linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
@ 2020-02-21  3:26 Alastair D'Silva
  2020-02-21  3:26 ` [PATCH v3 01/27] powerpc: Add OPAL calls for LPC memory alloc/release Alastair D'Silva
                   ` (27 more replies)
  0 siblings, 28 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:26 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This series adds support for OpenCAPI Persistent Memory devices, exposing
them as nvdimms so that we can make use of the existing infrastructure.

Alastair D'Silva (27):
  powerpc: Add OPAL calls for LPC memory alloc/release
  mm/memory_hotplug: Allow check_hotplug_memory_addressable to be called
    from drivers
  powerpc: Map & release OpenCAPI LPC memory
  ocxl: Remove unnecessary externs
  ocxl: Address kernel doc errors & warnings
  ocxl: Tally up the LPC memory on a link & allow it to be mapped
  ocxl: Add functions to map/unmap LPC memory
  ocxl: Emit a log message showing how much LPC memory was detected
  ocxl: Save the device serial number in ocxl_fn
  powerpc: Add driver for OpenCAPI Persistent Memory
  powerpc: Enable the OpenCAPI Persistent Memory driver for
    powernv_defconfig
  powerpc/powernv/pmem: Add register addresses & status values to the
    header
  powerpc/powernv/pmem: Read the capability registers & wait for device
    ready
  powerpc/powernv/pmem: Add support for Admin commands
  powerpc/powernv/pmem: Add support for near storage commands
  powerpc/powernv/pmem: Register a character device for userspace to
    interact with
  powerpc/powernv/pmem: Implement the Read Error Log command
  powerpc/powernv/pmem: Add controller dump IOCTLs
  powerpc/powernv/pmem: Add an IOCTL to report controller statistics
  powerpc/powernv/pmem: Forward events to userspace
  powerpc/powernv/pmem: Add an IOCTL to request controller health & perf
    data
  powerpc/powernv/pmem: Implement the heartbeat command
  powerpc/powernv/pmem: Add debug IOCTLs
  powerpc/powernv/pmem: Expose SMART data via ndctl
  powerpc/powernv/pmem: Expose the serial number in sysfs
  powerpc/powernv/pmem: Expose the firmware version in sysfs
  MAINTAINERS: Add myself & nvdimm/ocxl to ocxl

 MAINTAINERS                                   |    3 +
 arch/powerpc/configs/powernv_defconfig        |    5 +
 arch/powerpc/include/asm/opal-api.h           |    2 +
 arch/powerpc/include/asm/opal.h               |    3 +
 arch/powerpc/include/asm/pnv-ocxl.h           |   40 +-
 arch/powerpc/platforms/powernv/Kconfig        |    3 +
 arch/powerpc/platforms/powernv/Makefile       |    1 +
 arch/powerpc/platforms/powernv/ocxl.c         |   43 +
 arch/powerpc/platforms/powernv/opal-call.c    |    2 +
 arch/powerpc/platforms/powernv/pmem/Kconfig   |   21 +
 arch/powerpc/platforms/powernv/pmem/Makefile  |    7 +
 arch/powerpc/platforms/powernv/pmem/ocxl.c    | 1991 +++++++++++++++++
 .../platforms/powernv/pmem/ocxl_internal.c    |  213 ++
 .../platforms/powernv/pmem/ocxl_internal.h    |  254 +++
 .../platforms/powernv/pmem/ocxl_sysfs.c       |   46 +
 drivers/misc/ocxl/config.c                    |   74 +-
 drivers/misc/ocxl/core.c                      |   61 +
 drivers/misc/ocxl/link.c                      |   53 +
 drivers/misc/ocxl/ocxl_internal.h             |   45 +-
 include/linux/memory_hotplug.h                |    5 +
 include/misc/ocxl.h                           |  122 +-
 include/uapi/linux/ndctl.h                    |    1 +
 include/uapi/nvdimm/ocxl-pmem.h               |  127 ++
 mm/memory_hotplug.c                           |    4 +-
 24 files changed, 3029 insertions(+), 97 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/pmem/Kconfig
 create mode 100644 arch/powerpc/platforms/powernv/pmem/Makefile
 create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl.c
 create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
 create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
 create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c
 create mode 100644 include/uapi/nvdimm/ocxl-pmem.h

-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* [PATCH v3 01/27] powerpc: Add OPAL calls for LPC memory alloc/release
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
@ 2020-02-21  3:26 ` Alastair D'Silva
  2020-02-24  5:49   ` Andrew Donnellan
  2020-02-21  3:26 ` [PATCH v3 02/27] mm/memory_hotplug: Allow check_hotplug_memory_addressable to be called from drivers Alastair D'Silva
                   ` (26 subsequent siblings)
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:26 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

Add OPAL calls for LPC memory alloc/release

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
---
 arch/powerpc/include/asm/opal-api.h        | 2 ++
 arch/powerpc/include/asm/opal.h            | 3 +++
 arch/powerpc/platforms/powernv/opal-call.c | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index c1f25a760eb1..9298e603001b 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -208,6 +208,8 @@
 #define OPAL_HANDLE_HMI2			166
 #define	OPAL_NX_COPROC_INIT			167
 #define OPAL_XIVE_GET_VP_STATE			170
+#define OPAL_NPU_MEM_ALLOC			171
+#define OPAL_NPU_MEM_RELEASE			172
 #define OPAL_MPIPL_UPDATE			173
 #define OPAL_MPIPL_REGISTER_TAG			174
 #define OPAL_MPIPL_QUERY_TAG			175
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9986ac34b8e2..8f7727e0f9ce 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -39,6 +39,9 @@ int64_t opal_npu_spa_clear_cache(uint64_t phb_id, uint32_t bdfn,
 				uint64_t PE_handle);
 int64_t opal_npu_tl_set(uint64_t phb_id, uint32_t bdfn, long cap,
 			uint64_t rate_phys, uint32_t size);
+int64_t opal_npu_mem_alloc(uint64_t phb_id, uint32_t bdfn,
+			uint64_t size, uint64_t *bar);
+int64_t opal_npu_mem_release(uint64_t phb_id, uint32_t bdfn);
 
 int64_t opal_console_write(int64_t term_number, __be64 *length,
 			   const uint8_t *buffer);
diff --git a/arch/powerpc/platforms/powernv/opal-call.c b/arch/powerpc/platforms/powernv/opal-call.c
index 5cd0f52d258f..f26e58b72c04 100644
--- a/arch/powerpc/platforms/powernv/opal-call.c
+++ b/arch/powerpc/platforms/powernv/opal-call.c
@@ -287,6 +287,8 @@ OPAL_CALL(opal_pci_set_pbcq_tunnel_bar,		OPAL_PCI_SET_PBCQ_TUNNEL_BAR);
 OPAL_CALL(opal_sensor_read_u64,			OPAL_SENSOR_READ_U64);
 OPAL_CALL(opal_sensor_group_enable,		OPAL_SENSOR_GROUP_ENABLE);
 OPAL_CALL(opal_nx_coproc_init,			OPAL_NX_COPROC_INIT);
+OPAL_CALL(opal_npu_mem_alloc,			OPAL_NPU_MEM_ALLOC);
+OPAL_CALL(opal_npu_mem_release,			OPAL_NPU_MEM_RELEASE);
 OPAL_CALL(opal_mpipl_update,			OPAL_MPIPL_UPDATE);
 OPAL_CALL(opal_mpipl_register_tag,		OPAL_MPIPL_REGISTER_TAG);
 OPAL_CALL(opal_mpipl_query_tag,			OPAL_MPIPL_QUERY_TAG);
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 02/27] mm/memory_hotplug: Allow check_hotplug_memory_addressable to be called from drivers
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
  2020-02-21  3:26 ` [PATCH v3 01/27] powerpc: Add OPAL calls for LPC memory alloc/release Alastair D'Silva
@ 2020-02-21  3:26 ` Alastair D'Silva
  2020-02-21  7:03   ` Andrew Donnellan
  2020-02-21  3:26 ` [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory Alastair D'Silva
                   ` (25 subsequent siblings)
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:26 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

When setting up OpenCAPI connected persistent memory, the range check may
not be performed until quite late (or perhaps not at all, if the user does
not establish a DAX device).

This patch makes the range check callable so we can perform the check while
probing the OpenCAPI SCM device.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 include/linux/memory_hotplug.h | 5 +++++
 mm/memory_hotplug.c            | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index f4d59155f3d4..34a69aecc45e 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -337,6 +337,11 @@ static inline void __remove_memory(int nid, u64 start, u64 size) {}
 extern void set_zone_contiguous(struct zone *zone);
 extern void clear_zone_contiguous(struct zone *zone);
 
+#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+int check_hotplug_memory_addressable(unsigned long pfn,
+		unsigned long nr_pages);
+#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
+
 extern void __ref free_area_init_core_hotplug(int nid);
 extern int __add_memory(int nid, u64 start, u64 size);
 extern int add_memory(int nid, u64 start, u64 size);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0a54ffac8c68..14945f033594 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -276,8 +276,8 @@ static int check_pfn_span(unsigned long pfn, unsigned long nr_pages,
 	return 0;
 }
 
-static int check_hotplug_memory_addressable(unsigned long pfn,
-					    unsigned long nr_pages)
+int check_hotplug_memory_addressable(unsigned long pfn,
+				     unsigned long nr_pages)
 {
 	const u64 max_addr = PFN_PHYS(pfn + nr_pages) - 1;
 
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
  2020-02-21  3:26 ` [PATCH v3 01/27] powerpc: Add OPAL calls for LPC memory alloc/release Alastair D'Silva
  2020-02-21  3:26 ` [PATCH v3 02/27] mm/memory_hotplug: Allow check_hotplug_memory_addressable to be called from drivers Alastair D'Silva
@ 2020-02-21  3:26 ` Alastair D'Silva
  2020-02-24  2:51   ` Andrew Donnellan
                     ` (2 more replies)
  2020-02-21  3:26 ` [PATCH v3 04/27] ocxl: Remove unnecessary externs Alastair D'Silva
                   ` (24 subsequent siblings)
  27 siblings, 3 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:26 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch adds platform support to map & release LPC memory.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/include/asm/pnv-ocxl.h   |  4 +++
 arch/powerpc/platforms/powernv/ocxl.c | 43 +++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h
index 7de82647e761..0b2a6707e555 100644
--- a/arch/powerpc/include/asm/pnv-ocxl.h
+++ b/arch/powerpc/include/asm/pnv-ocxl.h
@@ -32,5 +32,9 @@ extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle)
 
 extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);
 extern void pnv_ocxl_free_xive_irq(u32 irq);
+#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size);
+void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev);
+#endif
 
 #endif /* _ASM_PNV_OCXL_H */
diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c
index 8c65aacda9c8..f2edbcc67361 100644
--- a/arch/powerpc/platforms/powernv/ocxl.c
+++ b/arch/powerpc/platforms/powernv/ocxl.c
@@ -475,6 +475,49 @@ void pnv_ocxl_spa_release(void *platform_data)
 }
 EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release);
 
+#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size)
+{
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	u32 bdfn = pci_dev_id(pdev);
+	__be64 base_addr_be64;
+	u64 base_addr;
+	int rc;
+
+	rc = opal_npu_mem_alloc(phb->opal_id, bdfn, size, &base_addr_be64);
+	if (rc) {
+		dev_warn(&pdev->dev,
+			 "OPAL could not allocate LPC memory, rc=%d\n", rc);
+		return 0;
+	}
+
+	base_addr = be64_to_cpu(base_addr_be64);
+
+	rc = check_hotplug_memory_addressable(base_addr >> PAGE_SHIFT,
+					      size >> PAGE_SHIFT);
+	if (rc)
+		return 0;
+
+	return base_addr;
+}
+EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_setup);
+
+void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev)
+{
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
+	u32 bdfn = pci_dev_id(pdev);
+	int rc;
+
+	rc = opal_npu_mem_release(phb->opal_id, bdfn);
+	if (rc)
+		dev_warn(&pdev->dev,
+			 "OPAL reported rc=%d when releasing LPC memory\n", rc);
+}
+EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_release);
+#endif
+
 int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle)
 {
 	struct spa_data *data = (struct spa_data *) platform_data;
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (2 preceding siblings ...)
  2020-02-21  3:26 ` [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory Alastair D'Silva
@ 2020-02-21  3:26 ` Alastair D'Silva
  2020-02-21  6:06   ` Andrew Donnellan
                     ` (2 more replies)
  2020-02-21  3:26 ` [PATCH v3 05/27] ocxl: Address kernel doc errors & warnings Alastair D'Silva
                   ` (23 subsequent siblings)
  27 siblings, 3 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:26 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

Function declarations don't need externs, remove the existing ones
so they are consistent with newer code

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/include/asm/pnv-ocxl.h | 32 ++++++++++++++---------------
 include/misc/ocxl.h                 |  6 +++---
 2 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h
index 0b2a6707e555..b23c99bc0c84 100644
--- a/arch/powerpc/include/asm/pnv-ocxl.h
+++ b/arch/powerpc/include/asm/pnv-ocxl.h
@@ -9,29 +9,27 @@
 #define PNV_OCXL_TL_BITS_PER_RATE       4
 #define PNV_OCXL_TL_RATE_BUF_SIZE       ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
 
-extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16 *enabled,
-			u16 *supported);
-extern int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count);
+int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16 *enabled, u16 *supported);
+int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count);
 
-extern int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap,
+int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap,
 			char *rate_buf, int rate_buf_size);
-extern int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap,
+int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap,
 			uint64_t rate_buf_phys, int rate_buf_size);
 
-extern int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq);
-extern void pnv_ocxl_unmap_xsl_regs(void __iomem *dsisr, void __iomem *dar,
-				void __iomem *tfc, void __iomem *pe_handle);
-extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr,
-				void __iomem **dar, void __iomem **tfc,
-				void __iomem **pe_handle);
+int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq);
+void pnv_ocxl_unmap_xsl_regs(void __iomem *dsisr, void __iomem *dar,
+			     void __iomem *tfc, void __iomem *pe_handle);
+int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr,
+			  void __iomem **dar, void __iomem **tfc,
+			  void __iomem **pe_handle);
 
-extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask,
-			void **platform_data);
-extern void pnv_ocxl_spa_release(void *platform_data);
-extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle);
+int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, void **platform_data);
+void pnv_ocxl_spa_release(void *platform_data);
+int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle);
 
-extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);
-extern void pnv_ocxl_free_xive_irq(u32 irq);
+int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);
+void pnv_ocxl_free_xive_irq(u32 irq);
 #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
 u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size);
 void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev);
diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
index 06dd5839e438..0a762e387418 100644
--- a/include/misc/ocxl.h
+++ b/include/misc/ocxl.h
@@ -173,7 +173,7 @@ int ocxl_context_detach(struct ocxl_context *ctx);
  *
  * Returns 0 on success, negative on failure
  */
-extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
+int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
 
 /**
  * Frees an IRQ associated with an AFU context
@@ -182,7 +182,7 @@ extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
  *
  * Returns 0 on success, negative on failure
  */
-extern int ocxl_afu_irq_free(struct ocxl_context *ctx, int irq_id);
+int ocxl_afu_irq_free(struct ocxl_context *ctx, int irq_id);
 
 /**
  * Gets the address of the trigger page for an IRQ
@@ -193,7 +193,7 @@ extern int ocxl_afu_irq_free(struct ocxl_context *ctx, int irq_id);
  *
  * returns the trigger page address, or 0 if the IRQ is not valid
  */
-extern u64 ocxl_afu_irq_get_addr(struct ocxl_context *ctx, int irq_id);
+u64 ocxl_afu_irq_get_addr(struct ocxl_context *ctx, int irq_id);
 
 /**
  * Provide a callback to be called when an IRQ is triggered
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 05/27] ocxl: Address kernel doc errors & warnings
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (3 preceding siblings ...)
  2020-02-21  3:26 ` [PATCH v3 04/27] ocxl: Remove unnecessary externs Alastair D'Silva
@ 2020-02-21  3:26 ` Alastair D'Silva
  2020-02-24  2:11   ` Andrew Donnellan
  2020-02-21  3:26 ` [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped Alastair D'Silva
                   ` (22 subsequent siblings)
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:26 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch addresses warnings and errors from the kernel doc scripts for
the OpenCAPI driver.

It also makes minor tweaks to make the docs more consistent.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 drivers/misc/ocxl/config.c        | 24 ++++----
 drivers/misc/ocxl/ocxl_internal.h |  9 +--
 include/misc/ocxl.h               | 96 ++++++++++++-------------------
 3 files changed, 55 insertions(+), 74 deletions(-)

diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index c8e19bfb5ef9..a62e3d7db2bf 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -273,16 +273,16 @@ static int read_afu_info(struct pci_dev *dev, struct ocxl_fn_config *fn,
 }
 
 /**
- * Read the template version from the AFU
- * dev: the device for the AFU
- * fn: the AFU offsets
- * len: outputs the template length
- * version: outputs the major<<8,minor version
+ * read_template_version() - Read the template version from the AFU
+ * @dev: the device for the AFU
+ * @fn: the AFU offsets
+ * @len: outputs the template length
+ * @version: outputs the major<<8,minor version
  *
  * Returns 0 on success, negative on failure
  */
 static int read_template_version(struct pci_dev *dev, struct ocxl_fn_config *fn,
-		u16 *len, u16 *version)
+				 u16 *len, u16 *version)
 {
 	u32 val32;
 	u8 major, minor;
@@ -476,16 +476,16 @@ static int validate_afu(struct pci_dev *dev, struct ocxl_afu_config *afu)
 }
 
 /**
- * Populate AFU metadata regarding LPC memory
- * dev: the device for the AFU
- * fn: the AFU offsets
- * afu: the AFU struct to populate the LPC metadata into
+ * read_afu_lpc_memory_info() - Populate AFU metadata regarding LPC memory
+ * @dev: the device for the AFU
+ * @fn: the AFU offsets
+ * @afu: the AFU struct to populate the LPC metadata into
  *
  * Returns 0 on success, negative on failure
  */
 static int read_afu_lpc_memory_info(struct pci_dev *dev,
-				struct ocxl_fn_config *fn,
-				struct ocxl_afu_config *afu)
+				    struct ocxl_fn_config *fn,
+				    struct ocxl_afu_config *afu)
 {
 	int rc;
 	u32 val32;
diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h
index 345bf843a38e..198e4e4bc51d 100644
--- a/drivers/misc/ocxl/ocxl_internal.h
+++ b/drivers/misc/ocxl/ocxl_internal.h
@@ -122,11 +122,12 @@ int ocxl_config_check_afu_index(struct pci_dev *dev,
 				struct ocxl_fn_config *fn, int afu_idx);
 
 /**
- * Update values within a Process Element
+ * ocxl_link_update_pe() - Update values within a Process Element
+ * @link_handle: the link handle associated with the process element
+ * @pasid: the PASID for the AFU context
+ * @tid: the new thread id for the process element
  *
- * link_handle: the link handle associated with the process element
- * pasid: the PASID for the AFU context
- * tid: the new thread id for the process element
+ * Returns 0 on success
  */
 int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid);
 
diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
index 0a762e387418..357ef1aadbc0 100644
--- a/include/misc/ocxl.h
+++ b/include/misc/ocxl.h
@@ -62,8 +62,7 @@ struct ocxl_context;
 // Device detection & initialisation
 
 /**
- * Open an OpenCAPI function on an OpenCAPI device
- *
+ * ocxl_function_open() - Open an OpenCAPI function on an OpenCAPI device
  * @dev: The PCI device that contains the function
  *
  * Returns an opaque pointer to the function, or an error pointer (check with IS_ERR)
@@ -71,8 +70,7 @@ struct ocxl_context;
 struct ocxl_fn *ocxl_function_open(struct pci_dev *dev);
 
 /**
- * Get the list of AFUs associated with a PCI function device
- *
+ * ocxl_function_afu_list() - Get the list of AFUs associated with a PCI function device
  * Returns a list of struct ocxl_afu *
  *
  * @fn: The OpenCAPI function containing the AFUs
@@ -80,8 +78,7 @@ struct ocxl_fn *ocxl_function_open(struct pci_dev *dev);
 struct list_head *ocxl_function_afu_list(struct ocxl_fn *fn);
 
 /**
- * Fetch an AFU instance from an OpenCAPI function
- *
+ * ocxl_function_fetch_afu() - Fetch an AFU instance from an OpenCAPI function
  * @fn: The OpenCAPI function to get the AFU from
  * @afu_idx: The index of the AFU to get
  *
@@ -92,23 +89,20 @@ struct list_head *ocxl_function_afu_list(struct ocxl_fn *fn);
 struct ocxl_afu *ocxl_function_fetch_afu(struct ocxl_fn *fn, u8 afu_idx);
 
 /**
- * Take a reference to an AFU
- *
+ * ocxl_afu_get() - Take a reference to an AFU
  * @afu: The AFU to increment the reference count on
  */
 void ocxl_afu_get(struct ocxl_afu *afu);
 
 /**
- * Release a reference to an AFU
- *
+ * ocxl_afu_put() - Release a reference to an AFU
  * @afu: The AFU to decrement the reference count on
  */
 void ocxl_afu_put(struct ocxl_afu *afu);
 
 
 /**
- * Get the configuration information for an OpenCAPI function
- *
+ * ocxl_function_config() - Get the configuration information for an OpenCAPI function
  * @fn: The OpenCAPI function to get the config for
  *
  * Returns the function config, or NULL on error
@@ -116,8 +110,7 @@ void ocxl_afu_put(struct ocxl_afu *afu);
 const struct ocxl_fn_config *ocxl_function_config(struct ocxl_fn *fn);
 
 /**
- * Close an OpenCAPI function
- *
+ * ocxl_function_close() - Close an OpenCAPI function
  * This will free any AFUs previously retrieved from the function, and
  * detach and associated contexts. The contexts must by freed by the caller.
  *
@@ -129,8 +122,7 @@ void ocxl_function_close(struct ocxl_fn *fn);
 // Context allocation
 
 /**
- * Allocate an OpenCAPI context
- *
+ * ocxl_context_alloc() - Allocate an OpenCAPI context
  * @context: The OpenCAPI context to allocate, must be freed with ocxl_context_free
  * @afu: The AFU the context belongs to
  * @mapping: The mapping to unmap when the context is closed (may be NULL)
@@ -139,14 +131,13 @@ int ocxl_context_alloc(struct ocxl_context **context, struct ocxl_afu *afu,
 			struct address_space *mapping);
 
 /**
- * Free an OpenCAPI context
- *
+ * ocxl_context_free() - Free an OpenCAPI context
  * @ctx: The OpenCAPI context to free
  */
 void ocxl_context_free(struct ocxl_context *ctx);
 
 /**
- * Grant access to an MM to an OpenCAPI context
+ * ocxl_context_attach() - Grant access to an MM to an OpenCAPI context
  * @ctx: The OpenCAPI context to attach
  * @amr: The value of the AMR register to restrict access
  * @mm: The mm to attach to the context
@@ -157,7 +148,7 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr,
 				struct mm_struct *mm);
 
 /**
- * Detach an MM from an OpenCAPI context
+ * ocxl_context_detach() - Detach an MM from an OpenCAPI context
  * @ctx: The OpenCAPI context to attach
  *
  * Returns 0 on success, negative on failure
@@ -167,7 +158,7 @@ int ocxl_context_detach(struct ocxl_context *ctx);
 // AFU IRQs
 
 /**
- * Allocate an IRQ associated with an AFU context
+ * ocxl_afu_irq_alloc() - Allocate an IRQ associated with an AFU context
  * @ctx: the AFU context
  * @irq_id: out, the IRQ ID
  *
@@ -176,7 +167,7 @@ int ocxl_context_detach(struct ocxl_context *ctx);
 int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
 
 /**
- * Frees an IRQ associated with an AFU context
+ * ocxl_afu_irq_free() - Frees an IRQ associated with an AFU context
  * @ctx: the AFU context
  * @irq_id: the IRQ ID
  *
@@ -185,7 +176,7 @@ int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
 int ocxl_afu_irq_free(struct ocxl_context *ctx, int irq_id);
 
 /**
- * Gets the address of the trigger page for an IRQ
+ * ocxl_afu_irq_get_addr() - Gets the address of the trigger page for an IRQ
  * This can then be provided to an AFU which will write to that
  * page to trigger the IRQ.
  * @ctx: The AFU context that the IRQ is associated with
@@ -196,7 +187,7 @@ int ocxl_afu_irq_free(struct ocxl_context *ctx, int irq_id);
 u64 ocxl_afu_irq_get_addr(struct ocxl_context *ctx, int irq_id);
 
 /**
- * Provide a callback to be called when an IRQ is triggered
+ * ocxl_irq_set_handler() - Provide a callback to be called when an IRQ is triggered
  * @ctx: The AFU context that the IRQ is associated with
  * @irq_id: The IRQ ID
  * @handler: the callback to be called when the IRQ is triggered
@@ -213,8 +204,7 @@ int ocxl_irq_set_handler(struct ocxl_context *ctx, int irq_id,
 // AFU Metadata
 
 /**
- * Get a pointer to the config for an AFU
- *
+ * ocxl_afu_config() - Get a pointer to the config for an AFU
  * @afu: a pointer to the AFU to get the config for
  *
  * Returns a pointer to the AFU config
@@ -222,27 +212,24 @@ int ocxl_irq_set_handler(struct ocxl_context *ctx, int irq_id,
 struct ocxl_afu_config *ocxl_afu_config(struct ocxl_afu *afu);
 
 /**
- * Assign opaque hardware specific information to an OpenCAPI AFU.
- *
- * @dev: The PCI device associated with the OpenCAPI device
+ * ocxl_afu_set_private() - Assign opaque hardware specific information to an OpenCAPI AFU.
+ * @afu: The OpenCAPI AFU
  * @private: the opaque hardware specific information to assign to the driver
  */
 void ocxl_afu_set_private(struct ocxl_afu *afu, void *private);
 
 /**
- * Fetch the hardware specific information associated with an external OpenCAPI
- * AFU. This may be consumed by an external OpenCAPI driver.
- *
- * @afu: The AFU
+ * ocxl_afu_get_private() - Fetch the hardware specific information associated with
+ * an external OpenCAPI AFU. This may be consumed by an external OpenCAPI driver.
+ * @afu: The OpenCAPI AFU
  *
  * Returns the opaque pointer associated with the device, or NULL if not set
  */
-void *ocxl_afu_get_private(struct ocxl_afu *dev);
+void *ocxl_afu_get_private(struct ocxl_afu *afu);
 
 // Global MMIO
 /**
- * Read a 32 bit value from global MMIO
- *
+ * ocxl_global_mmio_read32() - Read a 32 bit value from global MMIO
  * @afu: The AFU
  * @offset: The Offset from the start of MMIO
  * @endian: the endianness that the MMIO data is in
@@ -251,11 +238,10 @@ void *ocxl_afu_get_private(struct ocxl_afu *dev);
  * Returns 0 for success, negative on error
  */
 int ocxl_global_mmio_read32(struct ocxl_afu *afu, size_t offset,
-				enum ocxl_endian endian, u32 *val);
+			    enum ocxl_endian endian, u32 *val);
 
 /**
- * Read a 64 bit value from global MMIO
- *
+ * ocxl_global_mmio_read64() - Read a 64 bit value from global MMIO
  * @afu: The AFU
  * @offset: The Offset from the start of MMIO
  * @endian: the endianness that the MMIO data is in
@@ -264,11 +250,10 @@ int ocxl_global_mmio_read32(struct ocxl_afu *afu, size_t offset,
  * Returns 0 for success, negative on error
  */
 int ocxl_global_mmio_read64(struct ocxl_afu *afu, size_t offset,
-				enum ocxl_endian endian, u64 *val);
+			    enum ocxl_endian endian, u64 *val);
 
 /**
- * Write a 32 bit value to global MMIO
- *
+ * ocxl_global_mmio_write32() - Write a 32 bit value to global MMIO
  * @afu: The AFU
  * @offset: The Offset from the start of MMIO
  * @endian: the endianness that the MMIO data is in
@@ -277,11 +262,10 @@ int ocxl_global_mmio_read64(struct ocxl_afu *afu, size_t offset,
  * Returns 0 for success, negative on error
  */
 int ocxl_global_mmio_write32(struct ocxl_afu *afu, size_t offset,
-				enum ocxl_endian endian, u32 val);
+			     enum ocxl_endian endian, u32 val);
 
 /**
- * Write a 64 bit value to global MMIO
- *
+ * ocxl_global_mmio_write64() - Write a 64 bit value to global MMIO
  * @afu: The AFU
  * @offset: The Offset from the start of MMIO
  * @endian: the endianness that the MMIO data is in
@@ -290,11 +274,10 @@ int ocxl_global_mmio_write32(struct ocxl_afu *afu, size_t offset,
  * Returns 0 for success, negative on error
  */
 int ocxl_global_mmio_write64(struct ocxl_afu *afu, size_t offset,
-				enum ocxl_endian endian, u64 val);
+			     enum ocxl_endian endian, u64 val);
 
 /**
- * Set bits in a 32 bit global MMIO register
- *
+ * ocxl_global_mmio_set32() - Set bits in a 32 bit global MMIO register
  * @afu: The AFU
  * @offset: The Offset from the start of MMIO
  * @endian: the endianness that the MMIO data is in
@@ -303,11 +286,10 @@ int ocxl_global_mmio_write64(struct ocxl_afu *afu, size_t offset,
  * Returns 0 for success, negative on error
  */
 int ocxl_global_mmio_set32(struct ocxl_afu *afu, size_t offset,
-				enum ocxl_endian endian, u32 mask);
+			   enum ocxl_endian endian, u32 mask);
 
 /**
- * Set bits in a 64 bit global MMIO register
- *
+ * ocxl_global_mmio_set64() - Set bits in a 64 bit global MMIO register
  * @afu: The AFU
  * @offset: The Offset from the start of MMIO
  * @endian: the endianness that the MMIO data is in
@@ -316,11 +298,10 @@ int ocxl_global_mmio_set32(struct ocxl_afu *afu, size_t offset,
  * Returns 0 for success, negative on error
  */
 int ocxl_global_mmio_set64(struct ocxl_afu *afu, size_t offset,
-				enum ocxl_endian endian, u64 mask);
+			   enum ocxl_endian endian, u64 mask);
 
 /**
- * Set bits in a 32 bit global MMIO register
- *
+ * ocxl_global_mmio_clear32() - Set bits in a 32 bit global MMIO register
  * @afu: The AFU
  * @offset: The Offset from the start of MMIO
  * @endian: the endianness that the MMIO data is in
@@ -329,11 +310,10 @@ int ocxl_global_mmio_set64(struct ocxl_afu *afu, size_t offset,
  * Returns 0 for success, negative on error
  */
 int ocxl_global_mmio_clear32(struct ocxl_afu *afu, size_t offset,
-				enum ocxl_endian endian, u32 mask);
+			     enum ocxl_endian endian, u32 mask);
 
 /**
- * Set bits in a 64 bit global MMIO register
- *
+ * ocxl_global_mmio_clear64() - Set bits in a 64 bit global MMIO register
  * @afu: The AFU
  * @offset: The Offset from the start of MMIO
  * @endian: the endianness that the MMIO data is in
@@ -342,7 +322,7 @@ int ocxl_global_mmio_clear32(struct ocxl_afu *afu, size_t offset,
  * Returns 0 for success, negative on error
  */
 int ocxl_global_mmio_clear64(struct ocxl_afu *afu, size_t offset,
-				enum ocxl_endian endian, u64 mask);
+			     enum ocxl_endian endian, u64 mask);
 
 // Functions left here are for compatibility with the cxlflash driver
 
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (4 preceding siblings ...)
  2020-02-21  3:26 ` [PATCH v3 05/27] ocxl: Address kernel doc errors & warnings Alastair D'Silva
@ 2020-02-21  3:26 ` Alastair D'Silva
  2020-02-24  5:25   ` Andrew Donnellan
  2020-02-25 16:30   ` Frederic Barrat
  2020-02-21  3:27 ` [PATCH v3 07/27] ocxl: Add functions to map/unmap LPC memory Alastair D'Silva
                   ` (21 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:26 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

Tally up the LPC memory on an OpenCAPI link & allow it to be mapped

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 drivers/misc/ocxl/core.c          | 10 ++++++
 drivers/misc/ocxl/link.c          | 53 +++++++++++++++++++++++++++++++
 drivers/misc/ocxl/ocxl_internal.h | 33 +++++++++++++++++++
 3 files changed, 96 insertions(+)

diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c
index b7a09b21ab36..2531c6cf19a0 100644
--- a/drivers/misc/ocxl/core.c
+++ b/drivers/misc/ocxl/core.c
@@ -230,8 +230,18 @@ static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev)
 	if (rc)
 		goto err_free_pasid;
 
+	if (afu->config.lpc_mem_size || afu->config.special_purpose_mem_size) {
+		rc = ocxl_link_add_lpc_mem(afu->fn->link, afu->config.lpc_mem_offset,
+					   afu->config.lpc_mem_size +
+					   afu->config.special_purpose_mem_size);
+		if (rc)
+			goto err_free_mmio;
+	}
+
 	return 0;
 
+err_free_mmio:
+	unmap_mmio_areas(afu);
 err_free_pasid:
 	reclaim_afu_pasid(afu);
 err_free_actag:
diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
index 58d111afd9f6..1e039cc5ebe5 100644
--- a/drivers/misc/ocxl/link.c
+++ b/drivers/misc/ocxl/link.c
@@ -84,6 +84,11 @@ struct ocxl_link {
 	int dev;
 	atomic_t irq_available;
 	struct spa *spa;
+	struct mutex lpc_mem_lock; /* protects lpc_mem & lpc_mem_sz */
+	u64 lpc_mem_sz; /* Total amount of LPC memory presented on the link */
+	u64 lpc_mem;
+	int lpc_consumers;
+
 	void *platform_data;
 };
 static struct list_head links_list = LIST_HEAD_INIT(links_list);
@@ -396,6 +401,8 @@ static int alloc_link(struct pci_dev *dev, int PE_mask, struct ocxl_link **out_l
 	if (rc)
 		goto err_spa;
 
+	mutex_init(&link->lpc_mem_lock);
+
 	/* platform specific hook */
 	rc = pnv_ocxl_spa_setup(dev, link->spa->spa_mem, PE_mask,
 				&link->platform_data);
@@ -711,3 +718,49 @@ void ocxl_link_free_irq(void *link_handle, int hw_irq)
 	atomic_inc(&link->irq_available);
 }
 EXPORT_SYMBOL_GPL(ocxl_link_free_irq);
+
+int ocxl_link_add_lpc_mem(void *link_handle, u64 offset, u64 size)
+{
+	struct ocxl_link *link = (struct ocxl_link *) link_handle;
+
+	// Check for overflow
+	if (offset > (offset + size))
+		return -EINVAL;
+
+	mutex_lock(&link->lpc_mem_lock);
+	link->lpc_mem_sz = max(link->lpc_mem_sz, offset + size);
+
+	mutex_unlock(&link->lpc_mem_lock);
+
+	return 0;
+}
+
+u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev)
+{
+	struct ocxl_link *link = (struct ocxl_link *) link_handle;
+
+	mutex_lock(&link->lpc_mem_lock);
+
+	if(!link->lpc_mem)
+		link->lpc_mem = pnv_ocxl_platform_lpc_setup(pdev, link->lpc_mem_sz);
+
+	if(link->lpc_mem)
+		link->lpc_consumers++;
+	mutex_unlock(&link->lpc_mem_lock);
+
+	return link->lpc_mem;
+}
+
+void ocxl_link_lpc_release(void *link_handle, struct pci_dev *pdev)
+{
+	struct ocxl_link *link = (struct ocxl_link *) link_handle;
+
+	mutex_lock(&link->lpc_mem_lock);
+	WARN_ON(--link->lpc_consumers < 0);
+	if (link->lpc_consumers == 0) {
+		pnv_ocxl_platform_lpc_release(pdev);
+		link->lpc_mem = 0;
+	}
+
+	mutex_unlock(&link->lpc_mem_lock);
+}
diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h
index 198e4e4bc51d..d0c8c4838f42 100644
--- a/drivers/misc/ocxl/ocxl_internal.h
+++ b/drivers/misc/ocxl/ocxl_internal.h
@@ -142,4 +142,37 @@ int ocxl_irq_offset_to_id(struct ocxl_context *ctx, u64 offset);
 u64 ocxl_irq_id_to_offset(struct ocxl_context *ctx, int irq_id);
 void ocxl_afu_irq_free_all(struct ocxl_context *ctx);
 
+/**
+ * ocxl_link_add_lpc_mem() - Increment the amount of memory required by an OpenCAPI link
+ *
+ * @link_handle: The OpenCAPI link handle
+ * @offset: The offset of the memory to add
+ * @size: The amount of memory to increment by
+ *
+ * Returns 0 on success, negative on overflow
+ */
+int ocxl_link_add_lpc_mem(void *link_handle, u64 offset, u64 size);
+
+/**
+ * ocxl_link_lpc_map() - Map the LPC memory for an OpenCAPI device
+ * Since LPC memory belongs to a link, the whole LPC memory available
+ * on the link must be mapped in order to make it accessible to a device.
+ * @link_handle: The OpenCAPI link handle
+ * @pdev: A device that is on the link
+ *
+ * Returns the address of the mapped LPC memory, or 0 on error
+ */
+u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev);
+
+/**
+ * ocxl_link_lpc_release() - Release the LPC memory device for an OpenCAPI device
+ *
+ * Offlines LPC memory on an OpenCAPI link for a device. If this is the
+ * last device on the link to release the memory, unmap it from the link.
+ *
+ * @link_handle: The OpenCAPI link handle
+ * @pdev: A device that is on the link
+ */
+void ocxl_link_lpc_release(void *link_handle, struct pci_dev *pdev);
+
 #endif /* _OCXL_INTERNAL_H_ */
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 07/27] ocxl: Add functions to map/unmap LPC memory
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (5 preceding siblings ...)
  2020-02-21  3:26 ` [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-24  6:02   ` Andrew Donnellan
  2020-02-25 17:01   ` Frederic Barrat
  2020-02-21  3:27 ` [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected Alastair D'Silva
                   ` (20 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

Add functions to map/unmap LPC memory

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 drivers/misc/ocxl/core.c          | 51 +++++++++++++++++++++++++++++++
 drivers/misc/ocxl/ocxl_internal.h |  3 ++
 include/misc/ocxl.h               | 21 +++++++++++++
 3 files changed, 75 insertions(+)

diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c
index 2531c6cf19a0..75ff14e3882a 100644
--- a/drivers/misc/ocxl/core.c
+++ b/drivers/misc/ocxl/core.c
@@ -210,6 +210,56 @@ static void unmap_mmio_areas(struct ocxl_afu *afu)
 	release_fn_bar(afu->fn, afu->config.global_mmio_bar);
 }
 
+int ocxl_afu_map_lpc_mem(struct ocxl_afu *afu)
+{
+	struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent);
+
+	if ((afu->config.lpc_mem_size + afu->config.special_purpose_mem_size) == 0)
+		return 0;
+
+	afu->lpc_base_addr = ocxl_link_lpc_map(afu->fn->link, dev);
+	if (afu->lpc_base_addr == 0)
+		return -EINVAL;
+
+	if (afu->config.lpc_mem_size > 0) {
+		afu->lpc_res.start = afu->lpc_base_addr + afu->config.lpc_mem_offset;
+		afu->lpc_res.end = afu->lpc_res.start + afu->config.lpc_mem_size - 1;
+	}
+
+	if (afu->config.special_purpose_mem_size > 0) {
+		afu->special_purpose_res.start = afu->lpc_base_addr +
+						 afu->config.special_purpose_mem_offset;
+		afu->special_purpose_res.end = afu->special_purpose_res.start +
+					       afu->config.special_purpose_mem_size - 1;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(ocxl_afu_map_lpc_mem);
+
+struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu)
+{
+	return &afu->lpc_res;
+}
+EXPORT_SYMBOL_GPL(ocxl_afu_lpc_mem);
+
+static void unmap_lpc_mem(struct ocxl_afu *afu)
+{
+	struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent);
+
+	if (afu->lpc_res.start || afu->special_purpose_res.start) {
+		void *link = afu->fn->link;
+
+		// only release the link when the the last consumer calls release
+		ocxl_link_lpc_release(link, dev);
+
+		afu->lpc_res.start = 0;
+		afu->lpc_res.end = 0;
+		afu->special_purpose_res.start = 0;
+		afu->special_purpose_res.end = 0;
+	}
+}
+
 static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev)
 {
 	int rc;
@@ -251,6 +301,7 @@ static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev)
 
 static void deconfigure_afu(struct ocxl_afu *afu)
 {
+	unmap_lpc_mem(afu);
 	unmap_mmio_areas(afu);
 	reclaim_afu_pasid(afu);
 	reclaim_afu_actag(afu);
diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h
index d0c8c4838f42..ce0cac1da416 100644
--- a/drivers/misc/ocxl/ocxl_internal.h
+++ b/drivers/misc/ocxl/ocxl_internal.h
@@ -52,6 +52,9 @@ struct ocxl_afu {
 	void __iomem *global_mmio_ptr;
 	u64 pp_mmio_start;
 	void *private;
+	u64 lpc_base_addr; /* Covers both LPC & special purpose memory */
+	struct resource lpc_res;
+	struct resource special_purpose_res;
 };
 
 enum ocxl_context_status {
diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
index 357ef1aadbc0..d8b0b4d46bfb 100644
--- a/include/misc/ocxl.h
+++ b/include/misc/ocxl.h
@@ -203,6 +203,27 @@ int ocxl_irq_set_handler(struct ocxl_context *ctx, int irq_id,
 
 // AFU Metadata
 
+/**
+ * ocxl_afu_map_lpc_mem() - Map the LPC system & special purpose memory for an AFU
+ * Do not call this during device discovery, as there may me multiple
+ * devices on a link, and the memory is mapped for the whole link, not
+ * just one device. It should only be called after all devices have
+ * registered their memory on the link.
+ *
+ * @afu: The AFU that has the LPC memory to map
+ *
+ * Returns 0 on success, negative on failure
+ */
+int ocxl_afu_map_lpc_mem(struct ocxl_afu *afu);
+
+/**
+ * ocxl_afu_lpc_mem() - Get the physical address range of LPC memory for an AFU
+ * @afu: The AFU associated with the LPC memory
+ *
+ * Returns a pointer to the resource struct for the physical address range
+ */
+struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu);
+
 /**
  * ocxl_afu_config() - Get a pointer to the config for an AFU
  * @afu: a pointer to the AFU to get the config for
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (6 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 07/27] ocxl: Add functions to map/unmap LPC memory Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-24  6:06   ` Andrew Donnellan
  2020-02-25 17:03   ` Frederic Barrat
  2020-02-21  3:27 ` [PATCH v3 09/27] ocxl: Save the device serial number in ocxl_fn Alastair D'Silva
                   ` (19 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch emits a message showing how much LPC memory & special purpose
memory was detected on an OCXL device.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 drivers/misc/ocxl/config.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index a62e3d7db2bf..701ae6216abf 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -568,6 +568,10 @@ static int read_afu_lpc_memory_info(struct pci_dev *dev,
 		afu->special_purpose_mem_size =
 			total_mem_size - lpc_mem_size;
 	}
+
+	dev_info(&dev->dev, "Probed LPC memory of %#llx bytes and special purpose memory of %#llx bytes\n",
+		afu->lpc_mem_size, afu->special_purpose_mem_size);
+
 	return 0;
 }
 
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 09/27] ocxl: Save the device serial number in ocxl_fn
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (7 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-21  3:27 ` [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory Alastair D'Silva
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch retrieves the serial number of the card and makes it available
to consumers of the ocxl driver via the ocxl_fn struct.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
---
 drivers/misc/ocxl/config.c | 46 ++++++++++++++++++++++++++++++++++++++
 include/misc/ocxl.h        |  1 +
 2 files changed, 47 insertions(+)

diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index 701ae6216abf..ce33fafa7b50 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -71,6 +71,51 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx)
 	return 0;
 }
 
+/**
+ * get_function_0() - Find a related PCI device (function 0)
+ * @device: PCI device to match
+ *
+ * Returns a pointer to the related device, or null if not found
+ */
+static struct pci_dev *get_function_0(struct pci_dev *dev)
+{
+	unsigned int devfn = PCI_DEVFN(PCI_SLOT(dev->devfn), 0);
+
+	return pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus),
+					   dev->bus->number, devfn);
+}
+
+static void read_serial(struct pci_dev *dev, struct ocxl_fn_config *fn)
+{
+	u32 low, high;
+	int pos;
+
+	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DSN);
+	if (pos) {
+		pci_read_config_dword(dev, pos + 0x04, &low);
+		pci_read_config_dword(dev, pos + 0x08, &high);
+
+		fn->serial = low | ((u64)high) << 32;
+
+		return;
+	}
+
+	if (PCI_FUNC(dev->devfn) != 0) {
+		struct pci_dev *related = get_function_0(dev);
+
+		if (!related) {
+			fn->serial = 0;
+			return;
+		}
+
+		read_serial(related, fn);
+		pci_dev_put(related);
+		return;
+	}
+
+	fn->serial = 0;
+}
+
 static void read_pasid(struct pci_dev *dev, struct ocxl_fn_config *fn)
 {
 	u16 val;
@@ -208,6 +253,7 @@ int ocxl_config_read_function(struct pci_dev *dev, struct ocxl_fn_config *fn)
 	int rc;
 
 	read_pasid(dev, fn);
+	read_serial(dev, fn);
 
 	rc = read_dvsec_tl(dev, fn);
 	if (rc) {
diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
index d8b0b4d46bfb..b8514dc64bd0 100644
--- a/include/misc/ocxl.h
+++ b/include/misc/ocxl.h
@@ -46,6 +46,7 @@ struct ocxl_fn_config {
 	int dvsec_afu_info_pos; /* offset of the AFU information DVSEC */
 	s8 max_pasid_log;
 	s8 max_afu_index;
+	u64 serial;
 };
 
 enum ocxl_endian {
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (8 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 09/27] ocxl: Save the device serial number in ocxl_fn Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-26  5:07   ` Andrew Donnellan
                     ` (2 more replies)
  2020-02-21  3:27 ` [PATCH v3 11/27] powerpc: Enable the OpenCAPI Persistent Memory driver for powernv_defconfig Alastair D'Silva
                   ` (17 subsequent siblings)
  27 siblings, 3 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This driver exposes LPC memory on OpenCAPI pmem cards
as an NVDIMM, allowing the existing nvram infrastructure
to be used.

Namespace metadata is stored on the media itself, so
scm_reserve_metadata() maps 1 section's worth of PMEM storage
at the start to hold this. The rest of the PMEM range is registered
with libnvdimm as an nvdimm. scm_ndctl_config_read/write/size() provide
callbacks to libnvdimm to access the metadata.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/Kconfig        |   3 +
 arch/powerpc/platforms/powernv/Makefile       |   1 +
 arch/powerpc/platforms/powernv/pmem/Kconfig   |  15 +
 arch/powerpc/platforms/powernv/pmem/Makefile  |   7 +
 arch/powerpc/platforms/powernv/pmem/ocxl.c    | 473 ++++++++++++++++++
 .../platforms/powernv/pmem/ocxl_internal.h    |  28 ++
 6 files changed, 527 insertions(+)
 create mode 100644 arch/powerpc/platforms/powernv/pmem/Kconfig
 create mode 100644 arch/powerpc/platforms/powernv/pmem/Makefile
 create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl.c
 create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_internal.h

diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 938803eab0ad..fc8976af0e52 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -50,3 +50,6 @@ config PPC_VAS
 config SCOM_DEBUGFS
 	bool "Expose SCOM controllers via debugfs"
 	depends on DEBUG_FS
+
+source "arch/powerpc/platforms/powernv/pmem/Kconfig"
+
diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
index c0f8120045c3..0bbd72988b6f 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -21,3 +21,4 @@ obj-$(CONFIG_PPC_VAS)	+= vas.o vas-window.o vas-debug.o
 obj-$(CONFIG_OCXL_BASE)	+= ocxl.o
 obj-$(CONFIG_SCOM_DEBUGFS) += opal-xscom.o
 obj-$(CONFIG_PPC_SECURE_BOOT) += opal-secvar.o
+obj-$(CONFIG_LIBNVDIMM) += pmem/
diff --git a/arch/powerpc/platforms/powernv/pmem/Kconfig b/arch/powerpc/platforms/powernv/pmem/Kconfig
new file mode 100644
index 000000000000..c5d927520920
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/pmem/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+if LIBNVDIMM
+
+config OCXL_PMEM
+	tristate "OpenCAPI Persistent Memory"
+	depends on LIBNVDIMM && PPC_POWERNV && PCI && EEH && ZONE_DEVICE && OCXL
+	help
+	  Exposes devices that implement the OpenCAPI Storage Class Memory
+	  specification as persistent memory regions. You may also want
+	  DEV_DAX, DEV_DAX_PMEM & FS_DAX if you plan on using DAX devices
+	  stacked on top of this driver.
+
+	  Select N if unsure.
+
+endif
diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile b/arch/powerpc/platforms/powernv/pmem/Makefile
new file mode 100644
index 000000000000..1c55c4193175
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/pmem/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+
+ccflags-$(CONFIG_PPC_WERROR)	+= -Werror
+
+obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
+
+ocxlpmem-y := ocxl.o
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
new file mode 100644
index 000000000000..3c4eeb5dcc0f
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -0,0 +1,473 @@
+// SPDX-License-Id
+// Copyright 2019 IBM Corp.
+
+/*
+ * A driver for OpenCAPI devices that implement the Storage Class
+ * Memory specification.
+ */
+
+#include <linux/module.h>
+#include <misc/ocxl.h>
+#include <linux/ndctl.h>
+#include <linux/mm_types.h>
+#include <linux/memory_hotplug.h>
+#include "ocxl_internal.h"
+
+
+static const struct pci_device_id ocxlpmem_pci_tbl[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_IBM, 0x0625), },
+	{ }
+};
+
+MODULE_DEVICE_TABLE(pci, ocxlpmem_pci_tbl);
+
+#define NUM_MINORS 256 // Total to reserve
+
+static dev_t ocxlpmem_dev;
+static struct class *ocxlpmem_class;
+static struct mutex minors_idr_lock;
+static struct idr minors_idr;
+
+/**
+ * ndctl_config_write() - Handle a ND_CMD_SET_CONFIG_DATA command from ndctl
+ * @ocxlpmem: the device metadata
+ * @command: the incoming data to write
+ * Return: 0 on success, negative on failure
+ */
+static int ndctl_config_write(struct ocxlpmem *ocxlpmem,
+			      struct nd_cmd_set_config_hdr *command)
+{
+	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
+		return -EINVAL;
+
+	memcpy_flushcache(ocxlpmem->metadata_addr + command->in_offset, command->in_buf,
+			  command->in_length);
+
+	return 0;
+}
+
+/**
+ * ndctl_config_read() - Handle a ND_CMD_GET_CONFIG_DATA command from ndctl
+ * @ocxlpmem: the device metadata
+ * @command: the read request
+ * Return: 0 on success, negative on failure
+ */
+static int ndctl_config_read(struct ocxlpmem *ocxlpmem,
+			     struct nd_cmd_get_config_data_hdr *command)
+{
+	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
+		return -EINVAL;
+
+	memcpy_mcsafe(command->out_buf, ocxlpmem->metadata_addr + command->in_offset,
+		      command->in_length);
+
+	return 0;
+}
+
+/**
+ * ndctl_config_size() - Handle a ND_CMD_GET_CONFIG_SIZE command from ndctl
+ * @command: the read request
+ * Return: 0 on success, negative on failure
+ */
+static int ndctl_config_size(struct nd_cmd_get_config_size *command)
+{
+	command->status = 0;
+	command->config_size = LABEL_AREA_SIZE;
+	command->max_xfer = PAGE_SIZE;
+
+	return 0;
+}
+
+static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
+		 struct nvdimm *nvdimm,
+		 unsigned int cmd, void *buf, unsigned int buf_len, int *cmd_rc)
+{
+	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct ocxlpmem, bus_desc);
+
+	switch (cmd) {
+	case ND_CMD_GET_CONFIG_SIZE:
+		*cmd_rc = ndctl_config_size(buf);
+		return 0;
+
+	case ND_CMD_GET_CONFIG_DATA:
+		*cmd_rc = ndctl_config_read(ocxlpmem, buf);
+		return 0;
+
+	case ND_CMD_SET_CONFIG_DATA:
+		*cmd_rc = ndctl_config_write(ocxlpmem, buf);
+		return 0;
+
+	default:
+		return -ENOTTY;
+	}
+}
+
+/**
+ * reserve_metadata() - Reserve space for nvdimm metadata
+ * @ocxlpmem: the device metadata
+ * @lpc_mem: The resource representing the LPC memory of the OpenCAPI device
+ */
+static int reserve_metadata(struct ocxlpmem *ocxlpmem,
+			    struct resource *lpc_mem)
+{
+	ocxlpmem->metadata_addr = devm_memremap(&ocxlpmem->dev, lpc_mem->start,
+						LABEL_AREA_SIZE, MEMREMAP_WB);
+	if (IS_ERR(ocxlpmem->metadata_addr))
+		return PTR_ERR(ocxlpmem->metadata_addr);
+
+	return 0;
+}
+
+/**
+ * register_lpc_mem() - Discover persistent memory on a device and register it with the NVDIMM subsystem
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success
+ */
+static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
+{
+	struct nd_region_desc region_desc;
+	struct nd_mapping_desc nd_mapping_desc;
+	struct resource *lpc_mem;
+	const struct ocxl_afu_config *config;
+	const struct ocxl_fn_config *fn_config;
+	int rc;
+	unsigned long nvdimm_cmd_mask = 0;
+	unsigned long nvdimm_flags = 0;
+	int target_node;
+	char serial[16+1];
+
+	// Set up the reserved metadata area
+	rc = ocxl_afu_map_lpc_mem(ocxlpmem->ocxl_afu);
+	if (rc < 0)
+		return rc;
+
+	lpc_mem = ocxl_afu_lpc_mem(ocxlpmem->ocxl_afu);
+	if (lpc_mem == NULL || lpc_mem->start == 0)
+		return -EINVAL;
+
+	config = ocxl_afu_config(ocxlpmem->ocxl_afu);
+	fn_config = ocxl_function_config(ocxlpmem->ocxl_fn);
+
+	rc = reserve_metadata(ocxlpmem, lpc_mem);
+	if (rc)
+		return rc;
+
+	ocxlpmem->bus_desc.provider_name = "ocxl-pmem";
+	ocxlpmem->bus_desc.ndctl = ndctl;
+	ocxlpmem->bus_desc.module = THIS_MODULE;
+
+	ocxlpmem->nvdimm_bus = nvdimm_bus_register(&ocxlpmem->dev,
+						   &ocxlpmem->bus_desc);
+	if (!ocxlpmem->nvdimm_bus)
+		return -EINVAL;
+
+	ocxlpmem->pmem_res.start = (u64)lpc_mem->start + LABEL_AREA_SIZE;
+	ocxlpmem->pmem_res.end = (u64)lpc_mem->start + config->lpc_mem_size - 1;
+	ocxlpmem->pmem_res.name = "OpenCAPI persistent memory";
+
+	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
+	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
+	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
+
+	set_bit(NDD_ALIASING, &nvdimm_flags);
+
+	snprintf(serial, sizeof(serial), "%llx", fn_config->serial);
+	nd_mapping_desc.nvdimm = nvdimm_create(ocxlpmem->nvdimm_bus, ocxlpmem,
+				 NULL, nvdimm_flags, nvdimm_cmd_mask,
+				 0, NULL);
+	if (!nd_mapping_desc.nvdimm)
+		return -ENOMEM;
+
+	if (nvdimm_bus_check_dimm_count(ocxlpmem->nvdimm_bus, 1))
+		return -EINVAL;
+
+	nd_mapping_desc.start = ocxlpmem->pmem_res.start;
+	nd_mapping_desc.size = resource_size(&ocxlpmem->pmem_res);
+	nd_mapping_desc.position = 0;
+
+	ocxlpmem->nd_set.cookie1 = fn_config->serial + 1; // allow for empty serial
+	ocxlpmem->nd_set.cookie2 = fn_config->serial + 1;
+
+	target_node = of_node_to_nid(ocxlpmem->pdev->dev.of_node);
+
+	memset(&region_desc, 0, sizeof(region_desc));
+	region_desc.res = &ocxlpmem->pmem_res;
+	region_desc.numa_node = NUMA_NO_NODE;
+	region_desc.target_node = target_node;
+	region_desc.num_mappings = 1;
+	region_desc.mapping = &nd_mapping_desc;
+	region_desc.nd_set = &ocxlpmem->nd_set;
+
+	set_bit(ND_REGION_PAGEMAP, &region_desc.flags);
+	/*
+	 * NB: libnvdimm copies the data from ndr_desc into it's own
+	 * structures so passing a stack pointer is fine.
+	 */
+	ocxlpmem->nd_region = nvdimm_pmem_region_create(ocxlpmem->nvdimm_bus,
+							&region_desc);
+	if (!ocxlpmem->nd_region)
+		return -EINVAL;
+
+	dev_info(&ocxlpmem->dev,
+		 "Onlining %lluMB of persistent memory\n",
+		 nd_mapping_desc.size / SZ_1M);
+
+	return 0;
+}
+
+/**
+ * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem device
+ * @ocxlpmem: the device metadata
+ * Return: the allocated minor number
+ */
+static int allocate_minor(struct ocxlpmem *ocxlpmem)
+{
+	int minor;
+
+	mutex_lock(&minors_idr_lock);
+	minor = idr_alloc(&minors_idr, ocxlpmem, 0, NUM_MINORS, GFP_KERNEL);
+	mutex_unlock(&minors_idr_lock);
+	return minor;
+}
+
+static void free_minor(struct ocxlpmem *ocxlpmem)
+{
+	mutex_lock(&minors_idr_lock);
+	idr_remove(&minors_idr, MINOR(ocxlpmem->dev.devt));
+	mutex_unlock(&minors_idr_lock);
+}
+
+/**
+ * free_ocxlpmem() - Free all members of an ocxlpmem struct
+ * @ocxlpmem: the device struct to clear
+ */
+static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
+{
+	int rc;
+
+	if (ocxlpmem->nvdimm_bus)
+		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
+
+	free_minor(ocxlpmem);
+
+	if (ocxlpmem->metadata_addr)
+		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
+
+	if (ocxlpmem->ocxl_context) {
+		rc = ocxl_context_detach(ocxlpmem->ocxl_context);
+		if (rc == -EBUSY)
+			dev_warn(&ocxlpmem->dev, "Timeout detaching ocxl context\n");
+		else
+			ocxl_context_free(ocxlpmem->ocxl_context);
+
+	}
+
+	if (ocxlpmem->ocxl_afu)
+		ocxl_afu_put(ocxlpmem->ocxl_afu);
+
+	if (ocxlpmem->ocxl_fn)
+		ocxl_function_close(ocxlpmem->ocxl_fn);
+
+	kfree(ocxlpmem);
+}
+
+/**
+ * free_ocxlpmem_dev() - Free an OpenCAPI persistent memory device
+ * @dev: The device struct
+ */
+static void free_ocxlpmem_dev(struct device *dev)
+{
+	struct ocxlpmem *ocxlpmem = container_of(dev, struct ocxlpmem, dev);
+
+	free_ocxlpmem(ocxlpmem);
+}
+
+/**
+ * ocxlpmem_register() - Register an OpenCAPI pmem device with the kernel
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success, negative on failure
+ */
+static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
+{
+	int rc;
+	int minor = allocate_minor(ocxlpmem);
+
+	if (minor < 0)
+		return minor;
+
+	ocxlpmem->dev.release = free_ocxlpmem_dev;
+	rc = dev_set_name(&ocxlpmem->dev, "ocxlpmem%d", minor);
+	if (rc < 0)
+		return rc;
+
+	ocxlpmem->dev.devt = MKDEV(MAJOR(ocxlpmem_dev), minor);
+	ocxlpmem->dev.class = ocxlpmem_class;
+	ocxlpmem->dev.parent = &ocxlpmem->pdev->dev;
+
+	return device_register(&ocxlpmem->dev);
+}
+
+/**
+ * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
+ * @pdev: the PCI device information struct
+ */
+static void ocxlpmem_remove(struct pci_dev *pdev)
+{
+	if (PCI_FUNC(pdev->devfn) == 0) {
+		struct ocxlpmem_function0 *func0 = pci_get_drvdata(pdev);
+
+		if (func0) {
+			ocxl_function_close(func0->ocxl_fn);
+			func0->ocxl_fn = NULL;
+		}
+	} else {
+		struct ocxlpmem *ocxlpmem = pci_get_drvdata(pdev);
+
+		if (ocxlpmem)
+			device_unregister(&ocxlpmem->dev);
+	}
+}
+
+/**
+ * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
+ * This is important as it enables templates higher than 0 across all other functions,
+ * which in turn enables higher bandwidth accesses
+ * @pdev: the PCI device information struct
+ * Return: 0 on success, negative on failure
+ */
+static int probe_function0(struct pci_dev *pdev)
+{
+	struct ocxlpmem_function0 *func0 = NULL;
+	struct ocxl_fn *fn;
+
+	func0 = kzalloc(sizeof(*func0), GFP_KERNEL);
+	if (!func0)
+		return -ENOMEM;
+
+	func0->pdev = pdev;
+	fn = ocxl_function_open(pdev);
+	if (IS_ERR(fn)) {
+		kfree(func0);
+		dev_err(&pdev->dev, "failed to open OCXL function\n");
+		return PTR_ERR(fn);
+	}
+	func0->ocxl_fn = fn;
+
+	pci_set_drvdata(pdev, func0);
+
+	return 0;
+}
+
+/**
+ * probe() - Init an OpenCAPI persistent memory device
+ * @pdev: the PCI device information struct
+ * @ent: The entry from ocxlpmem_pci_tbl
+ * Return: 0 on success, negative on failure
+ */
+static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+{
+	struct ocxlpmem *ocxlpmem;
+	int rc;
+
+	if (PCI_FUNC(pdev->devfn) == 0)
+		return probe_function0(pdev);
+	else if (PCI_FUNC(pdev->devfn) != 1)
+		return 0;
+
+	ocxlpmem = kzalloc(sizeof(*ocxlpmem), GFP_KERNEL);
+	if (!ocxlpmem) {
+		dev_err(&pdev->dev, "Could not allocate OpenCAPI persistent memory metadata\n");
+		rc = -ENOMEM;
+		goto err;
+	}
+	ocxlpmem->pdev = pdev;
+
+	pci_set_drvdata(pdev, ocxlpmem);
+
+	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
+	if (IS_ERR(ocxlpmem->ocxl_fn)) {
+		kfree(ocxlpmem);
+		pci_set_drvdata(pdev, NULL);
+		dev_err(&pdev->dev, "failed to open OCXL function\n");
+		rc = PTR_ERR(ocxlpmem->ocxl_fn);
+		goto err;
+	}
+
+	ocxlpmem->ocxl_afu = ocxl_function_fetch_afu(ocxlpmem->ocxl_fn, 0);
+	if (ocxlpmem->ocxl_afu == NULL) {
+		dev_err(&pdev->dev, "Could not get OCXL AFU from function\n");
+		rc = -ENXIO;
+		goto err;
+	}
+
+	ocxl_afu_get(ocxlpmem->ocxl_afu);
+
+	// Resources allocated below here are cleaned up in the release handler
+
+	rc = ocxlpmem_register(ocxlpmem);
+	if (rc) {
+		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory device with the kernel\n");
+		goto err;
+	}
+
+	rc = ocxl_context_alloc(&ocxlpmem->ocxl_context, ocxlpmem->ocxl_afu, NULL);
+	if (rc) {
+		dev_err(&pdev->dev, "Could not allocate OCXL context\n");
+		goto err;
+	}
+
+	rc = ocxl_context_attach(ocxlpmem->ocxl_context, 0, NULL);
+	if (rc) {
+		dev_err(&pdev->dev, "Could not attach ocxl context\n");
+		goto err;
+	}
+
+	rc = register_lpc_mem(ocxlpmem);
+	if (rc) {
+		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory with libnvdimm\n");
+		goto err;
+	}
+
+	return 0;
+
+err:
+	/*
+	 * Further cleanup is done in the release handler via free_ocxlpmem()
+	 * This allows us to keep the character device live to handle IOCTLs to
+	 * investigate issues if the card has an error
+	 */
+
+	dev_err(&pdev->dev,
+		"Error detected, will not register OpenCAPI persistent memory\n");
+	return rc;
+}
+
+static struct pci_driver pci_driver = {
+	.name = "ocxl-pmem",
+	.id_table = ocxlpmem_pci_tbl,
+	.probe = probe,
+	.remove = ocxlpmem_remove,
+	.shutdown = ocxlpmem_remove,
+};
+
+static int __init ocxlpmem_init(void)
+{
+	int rc = 0;
+
+	rc = pci_register_driver(&pci_driver);
+	if (rc)
+		return rc;
+
+	return 0;
+}
+
+static void ocxlpmem_exit(void)
+{
+	pci_unregister_driver(&pci_driver);
+}
+
+module_init(ocxlpmem_init);
+module_exit(ocxlpmem_exit);
+
+MODULE_DESCRIPTION("OpenCAPI Persistent Memory");
+MODULE_LICENSE("GPL");
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
new file mode 100644
index 000000000000..0faf3740e9b8
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0+
+// Copyright 2019 IBM Corp.
+
+#include <linux/pci.h>
+#include <misc/ocxl.h>
+#include <linux/libnvdimm.h>
+#include <linux/mm.h>
+
+#define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
+
+struct ocxlpmem_function0 {
+	struct pci_dev *pdev;
+	struct ocxl_fn *ocxl_fn;
+};
+
+struct ocxlpmem {
+	struct device dev;
+	struct pci_dev *pdev;
+	struct ocxl_fn *ocxl_fn;
+	struct nd_interleave_set nd_set;
+	struct nvdimm_bus_descriptor bus_desc;
+	struct nvdimm_bus *nvdimm_bus;
+	struct ocxl_afu *ocxl_afu;
+	struct ocxl_context *ocxl_context;
+	void *metadata_addr;
+	struct resource pmem_res;
+	struct nd_region *nd_region;
+};
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 11/27] powerpc: Enable the OpenCAPI Persistent Memory driver for powernv_defconfig
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (9 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-25  3:01   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 12/27] powerpc/powernv/pmem: Add register addresses & status values to the header Alastair D'Silva
                   ` (16 subsequent siblings)
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch enables the OpenCAPI Persistent Memory driver, as well
as DAX support, for the 'powernv' platform.

DAX is not a strict requirement for the functioning of the driver, but it
is likely that a user will want to create a DAX device on top of their
persistent memory device.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/configs/powernv_defconfig | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig
index 71749377d164..921d77bbd3d2 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -348,3 +348,8 @@ CONFIG_KVM_BOOK3S_64=m
 CONFIG_KVM_BOOK3S_64_HV=m
 CONFIG_VHOST_NET=m
 CONFIG_PRINTK_TIME=y
+CONFIG_ZONE_DEVICE=y
+CONFIG_OCXL_PMEM=m
+CONFIG_DEV_DAX=m
+CONFIG_DEV_DAX_PMEM=m
+CONFIG_FS_DAX=y
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 12/27] powerpc/powernv/pmem: Add register addresses & status values to the header
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (10 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 11/27] powerpc: Enable the OpenCAPI Persistent Memory driver for powernv_defconfig Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-27  5:08   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready Alastair D'Silva
                   ` (15 subsequent siblings)
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

These values have been taken from the device specifications.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 .../platforms/powernv/pmem/ocxl_internal.h    | 72 +++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index 0faf3740e9b8..9cf3e42750e7 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -8,6 +8,78 @@
 
 #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
 
+#define GLOBAL_MMIO_CHI		0x000
+#define GLOBAL_MMIO_CHIC	0x008
+#define GLOBAL_MMIO_CHIE	0x010
+#define GLOBAL_MMIO_CHIEC	0x018
+#define GLOBAL_MMIO_HCI		0x020
+#define GLOBAL_MMIO_HCIC	0x028
+#define GLOBAL_MMIO_IMA0_OHP	0x040
+#define GLOBAL_MMIO_IMA0_CFP	0x048
+#define GLOBAL_MMIO_IMA1_OHP	0x050
+#define GLOBAL_MMIO_IMA1_CFP	0x058
+#define GLOBAL_MMIO_ACMA_CREQO	0x100
+#define GLOBAL_MMIO_ACMA_CRSPO	0x104
+#define GLOBAL_MMIO_ACMA_CDBO	0x108
+#define GLOBAL_MMIO_ACMA_CDBS	0x10c
+#define GLOBAL_MMIO_NSCMA_CREQO	0x120
+#define GLOBAL_MMIO_NSCMA_CRSPO	0x124
+#define GLOBAL_MMIO_NSCMA_CDBO	0x128
+#define GLOBAL_MMIO_NSCMA_CDBS	0x12c
+#define GLOBAL_MMIO_CSTS	0x140
+#define GLOBAL_MMIO_FWVER	0x148
+#define GLOBAL_MMIO_CCAP0	0x160
+#define GLOBAL_MMIO_CCAP1	0x168
+
+#define GLOBAL_MMIO_CHI_ACRA	BIT_ULL(0)
+#define GLOBAL_MMIO_CHI_NSCRA	BIT_ULL(1)
+#define GLOBAL_MMIO_CHI_CRDY	BIT_ULL(4)
+#define GLOBAL_MMIO_CHI_CFFS	BIT_ULL(5)
+#define GLOBAL_MMIO_CHI_MA	BIT_ULL(6)
+#define GLOBAL_MMIO_CHI_ELA	BIT_ULL(7)
+#define GLOBAL_MMIO_CHI_CDA	BIT_ULL(8)
+#define GLOBAL_MMIO_CHI_CHFS	BIT_ULL(9)
+
+#define GLOBAL_MMIO_CHI_ALL	(GLOBAL_MMIO_CHI_ACRA | \
+				 GLOBAL_MMIO_CHI_NSCRA | \
+				 GLOBAL_MMIO_CHI_CRDY | \
+				 GLOBAL_MMIO_CHI_CFFS | \
+				 GLOBAL_MMIO_CHI_MA | \
+				 GLOBAL_MMIO_CHI_ELA | \
+				 GLOBAL_MMIO_CHI_CDA | \
+				 GLOBAL_MMIO_CHI_CHFS)
+
+#define GLOBAL_MMIO_HCI_ACRW				BIT_ULL(0)
+#define GLOBAL_MMIO_HCI_NSCRW				BIT_ULL(1)
+#define GLOBAL_MMIO_HCI_AFU_RESET			BIT_ULL(2)
+#define GLOBAL_MMIO_HCI_FW_DEBUG			BIT_ULL(3)
+#define GLOBAL_MMIO_HCI_CONTROLLER_DUMP			BIT_ULL(4)
+#define GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COLLECTED	BIT_ULL(5)
+#define GLOBAL_MMIO_HCI_REQ_HEALTH_PERF			BIT_ULL(6)
+
+#define ADMIN_COMMAND_HEARTBEAT		0x00u
+#define ADMIN_COMMAND_SHUTDOWN		0x01u
+#define ADMIN_COMMAND_FW_UPDATE		0x02u
+#define ADMIN_COMMAND_FW_DEBUG		0x03u
+#define ADMIN_COMMAND_ERRLOG		0x04u
+#define ADMIN_COMMAND_SMART		0x05u
+#define ADMIN_COMMAND_CONTROLLER_STATS	0x06u
+#define ADMIN_COMMAND_CONTROLLER_DUMP	0x07u
+#define ADMIN_COMMAND_CMD_CAPS		0x08u
+#define ADMIN_COMMAND_MAX		0x08u
+
+#define STATUS_SUCCESS		0x00
+#define STATUS_MEM_UNAVAILABLE	0x20
+#define STATUS_BAD_OPCODE	0x50
+#define STATUS_BAD_REQUEST_PARM	0x51
+#define STATUS_BAD_DATA_PARM	0x52
+#define STATUS_DEBUG_BLOCKED	0x70
+#define STATUS_FAIL		0xFF
+
+#define STATUS_FW_UPDATE_BLOCKED 0x21
+#define STATUS_FW_ARG_INVALID	0x51
+#define STATUS_FW_INVALID	0x52
+
 struct ocxlpmem_function0 {
 	struct pci_dev *pdev;
 	struct ocxl_fn *ocxl_fn;
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (11 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 12/27] powerpc/powernv/pmem: Add register addresses & status values to the header Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-27  3:54   ` Andrew Donnellan
  2020-03-02 17:51   ` Frederic Barrat
  2020-02-21  3:27 ` [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands Alastair D'Silva
                   ` (14 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch reads timeouts & firmware version from the controller, and
uses those timeouts to wait for the controller to report that it is ready
before handing the memory over to libnvdimm.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/Makefile  |  2 +-
 arch/powerpc/platforms/powernv/pmem/ocxl.c    | 92 +++++++++++++++++++
 .../platforms/powernv/pmem/ocxl_internal.c    | 19 ++++
 .../platforms/powernv/pmem/ocxl_internal.h    | 24 +++++
 4 files changed, 136 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_internal.c

diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile b/arch/powerpc/platforms/powernv/pmem/Makefile
index 1c55c4193175..4ceda25907d4 100644
--- a/arch/powerpc/platforms/powernv/pmem/Makefile
+++ b/arch/powerpc/platforms/powernv/pmem/Makefile
@@ -4,4 +4,4 @@ ccflags-$(CONFIG_PPC_WERROR)	+= -Werror
 
 obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
 
-ocxlpmem-y := ocxl.o
+ocxlpmem-y := ocxl.o ocxl_internal.o
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 3c4eeb5dcc0f..431212c9f0cc 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -8,6 +8,7 @@
 
 #include <linux/module.h>
 #include <misc/ocxl.h>
+#include <linux/delay.h>
 #include <linux/ndctl.h>
 #include <linux/mm_types.h>
 #include <linux/memory_hotplug.h>
@@ -215,6 +216,36 @@ static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
 	return 0;
 }
 
+/**
+ * is_usable() - Is a controller usable?
+ * @ocxlpmem: the device metadata
+ * @verbose: True to log errors
+ * Return: true if the controller is usable
+ */
+static bool is_usable(const struct ocxlpmem *ocxlpmem, bool verbose)
+{
+	u64 chi = 0;
+	int rc = ocxlpmem_chi(ocxlpmem, &chi);
+
+	if (rc < 0)
+		return false;
+
+	if (!(chi & GLOBAL_MMIO_CHI_CRDY)) {
+		if (verbose)
+			dev_err(&ocxlpmem->dev, "controller is not ready.\n");
+		return false;
+	}
+
+	if (!(chi & GLOBAL_MMIO_CHI_MA)) {
+		if (verbose)
+			dev_err(&ocxlpmem->dev,
+				"controller does not have memory available.\n");
+		return false;
+	}
+
+	return true;
+}
+
 /**
  * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem device
  * @ocxlpmem: the device metadata
@@ -328,6 +359,48 @@ static void ocxlpmem_remove(struct pci_dev *pdev)
 	}
 }
 
+/**
+ * read_device_metadata() - Retrieve config information from the AFU and save it for future use
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success, negative on failure
+ */
+static int read_device_metadata(struct ocxlpmem *ocxlpmem)
+{
+	u64 val;
+	int rc;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CCAP0,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	ocxlpmem->scm_revision = val & 0xFFFF;
+	ocxlpmem->read_latency = (val >> 32) & 0xFF;
+	ocxlpmem->readiness_timeout = (val >> 48) & 0x0F;
+	ocxlpmem->memory_available_timeout = val >> 52;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CCAP1,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	ocxlpmem->max_controller_dump_size = val & 0xFFFFFFFF;
+
+	// Extract firmware version text
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_FWVER,
+				     OCXL_HOST_ENDIAN, (u64 *)ocxlpmem->fw_version);
+	if (rc)
+		return rc;
+
+	ocxlpmem->fw_version[8] = '\0';
+
+	dev_info(&ocxlpmem->dev,
+		 "Firmware version '%s' SCM revision %d:%d\n", ocxlpmem->fw_version,
+		 ocxlpmem->scm_revision >> 4, ocxlpmem->scm_revision & 0x0F);
+
+	return 0;
+}
+
 /**
  * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
  * This is important as it enables templates higher than 0 across all other functions,
@@ -368,6 +441,7 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct ocxlpmem *ocxlpmem;
 	int rc;
+	u16 elapsed, timeout;
 
 	if (PCI_FUNC(pdev->devfn) == 0)
 		return probe_function0(pdev);
@@ -422,6 +496,24 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err;
 	}
 
+	if (read_device_metadata(ocxlpmem)) {
+		dev_err(&pdev->dev, "Could not read metadata\n");
+		goto err;
+	}
+
+	elapsed = 0;
+	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
+	while (!is_usable(ocxlpmem, false)) {
+		if (elapsed++ > timeout) {
+			dev_warn(&ocxlpmem->dev, "OpenCAPI Persistent Memory ready timeout.\n");
+			(void)is_usable(ocxlpmem, true);
+			rc = -ENXIO;
+			goto err;
+		}
+
+		msleep(1000);
+	}
+
 	rc = register_lpc_mem(ocxlpmem);
 	if (rc) {
 		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory with libnvdimm\n");
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
new file mode 100644
index 000000000000..617ca943b1b8
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0+
+// Copyright 2019 IBM Corp.
+
+#include <misc/ocxl.h>
+#include <linux/delay.h>
+#include "ocxl_internal.h"
+
+int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi)
+{
+	u64 val;
+	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHI,
+					 OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	*chi = val;
+
+	return 0;
+}
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index 9cf3e42750e7..ba0301533d00 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -97,4 +97,28 @@ struct ocxlpmem {
 	void *metadata_addr;
 	struct resource pmem_res;
 	struct nd_region *nd_region;
+	char fw_version[8+1];
+
+	u32 max_controller_dump_size;
+	u16 scm_revision; // major/minor
+	u8 readiness_timeout;  /* The worst case time (in seconds) that the host shall
+				* wait for the controller to become operational following a reset (CHI.CRDY).
+				*/
+	u8 memory_available_timeout;   /* The worst case time (in seconds) that the host shall
+					* wait for memory to become available following a reset (CHI.MA).
+					*/
+
+	u16 read_latency; /* The nominal measure of latency (in nanoseconds)
+			   * associated with an unassisted read of a memory block.
+			   * This represents the capability of the raw media technology without assistance
+			   */
 };
+
+/**
+ * ocxlpmem_chi() - Get the value of the CHI register
+ * @ocxlpmem: the device metadata
+ * @chi: returns the CHI value
+ *
+ * Returns 0 on success, negative on error
+ */
+int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi);
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (12 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-27  8:22   ` Andrew Donnellan
  2020-02-27 17:01   ` Dan Williams
  2020-02-21  3:27 ` [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands Alastair D'Silva
                   ` (13 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch requests the metadata required to issue admin commands, as well
as some helper functions to construct and check the completion of the
commands.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c    |  65 ++++++++
 .../platforms/powernv/pmem/ocxl_internal.c    | 153 ++++++++++++++++++
 .../platforms/powernv/pmem/ocxl_internal.h    |  61 +++++++
 3 files changed, 279 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 431212c9f0cc..4e782d22605b 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -216,6 +216,58 @@ static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
 	return 0;
 }
 
+/**
+ * extract_command_metadata() - Extract command data from MMIO & save it for further use
+ * @ocxlpmem: the device metadata
+ * @offset: The base address of the command data structures (address of CREQO)
+ * @command_metadata: A pointer to the command metadata to populate
+ * Return: 0 on success, negative on failure
+ */
+static int extract_command_metadata(struct ocxlpmem *ocxlpmem, u32 offset,
+					struct command_metadata *command_metadata)
+{
+	int rc;
+	u64 tmp;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, offset, OCXL_LITTLE_ENDIAN,
+				     &tmp);
+	if (rc)
+		return rc;
+
+	command_metadata->request_offset = tmp >> 32;
+	command_metadata->response_offset = tmp & 0xFFFFFFFF;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, offset + 8, OCXL_LITTLE_ENDIAN,
+				     &tmp);
+	if (rc)
+		return rc;
+
+	command_metadata->data_offset = tmp >> 32;
+	command_metadata->data_size = tmp & 0xFFFFFFFF;
+
+	command_metadata->id = 0;
+
+	return 0;
+}
+
+/**
+ * setup_command_metadata() - Set up the command metadata
+ * @ocxlpmem: the device metadata
+ */
+static int setup_command_metadata(struct ocxlpmem *ocxlpmem)
+{
+	int rc;
+
+	mutex_init(&ocxlpmem->admin_command.lock);
+
+	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_ACMA_CREQO,
+				      &ocxlpmem->admin_command);
+	if (rc)
+		return rc;
+
+	return 0;
+}
+
 /**
  * is_usable() - Is a controller usable?
  * @ocxlpmem: the device metadata
@@ -456,6 +508,14 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	}
 	ocxlpmem->pdev = pdev;
 
+	ocxlpmem->timeouts[ADMIN_COMMAND_ERRLOG] = 2000; // ms
+	ocxlpmem->timeouts[ADMIN_COMMAND_HEARTBEAT] = 100; // ms
+	ocxlpmem->timeouts[ADMIN_COMMAND_SMART] = 100; // ms
+	ocxlpmem->timeouts[ADMIN_COMMAND_CONTROLLER_DUMP] = 1000; // ms
+	ocxlpmem->timeouts[ADMIN_COMMAND_CONTROLLER_STATS] = 100; // ms
+	ocxlpmem->timeouts[ADMIN_COMMAND_SHUTDOWN] = 1000; // ms
+	ocxlpmem->timeouts[ADMIN_COMMAND_FW_UPDATE] = 16000; // ms
+
 	pci_set_drvdata(pdev, ocxlpmem);
 
 	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
@@ -501,6 +561,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err;
 	}
 
+	if (setup_command_metadata(ocxlpmem)) {
+		dev_err(&pdev->dev, "Could not read OCXL command matada\n");
+		goto err;
+	}
+
 	elapsed = 0;
 	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
 	while (!is_usable(ocxlpmem, false)) {
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
index 617ca943b1b8..583f48023025 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
@@ -17,3 +17,156 @@ int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi)
 
 	return 0;
 }
+
+#define COMMAND_REQUEST_SIZE (8 * sizeof(u64))
+static int scm_command_request(const struct ocxlpmem *ocxlpmem,
+			       struct command_metadata *cmd, u8 op_code)
+{
+	u64 val = op_code;
+	int rc;
+	u8 i;
+
+	cmd->op_code = op_code;
+	cmd->id++;
+
+	val |= ((u64)cmd->id) << 16;
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, cmd->request_offset,
+				      OCXL_LITTLE_ENDIAN, val);
+	if (rc)
+		return rc;
+
+	for (i = sizeof(u64); i < COMMAND_REQUEST_SIZE; i += sizeof(u64)) {
+		rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
+					      cmd->request_offset + i,
+					      OCXL_LITTLE_ENDIAN, 0);
+		if (rc)
+			return rc;
+	}
+
+	return 0;
+}
+
+int admin_command_request(struct ocxlpmem *ocxlpmem, u8 op_code)
+{
+	u64 val;
+	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHI,
+					 OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	return scm_command_request(ocxlpmem, &ocxlpmem->admin_command, op_code);
+}
+
+static int command_response(const struct ocxlpmem *ocxlpmem,
+			    const struct command_metadata *cmd)
+{
+	u64 val;
+	u16 id;
+	u8 status;
+	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+					 cmd->response_offset,
+					 OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	status = val & 0xff;
+	id = (val >> 16) & 0xffff;
+
+	if (id != cmd->id) {
+		dev_warn(&ocxlpmem->dev,
+			 "Expected response for command %d, but received response for command %d instead.\n",
+			 cmd->id, id);
+	}
+
+	return status;
+}
+
+int admin_response(const struct ocxlpmem *ocxlpmem)
+{
+	return command_response(ocxlpmem, &ocxlpmem->admin_command);
+}
+
+
+int admin_command_execute(const struct ocxlpmem *ocxlpmem)
+{
+	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
+				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_HCI_ACRW);
+}
+
+static bool admin_command_complete(const struct ocxlpmem *ocxlpmem)
+{
+	u64 val = 0;
+
+	int rc = ocxlpmem_chi(ocxlpmem, &val);
+
+	WARN_ON(rc);
+
+	return (val & GLOBAL_MMIO_CHI_ACRA) != 0;
+}
+
+int admin_command_complete_timeout(const struct ocxlpmem *ocxlpmem,
+				   int command)
+{
+	u32 timeout = ocxlpmem->timeouts[command];
+	// 32 is the next power of 2 greater than the 20ms minimum for msleep
+#define TIMEOUT_SLEEP_MILLIS 32
+	timeout /= TIMEOUT_SLEEP_MILLIS;
+	if (!timeout)
+		timeout = DEFAULT_TIMEOUT / TIMEOUT_SLEEP_MILLIS;
+
+	while (timeout-- > 0) {
+		if (admin_command_complete(ocxlpmem))
+			return 0;
+		msleep(TIMEOUT_SLEEP_MILLIS);
+	}
+
+	if (admin_command_complete(ocxlpmem))
+		return 0;
+
+	return -EBUSY;
+}
+
+int admin_response_handled(const struct ocxlpmem *ocxlpmem)
+{
+	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
+				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_CHI_ACRA);
+}
+
+void warn_status(const struct ocxlpmem *ocxlpmem, const char *message,
+		     u8 status)
+{
+	const char *text = "Unknown";
+
+	switch (status) {
+	case STATUS_SUCCESS:
+		text = "Success";
+		break;
+
+	case STATUS_MEM_UNAVAILABLE:
+		text = "Persistent memory unavailable";
+		break;
+
+	case STATUS_BAD_OPCODE:
+		text = "Bad opcode";
+		break;
+
+	case STATUS_BAD_REQUEST_PARM:
+		text = "Bad request parameter";
+		break;
+
+	case STATUS_BAD_DATA_PARM:
+		text = "Bad data parameter";
+		break;
+
+	case STATUS_DEBUG_BLOCKED:
+		text = "Debug action blocked";
+		break;
+
+	case STATUS_FAIL:
+		text = "Failed";
+		break;
+	}
+
+	dev_warn(&ocxlpmem->dev, "%s: %s (%x)\n", message, text, status);
+}
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index ba0301533d00..2fef68c71271 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -7,6 +7,7 @@
 #include <linux/mm.h>
 
 #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
+#define DEFAULT_TIMEOUT 100
 
 #define GLOBAL_MMIO_CHI		0x000
 #define GLOBAL_MMIO_CHIC	0x008
@@ -80,6 +81,16 @@
 #define STATUS_FW_ARG_INVALID	0x51
 #define STATUS_FW_INVALID	0x52
 
+struct command_metadata {
+	u32 request_offset;
+	u32 response_offset;
+	u32 data_offset;
+	u32 data_size;
+	struct mutex lock;
+	u16 id;
+	u8 op_code;
+};
+
 struct ocxlpmem_function0 {
 	struct pci_dev *pdev;
 	struct ocxl_fn *ocxl_fn;
@@ -95,9 +106,11 @@ struct ocxlpmem {
 	struct ocxl_afu *ocxl_afu;
 	struct ocxl_context *ocxl_context;
 	void *metadata_addr;
+	struct command_metadata admin_command;
 	struct resource pmem_res;
 	struct nd_region *nd_region;
 	char fw_version[8+1];
+	u32 timeouts[ADMIN_COMMAND_MAX+1];
 
 	u32 max_controller_dump_size;
 	u16 scm_revision; // major/minor
@@ -122,3 +135,51 @@ struct ocxlpmem {
  * Returns 0 on success, negative on error
  */
 int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi);
+
+/**
+ * admin_command_request() - Issue an admin command request
+ * @ocxlpmem: the device metadata
+ * @op_code: The op-code for the command
+ *
+ * Returns an identifier for the command, or negative on error
+ */
+int admin_command_request(struct ocxlpmem *ocxlpmem, u8 op_code);
+
+/**
+ * admin_response() - Validate an admin response
+ * @ocxlpmem: the device metadata
+ * Returns the status code of the command, or negative on error
+ */
+int admin_response(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * admin_command_execute() - Notify the controller to start processing a pending admin command
+ * @ocxlpmem: the device metadata
+ * Returns 0 on success, negative on error
+ */
+int admin_command_execute(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * admin_command_complete_timeout() - Wait for an admin command to finish executing
+ * @ocxlpmem: the device metadata
+ * @command: the admin command to wait for completion (determines the timeout)
+ * Returns 0 on success, -EBUSY on timeout
+ */
+int admin_command_complete_timeout(const struct ocxlpmem *ocxlpmem,
+				   int command);
+
+/**
+ * admin_response_handled() - Notify the controller that the admin response has been handled
+ * @ocxlpmem: the device metadata
+ * Returns 0 on success, negative on failure
+ */
+int admin_response_handled(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * warn_status() - Emit a kernel warning showing a command status.
+ * @ocxlpmem: the device metadata
+ * @message: A message to accompany the warning
+ * @status: The command status
+ */
+void warn_status(const struct ocxlpmem *ocxlpmem, const char *message,
+		 u8 status);
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (13 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-27  8:30   ` Andrew Donnellan
                     ` (2 more replies)
  2020-02-21  3:27 ` [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with Alastair D'Silva
                   ` (12 subsequent siblings)
  27 siblings, 3 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

Similar to the previous patch, this adds support for near storage commands.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c    |  6 +++
 .../platforms/powernv/pmem/ocxl_internal.c    | 41 +++++++++++++++++++
 .../platforms/powernv/pmem/ocxl_internal.h    | 37 +++++++++++++++++
 3 files changed, 84 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 4e782d22605b..b8bd7e703b19 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -259,12 +259,18 @@ static int setup_command_metadata(struct ocxlpmem *ocxlpmem)
 	int rc;
 
 	mutex_init(&ocxlpmem->admin_command.lock);
+	mutex_init(&ocxlpmem->ns_command.lock);
 
 	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_ACMA_CREQO,
 				      &ocxlpmem->admin_command);
 	if (rc)
 		return rc;
 
+	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_NSCMA_CREQO,
+					  &ocxlpmem->ns_command);
+	if (rc)
+		return rc;
+
 	return 0;
 }
 
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
index 583f48023025..3e0b133feddf 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
@@ -133,6 +133,47 @@ int admin_response_handled(const struct ocxlpmem *ocxlpmem)
 				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_CHI_ACRA);
 }
 
+int ns_command_request(struct ocxlpmem *ocxlpmem, u8 op_code)
+{
+	u64 val;
+	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHI,
+					 OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	if (!(val & GLOBAL_MMIO_CHI_NSCRA))
+		return -EBUSY;
+
+	return scm_command_request(ocxlpmem, &ocxlpmem->ns_command, op_code);
+}
+
+int ns_response(const struct ocxlpmem *ocxlpmem)
+{
+	return command_response(ocxlpmem, &ocxlpmem->ns_command);
+}
+
+int ns_command_execute(const struct ocxlpmem *ocxlpmem)
+{
+	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
+				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_HCI_NSCRW);
+}
+
+bool ns_command_complete(const struct ocxlpmem *ocxlpmem)
+{
+	u64 val = 0;
+	int rc = ocxlpmem_chi(ocxlpmem, &val);
+
+	WARN_ON(rc);
+
+	return (val & GLOBAL_MMIO_CHI_NSCRA) != 0;
+}
+
+int ns_response_handled(const struct ocxlpmem *ocxlpmem)
+{
+	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
+				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_CHI_NSCRA);
+}
+
 void warn_status(const struct ocxlpmem *ocxlpmem, const char *message,
 		     u8 status)
 {
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index 2fef68c71271..28e2020f6355 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -107,6 +107,7 @@ struct ocxlpmem {
 	struct ocxl_context *ocxl_context;
 	void *metadata_addr;
 	struct command_metadata admin_command;
+	struct command_metadata ns_command;
 	struct resource pmem_res;
 	struct nd_region *nd_region;
 	char fw_version[8+1];
@@ -175,6 +176,42 @@ int admin_command_complete_timeout(const struct ocxlpmem *ocxlpmem,
  */
 int admin_response_handled(const struct ocxlpmem *ocxlpmem);
 
+/**
+ * ns_command_request() - Issue a near storage command request
+ * @ocxlpmem: the device metadata
+ * @op_code: The op-code for the command
+ * Returns an identifier for the command, or negative on error
+ */
+int ns_command_request(struct ocxlpmem *ocxlpmem, u8 op_code);
+
+/**
+ * ns_response() - Validate a near storage response
+ * @ocxlpmem: the device metadata
+ * Returns the status code of the command, or negative on error
+ */
+int ns_response(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * ns_command_execute() - Notify the controller to start processing a pending near storage command
+ * @ocxlpmem: the device metadata
+ * Returns 0 on success, negative on error
+ */
+int ns_command_execute(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * ns_command_complete() - Is a near storage command executing
+ * @ocxlpmem: the device metadata
+ * Returns true if the previous admin command has completed
+ */
+bool ns_command_complete(const struct ocxlpmem *ocxlpmem);
+
+/**
+ * ns_response_handled() - Notify the controller that the near storage response has been handled
+ * @ocxlpmem: the device metadata
+ * Returns 0 on success, negative on failure
+ */
+int ns_response_handled(const struct ocxlpmem *ocxlpmem);
+
 /**
  * warn_status() - Emit a kernel warning showing a command status.
  * @ocxlpmem: the device metadata
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (14 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-03-02  5:34   ` Andrew Donnellan
  2020-03-03  9:28   ` Frederic Barrat
  2020-02-21  3:27 ` [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command Alastair D'Silva
                   ` (11 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch introduces a character device (/dev/ocxl-scmX) which further
patches will use to interact with userspace.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c    | 116 +++++++++++++++++-
 .../platforms/powernv/pmem/ocxl_internal.h    |   2 +
 2 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index b8bd7e703b19..63109a870d2c 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -10,6 +10,7 @@
 #include <misc/ocxl.h>
 #include <linux/delay.h>
 #include <linux/ndctl.h>
+#include <linux/fs.h>
 #include <linux/mm_types.h>
 #include <linux/memory_hotplug.h>
 #include "ocxl_internal.h"
@@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
 
 	free_minor(ocxlpmem);
 
+	if (ocxlpmem->cdev.owner)
+		cdev_del(&ocxlpmem->cdev);
+
 	if (ocxlpmem->metadata_addr)
 		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
 
@@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
 	return device_register(&ocxlpmem->dev);
 }
 
+static void ocxlpmem_put(struct ocxlpmem *ocxlpmem)
+{
+	put_device(&ocxlpmem->dev);
+}
+
+static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem)
+{
+	return (get_device(&ocxlpmem->dev) == NULL) ? NULL : ocxlpmem;
+}
+
+static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno)
+{
+	struct ocxlpmem *ocxlpmem;
+	int minor = MINOR(devno);
+	/*
+	 * We don't declare an RCU critical section here, as our AFU
+	 * is protected by a reference counter on the device. By the time the
+	 * minor number of a device is removed from the idr, the ref count of
+	 * the device is already at 0, so no user API will access that AFU and
+	 * this function can't return it.
+	 */
+	ocxlpmem = idr_find(&minors_idr, minor);
+	if (ocxlpmem)
+		ocxlpmem_get(ocxlpmem);
+	return ocxlpmem;
+}
+
+static int file_open(struct inode *inode, struct file *file)
+{
+	struct ocxlpmem *ocxlpmem;
+
+	ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev);
+	if (!ocxlpmem)
+		return -ENODEV;
+
+	file->private_data = ocxlpmem;
+	return 0;
+}
+
+static int file_release(struct inode *inode, struct file *file)
+{
+	struct ocxlpmem *ocxlpmem = file->private_data;
+
+	ocxlpmem_put(ocxlpmem);
+	return 0;
+}
+
+static const struct file_operations fops = {
+	.owner		= THIS_MODULE,
+	.open		= file_open,
+	.release	= file_release,
+};
+
+/**
+ * create_cdev() - Create the chardev in /dev for the device
+ * @ocxlpmem: the SCM metadata
+ * Return: 0 on success, negative on failure
+ */
+static int create_cdev(struct ocxlpmem *ocxlpmem)
+{
+	cdev_init(&ocxlpmem->cdev, &fops);
+	return cdev_add(&ocxlpmem->cdev, ocxlpmem->dev.devt, 1);
+}
+
 /**
  * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
  * @pdev: the PCI device information struct
@@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err;
 	}
 
+	if (create_cdev(ocxlpmem)) {
+		dev_err(&pdev->dev, "Could not create character device\n");
+		goto err;
+	}
+
 	elapsed = 0;
 	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
 	while (!is_usable(ocxlpmem, false)) {
@@ -613,20 +686,59 @@ static struct pci_driver pci_driver = {
 	.shutdown = ocxlpmem_remove,
 };
 
+static int file_init(void)
+{
+	int rc;
+
+	mutex_init(&minors_idr_lock);
+	idr_init(&minors_idr);
+
+	rc = alloc_chrdev_region(&ocxlpmem_dev, 0, NUM_MINORS, "ocxl-pmem");
+	if (rc) {
+		idr_destroy(&minors_idr);
+		pr_err("Unable to allocate OpenCAPI persistent memory major number: %d\n", rc);
+		return rc;
+	}
+
+	ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem");
+	if (IS_ERR(ocxlpmem_class)) {
+		idr_destroy(&minors_idr);
+		pr_err("Unable to create ocxl-pmem class\n");
+		unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
+		return PTR_ERR(ocxlpmem_class);
+	}
+
+	return 0;
+}
+
+static void file_exit(void)
+{
+	class_destroy(ocxlpmem_class);
+	unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
+	idr_destroy(&minors_idr);
+}
+
 static int __init ocxlpmem_init(void)
 {
-	int rc = 0;
+	int rc;
 
-	rc = pci_register_driver(&pci_driver);
+	rc = file_init();
 	if (rc)
 		return rc;
 
+	rc = pci_register_driver(&pci_driver);
+	if (rc) {
+		file_exit();
+		return rc;
+	}
+
 	return 0;
 }
 
 static void ocxlpmem_exit(void)
 {
 	pci_unregister_driver(&pci_driver);
+	file_exit();
 }
 
 module_init(ocxlpmem_init);
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index 28e2020f6355..d2d81fec7bb1 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -2,6 +2,7 @@
 // Copyright 2019 IBM Corp.
 
 #include <linux/pci.h>
+#include <linux/cdev.h>
 #include <misc/ocxl.h>
 #include <linux/libnvdimm.h>
 #include <linux/mm.h>
@@ -99,6 +100,7 @@ struct ocxlpmem_function0 {
 struct ocxlpmem {
 	struct device dev;
 	struct pci_dev *pdev;
+	struct cdev cdev;
 	struct ocxl_fn *ocxl_fn;
 	struct nd_interleave_set nd_set;
 	struct nvdimm_bus_descriptor bus_desc;
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (15 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-03-03 10:36   ` Frederic Barrat
  2020-03-04  5:58   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs Alastair D'Silva
                   ` (10 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

The read error log command extracts information from the controller's
internal error log.

This patch exposes this information in 2 ways:
- During probe, if an error occurs & a log is available, print it to the
  console
- After probe, make the error log available to userspace via an IOCTL.
  Userspace is notified of pending error logs in a later patch
  ("powerpc/powernv/pmem: Forward events to userspace")

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c    | 269 ++++++++++++++++++
 .../platforms/powernv/pmem/ocxl_internal.h    |   1 +
 include/uapi/nvdimm/ocxl-pmem.h               |  46 +++
 3 files changed, 316 insertions(+)
 create mode 100644 include/uapi/nvdimm/ocxl-pmem.h

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 63109a870d2c..2b64504f9129 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -447,10 +447,219 @@ static int file_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
+/**
+ * error_log_header_parse() - Parse the first 64 bits of the error log command response
+ * @ocxlpmem: the device metadata
+ * @length: out, returns the number of bytes in the response (excluding the 64 bit header)
+ */
+static int error_log_header_parse(struct ocxlpmem *ocxlpmem, u16 *length)
+{
+	int rc;
+	u64 val;
+
+	u16 data_identifier;
+	u32 data_length;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	data_identifier = val >> 48;
+	data_length = val & 0xFFFF;
+
+	if (data_identifier != 0x454C) { // 'EL'
+		dev_err(&ocxlpmem->dev,
+			"Bad data identifier for error log data, expected 'EL', got '%2s' (%#x), data_length=%u\n",
+			(char *)&data_identifier,
+			(unsigned int)data_identifier, data_length);
+		return -EINVAL;
+	}
+
+	*length = data_length;
+	return 0;
+}
+
+static int error_log_offset_0x08(struct ocxlpmem *ocxlpmem,
+				 u32 *log_identifier, u32 *program_ref_code)
+{
+	int rc;
+	u64 val;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	*log_identifier = val >> 32;
+	*program_ref_code = val & 0xFFFFFFFF;
+
+	return 0;
+}
+
+static int read_error_log(struct ocxlpmem *ocxlpmem,
+			  struct ioctl_ocxl_pmem_error_log *log, bool buf_is_user)
+{
+	u64 val;
+	u16 user_buf_length;
+	u16 buf_length;
+	u16 i;
+	int rc;
+
+	if (log->buf_size % 8)
+		return -EINVAL;
+
+	rc = ocxlpmem_chi(ocxlpmem, &val);
+	if (rc)
+		goto out;
+
+	if (!(val & GLOBAL_MMIO_CHI_ELA))
+		return -EAGAIN;
+
+	user_buf_length = log->buf_size;
+
+	mutex_lock(&ocxlpmem->admin_command.lock);
+
+	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_ERRLOG);
+	if (rc)
+		goto out;
+
+	rc = admin_command_execute(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_ERRLOG);
+	if (rc < 0) {
+		dev_warn(&ocxlpmem->dev, "Read error log timed out\n");
+		goto out;
+	}
+
+	rc = admin_response(ocxlpmem);
+	if (rc < 0)
+		goto out;
+	if (rc != STATUS_SUCCESS) {
+		warn_status(ocxlpmem, "Unexpected status from retrieve error log", rc);
+		goto out;
+	}
+
+
+	rc = error_log_header_parse(ocxlpmem, &log->buf_size);
+	if (rc)
+		goto out;
+	// log->buf_size now contains the returned buffer size, not the user size
+
+	rc = error_log_offset_0x08(ocxlpmem, &log->log_identifier,
+				       &log->program_reference_code);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x10,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		goto out;
+
+	log->error_log_type = val >> 56;
+	log->action_flags = (log->error_log_type == OCXL_PMEM_ERROR_LOG_TYPE_GENERAL) ?
+			    (val >> 32) & 0xFFFFFF : 0;
+	log->power_on_seconds = val & 0xFFFFFFFF;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x18,
+				     OCXL_LITTLE_ENDIAN, &log->timestamp);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x20,
+				     OCXL_HOST_ENDIAN, &log->wwid[0]);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x28,
+				     OCXL_HOST_ENDIAN, &log->wwid[1]);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x30,
+				     OCXL_HOST_ENDIAN, (u64 *)log->fw_revision);
+	if (rc)
+		goto out;
+	log->fw_revision[8] = '\0';
+
+	buf_length = (user_buf_length < log->buf_size) ?
+		     user_buf_length : log->buf_size;
+	for (i = 0; i < buf_length + 0x48; i += 8) {
+		u64 val;
+
+		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+					     ocxlpmem->admin_command.data_offset + i,
+					     OCXL_HOST_ENDIAN, &val);
+		if (rc)
+			goto out;
+
+		if (buf_is_user) {
+			if (copy_to_user(&log->buf[i], &val, sizeof(u64))) {
+				rc = -EFAULT;
+				goto out;
+			}
+		} else
+			log->buf[i] = val;
+	}
+
+	rc = admin_response_handled(ocxlpmem);
+	if (rc)
+		goto out;
+
+out:
+	mutex_unlock(&ocxlpmem->admin_command.lock);
+	return rc;
+
+}
+
+static int ioctl_error_log(struct ocxlpmem *ocxlpmem,
+		struct ioctl_ocxl_pmem_error_log __user *uarg)
+{
+	struct ioctl_ocxl_pmem_error_log args;
+	int rc;
+
+	if (copy_from_user(&args, uarg, sizeof(args)))
+		return -EFAULT;
+
+	rc = read_error_log(ocxlpmem, &args, true);
+	if (rc)
+		return rc;
+
+	if (copy_to_user(uarg, &args, sizeof(args)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
+{
+	struct ocxlpmem *ocxlpmem = file->private_data;
+	int rc = -EINVAL;
+
+	switch (cmd) {
+	case IOCTL_OCXL_PMEM_ERROR_LOG:
+		rc = ioctl_error_log(ocxlpmem,
+				     (struct ioctl_ocxl_pmem_error_log __user *)args);
+		break;
+	}
+	return rc;
+}
+
 static const struct file_operations fops = {
 	.owner		= THIS_MODULE,
 	.open		= file_open,
 	.release	= file_release,
+	.unlocked_ioctl = file_ioctl,
+	.compat_ioctl   = file_ioctl,
 };
 
 /**
@@ -527,6 +736,60 @@ static int read_device_metadata(struct ocxlpmem *ocxlpmem)
 	return 0;
 }
 
+static const char *decode_error_log_type(u8 error_log_type)
+{
+	switch (error_log_type) {
+	case 0x00:
+		return "general";
+	case 0x01:
+		return "predictive failure";
+	case 0x02:
+		return "thermal warning";
+	case 0x03:
+		return "data loss";
+	case 0x04:
+		return "health & performance";
+	default:
+		return "unknown";
+	}
+}
+
+static void dump_error_log(struct ocxlpmem *ocxlpmem)
+{
+	struct ioctl_ocxl_pmem_error_log log;
+	u32 buf_size;
+	u8 *buf;
+	int rc;
+
+	if (ocxlpmem->admin_command.data_size == 0)
+		return;
+
+	buf_size = ocxlpmem->admin_command.data_size - 0x48;
+	buf = kzalloc(buf_size, GFP_KERNEL);
+	if (!buf)
+		return;
+
+	log.buf = buf;
+	log.buf_size = buf_size;
+
+	rc = read_error_log(ocxlpmem, &log, false);
+	if (rc < 0)
+		goto out;
+
+	dev_warn(&ocxlpmem->dev,
+		 "OCXL PMEM Error log: WWID=0x%016llx%016llx LID=0x%x PRC=%x type=0x%x %s, Uptime=%u seconds timestamp=0x%llx\n",
+		 log.wwid[0], log.wwid[1],
+		 log.log_identifier, log.program_reference_code,
+		 log.error_log_type,
+		 decode_error_log_type(log.error_log_type),
+		 log.power_on_seconds, log.timestamp);
+	print_hex_dump(KERN_WARNING, "buf", DUMP_PREFIX_OFFSET, 16, 1, buf,
+		       log.buf_size, false);
+
+out:
+	kfree(buf);
+}
+
 /**
  * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
  * This is important as it enables templates higher than 0 across all other functions,
@@ -568,6 +831,7 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	struct ocxlpmem *ocxlpmem;
 	int rc;
 	u16 elapsed, timeout;
+	u64 chi;
 
 	if (PCI_FUNC(pdev->devfn) == 0)
 		return probe_function0(pdev);
@@ -667,6 +931,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	return 0;
 
 err:
+	if (ocxlpmem &&
+		    (ocxlpmem_chi(ocxlpmem, &chi) == 0) &&
+		    (chi & GLOBAL_MMIO_CHI_ELA))
+		dump_error_log(ocxlpmem);
+
 	/*
 	 * Further cleanup is done in the release handler via free_ocxlpmem()
 	 * This allows us to keep the character device live to handle IOCTLs to
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index d2d81fec7bb1..b953ee522ed4 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -5,6 +5,7 @@
 #include <linux/cdev.h>
 #include <misc/ocxl.h>
 #include <linux/libnvdimm.h>
+#include <uapi/nvdimm/ocxl-pmem.h>
 #include <linux/mm.h>
 
 #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
new file mode 100644
index 000000000000..b10f8ac0c20f
--- /dev/null
+++ b/include/uapi/nvdimm/ocxl-pmem.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/* Copyright 2017 IBM Corp. */
+#ifndef _UAPI_OCXL_SCM_H
+#define _UAPI_OCXL_SCM_H
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+#define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
+#define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
+#define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
+#define OCXL_PMEM_ERROR_LOG_ACTION_DUMP		(1 << (55-32))
+
+#define OCXL_PMEM_ERROR_LOG_TYPE_GENERAL		(0x00)
+#define OCXL_PMEM_ERROR_LOG_TYPE_PREDICTIVE_FAILURE	(0x01)
+#define OCXL_PMEM_ERROR_LOG_TYPE_THERMAL_WARNING	(0x02)
+#define OCXL_PMEM_ERROR_LOG_TYPE_DATA_LOSS		(0x03)
+#define OCXL_PMEM_ERROR_LOG_TYPE_HEALTH_PERFORMANCE	(0x04)
+
+struct ioctl_ocxl_pmem_error_log {
+	__u32 log_identifier; /* out */
+	__u32 program_reference_code; /* out */
+	__u32 action_flags; /* out, recommended course of action */
+	__u32 power_on_seconds; /* out, Number of seconds the controller has been on when the error occurred */
+	__u64 timestamp; /* out, relative time since the current IPL */
+	__u64 wwid[2]; /* out, the NAA formatted WWID associated with the controller */
+	char  fw_revision[8+1]; /* out, firmware revision as null terminated text */
+	__u16 buf_size; /* in/out, buffer size provided/required.
+			 * If required is greater than provided, the buffer
+			 * will be truncated to the amount provided. If its
+			 * less, then only the required bytes will be populated.
+			 * If it is 0, then there are no more error log entries.
+			 */
+	__u8  error_log_type;
+	__u8  reserved1;
+	__u32 reserved2;
+	__u64 reserved3[2];
+	__u8 *buf; /* pointer to output buffer */
+};
+
+/* ioctl numbers */
+#define OCXL_PMEM_MAGIC 0x5C
+/* SCM devices */
+#define IOCTL_OCXL_PMEM_ERROR_LOG			_IOWR(OCXL_PMEM_MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log)
+
+#endif /* _UAPI_OCXL_SCM_H */
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (16 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-03-03 18:04   ` Frederic Barrat
  2020-03-04  6:53   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics Alastair D'Silva
                   ` (9 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch adds IOCTLs to allow userspace to request & fetch dumps
of the internal controller state.

This is useful during debugging or when a fatal error on the controller
has occurred.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c | 132 +++++++++++++++++++++
 include/uapi/nvdimm/ocxl-pmem.h            |  15 +++
 2 files changed, 147 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 2b64504f9129..2cabafe1fc58 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -640,6 +640,124 @@ static int ioctl_error_log(struct ocxlpmem *ocxlpmem,
 	return 0;
 }
 
+static int ioctl_controller_dump_data(struct ocxlpmem *ocxlpmem,
+		struct ioctl_ocxl_pmem_controller_dump_data __user *uarg)
+{
+	struct ioctl_ocxl_pmem_controller_dump_data args;
+	u16 i;
+	u64 val;
+	int rc;
+
+	if (copy_from_user(&args, uarg, sizeof(args)))
+		return -EFAULT;
+
+	if (args.buf_size % 8)
+		return -EINVAL;
+
+	if (args.buf_size > ocxlpmem->admin_command.data_size)
+		return -EINVAL;
+
+	mutex_lock(&ocxlpmem->admin_command.lock);
+
+	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_CONTROLLER_DUMP);
+	if (rc)
+		goto out;
+
+	val = ((u64)args.offset) << 32;
+	val |= args.buf_size;
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
+				      ocxlpmem->admin_command.request_offset + 0x08,
+				      OCXL_LITTLE_ENDIAN, val);
+	if (rc)
+		goto out;
+
+	rc = admin_command_execute(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = admin_command_complete_timeout(ocxlpmem,
+					    ADMIN_COMMAND_CONTROLLER_DUMP);
+	if (rc < 0) {
+		dev_warn(&ocxlpmem->dev, "Controller dump timed out\n");
+		goto out;
+	}
+
+	rc = admin_response(ocxlpmem);
+	if (rc < 0)
+		goto out;
+	if (rc != STATUS_SUCCESS) {
+		warn_status(ocxlpmem,
+			    "Unexpected status from retrieve error log",
+			    rc);
+		goto out;
+	}
+
+	for (i = 0; i < args.buf_size; i += 8) {
+		u64 val;
+
+		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+					     ocxlpmem->admin_command.data_offset + i,
+					     OCXL_HOST_ENDIAN, &val);
+		if (rc)
+			goto out;
+
+		if (copy_to_user(&args.buf[i], &val, sizeof(u64))) {
+			rc = -EFAULT;
+			goto out;
+		}
+	}
+
+	if (copy_to_user(uarg, &args, sizeof(args))) {
+		rc = -EFAULT;
+		goto out;
+	}
+
+	rc = admin_response_handled(ocxlpmem);
+	if (rc)
+		goto out;
+
+out:
+	mutex_unlock(&ocxlpmem->admin_command.lock);
+	return rc;
+}
+
+int request_controller_dump(struct ocxlpmem *ocxlpmem)
+{
+	int rc;
+	u64 busy = 1;
+
+	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
+				    OCXL_LITTLE_ENDIAN,
+				    GLOBAL_MMIO_CHI_CDA);
+
+
+	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
+				    OCXL_LITTLE_ENDIAN,
+				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP);
+	if (rc)
+		return rc;
+
+	while (busy) {
+		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+					     GLOBAL_MMIO_HCI,
+					     OCXL_LITTLE_ENDIAN, &busy);
+		if (rc)
+			return rc;
+
+		busy &= GLOBAL_MMIO_HCI_CONTROLLER_DUMP;
+		cond_resched();
+	}
+
+	return 0;
+}
+
+static int ioctl_controller_dump_complete(struct ocxlpmem *ocxlpmem)
+{
+	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
+				    OCXL_LITTLE_ENDIAN,
+				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COLLECTED);
+}
+
 static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 {
 	struct ocxlpmem *ocxlpmem = file->private_data;
@@ -650,7 +768,21 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 		rc = ioctl_error_log(ocxlpmem,
 				     (struct ioctl_ocxl_pmem_error_log __user *)args);
 		break;
+
+	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP:
+		rc = request_controller_dump(ocxlpmem);
+		break;
+
+	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA:
+		rc = ioctl_controller_dump_data(ocxlpmem,
+						(struct ioctl_ocxl_pmem_controller_dump_data __user *)args);
+		break;
+
+	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE:
+		rc = ioctl_controller_dump_complete(ocxlpmem);
+		break;
 	}
+
 	return rc;
 }
 
diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
index b10f8ac0c20f..d4d8512d03f7 100644
--- a/include/uapi/nvdimm/ocxl-pmem.h
+++ b/include/uapi/nvdimm/ocxl-pmem.h
@@ -38,9 +38,24 @@ struct ioctl_ocxl_pmem_error_log {
 	__u8 *buf; /* pointer to output buffer */
 };
 
+struct ioctl_ocxl_pmem_controller_dump_data {
+	__u8 *buf; /* pointer to output buffer */
+	__u16 buf_size; /* in/out, buffer size provided/required.
+			 * If required is greater than provided, the buffer
+			 * will be truncated to the amount provided. If its
+			 * less, then only the required bytes will be populated.
+			 * If it is 0, then there is no more dump data available.
+			 */
+	__u32 offset; /* in, Offset within the dump */
+	__u64 reserved[8];
+};
+
 /* ioctl numbers */
 #define OCXL_PMEM_MAGIC 0x5C
 /* SCM devices */
 #define IOCTL_OCXL_PMEM_ERROR_LOG			_IOWR(OCXL_PMEM_MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log)
+#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP			_IO(OCXL_PMEM_MAGIC, 0x02)
+#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(OCXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
+#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_MAGIC, 0x04)
 
 #endif /* _UAPI_OCXL_SCM_H */
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (17 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-03-04  9:25   ` Frederic Barrat
  2020-03-05  0:46   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace Alastair D'Silva
                   ` (8 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

The controller can report a number of statistics that are useful
in evaluating the performance and reliability of the card.

This patch exposes this information via an IOCTL.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c | 185 +++++++++++++++++++++
 include/uapi/nvdimm/ocxl-pmem.h            |  17 ++
 2 files changed, 202 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 2cabafe1fc58..009d4fd29e7d 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -758,6 +758,186 @@ static int ioctl_controller_dump_complete(struct ocxlpmem *ocxlpmem)
 				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COLLECTED);
 }
 
+/**
+ * controller_stats_header_parse() - Parse the first 64 bits of the controller stats admin command response
+ * @ocxlpmem: the device metadata
+ * @length: out, returns the number of bytes in the response (excluding the 64 bit header)
+ */
+static int controller_stats_header_parse(struct ocxlpmem *ocxlpmem,
+	u32 *length)
+{
+	int rc;
+	u64 val;
+
+	u16 data_identifier;
+	u32 data_length;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	data_identifier = val >> 48;
+	data_length = val & 0xFFFFFFFF;
+
+	if (data_identifier != 0x4353) { // 'CS'
+		dev_err(&ocxlpmem->dev,
+			"Bad data identifier for controller stats, expected 'CS', got '%-.*s'\n",
+			2, (char *)&data_identifier);
+		return -EINVAL;
+	}
+
+	*length = data_length;
+	return 0;
+}
+
+static int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
+				  struct ioctl_ocxl_pmem_controller_stats __user *uarg)
+{
+	struct ioctl_ocxl_pmem_controller_stats args;
+	u32 length;
+	int rc;
+	u64 val;
+
+	memset(&args, '\0', sizeof(args));
+
+	mutex_lock(&ocxlpmem->admin_command.lock);
+
+	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_CONTROLLER_STATS);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
+				      ocxlpmem->admin_command.request_offset + 0x08,
+				      OCXL_LITTLE_ENDIAN, 0);
+	if (rc)
+		goto out;
+
+	rc = admin_command_execute(ocxlpmem);
+	if (rc)
+		goto out;
+
+
+	rc = admin_command_complete_timeout(ocxlpmem,
+					    ADMIN_COMMAND_CONTROLLER_STATS);
+	if (rc < 0) {
+		dev_warn(&ocxlpmem->dev, "Controller stats timed out\n");
+		goto out;
+	}
+
+	rc = admin_response(ocxlpmem);
+	if (rc < 0)
+		goto out;
+	if (rc != STATUS_SUCCESS) {
+		warn_status(ocxlpmem,
+			    "Unexpected status from controller stats", rc);
+		goto out;
+	}
+
+	rc = controller_stats_header_parse(ocxlpmem, &length);
+	if (rc)
+		goto out;
+
+	if (length != 0x140)
+		warn_status(ocxlpmem,
+			    "Unexpected length for controller stats data, expected 0x140, got 0x%x",
+			    length);
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x08,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		goto out;
+
+	args.reset_count = val >> 32;
+	args.reset_uptime = val & 0xFFFFFFFF;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x10,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		goto out;
+
+	args.power_on_uptime = val >> 32;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x08,
+				     OCXL_LITTLE_ENDIAN, &args.host_load_count);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x10,
+				     OCXL_LITTLE_ENDIAN, &args.host_store_count);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x18,
+				     OCXL_LITTLE_ENDIAN, &args.media_read_count);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x20,
+				     OCXL_LITTLE_ENDIAN, &args.media_write_count);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x28,
+				     OCXL_LITTLE_ENDIAN, &args.cache_hit_count);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x30,
+				     OCXL_LITTLE_ENDIAN, &args.cache_miss_count);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x38,
+				     OCXL_LITTLE_ENDIAN, &args.media_read_latency);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x40,
+				     OCXL_LITTLE_ENDIAN, &args.media_write_latency);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x48,
+				     OCXL_LITTLE_ENDIAN, &args.cache_read_latency);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x50,
+				     OCXL_LITTLE_ENDIAN, &args.cache_write_latency);
+	if (rc)
+		goto out;
+
+	if (copy_to_user(uarg, &args, sizeof(args))) {
+		rc = -EFAULT;
+		goto out;
+	}
+
+	rc = admin_response_handled(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = 0;
+	goto out;
+
+out:
+	mutex_unlock(&ocxlpmem->admin_command.lock);
+	return rc;
+}
+
 static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 {
 	struct ocxlpmem *ocxlpmem = file->private_data;
@@ -781,6 +961,11 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE:
 		rc = ioctl_controller_dump_complete(ocxlpmem);
 		break;
+
+	case IOCTL_OCXL_PMEM_CONTROLLER_STATS:
+		rc = ioctl_controller_stats(ocxlpmem,
+					    (struct ioctl_ocxl_pmem_controller_stats __user *)args);
+		break;
 	}
 
 	return rc;
diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
index d4d8512d03f7..add223aa2fdb 100644
--- a/include/uapi/nvdimm/ocxl-pmem.h
+++ b/include/uapi/nvdimm/ocxl-pmem.h
@@ -50,6 +50,22 @@ struct ioctl_ocxl_pmem_controller_dump_data {
 	__u64 reserved[8];
 };
 
+struct ioctl_ocxl_pmem_controller_stats {
+	__u32 reset_count;
+	__u32 reset_uptime; /* seconds */
+	__u32 power_on_uptime; /* seconds */
+	__u64 host_load_count;
+	__u64 host_store_count;
+	__u64 media_read_count;
+	__u64 media_write_count;
+	__u64 cache_hit_count;
+	__u64 cache_miss_count;
+	__u64 media_read_latency; /* nanoseconds */
+	__u64 media_write_latency; /* nanoseconds */
+	__u64 cache_read_latency; /* nanoseconds */
+	__u64 cache_write_latency; /* nanoseconds */
+};
+
 /* ioctl numbers */
 #define OCXL_PMEM_MAGIC 0x5C
 /* SCM devices */
@@ -57,5 +73,6 @@ struct ioctl_ocxl_pmem_controller_dump_data {
 #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP			_IO(OCXL_PMEM_MAGIC, 0x02)
 #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(OCXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
 #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_MAGIC, 0x04)
+#define IOCTL_OCXL_PMEM_CONTROLLER_STATS		_IO(OCXL_PMEM_MAGIC, 0x05)
 
 #endif /* _UAPI_OCXL_SCM_H */
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (18 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-03-03  7:02   ` Andrew Donnellan
  2020-03-04 11:00   ` Frederic Barrat
  2020-02-21  3:27 ` [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data Alastair D'Silva
                   ` (7 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

Some of the interrupts that the card generates are better handled
by the userspace daemon, in particular:
Controller Hardware/Firmware Fatal
Controller Dump Available
Error Log available

This patch allows a userspace application to register an eventfd with
the driver via SCM_IOCTL_EVENTFD to receive notifications of these
interrupts.

Userspace can then identify what events have occurred by calling
SCM_IOCTL_EVENT_CHECK and checking against the SCM_IOCTL_EVENT_FOO
masks.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c    | 216 ++++++++++++++++++
 .../platforms/powernv/pmem/ocxl_internal.h    |   5 +
 include/uapi/nvdimm/ocxl-pmem.h               |  16 ++
 3 files changed, 237 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 009d4fd29e7d..e46696d3cc36 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -10,6 +10,7 @@
 #include <misc/ocxl.h>
 #include <linux/delay.h>
 #include <linux/ndctl.h>
+#include <linux/eventfd.h>
 #include <linux/fs.h>
 #include <linux/mm_types.h>
 #include <linux/memory_hotplug.h>
@@ -335,11 +336,22 @@ static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
 {
 	int rc;
 
+	// Disable doorbells
+	(void)ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIEC,
+				     OCXL_LITTLE_ENDIAN,
+				     GLOBAL_MMIO_CHI_ALL);
+
 	if (ocxlpmem->nvdimm_bus)
 		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
 
 	free_minor(ocxlpmem);
 
+	if (ocxlpmem->irq_addr[1])
+		iounmap(ocxlpmem->irq_addr[1]);
+
+	if (ocxlpmem->irq_addr[0])
+		iounmap(ocxlpmem->irq_addr[0]);
+
 	if (ocxlpmem->cdev.owner)
 		cdev_del(&ocxlpmem->cdev);
 
@@ -443,6 +455,11 @@ static int file_release(struct inode *inode, struct file *file)
 {
 	struct ocxlpmem *ocxlpmem = file->private_data;
 
+	if (ocxlpmem->ev_ctx) {
+		eventfd_ctx_put(ocxlpmem->ev_ctx);
+		ocxlpmem->ev_ctx = NULL;
+	}
+
 	ocxlpmem_put(ocxlpmem);
 	return 0;
 }
@@ -938,6 +955,51 @@ static int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
 	return rc;
 }
 
+static int ioctl_eventfd(struct ocxlpmem *ocxlpmem,
+		 struct ioctl_ocxl_pmem_eventfd __user *uarg)
+{
+	struct ioctl_ocxl_pmem_eventfd args;
+
+	if (copy_from_user(&args, uarg, sizeof(args)))
+		return -EFAULT;
+
+	if (ocxlpmem->ev_ctx)
+		return -EINVAL;
+
+	ocxlpmem->ev_ctx = eventfd_ctx_fdget(args.eventfd);
+	if (!ocxlpmem->ev_ctx)
+		return -EFAULT;
+
+	return 0;
+}
+
+static int ioctl_event_check(struct ocxlpmem *ocxlpmem, u64 __user *uarg)
+{
+	u64 val = 0;
+	int rc;
+	u64 chi = 0;
+
+	rc = ocxlpmem_chi(ocxlpmem, &chi);
+	if (rc < 0)
+		return rc;
+
+	if (chi & GLOBAL_MMIO_CHI_ELA)
+		val |= IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE;
+
+	if (chi & GLOBAL_MMIO_CHI_CDA)
+		val |= IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE;
+
+	if (chi & GLOBAL_MMIO_CHI_CFFS)
+		val |= IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL;
+
+	if (chi & GLOBAL_MMIO_CHI_CHFS)
+		val |= IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL;
+
+	rc = copy_to_user((u64 __user *) uarg, &val, sizeof(val));
+
+	return rc;
+}
+
 static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 {
 	struct ocxlpmem *ocxlpmem = file->private_data;
@@ -966,6 +1028,15 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 		rc = ioctl_controller_stats(ocxlpmem,
 					    (struct ioctl_ocxl_pmem_controller_stats __user *)args);
 		break;
+
+	case IOCTL_OCXL_PMEM_EVENTFD:
+		rc = ioctl_eventfd(ocxlpmem,
+				   (struct ioctl_ocxl_pmem_eventfd __user *)args);
+		break;
+
+	case IOCTL_OCXL_PMEM_EVENT_CHECK:
+		rc = ioctl_event_check(ocxlpmem, (u64 __user *)args);
+		break;
 	}
 
 	return rc;
@@ -1107,6 +1178,146 @@ static void dump_error_log(struct ocxlpmem *ocxlpmem)
 	kfree(buf);
 }
 
+static irqreturn_t imn0_handler(void *private)
+{
+	struct ocxlpmem *ocxlpmem = private;
+	u64 chi = 0;
+
+	(void)ocxlpmem_chi(ocxlpmem, &chi);
+
+	if (chi & GLOBAL_MMIO_CHI_ELA) {
+		dev_warn(&ocxlpmem->dev, "Error log is available\n");
+
+		if (ocxlpmem->ev_ctx)
+			eventfd_signal(ocxlpmem->ev_ctx, 1);
+	}
+
+	if (chi & GLOBAL_MMIO_CHI_CDA) {
+		dev_warn(&ocxlpmem->dev, "Controller dump is available\n");
+
+		if (ocxlpmem->ev_ctx)
+			eventfd_signal(ocxlpmem->ev_ctx, 1);
+	}
+
+
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t imn1_handler(void *private)
+{
+	struct ocxlpmem *ocxlpmem = private;
+	u64 chi = 0;
+
+	(void)ocxlpmem_chi(ocxlpmem, &chi);
+
+	if (chi & (GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS)) {
+		dev_err(&ocxlpmem->dev,
+			"Controller status is fatal, chi=0x%llx, going offline\n", chi);
+
+		if (ocxlpmem->nvdimm_bus) {
+			nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
+			ocxlpmem->nvdimm_bus = NULL;
+		}
+
+		if (ocxlpmem->ev_ctx)
+			eventfd_signal(ocxlpmem->ev_ctx, 1);
+	}
+
+	return IRQ_HANDLED;
+}
+
+
+/**
+ * ocxlpmem_setup_irq() - Set up the IRQs for the OpenCAPI Persistent Memory device
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success, negative on failure
+ */
+static int ocxlpmem_setup_irq(struct ocxlpmem *ocxlpmem)
+{
+	int rc;
+	u64 irq_addr;
+
+	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem->irq_id[0]);
+	if (rc)
+		return rc;
+
+	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[0],
+				  imn0_handler, NULL, ocxlpmem);
+
+	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, ocxlpmem->irq_id[0]);
+	if (!irq_addr)
+		return -EINVAL;
+
+	ocxlpmem->irq_addr[0] = ioremap(irq_addr, PAGE_SIZE);
+	if (!ocxlpmem->irq_addr[0])
+		return -EINVAL;
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_OHP,
+				      OCXL_LITTLE_ENDIAN,
+				      (u64)ocxlpmem->irq_addr[0]);
+	if (rc)
+		goto out_irq0;
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_CFP,
+				      OCXL_LITTLE_ENDIAN, 0);
+	if (rc)
+		goto out_irq0;
+
+	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem->irq_id[1]);
+	if (rc)
+		goto out_irq0;
+
+
+	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[1],
+				  imn1_handler, NULL, ocxlpmem);
+	if (rc)
+		goto out_irq0;
+
+	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, ocxlpmem->irq_id[1]);
+	if (!irq_addr) {
+		rc = -EFAULT;
+		goto out_irq0;
+	}
+
+	ocxlpmem->irq_addr[1] = ioremap(irq_addr, PAGE_SIZE);
+	if (!ocxlpmem->irq_addr[1]) {
+		rc = -EINVAL;
+		goto out_irq0;
+	}
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA1_OHP,
+				      OCXL_LITTLE_ENDIAN,
+				      (u64)ocxlpmem->irq_addr[1]);
+	if (rc)
+		goto out_irq1;
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA1_CFP,
+				      OCXL_LITTLE_ENDIAN, 0);
+	if (rc)
+		goto out_irq1;
+
+	// Enable doorbells
+	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIE,
+				    OCXL_LITTLE_ENDIAN,
+				    GLOBAL_MMIO_CHI_ELA | GLOBAL_MMIO_CHI_CDA |
+				    GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS |
+				    GLOBAL_MMIO_CHI_NSCRA);
+	if (rc)
+		goto out_irq1;
+
+	return 0;
+
+out_irq1:
+	iounmap(ocxlpmem->irq_addr[1]);
+	ocxlpmem->irq_addr[1] = NULL;
+
+out_irq0:
+	iounmap(ocxlpmem->irq_addr[0]);
+	ocxlpmem->irq_addr[0] = NULL;
+
+	return rc;
+}
+
 /**
  * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
  * This is important as it enables templates higher than 0 across all other functions,
@@ -1216,6 +1427,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err;
 	}
 
+	if (ocxlpmem_setup_irq(ocxlpmem)) {
+		dev_err(&pdev->dev, "Could not set up OCXL IRQs\n");
+		goto err;
+	}
+
 	if (setup_command_metadata(ocxlpmem)) {
 		dev_err(&pdev->dev, "Could not read OCXL command matada\n");
 		goto err;
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index b953ee522ed4..927690f4888f 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -103,6 +103,10 @@ struct ocxlpmem {
 	struct pci_dev *pdev;
 	struct cdev cdev;
 	struct ocxl_fn *ocxl_fn;
+#define SCM_IRQ_COUNT 2
+	int irq_id[SCM_IRQ_COUNT];
+	struct dev_pagemap irq_pgmap[SCM_IRQ_COUNT];
+	void *irq_addr[SCM_IRQ_COUNT];
 	struct nd_interleave_set nd_set;
 	struct nvdimm_bus_descriptor bus_desc;
 	struct nvdimm_bus *nvdimm_bus;
@@ -113,6 +117,7 @@ struct ocxlpmem {
 	struct command_metadata ns_command;
 	struct resource pmem_res;
 	struct nd_region *nd_region;
+	struct eventfd_ctx *ev_ctx;
 	char fw_version[8+1];
 	u32 timeouts[ADMIN_COMMAND_MAX+1];
 
diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
index add223aa2fdb..988eb0bc413d 100644
--- a/include/uapi/nvdimm/ocxl-pmem.h
+++ b/include/uapi/nvdimm/ocxl-pmem.h
@@ -66,6 +66,20 @@ struct ioctl_ocxl_pmem_controller_stats {
 	__u64 cache_write_latency; /* nanoseconds */
 };
 
+struct ioctl_ocxl_pmem_eventfd {
+	__s32 eventfd;
+	__u32 reserved;
+};
+
+#ifndef BIT_ULL
+#define BIT_ULL(nr)	(1ULL << (nr))
+#endif
+
+#define IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE	BIT_ULL(0)
+#define IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE	BIT_ULL(1)
+#define IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL		BIT_ULL(2)
+#define IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL		BIT_ULL(3)
+
 /* ioctl numbers */
 #define OCXL_PMEM_MAGIC 0x5C
 /* SCM devices */
@@ -74,5 +88,7 @@ struct ioctl_ocxl_pmem_controller_stats {
 #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(OCXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
 #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_MAGIC, 0x04)
 #define IOCTL_OCXL_PMEM_CONTROLLER_STATS		_IO(OCXL_PMEM_MAGIC, 0x05)
+#define IOCTL_OCXL_PMEM_EVENTFD				_IOW(OCXL_PMEM_MAGIC, 0x06, struct ioctl_ocxl_pmem_eventfd)
+#define IOCTL_OCXL_PMEM_EVENT_CHECK			_IOR(OCXL_PMEM_MAGIC, 0x07, __u64)
 
 #endif /* _UAPI_OCXL_SCM_H */
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (19 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-28  6:12   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 22/27] powerpc/powernv/pmem: Implement the heartbeat command Alastair D'Silva
                   ` (6 subsequent siblings)
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

When health & performance data is requested from the controller,
it responds with an error log containing the requested information.

This patch allows the request to me issued via an IOCTL.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c | 16 ++++++++++++++++
 include/uapi/nvdimm/ocxl-pmem.h            |  1 +
 2 files changed, 17 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index e46696d3cc36..081883a8247a 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -1000,6 +1000,18 @@ static int ioctl_event_check(struct ocxlpmem *ocxlpmem, u64 __user *uarg)
 	return rc;
 }
 
+/**
+ * req_controller_health_perf() - Request controller health & performance data
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success, negative on failure
+ */
+int req_controller_health_perf(struct ocxlpmem *ocxlpmem)
+{
+	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
+				      OCXL_LITTLE_ENDIAN,
+				      GLOBAL_MMIO_HCI_REQ_HEALTH_PERF);
+}
+
 static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 {
 	struct ocxlpmem *ocxlpmem = file->private_data;
@@ -1037,6 +1049,10 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 	case IOCTL_OCXL_PMEM_EVENT_CHECK:
 		rc = ioctl_event_check(ocxlpmem, (u64 __user *)args);
 		break;
+
+	case IOCTL_OCXL_PMEM_REQUEST_HEALTH:
+		rc = req_controller_health_perf(ocxlpmem);
+		break;
 	}
 
 	return rc;
diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
index 988eb0bc413d..0d03abb44001 100644
--- a/include/uapi/nvdimm/ocxl-pmem.h
+++ b/include/uapi/nvdimm/ocxl-pmem.h
@@ -90,5 +90,6 @@ struct ioctl_ocxl_pmem_eventfd {
 #define IOCTL_OCXL_PMEM_CONTROLLER_STATS		_IO(OCXL_PMEM_MAGIC, 0x05)
 #define IOCTL_OCXL_PMEM_EVENTFD				_IOW(OCXL_PMEM_MAGIC, 0x06, struct ioctl_ocxl_pmem_eventfd)
 #define IOCTL_OCXL_PMEM_EVENT_CHECK			_IOR(OCXL_PMEM_MAGIC, 0x07, __u64)
+#define IOCTL_OCXL_PMEM_REQUEST_HEALTH			_IO(OCXL_PMEM_MAGIC, 0x08)
 
 #endif /* _UAPI_OCXL_SCM_H */
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 22/27] powerpc/powernv/pmem: Implement the heartbeat command
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (20 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-28  6:20   ` Andrew Donnellan
  2020-03-04 14:25   ` Frederic Barrat
  2020-02-21  3:27 ` [PATCH v3 23/27] powerpc/powernv/pmem: Add debug IOCTLs Alastair D'Silva
                   ` (5 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

The heartbeat admin command is a simple admin command that exercises
the communication mechanisms within the controller.

This patch issues a heartbeat command to the card during init to ensure
we can communicate with the card's controller.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c | 43 ++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 081883a8247a..e01f6f9fc180 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -306,6 +306,44 @@ static bool is_usable(const struct ocxlpmem *ocxlpmem, bool verbose)
 	return true;
 }
 
+/**
+ * heartbeat() - Issue a heartbeat command to the controller
+ * @ocxlpmem: the device metadata
+ * Return: 0 if the controller responded correctly, negative on error
+ */
+static int heartbeat(struct ocxlpmem *ocxlpmem)
+{
+	int rc;
+
+	mutex_lock(&ocxlpmem->admin_command.lock);
+
+	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_HEARTBEAT);
+	if (rc)
+		goto out;
+
+	rc = admin_command_execute(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_HEARTBEAT);
+	if (rc < 0) {
+		dev_err(&ocxlpmem->dev, "Heartbeat timeout\n");
+		goto out;
+	}
+
+	rc = admin_response(ocxlpmem);
+	if (rc < 0)
+		goto out;
+	if (rc != STATUS_SUCCESS)
+		warn_status(ocxlpmem, "Unexpected status from heartbeat", rc);
+
+	(void)admin_response_handled(ocxlpmem);
+
+out:
+	mutex_unlock(&ocxlpmem->admin_command.lock);
+	return rc;
+}
+
 /**
  * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem device
  * @ocxlpmem: the device metadata
@@ -1458,6 +1496,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err;
 	}
 
+	if (heartbeat(ocxlpmem)) {
+		dev_err(&pdev->dev, "Heartbeat failed\n");
+		goto err;
+	}
+
 	elapsed = 0;
 	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
 	while (!is_usable(ocxlpmem, false)) {
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 23/27] powerpc/powernv/pmem: Add debug IOCTLs
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (21 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 22/27] powerpc/powernv/pmem: Implement the heartbeat command Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-03-04 15:21   ` Frederic Barrat
  2020-03-05  3:11   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 24/27] powerpc/powernv/pmem: Expose SMART data via ndctl Alastair D'Silva
                   ` (4 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

These IOCTLs provide low level access to the card to aid in debugging
controller/FPGA firmware.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/Kconfig |   6 +
 arch/powerpc/platforms/powernv/pmem/ocxl.c  | 249 ++++++++++++++++++++
 include/uapi/nvdimm/ocxl-pmem.h             |  32 +++
 3 files changed, 287 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/Kconfig b/arch/powerpc/platforms/powernv/pmem/Kconfig
index c5d927520920..3f44429d70c9 100644
--- a/arch/powerpc/platforms/powernv/pmem/Kconfig
+++ b/arch/powerpc/platforms/powernv/pmem/Kconfig
@@ -12,4 +12,10 @@ config OCXL_PMEM
 
 	  Select N if unsure.
 
+config OCXL_PMEM_DEBUG
+	bool "OpenCAPI Persistent Memory debugging"
+	depends on OCXL_PMEM
+	help
+	  Enables low level IOCTLs for OpenCAPI Persistent Memory firmware development
+
 endif
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index e01f6f9fc180..d4ce5e9e0521 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -1050,6 +1050,235 @@ int req_controller_health_perf(struct ocxlpmem *ocxlpmem)
 				      GLOBAL_MMIO_HCI_REQ_HEALTH_PERF);
 }
 
+#ifdef CONFIG_OCXL_PMEM_DEBUG
+/**
+ * enable_fwdebug() - Enable FW debug on the controller
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success, negative on failure
+ */
+static int enable_fwdebug(const struct ocxlpmem *ocxlpmem)
+{
+	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
+				      OCXL_LITTLE_ENDIAN,
+				      GLOBAL_MMIO_HCI_FW_DEBUG);
+}
+
+/**
+ * disable_fwdebug() - Disable FW debug on the controller
+ * @ocxlpmem: the device metadata
+ * Return: 0 on success, negative on failure
+ */
+static int disable_fwdebug(const struct ocxlpmem *ocxlpmem)
+{
+	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCIC,
+				      OCXL_LITTLE_ENDIAN,
+				      GLOBAL_MMIO_HCI_FW_DEBUG);
+}
+
+static int ioctl_fwdebug(struct ocxlpmem *ocxlpmem,
+			     struct ioctl_ocxl_pmem_fwdebug __user *uarg)
+{
+	struct ioctl_ocxl_pmem_fwdebug args;
+	u64 val;
+	int i;
+	int rc;
+
+	if (copy_from_user(&args, uarg, sizeof(args)))
+		return -EFAULT;
+
+	// Buffer size must be a multiple of 8
+	if ((args.buf_size & 0x07))
+		return -EINVAL;
+
+	if (args.buf_size > ocxlpmem->admin_command.data_size)
+		return -EINVAL;
+
+	mutex_lock(&ocxlpmem->admin_command.lock);
+
+	rc = enable_fwdebug(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_FW_DEBUG);
+	if (rc)
+		goto out;
+
+	// Write DebugAction & FunctionCode
+	val = ((u64)args.debug_action << 56) | ((u64)args.function_code << 40);
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
+				      ocxlpmem->admin_command.request_offset + 0x08,
+				      OCXL_LITTLE_ENDIAN, val);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
+				      ocxlpmem->admin_command.request_offset + 0x10,
+				      OCXL_LITTLE_ENDIAN, args.debug_parameter_1);
+	if (rc)
+		goto out;
+
+	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
+				      ocxlpmem->admin_command.request_offset + 0x18,
+				      OCXL_LITTLE_ENDIAN, args.debug_parameter_2);
+	if (rc)
+		goto out;
+
+	for (i = 0x20; i < 0x38; i += 0x08)
+		rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
+					      ocxlpmem->admin_command.request_offset + i,
+					      OCXL_LITTLE_ENDIAN, 0);
+	if (rc)
+		goto out;
+
+
+	// Populate admin command buffer
+	if (args.buf_size) {
+		for (i = 0; i < args.buf_size; i += sizeof(u64)) {
+			u64 val;
+
+			if (copy_from_user(&val, &args.buf[i], sizeof(u64)))
+				return -EFAULT;
+
+			rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
+						      ocxlpmem->admin_command.data_offset + i,
+						      OCXL_HOST_ENDIAN, val);
+			if (rc)
+				goto out;
+		}
+	}
+
+	rc = admin_command_execute(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = admin_command_complete_timeout(ocxlpmem,
+					    ocxlpmem->timeouts[ADMIN_COMMAND_FW_DEBUG]);
+	if (rc < 0)
+		goto out;
+
+	rc = admin_response(ocxlpmem);
+	if (rc < 0)
+		goto out;
+	if (rc != STATUS_SUCCESS) {
+		warn_status(ocxlpmem, "Unexpected status from FW Debug", rc);
+		goto out;
+	}
+
+	if (args.buf_size) {
+		for (i = 0; i < args.buf_size; i += sizeof(u64)) {
+			u64 val;
+
+			rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+						     ocxlpmem->admin_command.data_offset + i,
+						     OCXL_HOST_ENDIAN, &val);
+			if (rc)
+				goto out;
+
+			if (copy_to_user(&args.buf[i], &val, sizeof(u64))) {
+				rc = -EFAULT;
+				goto out;
+			}
+		}
+	}
+
+	rc = admin_response_handled(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = disable_fwdebug(ocxlpmem);
+	if (rc)
+		goto out;
+
+out:
+	mutex_unlock(&ocxlpmem->admin_command.lock);
+	return rc;
+}
+
+static int ioctl_shutdown(struct ocxlpmem *ocxlpmem)
+{
+	int rc;
+
+	mutex_lock(&ocxlpmem->admin_command.lock);
+
+	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_SHUTDOWN);
+	if (rc)
+		goto out;
+
+	rc = admin_command_execute(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_SHUTDOWN);
+	if (rc < 0) {
+		dev_warn(&ocxlpmem->dev, "Shutdown timed out\n");
+		goto out;
+	}
+
+	rc = 0;
+	goto out;
+
+out:
+	mutex_unlock(&ocxlpmem->admin_command.lock);
+	return rc;
+}
+
+static int ioctl_mmio_write(struct ocxlpmem *ocxlpmem,
+				struct ioctl_ocxl_pmem_mmio __user *uarg)
+{
+	struct scm_ioctl_mmio args;
+
+	if (copy_from_user(&args, uarg, sizeof(args)))
+		return -EFAULT;
+
+	return ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, args.address,
+					OCXL_LITTLE_ENDIAN, args.val);
+}
+
+static int ioctl_mmio_read(struct ocxlpmem *ocxlpmem,
+				     struct ioctl_ocxl_pmem_mmio __user *uarg)
+{
+	struct ioctl_ocxl_pmem_mmio args;
+	int rc;
+
+	if (copy_from_user(&args, uarg, sizeof(args)))
+		return -EFAULT;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, args.address,
+				     OCXL_LITTLE_ENDIAN, &args.val);
+	if (rc)
+		return rc;
+
+	if (copy_to_user(uarg, &args, sizeof(args)))
+		return -EFAULT;
+
+	return 0;
+}
+#else /* CONFIG_OCXL_PMEM_DEBUG */
+static int ioctl_fwdebug(struct ocxlpmem *ocxlpmem,
+			     struct ioctl_ocxl_pmem_fwdebug __user *uarg)
+{
+	return -EPERM;
+}
+
+static int ioctl_shutdown(struct ocxlpmem *ocxlpmem)
+{
+	return -EPERM;
+}
+
+static int ioctl_mmio_write(struct ocxlpmem *ocxlpmem,
+				struct ioctl_ocxl_pmem_mmio __user *uarg)
+{
+	return -EPERM;
+}
+
+static int ioctl_mmio_read(struct ocxlpmem *ocxlpmem,
+			       struct ioctl_ocxl_pmem_mmio __user *uarg)
+{
+	return -EPERM;
+}
+#endif /* CONFIG_OCXL_PMEM_DEBUG */
+
 static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 {
 	struct ocxlpmem *ocxlpmem = file->private_data;
@@ -1091,6 +1320,26 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
 	case IOCTL_OCXL_PMEM_REQUEST_HEALTH:
 		rc = req_controller_health_perf(ocxlpmem);
 		break;
+
+	case IOCTL_OCXL_PMEM_FWDEBUG:
+		rc = ioctl_fwdebug(ocxlpmem,
+				   (struct ioctl_ocxl_pmem_fwdebug __user *)args);
+		break;
+
+	case IOCTL_OCXL_PMEM_SHUTDOWN:
+		rc = ioctl_shutdown(ocxlpmem);
+		break;
+
+	case IOCTL_OCXL_PMEM_MMIO_WRITE:
+		rc = ioctl_mmio_write(ocxlpmem,
+				      (struct ioctl_ocxl_pmem_mmio __user *)args);
+		break;
+
+	case IOCTL_OCXL_PMEM_MMIO_READ:
+		rc = ioctl_mmio_read(ocxlpmem,
+				     (struct ioctl_ocxl_pmem_mmio __user *)args);
+		break;
+
 	}
 
 	return rc;
diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
index 0d03abb44001..e20a4f8be82a 100644
--- a/include/uapi/nvdimm/ocxl-pmem.h
+++ b/include/uapi/nvdimm/ocxl-pmem.h
@@ -6,6 +6,28 @@
 #include <linux/types.h>
 #include <linux/ioctl.h>
 
+enum ocxlpmem_fwdebug_action {
+	OCXL_PMEM_FWDEBUG_READ_CONTROLLER_MEMORY = 0x01,
+	OCXL_PMEM_FWDEBUG_WRITE_CONTROLLER_MEMORY = 0x02,
+	OCXL_PMEM_FWDEBUG_ENABLE_FUNCTION = 0x03,
+	OCXL_PMEM_FWDEBUG_DISABLE_FUNCTION = 0x04,
+	OCXL_PMEM_FWDEBUG_GET_PEL = 0x05, // Retrieve Persistent Error Log
+};
+
+struct ioctl_ocxl_pmem_buffer_info {
+	__u32	admin_command_buffer_size; // out
+	__u32	near_storage_buffer_size; // out
+};
+
+struct ioctl_ocxl_pmem_fwdebug { // All args are inputs
+	enum ocxlpmem_fwdebug_action debug_action;
+	__u16 function_code;
+	__u16 buf_size; // Size of optional data buffer
+	__u64 debug_parameter_1;
+	__u64 debug_parameter_2;
+	__u8 *buf; // Pointer to optional in/out data buffer
+};
+
 #define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
 #define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
 #define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
@@ -66,6 +88,11 @@ struct ioctl_ocxl_pmem_controller_stats {
 	__u64 cache_write_latency; /* nanoseconds */
 };
 
+struct ioctl_ocxl_pmem_mmio {
+	__u64 address; /* Offset in global MMIO space */
+	__u64 val; /* value to write/was read */
+};
+
 struct ioctl_ocxl_pmem_eventfd {
 	__s32 eventfd;
 	__u32 reserved;
@@ -92,4 +119,9 @@ struct ioctl_ocxl_pmem_eventfd {
 #define IOCTL_OCXL_PMEM_EVENT_CHECK			_IOR(OCXL_PMEM_MAGIC, 0x07, __u64)
 #define IOCTL_OCXL_PMEM_REQUEST_HEALTH			_IO(OCXL_PMEM_MAGIC, 0x08)
 
+#define IOCTL_OCXL_PMEM_FWDEBUG		_IOWR(OCXL_PMEM_MAGIC, 0xf0, struct ioctl_ocxl_pmem_fwdebug)
+#define IOCTL_OCXL_PMEM_MMIO_WRITE	_IOW(OCXL_PMEM_MAGIC, 0xf1, struct ioctl_ocxl_pmem_mmio)
+#define IOCTL_OCXL_PMEM_MMIO_READ	_IOWR(OCXL_PMEM_MAGIC, 0xf2, struct ioctl_ocxl_pmem_mmio)
+#define IOCTL_OCXL_PMEM_SHUTDOWN	_IO(OCXL_PMEM_MAGIC, 0xf3)
+
 #endif /* _UAPI_OCXL_SCM_H */
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 24/27] powerpc/powernv/pmem: Expose SMART data via ndctl
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (22 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 23/27] powerpc/powernv/pmem: Add debug IOCTLs Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-03-04 15:40   ` Frederic Barrat
  2020-03-05  3:36   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs Alastair D'Silva
                   ` (3 subsequent siblings)
  27 siblings, 2 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This patch retrieves proprietary formatted SMART data and makes it
available via ndctl. A later contribution will be made to ndctl to
parse this data.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl.c    | 128 ++++++++++++++++++
 .../platforms/powernv/pmem/ocxl_internal.h    |  18 +++
 include/uapi/linux/ndctl.h                    |   1 +
 3 files changed, 147 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index d4ce5e9e0521..5cd1b6d78dd6 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -81,6 +81,129 @@ static int ndctl_config_size(struct nd_cmd_get_config_size *command)
 	return 0;
 }
 
+/**
+ * smart_header_parse() - Parse the first 64 bits of the SMART admin command response
+ * @ocxlpmem: the device metadata
+ * @length: out, returns the number of bytes in the response (excluding the 64 bit header)
+ */
+static int smart_header_parse(struct ocxlpmem *ocxlpmem, u32 *length)
+{
+	int rc;
+	u64 val;
+
+	u16 data_identifier;
+	u32 data_length;
+
+	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+				     ocxlpmem->admin_command.data_offset,
+				     OCXL_LITTLE_ENDIAN, &val);
+	if (rc)
+		return rc;
+
+	data_identifier = val >> 48;
+	data_length = val & 0xFFFFFFFF;
+
+	if (data_identifier != 0x534D) { // 'SM'
+		dev_err(&ocxlpmem->dev,
+			"Bad data identifier for smart data, expected 'SM', got '%-.*s'\n",
+			2, (char *)&data_identifier);
+		return -EINVAL;
+	}
+
+	*length = data_length;
+	return 0;
+}
+
+static int ndctl_smart(struct ocxlpmem *ocxlpmem, struct nd_cmd_pkg *pkg)
+{
+	u32 length, i;
+	struct nd_ocxl_smart *out;
+	int rc;
+
+	mutex_lock(&ocxlpmem->admin_command.lock);
+
+	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_SMART);
+	if (rc)
+		goto out;
+
+	rc = admin_command_execute(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_SMART);
+	if (rc < 0) {
+		dev_err(&ocxlpmem->dev, "SMART timeout\n");
+		goto out;
+	}
+
+	rc = admin_response(ocxlpmem);
+	if (rc < 0)
+		goto out;
+	if (rc != STATUS_SUCCESS) {
+		warn_status(ocxlpmem, "Unexpected status from SMART", rc);
+		goto out;
+	}
+
+	rc = smart_header_parse(ocxlpmem, &length);
+	if (rc)
+		goto out;
+
+	pkg->nd_fw_size = length;
+
+	length = min(length, pkg->nd_size_out); // bytes
+	out = (struct nd_ocxl_smart *)pkg->nd_payload;
+	// Each SMART attribute is 2 * 64 bits
+	out->count = length / (2 * sizeof(u64)); // attributes
+
+	for (i = 0; i < length; i += sizeof(u64)) {
+		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
+					     ocxlpmem->admin_command.data_offset + sizeof(u64) + i,
+					     OCXL_LITTLE_ENDIAN,
+					     &out->attribs[i/sizeof(u64)]);
+		if (rc)
+			goto out;
+	}
+
+	rc = admin_response_handled(ocxlpmem);
+	if (rc)
+		goto out;
+
+	rc = 0;
+	goto out;
+
+out:
+	mutex_unlock(&ocxlpmem->admin_command.lock);
+	return rc;
+}
+
+static int ndctl_call(struct ocxlpmem *ocxlpmem, void *buf, unsigned int buf_len)
+{
+	struct nd_cmd_pkg *pkg = buf;
+
+	if (buf_len < sizeof(struct nd_cmd_pkg)) {
+		dev_err(&ocxlpmem->dev, "Invalid ND_CALL size=%u\n", buf_len);
+		return -EINVAL;
+	}
+
+	if (pkg->nd_family != NVDIMM_FAMILY_OCXL) {
+		dev_err(&ocxlpmem->dev, "Invalid ND_CALL family=0x%llx\n", pkg->nd_family);
+		return -EINVAL;
+	}
+
+	switch (pkg->nd_command) {
+	case ND_CMD_OCXL_SMART:
+		ndctl_smart(ocxlpmem, pkg);
+		break;
+
+	default:
+		dev_err(&ocxlpmem->dev, "Invalid ND_CALL command=0x%llx\n", pkg->nd_command);
+		return -EINVAL;
+	}
+
+
+	return 0;
+}
+
 static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
 		 struct nvdimm *nvdimm,
 		 unsigned int cmd, void *buf, unsigned int buf_len, int *cmd_rc)
@@ -88,6 +211,10 @@ static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
 	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct ocxlpmem, bus_desc);
 
 	switch (cmd) {
+	case ND_CMD_CALL:
+		*cmd_rc = ndctl_call(ocxlpmem, buf, buf_len);
+		return 0;
+
 	case ND_CMD_GET_CONFIG_SIZE:
 		*cmd_rc = ndctl_config_size(buf);
 		return 0;
@@ -171,6 +298,7 @@ static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
 	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
 	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
 	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
+	set_bit(ND_CMD_CALL, &nvdimm_cmd_mask);
 
 	set_bit(NDD_ALIASING, &nvdimm_flags);
 
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index 927690f4888f..0eb7a35d24ae 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -7,6 +7,7 @@
 #include <linux/libnvdimm.h>
 #include <uapi/nvdimm/ocxl-pmem.h>
 #include <linux/mm.h>
+#include <linux/ndctl.h>
 
 #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
 #define DEFAULT_TIMEOUT 100
@@ -98,6 +99,23 @@ struct ocxlpmem_function0 {
 	struct ocxl_fn *ocxl_fn;
 };
 
+struct nd_ocxl_smart {
+	__u8 count;
+	__u8 reserved[7];
+	__u64 attribs[0];
+} __packed;
+
+struct nd_pkg_ocxl {
+	struct nd_cmd_pkg gen;
+	union {
+		struct nd_ocxl_smart smart;
+	};
+};
+
+enum nd_cmd_ocxl {
+	ND_CMD_OCXL_SMART = 1,
+};
+
 struct ocxlpmem {
 	struct device dev;
 	struct pci_dev *pdev;
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index de5d90212409..2885052e7f40 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -244,6 +244,7 @@ struct nd_cmd_pkg {
 #define NVDIMM_FAMILY_HPE2 2
 #define NVDIMM_FAMILY_MSFT 3
 #define NVDIMM_FAMILY_HYPERV 4
+#define NVDIMM_FAMILY_OCXL 6
 
 #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
 					struct nd_cmd_pkg)
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (23 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 24/27] powerpc/powernv/pmem: Expose SMART data via ndctl Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-28  6:25   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 26/27] powerpc/powernv/pmem: Expose the firmware version " Alastair D'Silva
                   ` (2 subsequent siblings)
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This information will be used by ndctl in userspace to help users identify
the device.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/Makefile  |  2 +-
 arch/powerpc/platforms/powernv/pmem/ocxl.c    |  5 +++
 .../platforms/powernv/pmem/ocxl_internal.h    |  6 +++
 .../platforms/powernv/pmem/ocxl_sysfs.c       | 37 +++++++++++++++++++
 4 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c

diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile b/arch/powerpc/platforms/powernv/pmem/Makefile
index 4ceda25907d4..d02870806f30 100644
--- a/arch/powerpc/platforms/powernv/pmem/Makefile
+++ b/arch/powerpc/platforms/powernv/pmem/Makefile
@@ -4,4 +4,4 @@ ccflags-$(CONFIG_PPC_WERROR)	+= -Werror
 
 obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
 
-ocxlpmem-y := ocxl.o ocxl_internal.o
+ocxlpmem-y := ocxl.o ocxl_internal.o ocxl_sysfs.o
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index 5cd1b6d78dd6..ec73713d05ad 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -1878,6 +1878,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err;
 	}
 
+	if (ocxlpmem_sysfs_add(ocxlpmem)) {
+		dev_err(&pdev->dev, "Could not create sysfs entries\n");
+		goto err;
+	}
+
 	elapsed = 0;
 	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
 	while (!is_usable(ocxlpmem, false)) {
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
index 0eb7a35d24ae..12304ceace61 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
@@ -246,3 +246,9 @@ int ns_response_handled(const struct ocxlpmem *ocxlpmem);
  */
 void warn_status(const struct ocxlpmem *ocxlpmem, const char *message,
 		 u8 status);
+
+/**
+ * ocxlpmem_sysfs_add() - Create sysfs entries for an OpenCAPI persistent memory device
+ * @ocxlpmem: the device metadata
+ */
+int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem);
diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c b/arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c
new file mode 100644
index 000000000000..7829e4bc887d
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0+
+// Copyright 2018 IBM Corp.
+
+#include <linux/sysfs.h>
+#include <linux/capability.h>
+#include <linux/limits.h>
+#include <linux/firmware.h>
+#include "ocxl_internal.h"
+
+static ssize_t serial_show(struct device *device, struct device_attribute *attr,
+			   char *buf)
+{
+	struct ocxlpmem *ocxlpmem = container_of(device, struct ocxlpmem, dev);
+	const struct ocxl_fn_config *fn_config = ocxl_function_config(ocxlpmem->ocxl_fn);
+
+	return scnprintf(buf, PAGE_SIZE, "%llu\n", fn_config->serial);
+}
+
+static struct device_attribute attrs[] = {
+	__ATTR_RO(serial),
+};
+
+int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem)
+{
+	int i, rc;
+
+	for (i = 0; i < ARRAY_SIZE(attrs); i++) {
+		rc = device_create_file(&ocxlpmem->dev, &attrs[i]);
+		if (rc) {
+			for (; --i >= 0;)
+				device_remove_file(&ocxlpmem->dev, &attrs[i]);
+
+			return rc;
+		}
+	}
+	return 0;
+}
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 26/27] powerpc/powernv/pmem: Expose the firmware version in sysfs
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (24 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-03-02  7:35   ` Andrew Donnellan
  2020-02-21  3:27 ` [PATCH v3 27/27] MAINTAINERS: Add myself & nvdimm/ocxl to ocxl Alastair D'Silva
  2020-02-21 16:21 ` [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Dan Williams
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

This information will be used by ndctl in userspace to help users identify
the device.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c b/arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c
index 7829e4bc887d..84b23cc3e8b7 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl_sysfs.c
@@ -16,8 +16,17 @@ static ssize_t serial_show(struct device *device, struct device_attribute *attr,
 	return scnprintf(buf, PAGE_SIZE, "%llu\n", fn_config->serial);
 }
 
+static ssize_t fw_version_show(struct device *device,
+			       struct device_attribute *attr, char *buf)
+{
+	struct ocxlpmem *ocxlpmem = container_of(device, struct ocxlpmem, dev);
+
+	return scnprintf(buf, PAGE_SIZE, "%s\n", ocxlpmem->fw_version);
+}
+
 static struct device_attribute attrs[] = {
 	__ATTR_RO(serial),
+	__ATTR_RO(fw_version),
 };
 
 int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem)
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* [PATCH v3 27/27] MAINTAINERS: Add myself & nvdimm/ocxl to ocxl
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (25 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 26/27] powerpc/powernv/pmem: Expose the firmware version " Alastair D'Silva
@ 2020-02-21  3:27 ` Alastair D'Silva
  2020-02-21  5:35   ` Andrew Donnellan
  2020-02-21 16:21 ` [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Dan Williams
  27 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-21  3:27 UTC (permalink / raw)
  To: alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy

From: Alastair D'Silva <alastair@d-silva.org>

The OpenCAPI Persistent Memory driver will be maintained as part ofi
the ppc tree.

I'm also adding myself as an author of the driver & contributor to
the generic ocxl driver.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 MAINTAINERS | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f8670989ec91..3fb9a9f576a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12064,13 +12064,16 @@ F:	tools/objtool/
 OCXL (Open Coherent Accelerator Processor Interface OpenCAPI) DRIVER
 M:	Frederic Barrat <fbarrat@linux.ibm.com>
 M:	Andrew Donnellan <ajd@linux.ibm.com>
+M:	Alastair D'Silva <alastair@d-silva.org>
 L:	linuxppc-dev@lists.ozlabs.org
 S:	Supported
 F:	arch/powerpc/platforms/powernv/ocxl.c
+F:	arch/powerpc/platforms/powernv/pmem/*
 F:	arch/powerpc/include/asm/pnv-ocxl.h
 F:	drivers/misc/ocxl/
 F:	include/misc/ocxl*
 F:	include/uapi/misc/ocxl.h
+F:	include/uapi/nvdimm/ocxl-pmem.h
 F:	Documentation/userspace-api/accelerators/ocxl.rst
 
 OMAP AUDIO SUPPORT
-- 
2.24.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 27/27] MAINTAINERS: Add myself & nvdimm/ocxl to ocxl
  2020-02-21  3:27 ` [PATCH v3 27/27] MAINTAINERS: Add myself & nvdimm/ocxl to ocxl Alastair D'Silva
@ 2020-02-21  5:35   ` Andrew Donnellan
  0 siblings, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-21  5:35 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> The OpenCAPI Persistent Memory driver will be maintained as part ofi
> the ppc tree.
> 
> I'm also adding myself as an author of the driver & contributor to
> the generic ocxl driver.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

You need to update the title of this patch :)

> ---
>   MAINTAINERS | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f8670989ec91..3fb9a9f576a7 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12064,13 +12064,16 @@ F:	tools/objtool/
>   OCXL (Open Coherent Accelerator Processor Interface OpenCAPI) DRIVER
>   M:	Frederic Barrat <fbarrat@linux.ibm.com>
>   M:	Andrew Donnellan <ajd@linux.ibm.com>
> +M:	Alastair D'Silva <alastair@d-silva.org>
>   L:	linuxppc-dev@lists.ozlabs.org
>   S:	Supported
>   F:	arch/powerpc/platforms/powernv/ocxl.c
> +F:	arch/powerpc/platforms/powernv/pmem/*
>   F:	arch/powerpc/include/asm/pnv-ocxl.h
>   F:	drivers/misc/ocxl/
>   F:	include/misc/ocxl*
>   F:	include/uapi/misc/ocxl.h
> +F:	include/uapi/nvdimm/ocxl-pmem.h
>   F:	Documentation/userspace-api/accelerators/ocxl.rst

Should this be part of the ocxl entry or a separate entry? I guess I 
don't care too much either way.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-21  3:26 ` [PATCH v3 04/27] ocxl: Remove unnecessary externs Alastair D'Silva
@ 2020-02-21  6:06   ` Andrew Donnellan
  2020-02-25 13:23   ` Frederic Barrat
  2020-02-26  8:14   ` Baoquan He
  2 siblings, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-21  6:06 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:26 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Function declarations don't need externs, remove the existing ones
> so they are consistent with newer code
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

Acked-by: Andrew Donnellan <ajd@linux.ibm.com>


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 02/27] mm/memory_hotplug: Allow check_hotplug_memory_addressable to be called from drivers
  2020-02-21  3:26 ` [PATCH v3 02/27] mm/memory_hotplug: Allow check_hotplug_memory_addressable to be called from drivers Alastair D'Silva
@ 2020-02-21  7:03   ` Andrew Donnellan
  0 siblings, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-21  7:03 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:26 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> When setting up OpenCAPI connected persistent memory, the range check may
> not be performed until quite late (or perhaps not at all, if the user does
> not establish a DAX device).
> 
> This patch makes the range check callable so we can perform the check while
> probing the OpenCAPI SCM device.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-21  3:26 [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Alastair D'Silva
                   ` (26 preceding siblings ...)
  2020-02-21  3:27 ` [PATCH v3 27/27] MAINTAINERS: Add myself & nvdimm/ocxl to ocxl Alastair D'Silva
@ 2020-02-21 16:21 ` Dan Williams
  2020-02-21 16:24   ` Dan Williams
  2020-02-24  4:34   ` Alastair D'Silva
  27 siblings, 2 replies; 130+ messages in thread
From: Dan Williams @ 2020-02-21 16:21 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: alastair, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashev skiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Thu, Feb 20, 2020 at 7:28 PM Alastair D'Silva <alastair@au1.ibm.com> wrote:
>
> From: Alastair D'Silva <alastair@d-silva.org>
>
> This series adds support for OpenCAPI Persistent Memory devices, exposing
> them as nvdimms so that we can make use of the existing infrastructure.

A single sentence to introduce:

24 files changed, 3029 insertions(+), 97 deletions(-)

...is inadequate. What are OpenCAPI Persistent Memory devices? How do
they compare, in terms relevant to libnvdimm, to other persistent
memory devices? What challenges do they pose to the existing enabling?
What is the overall approach taken with this 27 patch break down? What
are the changes since v2, v1? If you incorporated someone's review
feedback note it in the cover letter changelog, if you didn't
incorporate someone's feedback note that too with an explanation.

In short, provide a bridge document for someone familiar with the
upstream infrastructure, but not necessarily steeped in powernv /
OpenCAPI platform details, to get started with this code.

For now, no need to resend the whole series, just reply to this
message with a fleshed out cover letter and then incorporate it going
forward for v4+.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-21 16:21 ` [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Dan Williams
@ 2020-02-21 16:24   ` Dan Williams
  2020-02-24  4:34   ` Alastair D'Silva
  1 sibling, 0 replies; 130+ messages in thread
From: Dan Williams @ 2020-02-21 16:24 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: alastair, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashev skiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Fri, Feb 21, 2020 at 8:21 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, Feb 20, 2020 at 7:28 PM Alastair D'Silva <alastair@au1.ibm.com> wrote:
> >
> > From: Alastair D'Silva <alastair@d-silva.org>
> >
> > This series adds support for OpenCAPI Persistent Memory devices, exposing
> > them as nvdimms so that we can make use of the existing infrastructure.
>
> A single sentence to introduce:
>
> 24 files changed, 3029 insertions(+), 97 deletions(-)
>
> ...is inadequate. What are OpenCAPI Persistent Memory devices? How do
> they compare, in terms relevant to libnvdimm, to other persistent
> memory devices? What challenges do they pose to the existing enabling?
> What is the overall approach taken with this 27 patch break down? What
> are the changes since v2, v1? If you incorporated someone's review
> feedback note it in the cover letter changelog, if you didn't

Assumptions and tradeoffs the implementation considered are also
critical for reviewing the approach.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 05/27] ocxl: Address kernel doc errors & warnings
  2020-02-21  3:26 ` [PATCH v3 05/27] ocxl: Address kernel doc errors & warnings Alastair D'Silva
@ 2020-02-24  2:11   ` Andrew Donnellan
  0 siblings, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-24  2:11 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:26 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch addresses warnings and errors from the kernel doc scripts for
> the OpenCAPI driver.
> 
> It also makes minor tweaks to make the docs more consistent.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

Looks good, fixes all the kerneldoc warnings I get.

Acked-by: Andrew Donnellan <ajd@linux.ibm.com>


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory
  2020-02-21  3:26 ` [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory Alastair D'Silva
@ 2020-02-24  2:51   ` Andrew Donnellan
  2020-02-24  5:49     ` Andrew Donnellan
  2020-02-25 10:02   ` Frederic Barrat
  2020-03-03  6:10   ` Andrew Donnellan
  2 siblings, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-24  2:51 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:26 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch adds platform support to map & release LPC memory.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

Nothing seems obviously wrong here.

Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-21 16:21 ` [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices Dan Williams
  2020-02-21 16:24   ` Dan Williams
@ 2020-02-24  4:34   ` Alastair D'Silva
  2020-02-24  4:37     ` Matthew Wilcox
  1 sibling, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-24  4:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy, Linux Kernel Mailing List,
	linuxppc-dev, linux-nvdimm, Linux MM

On Fri, 2020-02-21 at 08:21 -0800, Dan Williams wrote:
> On Thu, Feb 20, 2020 at 7:28 PM Alastair D'Silva <
> alastair@au1.ibm.com> wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This series adds support for OpenCAPI Persistent Memory devices,
> > exposing
> > them as nvdimms so that we can make use of the existing
> > infrastructure.
> 
> A single sentence to introduce:
> 
> 24 files changed, 3029 insertions(+), 97 deletions(-)
> 
> ...is inadequate. What are OpenCAPI Persistent Memory devices? How do
> they compare, in terms relevant to libnvdimm, to other persistent
> memory devices? What challenges do they pose to the existing
> enabling?
> What is the overall approach taken with this 27 patch break down?
> What
> are the changes since v2, v1? If you incorporated someone's review
> feedback note it in the cover letter changelog, if you didn't
> incorporate someone's feedback note that too with an explanation.
> 
> In short, provide a bridge document for someone familiar with the
> upstream infrastructure, but not necessarily steeped in powernv /
> OpenCAPI platform details, to get started with this code.
> 
> For now, no need to resend the whole series, just reply to this
> message with a fleshed out cover letter and then incorporate it going
> forward for v4+.


Apologies, I was maintaining a changelog, and forgot to include it.
I'll flesh out the cover letter too:

This series adds support for OpenCAPI Persistent Memory devices on bare
metal (arch/powernv), exposing them as nvdimms so that we can make use
of the existing infrastructure. There already exists a driver for the
same devices abstracted through PowerVM (arch/pseries):
arch/powerpc/platforms/pseries/papr_scm.c

These devices are connected via OpenCAPI, and present as LPC (lowest
coherence point) memory to the system, practically, that means that
memory on these cards could be treated as conventional, cache-coherent
memory.

Since the devices are connected via OpenCAPI, they are not enumerated
via ACPI. Instead, OpenCAPI links present as pseudo-PCI bridges, with
devices below them.

This series introduces a driver that exposes the memory on these cards
as nvdimms, with each card getting it's own bus. This is somewhat
complicated by the fact that the cards do not have out of band
persistent storage for metadata, so 1 SECTION_SIZE's (see SPARSEMEM)
worth of storage is carved out of the top of the card storage to
implement the ndctl_config_* calls.

The driver is not responsible for configuring the NPU (NVLink
Processing Unit) BARs to map the LPC memory from the card into the
system's physical address space, instead, it requests this to be done
via OPAL calls (typically implemented by Skiboot).

The series is structured as follows:
 - Required infrastructure changes & cleanup
 - A minimal driver implementation
 - Implementing additional features within the driver

V3:
  - Rebase against next/next-20200220
  - Move driver to arch/powerpc/platforms/powernv, we now expect this
    driver to go upstream via the powerpc tree
  - "nvdimm/ocxl: Implement the Read Error Log command"
	- Fix bad header path
  - "nvdimm/ocxl: Read the capability registers & wait for device
ready"
	- Fix overlapping masks between readiness_timeout &
memory_available_timeout
  - "nvdimm: Add driver for OpenCAPI Storage Class Memory"
	- Address minor review comments from Jonathan Cameron
	- Remove attributes
	- Default to module if building LIBNVDIMM
	- Propogate errors up from called functions in probe()
  - "nvdimm/ocxl: Expose SMART data via ndctl"
	- Pack attributes in struct
	- Support different size SMART buffers for compatibility with
newer
	  ndctls that may want more SMART attribs than we provide
	- Rework to to use ND_CMD_CALL instead of ND_CMD_SMART
  - drop "ocxl: Free detached contexts in ocxl_context_detach_all()"
  - "powerpc: Map & release OpenCAPI LPC memory"
	- Remove 'extern'
	- Only available with CONFIG_MEMORY_HOTPLUG_SPARSE
  - "ocxl: Tally up the LPC memory on a link & allow it to be mapped"
	- Address minor review comments from Jonathan Cameron
  - "ocxl: Add functions to map/unmap LPC memory"
	- Split detected memory message into a separate patch
	- Address minor review comments from Jonathan Cameron
	- Add a comment explaining why unmap_lpc_mem is in
deconfigure_afu
  - "nvdimm/ocxl: Add support for Admin commands"
	- use sizeof(u64) rather than 0x08 when iterating u64s
  - "nvdimm/ocxl: Implement the heartbeat command"
	- Fix typo in blurb
  - Address kernel doc issues
  - Ensure all uapi headers use C89 compatible comments
  - Drop patches for firmware update & overwrite, these will be
    submitted later once patches are available for ndctl
  - Rename SCM to OpenCAPI Persistent Memory

V2:
  - "powerpc: Map & release OpenCAPI LPC memory"
      - Fix #if -> #ifdef
      - use pci_dev_id to get the bdfn
      - use __be64 to hold be data
      - indent check_hotplug_memory_addressable correctly 
      - Remove export of check_hotplug_memory_addressable
  - "ocxl: Conditionally bind SCM devices to the generic OCXL driver"
      - Improve patch description and remove redundant default
  - "nvdimm: Add driver for OpenCAPI Storage Class Memory"
      - Mark a few funcs as static as identified by the 0day bot
      - Add OCXL dependancies to OCXL_SCM
      - Use memcpy_mcsafe in scm_ndctl_config_read
      - Rename scm_foo_offset_0x00 to scm_foo_header_parse & add docs
      - Name DIMM attribs "ocxl" rather than "scm"
      - Split out into base + many feature patches
  - "powerpc: Enable OpenCAPI Storage Class Memory driver on bare
metal"
      - Build DEV_DAX & friends as modules
  - "ocxl: Conditionally bind SCM devices to the generic OCXL driver"
      - Patch dropped (easy enough to maintain this out of tree for
development)
  - "ocxl: Tally up the LPC memory on a link & allow it to be mapped"
      - Add a warning if an unmatched lpc_release is called
  - "ocxl: Add functions to map/unmap LPC memory"
      - Use EXPORT_SYMBOL_GPL

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-24  4:34   ` Alastair D'Silva
@ 2020-02-24  4:37     ` Matthew Wilcox
  2020-02-24  4:42       ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Matthew Wilcox @ 2020-02-24  4:37 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy, Linux Kernel Mailing List,
	linuxppc-dev, linux-nvdimm, Linux MM

On Mon, Feb 24, 2020 at 03:34:07PM +1100, Alastair D'Silva wrote:
> V3:
>   - Rebase against next/next-20200220
>   - Move driver to arch/powerpc/platforms/powernv, we now expect this
>     driver to go upstream via the powerpc tree

That's rather the opposite direction of normal; mostly drivers live under
drivers/ and not in arch/.  It's easier for drivers to get overlooked
when doing tree-wide changes if they're hiding.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-24  4:37     ` Matthew Wilcox
@ 2020-02-24  4:42       ` Alastair D'Silva
  2020-02-24  6:51         ` Oliver O'Halloran
  0 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-24  4:42 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy, Linux Kernel Mailing List,
	linuxppc-dev, linux-nvdimm, Linux MM

On Sun, 2020-02-23 at 20:37 -0800, Matthew Wilcox wrote:
> On Mon, Feb 24, 2020 at 03:34:07PM +1100, Alastair D'Silva wrote:
> > V3:
> >   - Rebase against next/next-20200220
> >   - Move driver to arch/powerpc/platforms/powernv, we now expect
> > this
> >     driver to go upstream via the powerpc tree
> 
> That's rather the opposite direction of normal; mostly drivers live
> under
> drivers/ and not in arch/.  It's easier for drivers to get overlooked
> when doing tree-wide changes if they're hiding.

This is true, however, given that it was not all that desirable to have
it under drivers/nvdimm, it's sister driver (for the same hardware) is
also under arch, and that we don't expect this driver to be used on any
platform other than powernv, we think this was the most reasonable
place to put it.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped
  2020-02-21  3:26 ` [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped Alastair D'Silva
@ 2020-02-24  5:25   ` Andrew Donnellan
  2020-02-24  5:36     ` Alastair D'Silva
  2020-02-25 16:30   ` Frederic Barrat
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-24  5:25 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:26 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Tally up the LPC memory on an OpenCAPI link & allow it to be mapped
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

This commit message is a bit short and could do with some further 
explanation.

In particular - it's worth explaining why the tracking of available LPC 
memory needs to be done at a link level, because a single OpenCAPI card 
can have multiple PCI functions, each with multiple AFUs which define an 
amount of LPC memory they have, even if the common case is expected to 
be a single function with a single AFU and thus one LPC area per link.

Snowpatch has a few checkpatch issues to report:

https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/11800//artifact/linux/checkpatch.log

The code generally looks okay to me.

> diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h
> index 198e4e4bc51d..d0c8c4838f42 100644
> --- a/drivers/misc/ocxl/ocxl_internal.h
> +++ b/drivers/misc/ocxl/ocxl_internal.h
> @@ -142,4 +142,37 @@ int ocxl_irq_offset_to_id(struct ocxl_context *ctx, u64 offset);
>   u64 ocxl_irq_id_to_offset(struct ocxl_context *ctx, int irq_id);
>   void ocxl_afu_irq_free_all(struct ocxl_context *ctx);
>   
> +/**
> + * ocxl_link_add_lpc_mem() - Increment the amount of memory required by an OpenCAPI link
> + *
> + * @link_handle: The OpenCAPI link handle
> + * @offset: The offset of the memory to add
> + * @size: The amount of memory to increment by
> + *
> + * Returns 0 on success, negative on overflow
> + */

I think "amount of memory required" isn't the best way to express this.

Might as well explicitly say -EINVAL on overflow.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped
  2020-02-24  5:25   ` Andrew Donnellan
@ 2020-02-24  5:36     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-24  5:36 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Mon, 2020-02-24 at 16:25 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:26 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > Tally up the LPC memory on an OpenCAPI link & allow it to be mapped
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> 
> This commit message is a bit short and could do with some further 
> explanation.
> 
> In particular - it's worth explaining why the tracking of available
> LPC 
> memory needs to be done at a link level, because a single OpenCAPI
> card 
> can have multiple PCI functions, each with multiple AFUs which define
> an 
> amount of LPC memory they have, even if the common case is expected
> to 
> be a single function with a single AFU and thus one LPC area per
> link.

Ok

> 
> Snowpatch has a few checkpatch issues to report:
> 
> https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/11800//artifact/linux/checkpatch.log
> 

Gah, I could have sworn I ran checkpatch against this :/

> The code generally looks okay to me.
> 
> > diff --git a/drivers/misc/ocxl/ocxl_internal.h
> > b/drivers/misc/ocxl/ocxl_internal.h
> > index 198e4e4bc51d..d0c8c4838f42 100644
> > --- a/drivers/misc/ocxl/ocxl_internal.h
> > +++ b/drivers/misc/ocxl/ocxl_internal.h
> > @@ -142,4 +142,37 @@ int ocxl_irq_offset_to_id(struct ocxl_context
> > *ctx, u64 offset);
> >   u64 ocxl_irq_id_to_offset(struct ocxl_context *ctx, int irq_id);
> >   void ocxl_afu_irq_free_all(struct ocxl_context *ctx);
> >   
> > +/**
> > + * ocxl_link_add_lpc_mem() - Increment the amount of memory
> > required by an OpenCAPI link
> > + *
> > + * @link_handle: The OpenCAPI link handle
> > + * @offset: The offset of the memory to add
> > + * @size: The amount of memory to increment by
> > + *
> > + * Returns 0 on success, negative on overflow
> > + */
> 
> I think "amount of memory required" isn't the best way to express
> this.
> 
> Might as well explicitly say -EINVAL on overflow.
> 

Ok

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 01/27] powerpc: Add OPAL calls for LPC memory alloc/release
  2020-02-21  3:26 ` [PATCH v3 01/27] powerpc: Add OPAL calls for LPC memory alloc/release Alastair D'Silva
@ 2020-02-24  5:49   ` Andrew Donnellan
  2020-02-24  5:50     ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-24  5:49 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:26 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Add OPAL calls for LPC memory alloc/release
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>

Summary line should be "powerpc/powernv".


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory
  2020-02-24  2:51   ` Andrew Donnellan
@ 2020-02-24  5:49     ` Andrew Donnellan
  0 siblings, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-24  5:49 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 24/2/20 1:51 pm, Andrew Donnellan wrote:
> On 21/2/20 2:26 pm, Alastair D'Silva wrote:
>> From: Alastair D'Silva <alastair@d-silva.org>
>>
>> This patch adds platform support to map & release LPC memory.
>>
>> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> 
> Nothing seems obviously wrong here.
> 
> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>

Oh, commit message nitpick :)

Summary should be powerpc/powernv. Commit message should explain that 
this is for the powernv platform and presents an interface that drivers 
can use to make use of the new OPAL calls.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 01/27] powerpc: Add OPAL calls for LPC memory alloc/release
  2020-02-24  5:49   ` Andrew Donnellan
@ 2020-02-24  5:50     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-24  5:50 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Mon, 2020-02-24 at 16:49 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:26 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > Add OPAL calls for LPC memory alloc/release
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
> > Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
> 
> Summary line should be "powerpc/powernv".
> 
> 

Ok

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 07/27] ocxl: Add functions to map/unmap LPC memory
  2020-02-21  3:27 ` [PATCH v3 07/27] ocxl: Add functions to map/unmap LPC memory Alastair D'Silva
@ 2020-02-24  6:02   ` Andrew Donnellan
  2020-02-24  6:08     ` Alastair D'Silva
  2020-02-25 17:01   ` Frederic Barrat
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-24  6:02 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Add functions to map/unmap LPC memory
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   drivers/misc/ocxl/core.c          | 51 +++++++++++++++++++++++++++++++
>   drivers/misc/ocxl/ocxl_internal.h |  3 ++
>   include/misc/ocxl.h               | 21 +++++++++++++
>   3 files changed, 75 insertions(+)
> 
> diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c
> index 2531c6cf19a0..75ff14e3882a 100644
> --- a/drivers/misc/ocxl/core.c
> +++ b/drivers/misc/ocxl/core.c
> @@ -210,6 +210,56 @@ static void unmap_mmio_areas(struct ocxl_afu *afu)
>   	release_fn_bar(afu->fn, afu->config.global_mmio_bar);
>   }
>   
> +int ocxl_afu_map_lpc_mem(struct ocxl_afu *afu)
> +{
> +	struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent);
> +
> +	if ((afu->config.lpc_mem_size + afu->config.special_purpose_mem_size) == 0)
> +		return 0;

I'd prefer the comparison here to be:

   afu->config.lpc_mem_size == 0 &&
     afu->config.special_purpose_mem_size == 0

so a reader doesn't have to think about what this means.

> +
> +	afu->lpc_base_addr = ocxl_link_lpc_map(afu->fn->link, dev);
> +	if (afu->lpc_base_addr == 0)
> +		return -EINVAL;
> +
> +	if (afu->config.lpc_mem_size > 0) {
> +		afu->lpc_res.start = afu->lpc_base_addr + afu->config.lpc_mem_offset;

Maybe not for this series - hmm, I wonder if we should print a warning 
somewhere (maybe in read_afu_lpc_memory_info()?) if we see the case 
where (lpc_mem_offset > 0 && lpc_mem_size == 0). Likewise for special 
purpose?

> +		afu->lpc_res.end = afu->lpc_res.start + afu->config.lpc_mem_size - 1;
> +	}
> +
> +	if (afu->config.special_purpose_mem_size > 0) {
> +		afu->special_purpose_res.start = afu->lpc_base_addr +
> +						 afu->config.special_purpose_mem_offset;
> +		afu->special_purpose_res.end = afu->special_purpose_res.start +
> +					       afu->config.special_purpose_mem_size - 1;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(ocxl_afu_map_lpc_mem);
> +
> +struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu)
> +{
> +	return &afu->lpc_res;
> +}
> +EXPORT_SYMBOL_GPL(ocxl_afu_lpc_mem);

What's the point of this function? A layer of indirection just in case 
we need it in future?

> +
> +static void unmap_lpc_mem(struct ocxl_afu *afu)
> +{
> +	struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent);
> +
> +	if (afu->lpc_res.start || afu->special_purpose_res.start) {
> +		void *link = afu->fn->link;
> +
> +		// only release the link when the the last consumer calls release
> +		ocxl_link_lpc_release(link, dev);
> +
> +		afu->lpc_res.start = 0;
> +		afu->lpc_res.end = 0;
> +		afu->special_purpose_res.start = 0;
> +		afu->special_purpose_res.end = 0;
> +	}
> +}
> +
>   static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev)
>   {
>   	int rc;
> @@ -251,6 +301,7 @@ static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev)
>   
>   static void deconfigure_afu(struct ocxl_afu *afu)
>   {
> +	unmap_lpc_mem(afu);
>   	unmap_mmio_areas(afu);
>   	reclaim_afu_pasid(afu);
>   	reclaim_afu_actag(afu);
> diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h
> index d0c8c4838f42..ce0cac1da416 100644
> --- a/drivers/misc/ocxl/ocxl_internal.h
> +++ b/drivers/misc/ocxl/ocxl_internal.h
> @@ -52,6 +52,9 @@ struct ocxl_afu {
>   	void __iomem *global_mmio_ptr;
>   	u64 pp_mmio_start;
>   	void *private;
> +	u64 lpc_base_addr; /* Covers both LPC & special purpose memory */
> +	struct resource lpc_res;
> +	struct resource special_purpose_res;
>   };
>   
>   enum ocxl_context_status {
> diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
> index 357ef1aadbc0..d8b0b4d46bfb 100644
> --- a/include/misc/ocxl.h
> +++ b/include/misc/ocxl.h
> @@ -203,6 +203,27 @@ int ocxl_irq_set_handler(struct ocxl_context *ctx, int irq_id,
>   
>   // AFU Metadata
>   
> +/**
> + * ocxl_afu_map_lpc_mem() - Map the LPC system & special purpose memory for an AFU
> + * Do not call this during device discovery, as there may me multiple

be

> + * devices on a link, and the memory is mapped for the whole link, not
> + * just one device. It should only be called after all devices have
> + * registered their memory on the link.
> + *
> + * @afu: The AFU that has the LPC memory to map
> + *
> + * Returns 0 on success, negative on failure
> + */
> +int ocxl_afu_map_lpc_mem(struct ocxl_afu *afu);
> +
> +/**
> + * ocxl_afu_lpc_mem() - Get the physical address range of LPC memory for an AFU
> + * @afu: The AFU associated with the LPC memory
> + *
> + * Returns a pointer to the resource struct for the physical address range
> + */
> +struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu);
> +
>   /**
>    * ocxl_afu_config() - Get a pointer to the config for an AFU
>    * @afu: a pointer to the AFU to get the config for
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected
  2020-02-21  3:27 ` [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected Alastair D'Silva
@ 2020-02-24  6:06   ` Andrew Donnellan
  2020-02-24  6:10     ` Alastair D'Silva
  2020-02-25 17:03   ` Frederic Barrat
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-24  6:06 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch emits a message showing how much LPC memory & special purpose
> memory was detected on an OCXL device.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   drivers/misc/ocxl/config.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
> index a62e3d7db2bf..701ae6216abf 100644
> --- a/drivers/misc/ocxl/config.c
> +++ b/drivers/misc/ocxl/config.c
> @@ -568,6 +568,10 @@ static int read_afu_lpc_memory_info(struct pci_dev *dev,
>   		afu->special_purpose_mem_size =
>   			total_mem_size - lpc_mem_size;
>   	}
> +
> +	dev_info(&dev->dev, "Probed LPC memory of %#llx bytes and special purpose memory of %#llx bytes\n",
> +		afu->lpc_mem_size, afu->special_purpose_mem_size);
> +

Printing this at info level for every single AFU seems a bit noisy. 
Perhaps we can print it only if LPC memory is > 0?

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 07/27] ocxl: Add functions to map/unmap LPC memory
  2020-02-24  6:02   ` Andrew Donnellan
@ 2020-02-24  6:08     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-24  6:08 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Mon, 2020-02-24 at 17:02 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > Add functions to map/unmap LPC memory
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   drivers/misc/ocxl/core.c          | 51
> > +++++++++++++++++++++++++++++++
> >   drivers/misc/ocxl/ocxl_internal.h |  3 ++
> >   include/misc/ocxl.h               | 21 +++++++++++++
> >   3 files changed, 75 insertions(+)
> > 
> > diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c
> > index 2531c6cf19a0..75ff14e3882a 100644
> > --- a/drivers/misc/ocxl/core.c
> > +++ b/drivers/misc/ocxl/core.c
> > @@ -210,6 +210,56 @@ static void unmap_mmio_areas(struct ocxl_afu
> > *afu)
> >   	release_fn_bar(afu->fn, afu->config.global_mmio_bar);
> >   }
> >   
> > +int ocxl_afu_map_lpc_mem(struct ocxl_afu *afu)
> > +{
> > +	struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent);
> > +
> > +	if ((afu->config.lpc_mem_size + afu-
> > >config.special_purpose_mem_size) == 0)
> > +		return 0;
> 
> I'd prefer the comparison here to be:
> 
>    afu->config.lpc_mem_size == 0 &&
>      afu->config.special_purpose_mem_size == 0
> 
> so a reader doesn't have to think about what this means.
> 

Ok

> > +
> > +	afu->lpc_base_addr = ocxl_link_lpc_map(afu->fn->link, dev);
> > +	if (afu->lpc_base_addr == 0)
> > +		return -EINVAL;
> > +
> > +	if (afu->config.lpc_mem_size > 0) {
> > +		afu->lpc_res.start = afu->lpc_base_addr + afu-
> > >config.lpc_mem_offset;
> 
> Maybe not for this series - hmm, I wonder if we should print a
> warning 
> somewhere (maybe in read_afu_lpc_memory_info()?) if we see the case 
> where (lpc_mem_offset > 0 && lpc_mem_size == 0). Likewise for
> special 
> purpose?
> 

Sounds reasonable, might as well add it here since there are other LPC
changes.

> > +		afu->lpc_res.end = afu->lpc_res.start + afu-
> > >config.lpc_mem_size - 1;
> > +	}
> > +
> > +	if (afu->config.special_purpose_mem_size > 0) {
> > +		afu->special_purpose_res.start = afu->lpc_base_addr +
> > +						 afu-
> > >config.special_purpose_mem_offset;
> > +		afu->special_purpose_res.end = afu-
> > >special_purpose_res.start +
> > +					       afu-
> > >config.special_purpose_mem_size - 1;
> > +	}
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(ocxl_afu_map_lpc_mem);
> > +
> > +struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu)
> > +{
> > +	return &afu->lpc_res;
> > +}
> > +EXPORT_SYMBOL_GPL(ocxl_afu_lpc_mem);
> 
> What's the point of this function? A layer of indirection just in
> case 
> we need it in future?
> 

struct ocxl_afu is opaque outsite the ocxl driver.

> > +
> > +static void unmap_lpc_mem(struct ocxl_afu *afu)
> > +{
> > +	struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent);
> > +
> > +	if (afu->lpc_res.start || afu->special_purpose_res.start) {
> > +		void *link = afu->fn->link;
> > +
> > +		// only release the link when the the last consumer
> > calls release
> > +		ocxl_link_lpc_release(link, dev);
> > +
> > +		afu->lpc_res.start = 0;
> > +		afu->lpc_res.end = 0;
> > +		afu->special_purpose_res.start = 0;
> > +		afu->special_purpose_res.end = 0;
> > +	}
> > +}
> > +
> >   static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct
> > pci_dev *dev)
> >   {
> >   	int rc;
> > @@ -251,6 +301,7 @@ static int configure_afu(struct ocxl_afu *afu,
> > u8 afu_idx, struct pci_dev *dev)
> >   
> >   static void deconfigure_afu(struct ocxl_afu *afu)
> >   {
> > +	unmap_lpc_mem(afu);
> >   	unmap_mmio_areas(afu);
> >   	reclaim_afu_pasid(afu);
> >   	reclaim_afu_actag(afu);
> > diff --git a/drivers/misc/ocxl/ocxl_internal.h
> > b/drivers/misc/ocxl/ocxl_internal.h
> > index d0c8c4838f42..ce0cac1da416 100644
> > --- a/drivers/misc/ocxl/ocxl_internal.h
> > +++ b/drivers/misc/ocxl/ocxl_internal.h
> > @@ -52,6 +52,9 @@ struct ocxl_afu {
> >   	void __iomem *global_mmio_ptr;
> >   	u64 pp_mmio_start;
> >   	void *private;
> > +	u64 lpc_base_addr; /* Covers both LPC & special purpose memory
> > */
> > +	struct resource lpc_res;
> > +	struct resource special_purpose_res;
> >   };
> >   
> >   enum ocxl_context_status {
> > diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
> > index 357ef1aadbc0..d8b0b4d46bfb 100644
> > --- a/include/misc/ocxl.h
> > +++ b/include/misc/ocxl.h
> > @@ -203,6 +203,27 @@ int ocxl_irq_set_handler(struct ocxl_context
> > *ctx, int irq_id,
> >   
> >   // AFU Metadata
> >   
> > +/**
> > + * ocxl_afu_map_lpc_mem() - Map the LPC system & special purpose
> > memory for an AFU
> > + * Do not call this during device discovery, as there may me
> > multiple
> 
> be
> 
> > + * devices on a link, and the memory is mapped for the whole link,
> > not
> > + * just one device. It should only be called after all devices
> > have
> > + * registered their memory on the link.
> > + *
> > + * @afu: The AFU that has the LPC memory to map
> > + *
> > + * Returns 0 on success, negative on failure
> > + */
> > +int ocxl_afu_map_lpc_mem(struct ocxl_afu *afu);
> > +
> > +/**
> > + * ocxl_afu_lpc_mem() - Get the physical address range of LPC
> > memory for an AFU
> > + * @afu: The AFU associated with the LPC memory
> > + *
> > + * Returns a pointer to the resource struct for the physical
> > address range
> > + */
> > +struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu);
> > +
> >   /**
> >    * ocxl_afu_config() - Get a pointer to the config for an AFU
> >    * @afu: a pointer to the AFU to get the config for
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected
  2020-02-24  6:06   ` Andrew Donnellan
@ 2020-02-24  6:10     ` Alastair D'Silva
  2020-02-24  6:13       ` Andrew Donnellan
  0 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-24  6:10 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Mon, 2020-02-24 at 17:06 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This patch emits a message showing how much LPC memory & special
> > purpose
> > memory was detected on an OCXL device.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   drivers/misc/ocxl/config.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/misc/ocxl/config.c
> > b/drivers/misc/ocxl/config.c
> > index a62e3d7db2bf..701ae6216abf 100644
> > --- a/drivers/misc/ocxl/config.c
> > +++ b/drivers/misc/ocxl/config.c
> > @@ -568,6 +568,10 @@ static int read_afu_lpc_memory_info(struct
> > pci_dev *dev,
> >   		afu->special_purpose_mem_size =
> >   			total_mem_size - lpc_mem_size;
> >   	}
> > +
> > +	dev_info(&dev->dev, "Probed LPC memory of %#llx bytes and
> > special purpose memory of %#llx bytes\n",
> > +		afu->lpc_mem_size, afu->special_purpose_mem_size);
> > +
> 
> Printing this at info level for every single AFU seems a bit noisy. 
> Perhaps we can print it only if LPC memory is > 0?
> 

There is an early exit before this if there is no LPC memory.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected
  2020-02-24  6:10     ` Alastair D'Silva
@ 2020-02-24  6:13       ` Andrew Donnellan
  0 siblings, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-24  6:13 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 24/2/20 5:10 pm, Alastair D'Silva wrote:
>> Printing this at info level for every single AFU seems a bit noisy.
>> Perhaps we can print it only if LPC memory is > 0?
>>
> 
> There is an early exit before this if there is no LPC memory.
> 

Noted, I'd missed that amidst all the early returns for errors.

In that case

Acked-by: Andrew Donnellan <ajd@linux.ibm.com>

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-24  4:42       ` Alastair D'Silva
@ 2020-02-24  6:51         ` Oliver O'Halloran
  2020-02-26  0:13           ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Oliver O'Halloran @ 2020-02-24  6:51 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Matthew Wilcox, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashevskiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Mon, Feb 24, 2020 at 3:43 PM Alastair D'Silva <alastair@au1.ibm.com> wrote:
>
> On Sun, 2020-02-23 at 20:37 -0800, Matthew Wilcox wrote:
> > On Mon, Feb 24, 2020 at 03:34:07PM +1100, Alastair D'Silva wrote:
> > > V3:
> > >   - Rebase against next/next-20200220
> > >   - Move driver to arch/powerpc/platforms/powernv, we now expect
> > > this
> > >     driver to go upstream via the powerpc tree
> >
> > That's rather the opposite direction of normal; mostly drivers live
> > under
> > drivers/ and not in arch/.  It's easier for drivers to get overlooked
> > when doing tree-wide changes if they're hiding.
>
> This is true, however, given that it was not all that desirable to have
> it under drivers/nvdimm, it's sister driver (for the same hardware) is
> also under arch, and that we don't expect this driver to be used on any
> platform other than powernv, we think this was the most reasonable
> place to put it.

Historically powernv specific platform drivers go in their respective
subsystem trees rather than in arch/ and I'd prefer we kept it that
way. When I added the papr_scm driver I put it in the pseries platform
directory because most of the pseries paravirt code lives there for
some reason; I don't know why. Luckily for me that followed the same
model that Dan used when he put the NFIT driver in drivers/acpi/ and
the libnvdimm core in drivers/nvdimm/ so we didn't have anything to
argue about. However, as Matthew pointed out, it is at odds with how
most subsystems operate. Is there any particular reason we're doing
things this way or should we think about moving libnvdimm users to
drivers/nvdimm/?

Oliver
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 11/27] powerpc: Enable the OpenCAPI Persistent Memory driver for powernv_defconfig
  2020-02-21  3:27 ` [PATCH v3 11/27] powerpc: Enable the OpenCAPI Persistent Memory driver for powernv_defconfig Alastair D'Silva
@ 2020-02-25  3:01   ` Andrew Donnellan
  0 siblings, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-25  3:01 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch enables the OpenCAPI Persistent Memory driver, as well
> as DAX support, for the 'powernv' platform.

defconfig, not platform

> 
> DAX is not a strict requirement for the functioning of the driver, but it
> is likely that a user will want to create a DAX device on top of their
> persistent memory device.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

Otherwise

Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory
  2020-02-21  3:26 ` [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory Alastair D'Silva
  2020-02-24  2:51   ` Andrew Donnellan
@ 2020-02-25 10:02   ` Frederic Barrat
  2020-02-26  0:19     ` Alastair D'Silva
  2020-03-03  6:10   ` Andrew Donnellan
  2 siblings, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-02-25 10:02 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:26, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch adds platform support to map & release LPC memory.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/include/asm/pnv-ocxl.h   |  4 +++
>   arch/powerpc/platforms/powernv/ocxl.c | 43 +++++++++++++++++++++++++++
>   2 files changed, 47 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h
> index 7de82647e761..0b2a6707e555 100644
> --- a/arch/powerpc/include/asm/pnv-ocxl.h
> +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> @@ -32,5 +32,9 @@ extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle)
>   
>   extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);
>   extern void pnv_ocxl_free_xive_irq(u32 irq);
> +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
> +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size);
> +void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev);
> +#endif


This breaks the compilation of the ocxl driver if CONFIG_MEMORY_HOTPLUG=n

Those functions still make sense even without memory hotplug, for 
example in the context of the implementation you had to access opencapi 
LPC memory through mmap(). The #ifdef is really needed only around the 
check_hotplug_memory_addressable() call.

   Fred


>   #endif /* _ASM_PNV_OCXL_H */
> diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c
> index 8c65aacda9c8..f2edbcc67361 100644
> --- a/arch/powerpc/platforms/powernv/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/ocxl.c
> @@ -475,6 +475,49 @@ void pnv_ocxl_spa_release(void *platform_data)
>   }
>   EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release);
>   
> +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
> +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size)
> +{
> +	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> +	struct pnv_phb *phb = hose->private_data;
> +	u32 bdfn = pci_dev_id(pdev);
> +	__be64 base_addr_be64;
> +	u64 base_addr;
> +	int rc;
> +
> +	rc = opal_npu_mem_alloc(phb->opal_id, bdfn, size, &base_addr_be64);
> +	if (rc) {
> +		dev_warn(&pdev->dev,
> +			 "OPAL could not allocate LPC memory, rc=%d\n", rc);
> +		return 0;
> +	}
> +
> +	base_addr = be64_to_cpu(base_addr_be64);
> +
> +	rc = check_hotplug_memory_addressable(base_addr >> PAGE_SHIFT,
> +					      size >> PAGE_SHIFT);
> +	if (rc)
> +		return 0;
> +
> +	return base_addr;
> +}
> +EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_setup);
> +
> +void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev)
> +{
> +	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> +	struct pnv_phb *phb = hose->private_data;
> +	u32 bdfn = pci_dev_id(pdev);
> +	int rc;
> +
> +	rc = opal_npu_mem_release(phb->opal_id, bdfn);
> +	if (rc)
> +		dev_warn(&pdev->dev,
> +			 "OPAL reported rc=%d when releasing LPC memory\n", rc);
> +}
> +EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_release);
> +#endif
> +
>   int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle)
>   {
>   	struct spa_data *data = (struct spa_data *) platform_data;
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-21  3:26 ` [PATCH v3 04/27] ocxl: Remove unnecessary externs Alastair D'Silva
  2020-02-21  6:06   ` Andrew Donnellan
@ 2020-02-25 13:23   ` Frederic Barrat
  2020-02-26  8:14   ` Baoquan He
  2 siblings, 0 replies; 130+ messages in thread
From: Frederic Barrat @ 2020-02-25 13:23 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:26, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Function declarations don't need externs, remove the existing ones
> so they are consistent with newer code
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---

Thanks for the cleanup!
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>




>   arch/powerpc/include/asm/pnv-ocxl.h | 32 ++++++++++++++---------------
>   include/misc/ocxl.h                 |  6 +++---
>   2 files changed, 18 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h
> index 0b2a6707e555..b23c99bc0c84 100644
> --- a/arch/powerpc/include/asm/pnv-ocxl.h
> +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> @@ -9,29 +9,27 @@
>   #define PNV_OCXL_TL_BITS_PER_RATE       4
>   #define PNV_OCXL_TL_RATE_BUF_SIZE       ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
>   
> -extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16 *enabled,
> -			u16 *supported);
> -extern int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count);
> +int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16 *enabled, u16 *supported);
> +int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count);
>   
> -extern int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap,
> +int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap,
>   			char *rate_buf, int rate_buf_size);
> -extern int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap,
> +int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap,
>   			uint64_t rate_buf_phys, int rate_buf_size);
>   
> -extern int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq);
> -extern void pnv_ocxl_unmap_xsl_regs(void __iomem *dsisr, void __iomem *dar,
> -				void __iomem *tfc, void __iomem *pe_handle);
> -extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr,
> -				void __iomem **dar, void __iomem **tfc,
> -				void __iomem **pe_handle);
> +int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq);
> +void pnv_ocxl_unmap_xsl_regs(void __iomem *dsisr, void __iomem *dar,
> +			     void __iomem *tfc, void __iomem *pe_handle);
> +int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr,
> +			  void __iomem **dar, void __iomem **tfc,
> +			  void __iomem **pe_handle);
>   
> -extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask,
> -			void **platform_data);
> -extern void pnv_ocxl_spa_release(void *platform_data);
> -extern int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle);
> +int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, void **platform_data);
> +void pnv_ocxl_spa_release(void *platform_data);
> +int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle);
>   
> -extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);
> -extern void pnv_ocxl_free_xive_irq(u32 irq);
> +int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);
> +void pnv_ocxl_free_xive_irq(u32 irq);
>   #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
>   u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size);
>   void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev);
> diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
> index 06dd5839e438..0a762e387418 100644
> --- a/include/misc/ocxl.h
> +++ b/include/misc/ocxl.h
> @@ -173,7 +173,7 @@ int ocxl_context_detach(struct ocxl_context *ctx);
>    *
>    * Returns 0 on success, negative on failure
>    */
> -extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
> +int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
>   
>   /**
>    * Frees an IRQ associated with an AFU context
> @@ -182,7 +182,7 @@ extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
>    *
>    * Returns 0 on success, negative on failure
>    */
> -extern int ocxl_afu_irq_free(struct ocxl_context *ctx, int irq_id);
> +int ocxl_afu_irq_free(struct ocxl_context *ctx, int irq_id);
>   
>   /**
>    * Gets the address of the trigger page for an IRQ
> @@ -193,7 +193,7 @@ extern int ocxl_afu_irq_free(struct ocxl_context *ctx, int irq_id);
>    *
>    * returns the trigger page address, or 0 if the IRQ is not valid
>    */
> -extern u64 ocxl_afu_irq_get_addr(struct ocxl_context *ctx, int irq_id);
> +u64 ocxl_afu_irq_get_addr(struct ocxl_context *ctx, int irq_id);
>   
>   /**
>    * Provide a callback to be called when an IRQ is triggered
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped
  2020-02-21  3:26 ` [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped Alastair D'Silva
  2020-02-24  5:25   ` Andrew Donnellan
@ 2020-02-25 16:30   ` Frederic Barrat
  2020-02-26  0:29     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-02-25 16:30 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:26, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Tally up the LPC memory on an OpenCAPI link & allow it to be mapped
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   drivers/misc/ocxl/core.c          | 10 ++++++
>   drivers/misc/ocxl/link.c          | 53 +++++++++++++++++++++++++++++++
>   drivers/misc/ocxl/ocxl_internal.h | 33 +++++++++++++++++++
>   3 files changed, 96 insertions(+)
> 
> diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c
> index b7a09b21ab36..2531c6cf19a0 100644
> --- a/drivers/misc/ocxl/core.c
> +++ b/drivers/misc/ocxl/core.c
> @@ -230,8 +230,18 @@ static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev)
>   	if (rc)
>   		goto err_free_pasid;
>   
> +	if (afu->config.lpc_mem_size || afu->config.special_purpose_mem_size) {
> +		rc = ocxl_link_add_lpc_mem(afu->fn->link, afu->config.lpc_mem_offset,
> +					   afu->config.lpc_mem_size +
> +					   afu->config.special_purpose_mem_size);
> +		if (rc)
> +			goto err_free_mmio;
> +	}
> +
>   	return 0;
>   
> +err_free_mmio:
> +	unmap_mmio_areas(afu);
>   err_free_pasid:
>   	reclaim_afu_pasid(afu);
>   err_free_actag:
> diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
> index 58d111afd9f6..1e039cc5ebe5 100644
> --- a/drivers/misc/ocxl/link.c
> +++ b/drivers/misc/ocxl/link.c
> @@ -84,6 +84,11 @@ struct ocxl_link {
>   	int dev;
>   	atomic_t irq_available;
>   	struct spa *spa;
> +	struct mutex lpc_mem_lock; /* protects lpc_mem & lpc_mem_sz */
> +	u64 lpc_mem_sz; /* Total amount of LPC memory presented on the link */
> +	u64 lpc_mem;
> +	int lpc_consumers;
> +
>   	void *platform_data;
>   };
>   static struct list_head links_list = LIST_HEAD_INIT(links_list);
> @@ -396,6 +401,8 @@ static int alloc_link(struct pci_dev *dev, int PE_mask, struct ocxl_link **out_l
>   	if (rc)
>   		goto err_spa;
>   
> +	mutex_init(&link->lpc_mem_lock);
> +
>   	/* platform specific hook */
>   	rc = pnv_ocxl_spa_setup(dev, link->spa->spa_mem, PE_mask,
>   				&link->platform_data);
> @@ -711,3 +718,49 @@ void ocxl_link_free_irq(void *link_handle, int hw_irq)
>   	atomic_inc(&link->irq_available);
>   }
>   EXPORT_SYMBOL_GPL(ocxl_link_free_irq);
> +
> +int ocxl_link_add_lpc_mem(void *link_handle, u64 offset, u64 size)
> +{
> +	struct ocxl_link *link = (struct ocxl_link *) link_handle;
> +
> +	// Check for overflow
> +	if (offset > (offset + size))
> +		return -EINVAL;
> +
> +	mutex_lock(&link->lpc_mem_lock);
> +	link->lpc_mem_sz = max(link->lpc_mem_sz, offset + size);
> +
> +	mutex_unlock(&link->lpc_mem_lock);
> +
> +	return 0;
> +}
> +
> +u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev)
> +{
> +	struct ocxl_link *link = (struct ocxl_link *) link_handle;
> +
> +	mutex_lock(&link->lpc_mem_lock);
> +
> +	if(!link->lpc_mem)
> +		link->lpc_mem = pnv_ocxl_platform_lpc_setup(pdev, link->lpc_mem_sz);
> +
> +	if(link->lpc_mem)
> +		link->lpc_consumers++;
> +	mutex_unlock(&link->lpc_mem_lock);
> +
> +	return link->lpc_mem;
> +}
> +
> +void ocxl_link_lpc_release(void *link_handle, struct pci_dev *pdev)
> +{
> +	struct ocxl_link *link = (struct ocxl_link *) link_handle;
> +
> +	mutex_lock(&link->lpc_mem_lock);
> +	WARN_ON(--link->lpc_consumers < 0);


Here, we always decrement the lpc_consumers count. However, it was only 
incremented if the mapping was setup correctly in opal.

We could arguably claim that ocxl_link_lpc_release() should only be 
called if ocxl_link_lpc_map() succeeded, but it would make error path 
handling easier if we only decrement the lpc_consumers count if 
link->lpc_mem is set. So that we can just call ocxl_link_lpc_release() 
in error paths without having to worry about triggering the WARN_ON message.

   Fred



> +	if (link->lpc_consumers == 0) {
> +		pnv_ocxl_platform_lpc_release(pdev);
> +		link->lpc_mem = 0;
> +	}
> +
> +	mutex_unlock(&link->lpc_mem_lock);
> +}
> diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h
> index 198e4e4bc51d..d0c8c4838f42 100644
> --- a/drivers/misc/ocxl/ocxl_internal.h
> +++ b/drivers/misc/ocxl/ocxl_internal.h
> @@ -142,4 +142,37 @@ int ocxl_irq_offset_to_id(struct ocxl_context *ctx, u64 offset);
>   u64 ocxl_irq_id_to_offset(struct ocxl_context *ctx, int irq_id);
>   void ocxl_afu_irq_free_all(struct ocxl_context *ctx);
>   
> +/**
> + * ocxl_link_add_lpc_mem() - Increment the amount of memory required by an OpenCAPI link
> + *
> + * @link_handle: The OpenCAPI link handle
> + * @offset: The offset of the memory to add
> + * @size: The amount of memory to increment by
> + *
> + * Returns 0 on success, negative on overflow
> + */
> +int ocxl_link_add_lpc_mem(void *link_handle, u64 offset, u64 size);
> +
> +/**
> + * ocxl_link_lpc_map() - Map the LPC memory for an OpenCAPI device
> + * Since LPC memory belongs to a link, the whole LPC memory available
> + * on the link must be mapped in order to make it accessible to a device.
> + * @link_handle: The OpenCAPI link handle
> + * @pdev: A device that is on the link
> + *
> + * Returns the address of the mapped LPC memory, or 0 on error
> + */
> +u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev);
> +
> +/**
> + * ocxl_link_lpc_release() - Release the LPC memory device for an OpenCAPI device
> + *
> + * Offlines LPC memory on an OpenCAPI link for a device. If this is the
> + * last device on the link to release the memory, unmap it from the link.
> + *
> + * @link_handle: The OpenCAPI link handle
> + * @pdev: A device that is on the link
> + */
> +void ocxl_link_lpc_release(void *link_handle, struct pci_dev *pdev);
> +
>   #endif /* _OCXL_INTERNAL_H_ */
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 07/27] ocxl: Add functions to map/unmap LPC memory
  2020-02-21  3:27 ` [PATCH v3 07/27] ocxl: Add functions to map/unmap LPC memory Alastair D'Silva
  2020-02-24  6:02   ` Andrew Donnellan
@ 2020-02-25 17:01   ` Frederic Barrat
  1 sibling, 0 replies; 130+ messages in thread
From: Frederic Barrat @ 2020-02-25 17:01 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Add functions to map/unmap LPC memory
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---


It looks ok to me.
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>



>   drivers/misc/ocxl/core.c          | 51 +++++++++++++++++++++++++++++++
>   drivers/misc/ocxl/ocxl_internal.h |  3 ++
>   include/misc/ocxl.h               | 21 +++++++++++++
>   3 files changed, 75 insertions(+)
> 
> diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c
> index 2531c6cf19a0..75ff14e3882a 100644
> --- a/drivers/misc/ocxl/core.c
> +++ b/drivers/misc/ocxl/core.c
> @@ -210,6 +210,56 @@ static void unmap_mmio_areas(struct ocxl_afu *afu)
>   	release_fn_bar(afu->fn, afu->config.global_mmio_bar);
>   }
>   
> +int ocxl_afu_map_lpc_mem(struct ocxl_afu *afu)
> +{
> +	struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent);
> +
> +	if ((afu->config.lpc_mem_size + afu->config.special_purpose_mem_size) == 0)
> +		return 0;
> +
> +	afu->lpc_base_addr = ocxl_link_lpc_map(afu->fn->link, dev);
> +	if (afu->lpc_base_addr == 0)
> +		return -EINVAL;
> +
> +	if (afu->config.lpc_mem_size > 0) {
> +		afu->lpc_res.start = afu->lpc_base_addr + afu->config.lpc_mem_offset;
> +		afu->lpc_res.end = afu->lpc_res.start + afu->config.lpc_mem_size - 1;
> +	}
> +
> +	if (afu->config.special_purpose_mem_size > 0) {
> +		afu->special_purpose_res.start = afu->lpc_base_addr +
> +						 afu->config.special_purpose_mem_offset;
> +		afu->special_purpose_res.end = afu->special_purpose_res.start +
> +					       afu->config.special_purpose_mem_size - 1;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(ocxl_afu_map_lpc_mem);
> +
> +struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu)
> +{
> +	return &afu->lpc_res;
> +}
> +EXPORT_SYMBOL_GPL(ocxl_afu_lpc_mem);
> +
> +static void unmap_lpc_mem(struct ocxl_afu *afu)
> +{
> +	struct pci_dev *dev = to_pci_dev(afu->fn->dev.parent);
> +
> +	if (afu->lpc_res.start || afu->special_purpose_res.start) {
> +		void *link = afu->fn->link;
> +
> +		// only release the link when the the last consumer calls release
> +		ocxl_link_lpc_release(link, dev);
> +
> +		afu->lpc_res.start = 0;
> +		afu->lpc_res.end = 0;
> +		afu->special_purpose_res.start = 0;
> +		afu->special_purpose_res.end = 0;
> +	}
> +}
> +
>   static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev)
>   {
>   	int rc;
> @@ -251,6 +301,7 @@ static int configure_afu(struct ocxl_afu *afu, u8 afu_idx, struct pci_dev *dev)
>   
>   static void deconfigure_afu(struct ocxl_afu *afu)
>   {
> +	unmap_lpc_mem(afu);
>   	unmap_mmio_areas(afu);
>   	reclaim_afu_pasid(afu);
>   	reclaim_afu_actag(afu);
> diff --git a/drivers/misc/ocxl/ocxl_internal.h b/drivers/misc/ocxl/ocxl_internal.h
> index d0c8c4838f42..ce0cac1da416 100644
> --- a/drivers/misc/ocxl/ocxl_internal.h
> +++ b/drivers/misc/ocxl/ocxl_internal.h
> @@ -52,6 +52,9 @@ struct ocxl_afu {
>   	void __iomem *global_mmio_ptr;
>   	u64 pp_mmio_start;
>   	void *private;
> +	u64 lpc_base_addr; /* Covers both LPC & special purpose memory */
> +	struct resource lpc_res;
> +	struct resource special_purpose_res;
>   };
>   
>   enum ocxl_context_status {
> diff --git a/include/misc/ocxl.h b/include/misc/ocxl.h
> index 357ef1aadbc0..d8b0b4d46bfb 100644
> --- a/include/misc/ocxl.h
> +++ b/include/misc/ocxl.h
> @@ -203,6 +203,27 @@ int ocxl_irq_set_handler(struct ocxl_context *ctx, int irq_id,
>   
>   // AFU Metadata
>   
> +/**
> + * ocxl_afu_map_lpc_mem() - Map the LPC system & special purpose memory for an AFU
> + * Do not call this during device discovery, as there may me multiple
> + * devices on a link, and the memory is mapped for the whole link, not
> + * just one device. It should only be called after all devices have
> + * registered their memory on the link.
> + *
> + * @afu: The AFU that has the LPC memory to map
> + *
> + * Returns 0 on success, negative on failure
> + */
> +int ocxl_afu_map_lpc_mem(struct ocxl_afu *afu);
> +
> +/**
> + * ocxl_afu_lpc_mem() - Get the physical address range of LPC memory for an AFU
> + * @afu: The AFU associated with the LPC memory
> + *
> + * Returns a pointer to the resource struct for the physical address range
> + */
> +struct resource *ocxl_afu_lpc_mem(struct ocxl_afu *afu);
> +
>   /**
>    * ocxl_afu_config() - Get a pointer to the config for an AFU
>    * @afu: a pointer to the AFU to get the config for
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected
  2020-02-21  3:27 ` [PATCH v3 08/27] ocxl: Emit a log message showing how much LPC memory was detected Alastair D'Silva
  2020-02-24  6:06   ` Andrew Donnellan
@ 2020-02-25 17:03   ` Frederic Barrat
  1 sibling, 0 replies; 130+ messages in thread
From: Frederic Barrat @ 2020-02-25 17:03 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch emits a message showing how much LPC memory & special purpose
> memory was detected on an OCXL device.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---


Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>



>   drivers/misc/ocxl/config.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
> index a62e3d7db2bf..701ae6216abf 100644
> --- a/drivers/misc/ocxl/config.c
> +++ b/drivers/misc/ocxl/config.c
> @@ -568,6 +568,10 @@ static int read_afu_lpc_memory_info(struct pci_dev *dev,
>   		afu->special_purpose_mem_size =
>   			total_mem_size - lpc_mem_size;
>   	}
> +
> +	dev_info(&dev->dev, "Probed LPC memory of %#llx bytes and special purpose memory of %#llx bytes\n",
> +		afu->lpc_mem_size, afu->special_purpose_mem_size);
> +
>   	return 0;
>   }
>   
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-24  6:51         ` Oliver O'Halloran
@ 2020-02-26  0:13           ` Alastair D'Silva
  2020-02-26  0:32             ` Dan Williams
  0 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-26  0:13 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Matthew Wilcox, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashevskiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Mon, 2020-02-24 at 17:51 +1100, Oliver O'Halloran wrote:
> On Mon, Feb 24, 2020 at 3:43 PM Alastair D'Silva <
> alastair@au1.ibm.com> wrote:
> > On Sun, 2020-02-23 at 20:37 -0800, Matthew Wilcox wrote:
> > > On Mon, Feb 24, 2020 at 03:34:07PM +1100, Alastair D'Silva wrote:
> > > > V3:
> > > >   - Rebase against next/next-20200220
> > > >   - Move driver to arch/powerpc/platforms/powernv, we now
> > > > expect
> > > > this
> > > >     driver to go upstream via the powerpc tree
> > > 
> > > That's rather the opposite direction of normal; mostly drivers
> > > live
> > > under
> > > drivers/ and not in arch/.  It's easier for drivers to get
> > > overlooked
> > > when doing tree-wide changes if they're hiding.
> > 
> > This is true, however, given that it was not all that desirable to
> > have
> > it under drivers/nvdimm, it's sister driver (for the same hardware)
> > is
> > also under arch, and that we don't expect this driver to be used on
> > any
> > platform other than powernv, we think this was the most reasonable
> > place to put it.
> 
> Historically powernv specific platform drivers go in their respective
> subsystem trees rather than in arch/ and I'd prefer we kept it that
> way. When I added the papr_scm driver I put it in the pseries
> platform
> directory because most of the pseries paravirt code lives there for
> some reason; I don't know why. Luckily for me that followed the same
> model that Dan used when he put the NFIT driver in drivers/acpi/ and
> the libnvdimm core in drivers/nvdimm/ so we didn't have anything to
> argue about. However, as Matthew pointed out, it is at odds with how
> most subsystems operate. Is there any particular reason we're doing
> things this way or should we think about moving libnvdimm users to
> drivers/nvdimm/?
> 
> Oliver


I'm not too fussed where it ends up, as long as it ends up somewhere :)

From what I can tell, the issue is that we have both "infrastructure"
drivers, and end-device drivers. To me, it feels like drivers/nvdimm
should contain both, and I think this feels like the right approach.

I could move it back to drivers/nvdimm/ocxl, but I felt that it was
only tolerated there, not desired. This could be cleared up with a
response from Dan Williams, and if it is indeed dersired, this is my
preferred location.

I think a case could also be made for drivers/ocxl, simply because we
don't expect more than a handful of drivers to ever live there (I
expect most users will drive their devices from userspace via libocxl).

In defence of keeping it in arch/powerpc/powernv, I highly doubt this
driver will end up being used on any platform other than this. Even
though OpenCAPI was engineered as an open standard, there is some
competition from industry giants with a competing standard on a much
more popular platform.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory
  2020-02-25 10:02   ` Frederic Barrat
@ 2020-02-26  0:19     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-26  0:19 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Tue, 2020-02-25 at 11:02 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:26, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This patch adds platform support to map & release LPC memory.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/include/asm/pnv-ocxl.h   |  4 +++
> >   arch/powerpc/platforms/powernv/ocxl.c | 43
> > +++++++++++++++++++++++++++
> >   2 files changed, 47 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/pnv-ocxl.h
> > b/arch/powerpc/include/asm/pnv-ocxl.h
> > index 7de82647e761..0b2a6707e555 100644
> > --- a/arch/powerpc/include/asm/pnv-ocxl.h
> > +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> > @@ -32,5 +32,9 @@ extern int pnv_ocxl_spa_remove_pe_from_cache(void
> > *platform_data, int pe_handle)
> >   
> >   extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);
> >   extern void pnv_ocxl_free_xive_irq(u32 irq);
> > +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
> > +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size);
> > +void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev);
> > +#endif
> 
> This breaks the compilation of the ocxl driver if
> CONFIG_MEMORY_HOTPLUG=n
> 
> Those functions still make sense even without memory hotplug, for 
> example in the context of the implementation you had to access
> opencapi 
> LPC memory through mmap(). The #ifdef is really needed only around
> the 
> check_hotplug_memory_addressable() call.
> 
>    Fred

Hmm, we do still need sparsemem though. Let me think about his some
more.

> 
> 
> >   #endif /* _ASM_PNV_OCXL_H */
> > diff --git a/arch/powerpc/platforms/powernv/ocxl.c
> > b/arch/powerpc/platforms/powernv/ocxl.c
> > index 8c65aacda9c8..f2edbcc67361 100644
> > --- a/arch/powerpc/platforms/powernv/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/ocxl.c
> > @@ -475,6 +475,49 @@ void pnv_ocxl_spa_release(void *platform_data)
> >   }
> >   EXPORT_SYMBOL_GPL(pnv_ocxl_spa_release);
> >   
> > +#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
> > +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size)
> > +{
> > +	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> > +	struct pnv_phb *phb = hose->private_data;
> > +	u32 bdfn = pci_dev_id(pdev);
> > +	__be64 base_addr_be64;
> > +	u64 base_addr;
> > +	int rc;
> > +
> > +	rc = opal_npu_mem_alloc(phb->opal_id, bdfn, size,
> > &base_addr_be64);
> > +	if (rc) {
> > +		dev_warn(&pdev->dev,
> > +			 "OPAL could not allocate LPC memory, rc=%d\n",
> > rc);
> > +		return 0;
> > +	}
> > +
> > +	base_addr = be64_to_cpu(base_addr_be64);
> > +
> > +	rc = check_hotplug_memory_addressable(base_addr >> PAGE_SHIFT,
> > +					      size >> PAGE_SHIFT);
> > +	if (rc)
> > +		return 0;
> > +
> > +	return base_addr;
> > +}
> > +EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_setup);
> > +
> > +void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev)
> > +{
> > +	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> > +	struct pnv_phb *phb = hose->private_data;
> > +	u32 bdfn = pci_dev_id(pdev);
> > +	int rc;
> > +
> > +	rc = opal_npu_mem_release(phb->opal_id, bdfn);
> > +	if (rc)
> > +		dev_warn(&pdev->dev,
> > +			 "OPAL reported rc=%d when releasing LPC
> > memory\n", rc);
> > +}
> > +EXPORT_SYMBOL_GPL(pnv_ocxl_platform_lpc_release);
> > +#endif
> > +
> >   int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int
> > pe_handle)
> >   {
> >   	struct spa_data *data = (struct spa_data *) platform_data;
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 06/27] ocxl: Tally up the LPC memory on a link & allow it to be mapped
  2020-02-25 16:30   ` Frederic Barrat
@ 2020-02-26  0:29     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-26  0:29 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Tue, 2020-02-25 at 17:30 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:26, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > Tally up the LPC memory on an OpenCAPI link & allow it to be mapped
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   drivers/misc/ocxl/core.c          | 10 ++++++
> >   drivers/misc/ocxl/link.c          | 53
> > +++++++++++++++++++++++++++++++
> >   drivers/misc/ocxl/ocxl_internal.h | 33 +++++++++++++++++++
> >   3 files changed, 96 insertions(+)
> > 
> > diff --git a/drivers/misc/ocxl/core.c b/drivers/misc/ocxl/core.c
> > index b7a09b21ab36..2531c6cf19a0 100644
> > --- a/drivers/misc/ocxl/core.c
> > +++ b/drivers/misc/ocxl/core.c
> > @@ -230,8 +230,18 @@ static int configure_afu(struct ocxl_afu *afu,
> > u8 afu_idx, struct pci_dev *dev)
> >   	if (rc)
> >   		goto err_free_pasid;
> >   
> > +	if (afu->config.lpc_mem_size || afu-
> > >config.special_purpose_mem_size) {
> > +		rc = ocxl_link_add_lpc_mem(afu->fn->link, afu-
> > >config.lpc_mem_offset,
> > +					   afu->config.lpc_mem_size +
> > +					   afu-
> > >config.special_purpose_mem_size);
> > +		if (rc)
> > +			goto err_free_mmio;
> > +	}
> > +
> >   	return 0;
> >   
> > +err_free_mmio:
> > +	unmap_mmio_areas(afu);
> >   err_free_pasid:
> >   	reclaim_afu_pasid(afu);
> >   err_free_actag:
> > diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
> > index 58d111afd9f6..1e039cc5ebe5 100644
> > --- a/drivers/misc/ocxl/link.c
> > +++ b/drivers/misc/ocxl/link.c
> > @@ -84,6 +84,11 @@ struct ocxl_link {
> >   	int dev;
> >   	atomic_t irq_available;
> >   	struct spa *spa;
> > +	struct mutex lpc_mem_lock; /* protects lpc_mem & lpc_mem_sz */
> > +	u64 lpc_mem_sz; /* Total amount of LPC memory presented on the
> > link */
> > +	u64 lpc_mem;
> > +	int lpc_consumers;
> > +
> >   	void *platform_data;
> >   };
> >   static struct list_head links_list = LIST_HEAD_INIT(links_list);
> > @@ -396,6 +401,8 @@ static int alloc_link(struct pci_dev *dev, int
> > PE_mask, struct ocxl_link **out_l
> >   	if (rc)
> >   		goto err_spa;
> >   
> > +	mutex_init(&link->lpc_mem_lock);
> > +
> >   	/* platform specific hook */
> >   	rc = pnv_ocxl_spa_setup(dev, link->spa->spa_mem, PE_mask,
> >   				&link->platform_data);
> > @@ -711,3 +718,49 @@ void ocxl_link_free_irq(void *link_handle, int
> > hw_irq)
> >   	atomic_inc(&link->irq_available);
> >   }
> >   EXPORT_SYMBOL_GPL(ocxl_link_free_irq);
> > +
> > +int ocxl_link_add_lpc_mem(void *link_handle, u64 offset, u64 size)
> > +{
> > +	struct ocxl_link *link = (struct ocxl_link *) link_handle;
> > +
> > +	// Check for overflow
> > +	if (offset > (offset + size))
> > +		return -EINVAL;
> > +
> > +	mutex_lock(&link->lpc_mem_lock);
> > +	link->lpc_mem_sz = max(link->lpc_mem_sz, offset + size);
> > +
> > +	mutex_unlock(&link->lpc_mem_lock);
> > +
> > +	return 0;
> > +}
> > +
> > +u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev)
> > +{
> > +	struct ocxl_link *link = (struct ocxl_link *) link_handle;
> > +
> > +	mutex_lock(&link->lpc_mem_lock);
> > +
> > +	if(!link->lpc_mem)
> > +		link->lpc_mem = pnv_ocxl_platform_lpc_setup(pdev, link-
> > >lpc_mem_sz);
> > +
> > +	if(link->lpc_mem)
> > +		link->lpc_consumers++;
> > +	mutex_unlock(&link->lpc_mem_lock);
> > +
> > +	return link->lpc_mem;
> > +}
> > +
> > +void ocxl_link_lpc_release(void *link_handle, struct pci_dev
> > *pdev)
> > +{
> > +	struct ocxl_link *link = (struct ocxl_link *) link_handle;
> > +
> > +	mutex_lock(&link->lpc_mem_lock);
> > +	WARN_ON(--link->lpc_consumers < 0);
> 
> Here, we always decrement the lpc_consumers count. However, it was
> only 
> incremented if the mapping was setup correctly in opal.
> 
> We could arguably claim that ocxl_link_lpc_release() should only be 
> called if ocxl_link_lpc_map() succeeded, but it would make error
> path 
> handling easier if we only decrement the lpc_consumers count if 
> link->lpc_mem is set. So that we can just call
> ocxl_link_lpc_release() 
> in error paths without having to worry about triggering the WARN_ON
> message.
> 
>    Fred
> 
> 

Ok, this makes sense.

> 
> > +	if (link->lpc_consumers == 0) {
> > +		pnv_ocxl_platform_lpc_release(pdev);
> > +		link->lpc_mem = 0;
> > +	}
> > +
> > +	mutex_unlock(&link->lpc_mem_lock);
> > +}
> > diff --git a/drivers/misc/ocxl/ocxl_internal.h
> > b/drivers/misc/ocxl/ocxl_internal.h
> > index 198e4e4bc51d..d0c8c4838f42 100644
> > --- a/drivers/misc/ocxl/ocxl_internal.h
> > +++ b/drivers/misc/ocxl/ocxl_internal.h
> > @@ -142,4 +142,37 @@ int ocxl_irq_offset_to_id(struct ocxl_context
> > *ctx, u64 offset);
> >   u64 ocxl_irq_id_to_offset(struct ocxl_context *ctx, int irq_id);
> >   void ocxl_afu_irq_free_all(struct ocxl_context *ctx);
> >   
> > +/**
> > + * ocxl_link_add_lpc_mem() - Increment the amount of memory
> > required by an OpenCAPI link
> > + *
> > + * @link_handle: The OpenCAPI link handle
> > + * @offset: The offset of the memory to add
> > + * @size: The amount of memory to increment by
> > + *
> > + * Returns 0 on success, negative on overflow
> > + */
> > +int ocxl_link_add_lpc_mem(void *link_handle, u64 offset, u64
> > size);
> > +
> > +/**
> > + * ocxl_link_lpc_map() - Map the LPC memory for an OpenCAPI device
> > + * Since LPC memory belongs to a link, the whole LPC memory
> > available
> > + * on the link must be mapped in order to make it accessible to a
> > device.
> > + * @link_handle: The OpenCAPI link handle
> > + * @pdev: A device that is on the link
> > + *
> > + * Returns the address of the mapped LPC memory, or 0 on error
> > + */
> > +u64 ocxl_link_lpc_map(void *link_handle, struct pci_dev *pdev);
> > +
> > +/**
> > + * ocxl_link_lpc_release() - Release the LPC memory device for an
> > OpenCAPI device
> > + *
> > + * Offlines LPC memory on an OpenCAPI link for a device. If this
> > is the
> > + * last device on the link to release the memory, unmap it from
> > the link.
> > + *
> > + * @link_handle: The OpenCAPI link handle
> > + * @pdev: A device that is on the link
> > + */
> > +void ocxl_link_lpc_release(void *link_handle, struct pci_dev
> > *pdev);
> > +
> >   #endif /* _OCXL_INTERNAL_H_ */
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-26  0:13           ` Alastair D'Silva
@ 2020-02-26  0:32             ` Dan Williams
  2020-02-26  0:35               ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Dan Williams @ 2020-02-26  0:32 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Matthew Wilcox, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashevskiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Tue, Feb 25, 2020 at 4:14 PM Alastair D'Silva <alastair@au1.ibm.com> wrote:
>
> On Mon, 2020-02-24 at 17:51 +1100, Oliver O'Halloran wrote:
> > On Mon, Feb 24, 2020 at 3:43 PM Alastair D'Silva <
> > alastair@au1.ibm.com> wrote:
> > > On Sun, 2020-02-23 at 20:37 -0800, Matthew Wilcox wrote:
> > > > On Mon, Feb 24, 2020 at 03:34:07PM +1100, Alastair D'Silva wrote:
> > > > > V3:
> > > > >   - Rebase against next/next-20200220
> > > > >   - Move driver to arch/powerpc/platforms/powernv, we now
> > > > > expect
> > > > > this
> > > > >     driver to go upstream via the powerpc tree
> > > >
> > > > That's rather the opposite direction of normal; mostly drivers
> > > > live
> > > > under
> > > > drivers/ and not in arch/.  It's easier for drivers to get
> > > > overlooked
> > > > when doing tree-wide changes if they're hiding.
> > >
> > > This is true, however, given that it was not all that desirable to
> > > have
> > > it under drivers/nvdimm, it's sister driver (for the same hardware)
> > > is
> > > also under arch, and that we don't expect this driver to be used on
> > > any
> > > platform other than powernv, we think this was the most reasonable
> > > place to put it.
> >
> > Historically powernv specific platform drivers go in their respective
> > subsystem trees rather than in arch/ and I'd prefer we kept it that
> > way. When I added the papr_scm driver I put it in the pseries
> > platform
> > directory because most of the pseries paravirt code lives there for
> > some reason; I don't know why. Luckily for me that followed the same
> > model that Dan used when he put the NFIT driver in drivers/acpi/ and
> > the libnvdimm core in drivers/nvdimm/ so we didn't have anything to
> > argue about. However, as Matthew pointed out, it is at odds with how
> > most subsystems operate. Is there any particular reason we're doing
> > things this way or should we think about moving libnvdimm users to
> > drivers/nvdimm/?
> >
> > Oliver
>
>
> I'm not too fussed where it ends up, as long as it ends up somewhere :)
>
> From what I can tell, the issue is that we have both "infrastructure"
> drivers, and end-device drivers. To me, it feels like drivers/nvdimm
> should contain both, and I think this feels like the right approach.
>
> I could move it back to drivers/nvdimm/ocxl, but I felt that it was
> only tolerated there, not desired. This could be cleared up with a
> response from Dan Williams, and if it is indeed dersired, this is my
> preferred location.

Apologies if I gave the impression it was only tolerated. I'm ok with
drivers/nvdimm/ocxl/, and to the larger point I'd also be ok with a
drivers/{acpi => nvdimm}/nfit and {arch/powerpc/platforms/pseries =>
drivers/nvdimm}/papr_scm.c move as well to keep all the consumers of
the nvdimm related code together with the core.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH v3 00/27] Add support for OpenCAPI Persistent Memory devices
  2020-02-26  0:32             ` Dan Williams
@ 2020-02-26  0:35               ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-26  0:35 UTC (permalink / raw)
  To: Dan Williams
  Cc: Matthew Wilcox, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashevskiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Tue, 2020-02-25 at 16:32 -0800, Dan Williams wrote:
> On Tue, Feb 25, 2020 at 4:14 PM Alastair D'Silva <
> alastair@au1.ibm.com> wrote:
> > On Mon, 2020-02-24 at 17:51 +1100, Oliver O'Halloran wrote:
> > > On Mon, Feb 24, 2020 at 3:43 PM Alastair D'Silva <
> > > alastair@au1.ibm.com> wrote:
> > > > On Sun, 2020-02-23 at 20:37 -0800, Matthew Wilcox wrote:
> > > > > On Mon, Feb 24, 2020 at 03:34:07PM +1100, Alastair D'Silva
> > > > > wrote:
> > > > > > V3:
> > > > > >   - Rebase against next/next-20200220
> > > > > >   - Move driver to arch/powerpc/platforms/powernv, we now
> > > > > > expect
> > > > > > this
> > > > > >     driver to go upstream via the powerpc tree
> > > > > 
> > > > > That's rather the opposite direction of normal; mostly
> > > > > drivers
> > > > > live
> > > > > under
> > > > > drivers/ and not in arch/.  It's easier for drivers to get
> > > > > overlooked
> > > > > when doing tree-wide changes if they're hiding.
> > > > 
> > > > This is true, however, given that it was not all that desirable
> > > > to
> > > > have
> > > > it under drivers/nvdimm, it's sister driver (for the same
> > > > hardware)
> > > > is
> > > > also under arch, and that we don't expect this driver to be
> > > > used on
> > > > any
> > > > platform other than powernv, we think this was the most
> > > > reasonable
> > > > place to put it.
> > > 
> > > Historically powernv specific platform drivers go in their
> > > respective
> > > subsystem trees rather than in arch/ and I'd prefer we kept it
> > > that
> > > way. When I added the papr_scm driver I put it in the pseries
> > > platform
> > > directory because most of the pseries paravirt code lives there
> > > for
> > > some reason; I don't know why. Luckily for me that followed the
> > > same
> > > model that Dan used when he put the NFIT driver in drivers/acpi/
> > > and
> > > the libnvdimm core in drivers/nvdimm/ so we didn't have anything
> > > to
> > > argue about. However, as Matthew pointed out, it is at odds with
> > > how
> > > most subsystems operate. Is there any particular reason we're
> > > doing
> > > things this way or should we think about moving libnvdimm users
> > > to
> > > drivers/nvdimm/?
> > > 
> > > Oliver
> > 
> > I'm not too fussed where it ends up, as long as it ends up
> > somewhere :)
> > 
> > From what I can tell, the issue is that we have both
> > "infrastructure"
> > drivers, and end-device drivers. To me, it feels like
> > drivers/nvdimm
> > should contain both, and I think this feels like the right
> > approach.
> > 
> > I could move it back to drivers/nvdimm/ocxl, but I felt that it was
> > only tolerated there, not desired. This could be cleared up with a
> > response from Dan Williams, and if it is indeed dersired, this is
> > my
> > preferred location.
> 
> Apologies if I gave the impression it was only tolerated. I'm ok with
> drivers/nvdimm/ocxl/, and to the larger point I'd also be ok with a
> drivers/{acpi => nvdimm}/nfit and {arch/powerpc/platforms/pseries =>
> drivers/nvdimm}/papr_scm.c move as well to keep all the consumers of
> the nvdimm related code together with the core.

Great, thanks for clarifying, text is so imprecise when it comes to
nuance :)

I'll move ti back to drivers/nvdimm/ocxl then.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory
  2020-02-21  3:27 ` [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory Alastair D'Silva
@ 2020-02-26  5:07   ` Andrew Donnellan
  2020-02-26  5:49     ` Alastair D'Silva
  2020-02-27 20:44   ` Frederic Barrat
  2020-02-28 18:32   ` Frederic Barrat
  2 siblings, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-26  5:07 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This driver exposes LPC memory on OpenCAPI pmem cards
> as an NVDIMM, allowing the existing nvram infrastructure
> to be used.
> 
> Namespace metadata is stored on the media itself, so
> scm_reserve_metadata() maps 1 section's worth of PMEM storage
> at the start to hold this. The rest of the PMEM range is registered
> with libnvdimm as an nvdimm. scm_ndctl_config_read/write/size() provide
> callbacks to libnvdimm to access the metadata.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

I'm not particularly familiar with the nvdimm subsystem, so the scope of 
my review is more on the ocxl + misc issues side.

A few minor checkpatch warnings that don't matter all that much:

https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/11786//artifact/linux/checkpatch.log

A few other comments below.

> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> new file mode 100644
> index 000000000000..3c4eeb5dcc0f
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -0,0 +1,473 @@
> +// SPDX-License-Id
> +// Copyright 2019 IBM Corp.
> +
> +/*
> + * A driver for OpenCAPI devices that implement the Storage Class
> + * Memory specification.
> + */
> +
> +#include <linux/module.h>
> +#include <misc/ocxl.h>
> +#include <linux/ndctl.h>
> +#include <linux/mm_types.h>
> +#include <linux/memory_hotplug.h>
> +#include "ocxl_internal.h"
> +
> +
> +static const struct pci_device_id ocxlpmem_pci_tbl[] = {
> +	{ PCI_DEVICE(PCI_VENDOR_ID_IBM, 0x0625), },
> +	{ }
> +};
> +
> +MODULE_DEVICE_TABLE(pci, ocxlpmem_pci_tbl);
> +
> +#define NUM_MINORS 256 // Total to reserve
> +
> +static dev_t ocxlpmem_dev;
> +static struct class *ocxlpmem_class;
> +static struct mutex minors_idr_lock;
> +static struct idr minors_idr;
> +
> +/**
> + * ndctl_config_write() - Handle a ND_CMD_SET_CONFIG_DATA command from ndctl
> + * @ocxlpmem: the device metadata
> + * @command: the incoming data to write
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_write(struct ocxlpmem *ocxlpmem,
> +			      struct nd_cmd_set_config_hdr *command)
> +{
> +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> +		return -EINVAL;
> +
> +	memcpy_flushcache(ocxlpmem->metadata_addr + command->in_offset, command->in_buf,
> +			  command->in_length);

Out of scope for this patch - given that we use memcpy_mcsafe in the 
config read, does it make sense to change memcpy_flushcache to be mcsafe 
as well?

> +
> +	return 0;
> +}
> +
> +/**
> + * ndctl_config_read() - Handle a ND_CMD_GET_CONFIG_DATA command from ndctl
> + * @ocxlpmem: the device metadata
> + * @command: the read request
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_read(struct ocxlpmem *ocxlpmem,
> +			     struct nd_cmd_get_config_data_hdr *command)
> +{
> +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> +		return -EINVAL;
> +
> +	memcpy_mcsafe(command->out_buf, ocxlpmem->metadata_addr + command->in_offset,
> +		      command->in_length);
> +
> +	return 0;
> +}
> +
> +/**
> + * ndctl_config_size() - Handle a ND_CMD_GET_CONFIG_SIZE command from ndctl
> + * @command: the read request
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_size(struct nd_cmd_get_config_size *command)
> +{
> +	command->status = 0;
> +	command->config_size = LABEL_AREA_SIZE;
> +	command->max_xfer = PAGE_SIZE;
> +
> +	return 0;
> +}
> +
> +static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
> +		 struct nvdimm *nvdimm,
> +		 unsigned int cmd, void *buf, unsigned int buf_len, int *cmd_rc)
> +{
> +	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct ocxlpmem, bus_desc);
> +
> +	switch (cmd) {
> +	case ND_CMD_GET_CONFIG_SIZE:
> +		*cmd_rc = ndctl_config_size(buf);
> +		return 0;
> +
> +	case ND_CMD_GET_CONFIG_DATA:
> +		*cmd_rc = ndctl_config_read(ocxlpmem, buf);
> +		return 0;
> +
> +	case ND_CMD_SET_CONFIG_DATA:
> +		*cmd_rc = ndctl_config_write(ocxlpmem, buf);
> +		return 0;
> +
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> +
> +/**
> + * reserve_metadata() - Reserve space for nvdimm metadata
> + * @ocxlpmem: the device metadata
> + * @lpc_mem: The resource representing the LPC memory of the OpenCAPI device
> + */
> +static int reserve_metadata(struct ocxlpmem *ocxlpmem,
> +			    struct resource *lpc_mem)
> +{
> +	ocxlpmem->metadata_addr = devm_memremap(&ocxlpmem->dev, lpc_mem->start,
> +						LABEL_AREA_SIZE, MEMREMAP_WB);
> +	if (IS_ERR(ocxlpmem->metadata_addr))
> +		return PTR_ERR(ocxlpmem->metadata_addr);
> +
> +	return 0;
> +}
> +
> +/**
> + * register_lpc_mem() - Discover persistent memory on a device and register it with the NVDIMM subsystem
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success
> + */
> +static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
> +{
> +	struct nd_region_desc region_desc;
> +	struct nd_mapping_desc nd_mapping_desc;
> +	struct resource *lpc_mem;
> +	const struct ocxl_afu_config *config;
> +	const struct ocxl_fn_config *fn_config;
> +	int rc;
> +	unsigned long nvdimm_cmd_mask = 0;
> +	unsigned long nvdimm_flags = 0;
> +	int target_node;
> +	char serial[16+1];

inb4 mpe tells you to Reverse Christmas Tree

> +
> +	// Set up the reserved metadata area
> +	rc = ocxl_afu_map_lpc_mem(ocxlpmem->ocxl_afu);
> +	if (rc < 0)
> +		return rc;
> +
> +	lpc_mem = ocxl_afu_lpc_mem(ocxlpmem->ocxl_afu);
> +	if (lpc_mem == NULL || lpc_mem->start == 0)
> +		return -EINVAL;
> +
> +	config = ocxl_afu_config(ocxlpmem->ocxl_afu);
> +	fn_config = ocxl_function_config(ocxlpmem->ocxl_fn);
> +
> +	rc = reserve_metadata(ocxlpmem, lpc_mem);
> +	if (rc)
> +		return rc;
> +
> +	ocxlpmem->bus_desc.provider_name = "ocxl-pmem";
> +	ocxlpmem->bus_desc.ndctl = ndctl;
> +	ocxlpmem->bus_desc.module = THIS_MODULE;
> +
> +	ocxlpmem->nvdimm_bus = nvdimm_bus_register(&ocxlpmem->dev,
> +						   &ocxlpmem->bus_desc);
> +	if (!ocxlpmem->nvdimm_bus)
> +		return -EINVAL;
> +
> +	ocxlpmem->pmem_res.start = (u64)lpc_mem->start + LABEL_AREA_SIZE;
> +	ocxlpmem->pmem_res.end = (u64)lpc_mem->start + config->lpc_mem_size - 1;
> +	ocxlpmem->pmem_res.name = "OpenCAPI persistent memory";
> +
> +	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
> +	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
> +	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
> +
> +	set_bit(NDD_ALIASING, &nvdimm_flags);
> +
> +	snprintf(serial, sizeof(serial), "%llx", fn_config->serial);
> +	nd_mapping_desc.nvdimm = nvdimm_create(ocxlpmem->nvdimm_bus, ocxlpmem,
> +				 NULL, nvdimm_flags, nvdimm_cmd_mask,
> +				 0, NULL);
> +	if (!nd_mapping_desc.nvdimm)
> +		return -ENOMEM;
> +
> +	if (nvdimm_bus_check_dimm_count(ocxlpmem->nvdimm_bus, 1))
> +		return -EINVAL;
> +
> +	nd_mapping_desc.start = ocxlpmem->pmem_res.start;
> +	nd_mapping_desc.size = resource_size(&ocxlpmem->pmem_res);
> +	nd_mapping_desc.position = 0;
> +
> +	ocxlpmem->nd_set.cookie1 = fn_config->serial + 1; // allow for empty serial
> +	ocxlpmem->nd_set.cookie2 = fn_config->serial + 1;
> +
> +	target_node = of_node_to_nid(ocxlpmem->pdev->dev.of_node);
> +
> +	memset(&region_desc, 0, sizeof(region_desc));
> +	region_desc.res = &ocxlpmem->pmem_res;
> +	region_desc.numa_node = NUMA_NO_NODE;
> +	region_desc.target_node = target_node;
> +	region_desc.num_mappings = 1;
> +	region_desc.mapping = &nd_mapping_desc;
> +	region_desc.nd_set = &ocxlpmem->nd_set;
> +
> +	set_bit(ND_REGION_PAGEMAP, &region_desc.flags);
> +	/*
> +	 * NB: libnvdimm copies the data from ndr_desc into it's own
> +	 * structures so passing a stack pointer is fine.
> +	 */
> +	ocxlpmem->nd_region = nvdimm_pmem_region_create(ocxlpmem->nvdimm_bus,
> +							&region_desc);
> +	if (!ocxlpmem->nd_region)
> +		return -EINVAL;
> +
> +	dev_info(&ocxlpmem->dev,
> +		 "Onlining %lluMB of persistent memory\n",
> +		 nd_mapping_desc.size / SZ_1M);
> +
> +	return 0;
> +}
> +
> +/**
> + * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem device
> + * @ocxlpmem: the device metadata
> + * Return: the allocated minor number
> + */
> +static int allocate_minor(struct ocxlpmem *ocxlpmem)
> +{
> +	int minor;
> +
> +	mutex_lock(&minors_idr_lock);
> +	minor = idr_alloc(&minors_idr, ocxlpmem, 0, NUM_MINORS, GFP_KERNEL);
> +	mutex_unlock(&minors_idr_lock);
> +	return minor;
> +}
> +
> +static void free_minor(struct ocxlpmem *ocxlpmem)

The lack of a kerneldoc comment here is inconsistent :)

> +{
> +	mutex_lock(&minors_idr_lock);
> +	idr_remove(&minors_idr, MINOR(ocxlpmem->dev.devt));
> +	mutex_unlock(&minors_idr_lock);
> +}
> +
> +/**
> + * free_ocxlpmem() - Free all members of an ocxlpmem struct
> + * @ocxlpmem: the device struct to clear
> + */
> +static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +
> +	if (ocxlpmem->nvdimm_bus)
> +		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> +
> +	free_minor(ocxlpmem);
> +
> +	if (ocxlpmem->metadata_addr)
> +		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
> +
> +	if (ocxlpmem->ocxl_context) {
> +		rc = ocxl_context_detach(ocxlpmem->ocxl_context);
> +		if (rc == -EBUSY)
> +			dev_warn(&ocxlpmem->dev, "Timeout detaching ocxl context\n");
> +		else
> +			ocxl_context_free(ocxlpmem->ocxl_context);
> +
> +	}
> +
> +	if (ocxlpmem->ocxl_afu)
> +		ocxl_afu_put(ocxlpmem->ocxl_afu);
> +
> +	if (ocxlpmem->ocxl_fn)
> +		ocxl_function_close(ocxlpmem->ocxl_fn);
> +
> +	kfree(ocxlpmem);
> +}
> +
> +/**
> + * free_ocxlpmem_dev() - Free an OpenCAPI persistent memory device
> + * @dev: The device struct
> + */
> +static void free_ocxlpmem_dev(struct device *dev)
> +{
> +	struct ocxlpmem *ocxlpmem = container_of(dev, struct ocxlpmem, dev);
> +
> +	free_ocxlpmem(ocxlpmem);
> +}
> +
> +/**
> + * ocxlpmem_register() - Register an OpenCAPI pmem device with the kernel
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +	int minor = allocate_minor(ocxlpmem);
> +
> +	if (minor < 0)
> +		return minor;
> +
> +	ocxlpmem->dev.release = free_ocxlpmem_dev;
> +	rc = dev_set_name(&ocxlpmem->dev, "ocxlpmem%d", minor);
> +	if (rc < 0)
> +		return rc;
> +
> +	ocxlpmem->dev.devt = MKDEV(MAJOR(ocxlpmem_dev), minor);
> +	ocxlpmem->dev.class = ocxlpmem_class;
> +	ocxlpmem->dev.parent = &ocxlpmem->pdev->dev;
> +
> +	return device_register(&ocxlpmem->dev);
> +}
> +
> +/**
> + * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
> + * @pdev: the PCI device information struct
> + */
> +static void ocxlpmem_remove(struct pci_dev *pdev)
> +{
> +	if (PCI_FUNC(pdev->devfn) == 0) {
> +		struct ocxlpmem_function0 *func0 = pci_get_drvdata(pdev);
> +
> +		if (func0) {
> +			ocxl_function_close(func0->ocxl_fn);
> +			func0->ocxl_fn = NULL;
> +		}
> +	} else {
> +		struct ocxlpmem *ocxlpmem = pci_get_drvdata(pdev);
> +
> +		if (ocxlpmem)
> +			device_unregister(&ocxlpmem->dev);
> +	}
> +}
> +
> +/**
> + * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
> + * This is important as it enables templates higher than 0 across all other functions,
> + * which in turn enables higher bandwidth accesses
> + * @pdev: the PCI device information struct
> + * Return: 0 on success, negative on failure
> + */
> +static int probe_function0(struct pci_dev *pdev)
> +{
> +	struct ocxlpmem_function0 *func0 = NULL;
> +	struct ocxl_fn *fn;
> +
> +	func0 = kzalloc(sizeof(*func0), GFP_KERNEL);
> +	if (!func0)
> +		return -ENOMEM;
> +
> +	func0->pdev = pdev;
> +	fn = ocxl_function_open(pdev);
> +	if (IS_ERR(fn)) {
> +		kfree(func0);
> +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> +		return PTR_ERR(fn);
> +	}
> +	func0->ocxl_fn = fn;
> +
> +	pci_set_drvdata(pdev, func0);
> +
> +	return 0;
> +}
> +
> +/**
> + * probe() - Init an OpenCAPI persistent memory device
> + * @pdev: the PCI device information struct
> + * @ent: The entry from ocxlpmem_pci_tbl
> + * Return: 0 on success, negative on failure
> + */
> +static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> +{
> +	struct ocxlpmem *ocxlpmem;
> +	int rc;
> +
> +	if (PCI_FUNC(pdev->devfn) == 0)
> +		return probe_function0(pdev);
> +	else if (PCI_FUNC(pdev->devfn) != 1)
> +		return 0;
> +
> +	ocxlpmem = kzalloc(sizeof(*ocxlpmem), GFP_KERNEL);
> +	if (!ocxlpmem) {
> +		dev_err(&pdev->dev, "Could not allocate OpenCAPI persistent memory metadata\n");
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +	ocxlpmem->pdev = pdev;
> +
> +	pci_set_drvdata(pdev, ocxlpmem);
> +
> +	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
> +	if (IS_ERR(ocxlpmem->ocxl_fn)) {
> +		kfree(ocxlpmem);

You can't free this yet...

> +		pci_set_drvdata(pdev, NULL);
> +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> +		rc = PTR_ERR(ocxlpmem->ocxl_fn);
> +		goto err;
> +	}
> +
> +	ocxlpmem->ocxl_afu = ocxl_function_fetch_afu(ocxlpmem->ocxl_fn, 0);
> +	if (ocxlpmem->ocxl_afu == NULL) {
> +		dev_err(&pdev->dev, "Could not get OCXL AFU from function\n");
> +		rc = -ENXIO;
> +		goto err;

Meanwhile in this case, I think ocxlpmem and ocxlpmem->ocxl_fn get 
leaked (it's before ocxlpmem_register(), so the cleanup handler hasn't 
been registered).

Add some new error labels for these two paths?

> +	}
> +
> +	ocxl_afu_get(ocxlpmem->ocxl_afu);
> +
> +	// Resources allocated below here are cleaned up in the release handler
> +
> +	rc = ocxlpmem_register(ocxlpmem);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory device with the kernel\n");
> +		goto err;
> +	}
> +
> +	rc = ocxl_context_alloc(&ocxlpmem->ocxl_context, ocxlpmem->ocxl_afu, NULL);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not allocate OCXL context\n");
> +		goto err;
> +	}
> +
> +	rc = ocxl_context_attach(ocxlpmem->ocxl_context, 0, NULL);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not attach ocxl context\n");
> +		goto err;
> +	}
> +
> +	rc = register_lpc_mem(ocxlpmem);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory with libnvdimm\n");
> +		goto err;
> +	}
> +
> +	return 0;
> +
> +err:
> +	/*
> +	 * Further cleanup is done in the release handler via free_ocxlpmem()
> +	 * This allows us to keep the character device live to handle IOCTLs to
> +	 * investigate issues if the card has an error
> +	 */
> +
> +	dev_err(&pdev->dev,
> +		"Error detected, will not register OpenCAPI persistent memory\n");
> +	return rc;
> +}
> +
> +static struct pci_driver pci_driver = {
> +	.name = "ocxl-pmem",
> +	.id_table = ocxlpmem_pci_tbl,
> +	.probe = probe,
> +	.remove = ocxlpmem_remove,
> +	.shutdown = ocxlpmem_remove,
> +};
> +
> +static int __init ocxlpmem_init(void)
> +{
> +	int rc = 0;
> +
> +	rc = pci_register_driver(&pci_driver);
> +	if (rc)
> +		return rc;
> +
> +	return 0;
> +}
> +
> +static void ocxlpmem_exit(void)
> +{
> +	pci_unregister_driver(&pci_driver);
> +}
> +
> +module_init(ocxlpmem_init);
> +module_exit(ocxlpmem_exit);
> +
> +MODULE_DESCRIPTION("OpenCAPI Persistent Memory");
> +MODULE_LICENSE("GPL");
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> new file mode 100644
> index 000000000000..0faf3740e9b8
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -0,0 +1,28 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +// Copyright 2019 IBM Corp.
> +
> +#include <linux/pci.h>
> +#include <misc/ocxl.h>
> +#include <linux/libnvdimm.h>
> +#include <linux/mm.h>
> +
> +#define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> +
> +struct ocxlpmem_function0 {
> +	struct pci_dev *pdev;
> +	struct ocxl_fn *ocxl_fn;
> +};

Hmm, given this struct serves no purpose other than associating an 
ocxl_fn with a pci_dev, and special handling for function 0 is probably 
going to be a common pattern in ocxl dependent drivers, perhaps we 
should export a function from ocxl that converts ocxl_fn -> pci_dev, and 
then you can just point the drvdata straight at the ocxl_fn?


> +
> +struct ocxlpmem {
> +	struct device dev;
> +	struct pci_dev *pdev;
> +	struct ocxl_fn *ocxl_fn;
> +	struct nd_interleave_set nd_set;
> +	struct nvdimm_bus_descriptor bus_desc;
> +	struct nvdimm_bus *nvdimm_bus;
> +	struct ocxl_afu *ocxl_afu;
> +	struct ocxl_context *ocxl_context;
> +	void *metadata_addr;
> +	struct resource pmem_res;
> +	struct nd_region *nd_region;
> +};
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory
  2020-02-26  5:07   ` Andrew Donnellan
@ 2020-02-26  5:49     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-26  5:49 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Wed, 2020-02-26 at 16:07 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This driver exposes LPC memory on OpenCAPI pmem cards
> > as an NVDIMM, allowing the existing nvram infrastructure
> > to be used.
> > 
> > Namespace metadata is stored on the media itself, so
> > scm_reserve_metadata() maps 1 section's worth of PMEM storage
> > at the start to hold this. The rest of the PMEM range is registered
> > with libnvdimm as an nvdimm. scm_ndctl_config_read/write/size()
> > provide
> > callbacks to libnvdimm to access the metadata.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> 
> I'm not particularly familiar with the nvdimm subsystem, so the scope
> of 
> my review is more on the ocxl + misc issues side.
> 
> A few minor checkpatch warnings that don't matter all that much:
> 
> https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/11786//artifact/linux/checkpatch.log
> 
> A few other comments below.
> 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > new file mode 100644
> > index 000000000000..3c4eeb5dcc0f
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -0,0 +1,473 @@
> > +// SPDX-License-Id
> > +// Copyright 2019 IBM Corp.
> > +
> > +/*
> > + * A driver for OpenCAPI devices that implement the Storage Class
> > + * Memory specification.
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <misc/ocxl.h>
> > +#include <linux/ndctl.h>
> > +#include <linux/mm_types.h>
> > +#include <linux/memory_hotplug.h>
> > +#include "ocxl_internal.h"
> > +
> > +
> > +static const struct pci_device_id ocxlpmem_pci_tbl[] = {
> > +	{ PCI_DEVICE(PCI_VENDOR_ID_IBM, 0x0625), },
> > +	{ }
> > +};
> > +
> > +MODULE_DEVICE_TABLE(pci, ocxlpmem_pci_tbl);
> > +
> > +#define NUM_MINORS 256 // Total to reserve
> > +
> > +static dev_t ocxlpmem_dev;
> > +static struct class *ocxlpmem_class;
> > +static struct mutex minors_idr_lock;
> > +static struct idr minors_idr;
> > +
> > +/**
> > + * ndctl_config_write() - Handle a ND_CMD_SET_CONFIG_DATA command
> > from ndctl
> > + * @ocxlpmem: the device metadata
> > + * @command: the incoming data to write
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ndctl_config_write(struct ocxlpmem *ocxlpmem,
> > +			      struct nd_cmd_set_config_hdr *command)
> > +{
> > +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> > +		return -EINVAL;
> > +
> > +	memcpy_flushcache(ocxlpmem->metadata_addr + command->in_offset, 
> > command->in_buf,
> > +			  command->in_length);
> 
> Out of scope for this patch - given that we use memcpy_mcsafe in the 
> config read, does it make sense to change memcpy_flushcache to be
> mcsafe 
> as well?
> 

Aneesh has confirmed that stores don't generate machine checks.

> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * ndctl_config_read() - Handle a ND_CMD_GET_CONFIG_DATA command
> > from ndctl
> > + * @ocxlpmem: the device metadata
> > + * @command: the read request
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ndctl_config_read(struct ocxlpmem *ocxlpmem,
> > +			     struct nd_cmd_get_config_data_hdr
> > *command)
> > +{
> > +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> > +		return -EINVAL;
> > +
> > +	memcpy_mcsafe(command->out_buf, ocxlpmem->metadata_addr +
> > command->in_offset,
> > +		      command->in_length);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * ndctl_config_size() - Handle a ND_CMD_GET_CONFIG_SIZE command
> > from ndctl
> > + * @command: the read request
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ndctl_config_size(struct nd_cmd_get_config_size
> > *command)
> > +{
> > +	command->status = 0;
> > +	command->config_size = LABEL_AREA_SIZE;
> > +	command->max_xfer = PAGE_SIZE;
> > +
> > +	return 0;
> > +}
> > +
> > +static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
> > +		 struct nvdimm *nvdimm,
> > +		 unsigned int cmd, void *buf, unsigned int buf_len, int
> > *cmd_rc)
> > +{
> > +	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct
> > ocxlpmem, bus_desc);
> > +
> > +	switch (cmd) {
> > +	case ND_CMD_GET_CONFIG_SIZE:
> > +		*cmd_rc = ndctl_config_size(buf);
> > +		return 0;
> > +
> > +	case ND_CMD_GET_CONFIG_DATA:
> > +		*cmd_rc = ndctl_config_read(ocxlpmem, buf);
> > +		return 0;
> > +
> > +	case ND_CMD_SET_CONFIG_DATA:
> > +		*cmd_rc = ndctl_config_write(ocxlpmem, buf);
> > +		return 0;
> > +
> > +	default:
> > +		return -ENOTTY;
> > +	}
> > +}
> > +
> > +/**
> > + * reserve_metadata() - Reserve space for nvdimm metadata
> > + * @ocxlpmem: the device metadata
> > + * @lpc_mem: The resource representing the LPC memory of the
> > OpenCAPI device
> > + */
> > +static int reserve_metadata(struct ocxlpmem *ocxlpmem,
> > +			    struct resource *lpc_mem)
> > +{
> > +	ocxlpmem->metadata_addr = devm_memremap(&ocxlpmem->dev,
> > lpc_mem->start,
> > +						LABEL_AREA_SIZE,
> > MEMREMAP_WB);
> > +	if (IS_ERR(ocxlpmem->metadata_addr))
> > +		return PTR_ERR(ocxlpmem->metadata_addr);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * register_lpc_mem() - Discover persistent memory on a device and
> > register it with the NVDIMM subsystem
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success
> > + */
> > +static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
> > +{
> > +	struct nd_region_desc region_desc;
> > +	struct nd_mapping_desc nd_mapping_desc;
> > +	struct resource *lpc_mem;
> > +	const struct ocxl_afu_config *config;
> > +	const struct ocxl_fn_config *fn_config;
> > +	int rc;
> > +	unsigned long nvdimm_cmd_mask = 0;
> > +	unsigned long nvdimm_flags = 0;
> > +	int target_node;
> > +	char serial[16+1];
> 
> inb4 mpe tells you to Reverse Christmas Tree
> 

-1, petty :P

> > +
> > +	// Set up the reserved metadata area
> > +	rc = ocxl_afu_map_lpc_mem(ocxlpmem->ocxl_afu);
> > +	if (rc < 0)
> > +		return rc;
> > +
> > +	lpc_mem = ocxl_afu_lpc_mem(ocxlpmem->ocxl_afu);
> > +	if (lpc_mem == NULL || lpc_mem->start == 0)
> > +		return -EINVAL;
> > +
> > +	config = ocxl_afu_config(ocxlpmem->ocxl_afu);
> > +	fn_config = ocxl_function_config(ocxlpmem->ocxl_fn);
> > +
> > +	rc = reserve_metadata(ocxlpmem, lpc_mem);
> > +	if (rc)
> > +		return rc;
> > +
> > +	ocxlpmem->bus_desc.provider_name = "ocxl-pmem";
> > +	ocxlpmem->bus_desc.ndctl = ndctl;
> > +	ocxlpmem->bus_desc.module = THIS_MODULE;
> > +
> > +	ocxlpmem->nvdimm_bus = nvdimm_bus_register(&ocxlpmem->dev,
> > +						   &ocxlpmem-
> > >bus_desc);
> > +	if (!ocxlpmem->nvdimm_bus)
> > +		return -EINVAL;
> > +
> > +	ocxlpmem->pmem_res.start = (u64)lpc_mem->start +
> > LABEL_AREA_SIZE;
> > +	ocxlpmem->pmem_res.end = (u64)lpc_mem->start + config-
> > >lpc_mem_size - 1;
> > +	ocxlpmem->pmem_res.name = "OpenCAPI persistent memory";
> > +
> > +	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
> > +	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
> > +	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
> > +
> > +	set_bit(NDD_ALIASING, &nvdimm_flags);
> > +
> > +	snprintf(serial, sizeof(serial), "%llx", fn_config->serial);
> > +	nd_mapping_desc.nvdimm = nvdimm_create(ocxlpmem->nvdimm_bus,
> > ocxlpmem,
> > +				 NULL, nvdimm_flags, nvdimm_cmd_mask,
> > +				 0, NULL);
> > +	if (!nd_mapping_desc.nvdimm)
> > +		return -ENOMEM;
> > +
> > +	if (nvdimm_bus_check_dimm_count(ocxlpmem->nvdimm_bus, 1))
> > +		return -EINVAL;
> > +
> > +	nd_mapping_desc.start = ocxlpmem->pmem_res.start;
> > +	nd_mapping_desc.size = resource_size(&ocxlpmem->pmem_res);
> > +	nd_mapping_desc.position = 0;
> > +
> > +	ocxlpmem->nd_set.cookie1 = fn_config->serial + 1; // allow for
> > empty serial
> > +	ocxlpmem->nd_set.cookie2 = fn_config->serial + 1;
> > +
> > +	target_node = of_node_to_nid(ocxlpmem->pdev->dev.of_node);
> > +
> > +	memset(&region_desc, 0, sizeof(region_desc));
> > +	region_desc.res = &ocxlpmem->pmem_res;
> > +	region_desc.numa_node = NUMA_NO_NODE;
> > +	region_desc.target_node = target_node;
> > +	region_desc.num_mappings = 1;
> > +	region_desc.mapping = &nd_mapping_desc;
> > +	region_desc.nd_set = &ocxlpmem->nd_set;
> > +
> > +	set_bit(ND_REGION_PAGEMAP, &region_desc.flags);
> > +	/*
> > +	 * NB: libnvdimm copies the data from ndr_desc into it's own
> > +	 * structures so passing a stack pointer is fine.
> > +	 */
> > +	ocxlpmem->nd_region = nvdimm_pmem_region_create(ocxlpmem-
> > >nvdimm_bus,
> > +							&region_desc);
> > +	if (!ocxlpmem->nd_region)
> > +		return -EINVAL;
> > +
> > +	dev_info(&ocxlpmem->dev,
> > +		 "Onlining %lluMB of persistent memory\n",
> > +		 nd_mapping_desc.size / SZ_1M);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * allocate_minor() - Allocate a minor number to use for an
> > OpenCAPI pmem device
> > + * @ocxlpmem: the device metadata
> > + * Return: the allocated minor number
> > + */
> > +static int allocate_minor(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int minor;
> > +
> > +	mutex_lock(&minors_idr_lock);
> > +	minor = idr_alloc(&minors_idr, ocxlpmem, 0, NUM_MINORS,
> > GFP_KERNEL);
> > +	mutex_unlock(&minors_idr_lock);
> > +	return minor;
> > +}
> > +
> > +static void free_minor(struct ocxlpmem *ocxlpmem)
> 
> The lack of a kerneldoc comment here is inconsistent :)
> 
> > +{
> > +	mutex_lock(&minors_idr_lock);
> > +	idr_remove(&minors_idr, MINOR(ocxlpmem->dev.devt));
> > +	mutex_unlock(&minors_idr_lock);
> > +}
> > +
> > +/**
> > + * free_ocxlpmem() - Free all members of an ocxlpmem struct
> > + * @ocxlpmem: the device struct to clear
> > + */
> > +static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +
> > +	if (ocxlpmem->nvdimm_bus)
> > +		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> > +
> > +	free_minor(ocxlpmem);
> > +
> > +	if (ocxlpmem->metadata_addr)
> > +		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
> > +
> > +	if (ocxlpmem->ocxl_context) {
> > +		rc = ocxl_context_detach(ocxlpmem->ocxl_context);
> > +		if (rc == -EBUSY)
> > +			dev_warn(&ocxlpmem->dev, "Timeout detaching
> > ocxl context\n");
> > +		else
> > +			ocxl_context_free(ocxlpmem->ocxl_context);
> > +
> > +	}
> > +
> > +	if (ocxlpmem->ocxl_afu)
> > +		ocxl_afu_put(ocxlpmem->ocxl_afu);
> > +
> > +	if (ocxlpmem->ocxl_fn)
> > +		ocxl_function_close(ocxlpmem->ocxl_fn);
> > +
> > +	kfree(ocxlpmem);
> > +}
> > +
> > +/**
> > + * free_ocxlpmem_dev() - Free an OpenCAPI persistent memory device
> > + * @dev: The device struct
> > + */
> > +static void free_ocxlpmem_dev(struct device *dev)
> > +{
> > +	struct ocxlpmem *ocxlpmem = container_of(dev, struct ocxlpmem,
> > dev);
> > +
> > +	free_ocxlpmem(ocxlpmem);
> > +}
> > +
> > +/**
> > + * ocxlpmem_register() - Register an OpenCAPI pmem device with the
> > kernel
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +	int minor = allocate_minor(ocxlpmem);
> > +
> > +	if (minor < 0)
> > +		return minor;
> > +
> > +	ocxlpmem->dev.release = free_ocxlpmem_dev;
> > +	rc = dev_set_name(&ocxlpmem->dev, "ocxlpmem%d", minor);
> > +	if (rc < 0)
> > +		return rc;
> > +
> > +	ocxlpmem->dev.devt = MKDEV(MAJOR(ocxlpmem_dev), minor);
> > +	ocxlpmem->dev.class = ocxlpmem_class;
> > +	ocxlpmem->dev.parent = &ocxlpmem->pdev->dev;
> > +
> > +	return device_register(&ocxlpmem->dev);
> > +}
> > +
> > +/**
> > + * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
> > + * @pdev: the PCI device information struct
> > + */
> > +static void ocxlpmem_remove(struct pci_dev *pdev)
> > +{
> > +	if (PCI_FUNC(pdev->devfn) == 0) {
> > +		struct ocxlpmem_function0 *func0 =
> > pci_get_drvdata(pdev);
> > +
> > +		if (func0) {
> > +			ocxl_function_close(func0->ocxl_fn);
> > +			func0->ocxl_fn = NULL;
> > +		}
> > +	} else {
> > +		struct ocxlpmem *ocxlpmem = pci_get_drvdata(pdev);
> > +
> > +		if (ocxlpmem)
> > +			device_unregister(&ocxlpmem->dev);
> > +	}
> > +}
> > +
> > +/**
> > + * probe_function0() - Set up function 0 for an OpenCAPI
> > persistent memory device
> > + * This is important as it enables templates higher than 0 across
> > all other functions,
> > + * which in turn enables higher bandwidth accesses
> > + * @pdev: the PCI device information struct
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int probe_function0(struct pci_dev *pdev)
> > +{
> > +	struct ocxlpmem_function0 *func0 = NULL;
> > +	struct ocxl_fn *fn;
> > +
> > +	func0 = kzalloc(sizeof(*func0), GFP_KERNEL);
> > +	if (!func0)
> > +		return -ENOMEM;
> > +
> > +	func0->pdev = pdev;
> > +	fn = ocxl_function_open(pdev);
> > +	if (IS_ERR(fn)) {
> > +		kfree(func0);
> > +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> > +		return PTR_ERR(fn);
> > +	}
> > +	func0->ocxl_fn = fn;
> > +
> > +	pci_set_drvdata(pdev, func0);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * probe() - Init an OpenCAPI persistent memory device
> > + * @pdev: the PCI device information struct
> > + * @ent: The entry from ocxlpmem_pci_tbl
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int probe(struct pci_dev *pdev, const struct pci_device_id
> > *ent)
> > +{
> > +	struct ocxlpmem *ocxlpmem;
> > +	int rc;
> > +
> > +	if (PCI_FUNC(pdev->devfn) == 0)
> > +		return probe_function0(pdev);
> > +	else if (PCI_FUNC(pdev->devfn) != 1)
> > +		return 0;
> > +
> > +	ocxlpmem = kzalloc(sizeof(*ocxlpmem), GFP_KERNEL);
> > +	if (!ocxlpmem) {
> > +		dev_err(&pdev->dev, "Could not allocate OpenCAPI
> > persistent memory metadata\n");
> > +		rc = -ENOMEM;
> > +		goto err;
> > +	}
> > +	ocxlpmem->pdev = pdev;
> > +
> > +	pci_set_drvdata(pdev, ocxlpmem);
> > +
> > +	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
> > +	if (IS_ERR(ocxlpmem->ocxl_fn)) {
> > +		kfree(ocxlpmem);
> 
> You can't free this yet...

Whoops

> 
> > +		pci_set_drvdata(pdev, NULL);
> > +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> > +		rc = PTR_ERR(ocxlpmem->ocxl_fn);
> > +		goto err;
> > +	}
> > +
> > +	ocxlpmem->ocxl_afu = ocxl_function_fetch_afu(ocxlpmem->ocxl_fn, 
> > 0);
> > +	if (ocxlpmem->ocxl_afu == NULL) {
> > +		dev_err(&pdev->dev, "Could not get OCXL AFU from
> > function\n");
> > +		rc = -ENXIO;
> > +		goto err;
> 
> Meanwhile in this case, I think ocxlpmem and ocxlpmem->ocxl_fn get 
> leaked (it's before ocxlpmem_register(), so the cleanup handler
> hasn't 
> been registered).
> 
> Add some new error labels for these two paths?
> 

Ok

> > +	}
> > +
> > +	ocxl_afu_get(ocxlpmem->ocxl_afu);
> > +
> > +	// Resources allocated below here are cleaned up in the release
> > handler
> > +
> > +	rc = ocxlpmem_register(ocxlpmem);
> > +	if (rc) {
> > +		dev_err(&pdev->dev, "Could not register OpenCAPI
> > persistent memory device with the kernel\n");
> > +		goto err;
> > +	}
> > +
> > +	rc = ocxl_context_alloc(&ocxlpmem->ocxl_context, ocxlpmem-
> > >ocxl_afu, NULL);
> > +	if (rc) {
> > +		dev_err(&pdev->dev, "Could not allocate OCXL
> > context\n");
> > +		goto err;
> > +	}
> > +
> > +	rc = ocxl_context_attach(ocxlpmem->ocxl_context, 0, NULL);
> > +	if (rc) {
> > +		dev_err(&pdev->dev, "Could not attach ocxl context\n");
> > +		goto err;
> > +	}
> > +
> > +	rc = register_lpc_mem(ocxlpmem);
> > +	if (rc) {
> > +		dev_err(&pdev->dev, "Could not register OpenCAPI
> > persistent memory with libnvdimm\n");
> > +		goto err;
> > +	}
> > +
> > +	return 0;
> > +
> > +err:
> > +	/*
> > +	 * Further cleanup is done in the release handler via
> > free_ocxlpmem()
> > +	 * This allows us to keep the character device live to handle
> > IOCTLs to
> > +	 * investigate issues if the card has an error
> > +	 */
> > +
> > +	dev_err(&pdev->dev,
> > +		"Error detected, will not register OpenCAPI persistent
> > memory\n");
> > +	return rc;
> > +}
> > +
> > +static struct pci_driver pci_driver = {
> > +	.name = "ocxl-pmem",
> > +	.id_table = ocxlpmem_pci_tbl,
> > +	.probe = probe,
> > +	.remove = ocxlpmem_remove,
> > +	.shutdown = ocxlpmem_remove,
> > +};
> > +
> > +static int __init ocxlpmem_init(void)
> > +{
> > +	int rc = 0;
> > +
> > +	rc = pci_register_driver(&pci_driver);
> > +	if (rc)
> > +		return rc;
> > +
> > +	return 0;
> > +}
> > +
> > +static void ocxlpmem_exit(void)
> > +{
> > +	pci_unregister_driver(&pci_driver);
> > +}
> > +
> > +module_init(ocxlpmem_init);
> > +module_exit(ocxlpmem_exit);
> > +
> > +MODULE_DESCRIPTION("OpenCAPI Persistent Memory");
> > +MODULE_LICENSE("GPL");
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > new file mode 100644
> > index 000000000000..0faf3740e9b8
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > @@ -0,0 +1,28 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +// Copyright 2019 IBM Corp.
> > +
> > +#include <linux/pci.h>
> > +#include <misc/ocxl.h>
> > +#include <linux/libnvdimm.h>
> > +#include <linux/mm.h>
> > +
> > +#define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> > +
> > +struct ocxlpmem_function0 {
> > +	struct pci_dev *pdev;
> > +	struct ocxl_fn *ocxl_fn;
> > +};
> 
> Hmm, given this struct serves no purpose other than associating an 
> ocxl_fn with a pci_dev, and special handling for function 0 is
> probably 
> going to be a common pattern in ocxl dependent drivers, perhaps we 
> should export a function from ocxl that converts ocxl_fn -> pci_dev,
> and 
> then you can just point the drvdata straight at the ocxl_fn?
> 

Yup, the struct is vestigial, I'll remove it and deal directly with the
ocxl_fn instead.

> 
> > +
> > +struct ocxlpmem {
> > +	struct device dev;
> > +	struct pci_dev *pdev;
> > +	struct ocxl_fn *ocxl_fn;
> > +	struct nd_interleave_set nd_set;
> > +	struct nvdimm_bus_descriptor bus_desc;
> > +	struct nvdimm_bus *nvdimm_bus;
> > +	struct ocxl_afu *ocxl_afu;
> > +	struct ocxl_context *ocxl_context;
> > +	void *metadata_addr;
> > +	struct resource pmem_res;
> > +	struct nd_region *nd_region;
> > +};
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-21  3:26 ` [PATCH v3 04/27] ocxl: Remove unnecessary externs Alastair D'Silva
  2020-02-21  6:06   ` Andrew Donnellan
  2020-02-25 13:23   ` Frederic Barrat
@ 2020-02-26  8:14   ` Baoquan He
  2020-02-26  8:26     ` Alastair D'Silva
  2 siblings, 1 reply; 130+ messages in thread
From: Baoquan He @ 2020-02-26  8:14 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: alastair, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardash evskiy,
	linux-kernel, linuxppc-dev, linux-nvdimm, linux-mm

On 02/21/20 at 02:26pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Function declarations don't need externs, remove the existing ones
> so they are consistent with newer code
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>  arch/powerpc/include/asm/pnv-ocxl.h | 32 ++++++++++++++---------------
>  include/misc/ocxl.h                 |  6 +++---
>  2 files changed, 18 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h
> index 0b2a6707e555..b23c99bc0c84 100644
> --- a/arch/powerpc/include/asm/pnv-ocxl.h
> +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> @@ -9,29 +9,27 @@
>  #define PNV_OCXL_TL_BITS_PER_RATE       4
>  #define PNV_OCXL_TL_RATE_BUF_SIZE       ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
>  
> -extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16 *enabled,
> -			u16 *supported);

It works w or w/o extern when declare functions. Searching 'extern'
under include can find so many functions with 'extern' adding. Do we
have a explicit standard if we should add or remove 'exter' in function
declaration?

I have no objection to this patch, just want to make clear so that I can
handle it w/o confusion.

Thanks
Baoquan
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-26  8:14   ` Baoquan He
@ 2020-02-26  8:26     ` Alastair D'Silva
  2020-02-26  9:01       ` Greg Kurz
  0 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-26  8:26 UTC (permalink / raw)
  To: 'Baoquan He', 'Alastair D'Silva'
  Cc: 'Aneesh Kumar K . V', 'Benjamin Herrenschmidt',
	'Paul Mackerras', 'Michael Ellerman',
	'Frederic Barrat', 'Andrew Donnellan',
	'Arnd Bergmann', 'Greg Kroah-Hartman',
	'Andrew Morton', 'Mauro Carvalho Chehab',
	'David S. Miller', 'Rob Herring',
	'Anton Blanchard', 'Krzysztof Kozlowski',
	'Mahesh Salgaonkar', 'Madhavan Srinivasan',
	'Cédric Le Goater', 'Anju T Sudhakar',
	'Hari Bathini', 'Thomas Gleixner',
	'Greg Kurz', 'Nicholas Piggin',
	'Masahiro Yamada'

> -----Original Message-----
> From: Baoquan He <bhe@redhat.com>
> Sent: Wednesday, 26 February 2020 7:15 PM
> To: Alastair D'Silva <alastair@au1.ibm.com>
> Cc: alastair@d-silva.org; Aneesh Kumar K . V
> <aneesh.kumar@linux.ibm.com>; Oliver O'Halloran <oohall@gmail.com>;
> Benjamin Herrenschmidt <benh@kernel.crashing.org>; Paul Mackerras
> <paulus@samba.org>; Michael Ellerman <mpe@ellerman.id.au>; Frederic
> Barrat <fbarrat@linux.ibm.com>; Andrew Donnellan <ajd@linux.ibm.com>;
> Arnd Bergmann <arnd@arndb.de>; Greg Kroah-Hartman
> <gregkh@linuxfoundation.org>; Dan Williams <dan.j.williams@intel.com>;
> Vishal Verma <vishal.l.verma@intel.com>; Dave Jiang
> <dave.jiang@intel.com>; Ira Weiny <ira.weiny@intel.com>; Andrew Morton
> <akpm@linux-foundation.org>; Mauro Carvalho Chehab
> <mchehab+samsung@kernel.org>; David S. Miller <davem@davemloft.net>;
> Rob Herring <robh@kernel.org>; Anton Blanchard <anton@ozlabs.org>;
> Krzysztof Kozlowski <krzk@kernel.org>; Mahesh Salgaonkar
> <mahesh@linux.vnet.ibm.com>; Madhavan Srinivasan
> <maddy@linux.vnet.ibm.com>; Cédric Le Goater <clg@kaod.org>; Anju T
> Sudhakar <anju@linux.vnet.ibm.com>; Hari Bathini
> <hbathini@linux.ibm.com>; Thomas Gleixner <tglx@linutronix.de>; Greg
> Kurz <groug@kaod.org>; Nicholas Piggin <npiggin@gmail.com>; Masahiro
> Yamada <yamada.masahiro@socionext.com>; Alexey Kardashevskiy
> <aik@ozlabs.ru>; linux-kernel@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org; linux-nvdimm@lists.01.org; linux-mm@kvack.org
> Subject: Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
> 
> On 02/21/20 at 02:26pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> >
> > Function declarations don't need externs, remove the existing ones so
> > they are consistent with newer code
> >
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >  arch/powerpc/include/asm/pnv-ocxl.h | 32 ++++++++++++++---------------
> >  include/misc/ocxl.h                 |  6 +++---
> >  2 files changed, 18 insertions(+), 20 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/pnv-ocxl.h
> > b/arch/powerpc/include/asm/pnv-ocxl.h
> > index 0b2a6707e555..b23c99bc0c84 100644
> > --- a/arch/powerpc/include/asm/pnv-ocxl.h
> > +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> > @@ -9,29 +9,27 @@
> >  #define PNV_OCXL_TL_BITS_PER_RATE       4
> >  #define PNV_OCXL_TL_RATE_BUF_SIZE
> ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
> >
> > -extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16
> *enabled,
> > -			u16 *supported);
> 
> It works w or w/o extern when declare functions. Searching 'extern'
> under include can find so many functions with 'extern' adding. Do we have
a
> explicit standard if we should add or remove 'exter' in function
declaration?
> 
> I have no objection to this patch, just want to make clear so that I can
handle
> it w/o confusion.
> 
> Thanks
> Baoquan
> 

For the OpenCAPI driver, we have settled on not having 'extern' on
functions.

I don't think I've seen a standard that supports or refutes this, but it
does not value add.

-- 
Alastair D'Silva           mob: 0423 762 819
skype: alastair_dsilva     msn: alastair@d-silva.org
blog: http://alastair.d-silva.org    Twitter: @EvilDeece



_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-26  8:26     ` Alastair D'Silva
@ 2020-02-26  9:01       ` Greg Kurz
  2020-02-26 14:15         ` 'Baoquan He'
  0 siblings, 1 reply; 130+ messages in thread
From: Greg Kurz @ 2020-02-26  9:01 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: 'Baoquan He', 'Alastair D'Silva',
	'Aneesh Kumar K . V', 'Benjamin Herrenschmidt',
	'Paul Mackerras', 'Michael Ellerman',
	'Frederic Barrat'  <fbarrat@linux.ibm.com>,
	 'Andrew Donnellan'  <ajd@linux.ibm.com>,
	'Arnd Bergmann', 'Greg Kroah-Hartman',
	'Andrew Morton'  <akpm@linux-foundation.org>,
	 'Mauro Carvalho Chehab' , 'David S. Miller',
	'Rob Herring', 'Anton Blanchard',
	'Krzysztof Kozlowski', 'Mahesh Salgaonkar',
	'Madhavan Srinivasan', 'Cédric Le Goater',
	'Anju T Sudhakar',
	'Hari Bathini'  <hbathini@linux.ibm.com>,
	 'Thomas Gleixner'

On Wed, 26 Feb 2020 19:26:34 +1100
"Alastair D'Silva" <alastair@d-silva.org> wrote:

> > -----Original Message-----
> > From: Baoquan He <bhe@redhat.com>
> > Sent: Wednesday, 26 February 2020 7:15 PM
> > To: Alastair D'Silva <alastair@au1.ibm.com>
> > Cc: alastair@d-silva.org; Aneesh Kumar K . V
> > <aneesh.kumar@linux.ibm.com>; Oliver O'Halloran <oohall@gmail.com>;
> > Benjamin Herrenschmidt <benh@kernel.crashing.org>; Paul Mackerras
> > <paulus@samba.org>; Michael Ellerman <mpe@ellerman.id.au>; Frederic
> > Barrat <fbarrat@linux.ibm.com>; Andrew Donnellan <ajd@linux.ibm.com>;
> > Arnd Bergmann <arnd@arndb.de>; Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org>; Dan Williams <dan.j.williams@intel.com>;
> > Vishal Verma <vishal.l.verma@intel.com>; Dave Jiang
> > <dave.jiang@intel.com>; Ira Weiny <ira.weiny@intel.com>; Andrew Morton
> > <akpm@linux-foundation.org>; Mauro Carvalho Chehab
> > <mchehab+samsung@kernel.org>; David S. Miller <davem@davemloft.net>;
> > Rob Herring <robh@kernel.org>; Anton Blanchard <anton@ozlabs.org>;
> > Krzysztof Kozlowski <krzk@kernel.org>; Mahesh Salgaonkar
> > <mahesh@linux.vnet.ibm.com>; Madhavan Srinivasan
> > <maddy@linux.vnet.ibm.com>; Cédric Le Goater <clg@kaod.org>; Anju T
> > Sudhakar <anju@linux.vnet.ibm.com>; Hari Bathini
> > <hbathini@linux.ibm.com>; Thomas Gleixner <tglx@linutronix.de>; Greg
> > Kurz <groug@kaod.org>; Nicholas Piggin <npiggin@gmail.com>; Masahiro
> > Yamada <yamada.masahiro@socionext.com>; Alexey Kardashevskiy
> > <aik@ozlabs.ru>; linux-kernel@vger.kernel.org; linuxppc-
> > dev@lists.ozlabs.org; linux-nvdimm@lists.01.org; linux-mm@kvack.org
> > Subject: Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
> > 
> > On 02/21/20 at 02:26pm, Alastair D'Silva wrote:
> > > From: Alastair D'Silva <alastair@d-silva.org>
> > >
> > > Function declarations don't need externs, remove the existing ones so
> > > they are consistent with newer code
> > >
> > > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > > ---
> > >  arch/powerpc/include/asm/pnv-ocxl.h | 32 ++++++++++++++---------------
> > >  include/misc/ocxl.h                 |  6 +++---
> > >  2 files changed, 18 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/arch/powerpc/include/asm/pnv-ocxl.h
> > > b/arch/powerpc/include/asm/pnv-ocxl.h
> > > index 0b2a6707e555..b23c99bc0c84 100644
> > > --- a/arch/powerpc/include/asm/pnv-ocxl.h
> > > +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> > > @@ -9,29 +9,27 @@
> > >  #define PNV_OCXL_TL_BITS_PER_RATE       4
> > >  #define PNV_OCXL_TL_RATE_BUF_SIZE
> > ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
> > >
> > > -extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16
> > *enabled,
> > > -			u16 *supported);
> > 
> > It works w or w/o extern when declare functions. Searching 'extern'
> > under include can find so many functions with 'extern' adding. Do we have
> a
> > explicit standard if we should add or remove 'exter' in function
> declaration?
> > 
> > I have no objection to this patch, just want to make clear so that I can
> handle
> > it w/o confusion.
> > 
> > Thanks
> > Baoquan
> > 
> 
> For the OpenCAPI driver, we have settled on not having 'extern' on
> functions.
> 
> I don't think I've seen a standard that supports or refutes this, but it
> does not value add.
> 

FWIW this is a warning condition for checkpatch:

$ ./scripts/checkpatch.pl --strict -f include/misc/ocxl.h

[...]

CHECK: extern prototypes should be avoided in .h files
#176: FILE: include/misc/ocxl.h:176:
+extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);

[...]
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-26  9:01       ` Greg Kurz
@ 2020-02-26 14:15         ` 'Baoquan He'
  2020-02-26 14:20           ` Greg Kurz
  0 siblings, 1 reply; 130+ messages in thread
From: 'Baoquan He' @ 2020-02-26 14:15 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Alastair D'Silva, 'Alastair D'Silva',
	'Aneesh Kumar K . V', 'Benjamin Herrenschmidt',
	'Paul Mackerras', 'Michael Ellerman',
	'Frederic Barrat', 'Andrew Donnellan',
	'Arnd Bergmann', 'Greg Kroah-Hartman',
	'Andrew Morton', 'Mauro Carvalho Chehab',
	'David S. Miller', 'Rob Herring',
	'Anton Blanchard', 'Krzysztof Kozlowski',
	'Mahesh Salgaonkar', 'Madhavan Srinivasan',
	'Cédric Le Goater', 'Anju T Sudhakar',
	'Hari Bathini', 'Thomas Gleixner',
	'Nicholas Piggin', 'Masahiro Yamada',
	'Alexey Kardashevskiy',
	linux-kernel, linuxppc-dev, linux-nvdimm, linux-mm

On 02/26/20 at 10:01am, Greg Kurz wrote:
> On Wed, 26 Feb 2020 19:26:34 +1100
> "Alastair D'Silva" <alastair@d-silva.org> wrote:
> 
> > > -----Original Message-----
> > > From: Baoquan He <bhe@redhat.com>
> > > Sent: Wednesday, 26 February 2020 7:15 PM
> > > To: Alastair D'Silva <alastair@au1.ibm.com>
> > > Cc: alastair@d-silva.org; Aneesh Kumar K . V
> > > <aneesh.kumar@linux.ibm.com>; Oliver O'Halloran <oohall@gmail.com>;
> > > Benjamin Herrenschmidt <benh@kernel.crashing.org>; Paul Mackerras
> > > <paulus@samba.org>; Michael Ellerman <mpe@ellerman.id.au>; Frederic
> > > Barrat <fbarrat@linux.ibm.com>; Andrew Donnellan <ajd@linux.ibm.com>;
> > > Arnd Bergmann <arnd@arndb.de>; Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org>; Dan Williams <dan.j.williams@intel.com>;
> > > Vishal Verma <vishal.l.verma@intel.com>; Dave Jiang
> > > <dave.jiang@intel.com>; Ira Weiny <ira.weiny@intel.com>; Andrew Morton
> > > <akpm@linux-foundation.org>; Mauro Carvalho Chehab
> > > <mchehab+samsung@kernel.org>; David S. Miller <davem@davemloft.net>;
> > > Rob Herring <robh@kernel.org>; Anton Blanchard <anton@ozlabs.org>;
> > > Krzysztof Kozlowski <krzk@kernel.org>; Mahesh Salgaonkar
> > > <mahesh@linux.vnet.ibm.com>; Madhavan Srinivasan
> > > <maddy@linux.vnet.ibm.com>; Cédric Le Goater <clg@kaod.org>; Anju T
> > > Sudhakar <anju@linux.vnet.ibm.com>; Hari Bathini
> > > <hbathini@linux.ibm.com>; Thomas Gleixner <tglx@linutronix.de>; Greg
> > > Kurz <groug@kaod.org>; Nicholas Piggin <npiggin@gmail.com>; Masahiro
> > > Yamada <yamada.masahiro@socionext.com>; Alexey Kardashevskiy
> > > <aik@ozlabs.ru>; linux-kernel@vger.kernel.org; linuxppc-
> > > dev@lists.ozlabs.org; linux-nvdimm@lists.01.org; linux-mm@kvack.org
> > > Subject: Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
> > > 
> > > On 02/21/20 at 02:26pm, Alastair D'Silva wrote:
> > > > From: Alastair D'Silva <alastair@d-silva.org>
> > > >
> > > > Function declarations don't need externs, remove the existing ones so
> > > > they are consistent with newer code
> > > >
> > > > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > > > ---
> > > >  arch/powerpc/include/asm/pnv-ocxl.h | 32 ++++++++++++++---------------
> > > >  include/misc/ocxl.h                 |  6 +++---
> > > >  2 files changed, 18 insertions(+), 20 deletions(-)
> > > >
> > > > diff --git a/arch/powerpc/include/asm/pnv-ocxl.h
> > > > b/arch/powerpc/include/asm/pnv-ocxl.h
> > > > index 0b2a6707e555..b23c99bc0c84 100644
> > > > --- a/arch/powerpc/include/asm/pnv-ocxl.h
> > > > +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> > > > @@ -9,29 +9,27 @@
> > > >  #define PNV_OCXL_TL_BITS_PER_RATE       4
> > > >  #define PNV_OCXL_TL_RATE_BUF_SIZE
> > > ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
> > > >
> > > > -extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16
> > > *enabled,
> > > > -			u16 *supported);
> > > 
> > > It works w or w/o extern when declare functions. Searching 'extern'
> > > under include can find so many functions with 'extern' adding. Do we have
> > a
> > > explicit standard if we should add or remove 'exter' in function
> > declaration?
> > > 
> > > I have no objection to this patch, just want to make clear so that I can
> > handle
> > > it w/o confusion.
> > > 
> > > Thanks
> > > Baoquan
> > > 
> > 
> > For the OpenCAPI driver, we have settled on not having 'extern' on
> > functions.
> > 
> > I don't think I've seen a standard that supports or refutes this, but it
> > does not value add.
> > 
> 
> FWIW this is a warning condition for checkpatch:
> 
> $ ./scripts/checkpatch.pl --strict -f include/misc/ocxl.h

Good to know, thanks.

I didn't know checkpatch.pl can run on header file directly. Tried to
check patch with '--strict -f', the below info doesn't appear. But it
does give out below information when run on header file.

> 
> [...]
> 
> CHECK: extern prototypes should be avoided in .h files
> #176: FILE: include/misc/ocxl.h:176:
> +extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
> 
> [...]
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-26 14:15         ` 'Baoquan He'
@ 2020-02-26 14:20           ` Greg Kurz
  2020-02-26 14:54             ` 'Baoquan He'
  0 siblings, 1 reply; 130+ messages in thread
From: Greg Kurz @ 2020-02-26 14:20 UTC (permalink / raw)
  To: 'Baoquan He'
  Cc: Alastair D'Silva, 'Alastair D'Silva',
	'Aneesh Kumar K . V', 'Benjamin Herrenschmidt',
	'Paul Mackerras', 'Michael Ellerman',
	'Frederic Barrat', 'Andrew Donnellan',
	'Arnd Bergmann', 'Greg Kroah-Hartman',
	'Andrew Morton', 'Mauro Carvalho Chehab',
	'David S. Miller', 'Rob Herring',
	'Anton Blanchard', 'Krzysztof Kozlowski',
	'Mahesh Salgaonkar', 'Madhavan Srinivasan',
	'Cédric Le Goater', 'Anju T Sudhakar',
	'Hari Bathini', 'Thomas Gleixner',
	'Nicholas Piggin'

On Wed, 26 Feb 2020 22:15:23 +0800
'Baoquan He' <bhe@redhat.com> wrote:

> On 02/26/20 at 10:01am, Greg Kurz wrote:
> > On Wed, 26 Feb 2020 19:26:34 +1100
> > "Alastair D'Silva" <alastair@d-silva.org> wrote:
> > 
> > > > -----Original Message-----
> > > > From: Baoquan He <bhe@redhat.com>
> > > > Sent: Wednesday, 26 February 2020 7:15 PM
> > > > To: Alastair D'Silva <alastair@au1.ibm.com>
> > > > Cc: alastair@d-silva.org; Aneesh Kumar K . V
> > > > <aneesh.kumar@linux.ibm.com>; Oliver O'Halloran <oohall@gmail.com>;
> > > > Benjamin Herrenschmidt <benh@kernel.crashing.org>; Paul Mackerras
> > > > <paulus@samba.org>; Michael Ellerman <mpe@ellerman.id.au>; Frederic
> > > > Barrat <fbarrat@linux.ibm.com>; Andrew Donnellan <ajd@linux.ibm.com>;
> > > > Arnd Bergmann <arnd@arndb.de>; Greg Kroah-Hartman
> > > > <gregkh@linuxfoundation.org>; Dan Williams <dan.j.williams@intel.com>;
> > > > Vishal Verma <vishal.l.verma@intel.com>; Dave Jiang
> > > > <dave.jiang@intel.com>; Ira Weiny <ira.weiny@intel.com>; Andrew Morton
> > > > <akpm@linux-foundation.org>; Mauro Carvalho Chehab
> > > > <mchehab+samsung@kernel.org>; David S. Miller <davem@davemloft.net>;
> > > > Rob Herring <robh@kernel.org>; Anton Blanchard <anton@ozlabs.org>;
> > > > Krzysztof Kozlowski <krzk@kernel.org>; Mahesh Salgaonkar
> > > > <mahesh@linux.vnet.ibm.com>; Madhavan Srinivasan
> > > > <maddy@linux.vnet.ibm.com>; Cédric Le Goater <clg@kaod.org>; Anju T
> > > > Sudhakar <anju@linux.vnet.ibm.com>; Hari Bathini
> > > > <hbathini@linux.ibm.com>; Thomas Gleixner <tglx@linutronix.de>; Greg
> > > > Kurz <groug@kaod.org>; Nicholas Piggin <npiggin@gmail.com>; Masahiro
> > > > Yamada <yamada.masahiro@socionext.com>; Alexey Kardashevskiy
> > > > <aik@ozlabs.ru>; linux-kernel@vger.kernel.org; linuxppc-
> > > > dev@lists.ozlabs.org; linux-nvdimm@lists.01.org; linux-mm@kvack.org
> > > > Subject: Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
> > > > 
> > > > On 02/21/20 at 02:26pm, Alastair D'Silva wrote:
> > > > > From: Alastair D'Silva <alastair@d-silva.org>
> > > > >
> > > > > Function declarations don't need externs, remove the existing ones so
> > > > > they are consistent with newer code
> > > > >
> > > > > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > > > > ---
> > > > >  arch/powerpc/include/asm/pnv-ocxl.h | 32 ++++++++++++++---------------
> > > > >  include/misc/ocxl.h                 |  6 +++---
> > > > >  2 files changed, 18 insertions(+), 20 deletions(-)
> > > > >
> > > > > diff --git a/arch/powerpc/include/asm/pnv-ocxl.h
> > > > > b/arch/powerpc/include/asm/pnv-ocxl.h
> > > > > index 0b2a6707e555..b23c99bc0c84 100644
> > > > > --- a/arch/powerpc/include/asm/pnv-ocxl.h
> > > > > +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> > > > > @@ -9,29 +9,27 @@
> > > > >  #define PNV_OCXL_TL_BITS_PER_RATE       4
> > > > >  #define PNV_OCXL_TL_RATE_BUF_SIZE
> > > > ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
> > > > >
> > > > > -extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16
> > > > *enabled,
> > > > > -			u16 *supported);
> > > > 
> > > > It works w or w/o extern when declare functions. Searching 'extern'
> > > > under include can find so many functions with 'extern' adding. Do we have
> > > a
> > > > explicit standard if we should add or remove 'exter' in function
> > > declaration?
> > > > 
> > > > I have no objection to this patch, just want to make clear so that I can
> > > handle
> > > > it w/o confusion.
> > > > 
> > > > Thanks
> > > > Baoquan
> > > > 
> > > 
> > > For the OpenCAPI driver, we have settled on not having 'extern' on
> > > functions.
> > > 
> > > I don't think I've seen a standard that supports or refutes this, but it
> > > does not value add.
> > > 
> > 
> > FWIW this is a warning condition for checkpatch:
> > 
> > $ ./scripts/checkpatch.pl --strict -f include/misc/ocxl.h
> 
> Good to know, thanks.
> 
> I didn't know checkpatch.pl can run on header file directly. Tried to
> check patch with '--strict -f', the below info doesn't appear. But it

Hmm... -f is to check a source file, not a patch... What did you try
exactly ?

> does give out below information when run on header file.
> 
> > 
> > [...]
> > 
> > CHECK: extern prototypes should be avoided in .h files
> > #176: FILE: include/misc/ocxl.h:176:
> > +extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
> > 
> > [...]
> > 
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
  2020-02-26 14:20           ` Greg Kurz
@ 2020-02-26 14:54             ` 'Baoquan He'
  0 siblings, 0 replies; 130+ messages in thread
From: 'Baoquan He' @ 2020-02-26 14:54 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Alastair D'Silva, 'Alastair D'Silva',
	'Aneesh Kumar K . V', 'Benjamin Herrenschmidt',
	'Paul Mackerras', 'Michael Ellerman',
	'Frederic Barrat', 'Andrew Donnellan',
	'Arnd Bergmann', 'Greg Kroah-Hartman',
	'Andrew Morton', 'Mauro Carvalho Chehab',
	'David S. Miller', 'Rob Herring',
	'Anton Blanchard', 'Krzysztof Kozlowski',
	'Mahesh Salgaonkar', 'Madhavan Srinivasan',
	'Cédric Le Goater', 'Anju T Sudhakar',
	'Hari Bathini', 'Thomas Gleixner',
	'Nicholas Piggin', 'Masahiro Yamada',
	'Alexey Kardashevskiy',
	linux-kernel, linuxppc-dev, linux-nvdimm, linux-mm

On 02/26/20 at 03:20pm, Greg Kurz wrote:
> On Wed, 26 Feb 2020 22:15:23 +0800
> 'Baoquan He' <bhe@redhat.com> wrote:
> 
> > On 02/26/20 at 10:01am, Greg Kurz wrote:
> > > On Wed, 26 Feb 2020 19:26:34 +1100
> > > "Alastair D'Silva" <alastair@d-silva.org> wrote:
> > > 
> > > > > -----Original Message-----
> > > > > From: Baoquan He <bhe@redhat.com>
> > > > > Sent: Wednesday, 26 February 2020 7:15 PM
> > > > > To: Alastair D'Silva <alastair@au1.ibm.com>
> > > > > Cc: alastair@d-silva.org; Aneesh Kumar K . V
> > > > > <aneesh.kumar@linux.ibm.com>; Oliver O'Halloran <oohall@gmail.com>;
> > > > > Benjamin Herrenschmidt <benh@kernel.crashing.org>; Paul Mackerras
> > > > > <paulus@samba.org>; Michael Ellerman <mpe@ellerman.id.au>; Frederic
> > > > > Barrat <fbarrat@linux.ibm.com>; Andrew Donnellan <ajd@linux.ibm.com>;
> > > > > Arnd Bergmann <arnd@arndb.de>; Greg Kroah-Hartman
> > > > > <gregkh@linuxfoundation.org>; Dan Williams <dan.j.williams@intel.com>;
> > > > > Vishal Verma <vishal.l.verma@intel.com>; Dave Jiang
> > > > > <dave.jiang@intel.com>; Ira Weiny <ira.weiny@intel.com>; Andrew Morton
> > > > > <akpm@linux-foundation.org>; Mauro Carvalho Chehab
> > > > > <mchehab+samsung@kernel.org>; David S. Miller <davem@davemloft.net>;
> > > > > Rob Herring <robh@kernel.org>; Anton Blanchard <anton@ozlabs.org>;
> > > > > Krzysztof Kozlowski <krzk@kernel.org>; Mahesh Salgaonkar
> > > > > <mahesh@linux.vnet.ibm.com>; Madhavan Srinivasan
> > > > > <maddy@linux.vnet.ibm.com>; Cédric Le Goater <clg@kaod.org>; Anju T
> > > > > Sudhakar <anju@linux.vnet.ibm.com>; Hari Bathini
> > > > > <hbathini@linux.ibm.com>; Thomas Gleixner <tglx@linutronix.de>; Greg
> > > > > Kurz <groug@kaod.org>; Nicholas Piggin <npiggin@gmail.com>; Masahiro
> > > > > Yamada <yamada.masahiro@socionext.com>; Alexey Kardashevskiy
> > > > > <aik@ozlabs.ru>; linux-kernel@vger.kernel.org; linuxppc-
> > > > > dev@lists.ozlabs.org; linux-nvdimm@lists.01.org; linux-mm@kvack.org
> > > > > Subject: Re: [PATCH v3 04/27] ocxl: Remove unnecessary externs
> > > > > 
> > > > > On 02/21/20 at 02:26pm, Alastair D'Silva wrote:
> > > > > > From: Alastair D'Silva <alastair@d-silva.org>
> > > > > >
> > > > > > Function declarations don't need externs, remove the existing ones so
> > > > > > they are consistent with newer code
> > > > > >
> > > > > > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > > > > > ---
> > > > > >  arch/powerpc/include/asm/pnv-ocxl.h | 32 ++++++++++++++---------------
> > > > > >  include/misc/ocxl.h                 |  6 +++---
> > > > > >  2 files changed, 18 insertions(+), 20 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/powerpc/include/asm/pnv-ocxl.h
> > > > > > b/arch/powerpc/include/asm/pnv-ocxl.h
> > > > > > index 0b2a6707e555..b23c99bc0c84 100644
> > > > > > --- a/arch/powerpc/include/asm/pnv-ocxl.h
> > > > > > +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> > > > > > @@ -9,29 +9,27 @@
> > > > > >  #define PNV_OCXL_TL_BITS_PER_RATE       4
> > > > > >  #define PNV_OCXL_TL_RATE_BUF_SIZE
> > > > > ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
> > > > > >
> > > > > > -extern int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16
> > > > > *enabled,
> > > > > > -			u16 *supported);
> > > > > 
> > > > > It works w or w/o extern when declare functions. Searching 'extern'
> > > > > under include can find so many functions with 'extern' adding. Do we have
> > > > a
> > > > > explicit standard if we should add or remove 'exter' in function
> > > > declaration?
> > > > > 
> > > > > I have no objection to this patch, just want to make clear so that I can
> > > > handle
> > > > > it w/o confusion.
> > > > > 
> > > > > Thanks
> > > > > Baoquan
> > > > > 
> > > > 
> > > > For the OpenCAPI driver, we have settled on not having 'extern' on
> > > > functions.
> > > > 
> > > > I don't think I've seen a standard that supports or refutes this, but it
> > > > does not value add.
> > > > 
> > > 
> > > FWIW this is a warning condition for checkpatch:
> > > 
> > > $ ./scripts/checkpatch.pl --strict -f include/misc/ocxl.h
> > 
> > Good to know, thanks.
> > 
> > I didn't know checkpatch.pl can run on header file directly. Tried to
> > check patch with '--strict -f', the below info doesn't appear. But it
> 
> Hmm... -f is to check a source file, not a patch... What did you try
> exactly ?

OK, that's it. I can see the 'CHECK' line when run checkpatch.pl on
patch with '--strict' only. I think this can be a good reason that we
should not add extern when add function declaration into header file.
Thanks.

> 
> > does give out below information when run on header file.
> > 
> > > 
> > > [...]
> > > 
> > > CHECK: extern prototypes should be avoided in .h files
> > > #176: FILE: include/misc/ocxl.h:176:
> > > +extern int ocxl_afu_irq_alloc(struct ocxl_context *ctx, int *irq_id);
> > > 
> > > [...]
> > > 
> > 
> 
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready
  2020-02-21  3:27 ` [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready Alastair D'Silva
@ 2020-02-27  3:54   ` Andrew Donnellan
  2020-02-27  3:58     ` Alastair D'Silva
  2020-03-02 17:51   ` Frederic Barrat
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-27  3:54 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> +/**
> + * read_device_metadata() - Retrieve config information from the AFU and save it for future use
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int read_device_metadata(struct ocxlpmem *ocxlpmem)
> +{
> +	u64 val;
> +	int rc;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CCAP0,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	ocxlpmem->scm_revision = val & 0xFFFF;
> +	ocxlpmem->read_latency = (val >> 32) & 0xFF;

This field is 16 bits in the spec, so the mask should be 0xFFFF I think?

Maybe we should generalise the EXTRACT_BITS macro we use in ocxl :)

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready
  2020-02-27  3:54   ` Andrew Donnellan
@ 2020-02-27  3:58     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-27  3:58 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Thu, 2020-02-27 at 14:54 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > +/**
> > + * read_device_metadata() - Retrieve config information from the
> > AFU and save it for future use
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int read_device_metadata(struct ocxlpmem *ocxlpmem)
> > +{
> > +	u64 val;
> > +	int rc;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CCAP0,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	ocxlpmem->scm_revision = val & 0xFFFF;
> > +	ocxlpmem->read_latency = (val >> 32) & 0xFF;
> 
> This field is 16 bits in the spec, so the mask should be 0xFFFF I
> think?
> 

You're right, I'll fix it.

> Maybe we should generalise the EXTRACT_BITS macro we use in ocxl :)
> 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 12/27] powerpc/powernv/pmem: Add register addresses & status values to the header
  2020-02-21  3:27 ` [PATCH v3 12/27] powerpc/powernv/pmem: Add register addresses & status values to the header Alastair D'Silva
@ 2020-02-27  5:08   ` Andrew Donnellan
  2020-02-27  5:16     ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-27  5:08 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> These values have been taken from the device specifications.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

I've compared these values against the internal version of the device 
specifications that I have access to, and they appear to match.

A few minor comments below, otherwise:

Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>

> +#define GLOBAL_MMIO_HCI_ACRW				BIT_ULL(0)
> +#define GLOBAL_MMIO_HCI_NSCRW				BIT_ULL(1)
> +#define GLOBAL_MMIO_HCI_AFU_RESET			BIT_ULL(2)
> +#define GLOBAL_MMIO_HCI_FW_DEBUG			BIT_ULL(3)
> +#define GLOBAL_MMIO_HCI_CONTROLLER_DUMP			BIT_ULL(4)
> +#define GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COLLECTED	BIT_ULL(5)
> +#define GLOBAL_MMIO_HCI_REQ_HEALTH_PERF			BIT_ULL(6)

The labelling of some of these bits deviates from the standard 
abbreviations in the spec, which is fine I guess as these names are more 
descriptive, but maybe add a brief comment with the original abbreviation?

> +
> +#define ADMIN_COMMAND_HEARTBEAT		0x00u
> +#define ADMIN_COMMAND_SHUTDOWN		0x01u
> +#define ADMIN_COMMAND_FW_UPDATE		0x02u
> +#define ADMIN_COMMAND_FW_DEBUG		0x03u
> +#define ADMIN_COMMAND_ERRLOG		0x04u
> +#define ADMIN_COMMAND_SMART		0x05u
> +#define ADMIN_COMMAND_CONTROLLER_STATS	0x06u
> +#define ADMIN_COMMAND_CONTROLLER_DUMP	0x07u
> +#define ADMIN_COMMAND_CMD_CAPS		0x08u
> +#define ADMIN_COMMAND_MAX		0x08u
> +
> +#define STATUS_SUCCESS		0x00
> +#define STATUS_MEM_UNAVAILABLE	0x20

There's also a "blocked on account of background task" code, 0x21.

> +#define STATUS_BAD_OPCODE	0x50
> +#define STATUS_BAD_REQUEST_PARM	0x51
> +#define STATUS_BAD_DATA_PARM	0x52
> +#define STATUS_DEBUG_BLOCKED	0x70
> +#define STATUS_FAIL		0xFF
> +
> +#define STATUS_FW_UPDATE_BLOCKED 0x21
> +#define STATUS_FW_ARG_INVALID	0x51
> +#define STATUS_FW_INVALID	0x52

These status codes seem, from the specification, to correspond to the 
generic error codes above, so perhaps they're not needed.


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 12/27] powerpc/powernv/pmem: Add register addresses & status values to the header
  2020-02-27  5:08   ` Andrew Donnellan
@ 2020-02-27  5:16     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-27  5:16 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Thu, 2020-02-27 at 16:08 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > These values have been taken from the device specifications.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> 
> I've compared these values against the internal version of the
> device 
> specifications that I have access to, and they appear to match.
> 
> A few minor comments below, otherwise:
> 
> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
> 
> > +#define GLOBAL_MMIO_HCI_ACRW				BIT_ULL
> > (0)
> > +#define GLOBAL_MMIO_HCI_NSCRW				BIT_ULL
> > (1)
> > +#define GLOBAL_MMIO_HCI_AFU_RESET			BIT_ULL(2)
> > +#define GLOBAL_MMIO_HCI_FW_DEBUG			BIT_ULL(3)
> > +#define GLOBAL_MMIO_HCI_CONTROLLER_DUMP			BIT_ULL
> > (4)
> > +#define GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COLLECTED	BIT_ULL(5)
> > +#define GLOBAL_MMIO_HCI_REQ_HEALTH_PERF			BIT_ULL
> > (6)
> 
> The labelling of some of these bits deviates from the standard 
> abbreviations in the spec, which is fine I guess as these names are
> more 
> descriptive, but maybe add a brief comment with the original
> abbreviation?
> 

Ok

> > +
> > +#define ADMIN_COMMAND_HEARTBEAT		0x00u
> > +#define ADMIN_COMMAND_SHUTDOWN		0x01u
> > +#define ADMIN_COMMAND_FW_UPDATE		0x02u
> > +#define ADMIN_COMMAND_FW_DEBUG		0x03u
> > +#define ADMIN_COMMAND_ERRLOG		0x04u
> > +#define ADMIN_COMMAND_SMART		0x05u
> > +#define ADMIN_COMMAND_CONTROLLER_STATS	0x06u
> > +#define ADMIN_COMMAND_CONTROLLER_DUMP	0x07u
> > +#define ADMIN_COMMAND_CMD_CAPS		0x08u
> > +#define ADMIN_COMMAND_MAX		0x08u
> > +
> > +#define STATUS_SUCCESS		0x00
> > +#define STATUS_MEM_UNAVAILABLE	0x20
> 
> There's also a "blocked on account of background task" code, 0x21.
> 

Ok

> > +#define STATUS_BAD_OPCODE	0x50
> > +#define STATUS_BAD_REQUEST_PARM	0x51
> > +#define STATUS_BAD_DATA_PARM	0x52
> > +#define STATUS_DEBUG_BLOCKED	0x70
> > +#define STATUS_FAIL		0xFF
> > +
> > +#define STATUS_FW_UPDATE_BLOCKED 0x21
> > +#define STATUS_FW_ARG_INVALID	0x51
> > +#define STATUS_FW_INVALID	0x52
> 
> These status codes seem, from the specification, to correspond to
> the 
> generic error codes above, so perhaps they're not needed.
> 

These will be used in warn_status_fw_update() later, but I'll alias
them to make it clear that they are shadowing values

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands
  2020-02-21  3:27 ` [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands Alastair D'Silva
@ 2020-02-27  8:22   ` Andrew Donnellan
  2020-02-27  8:27     ` Andrew Donnellan
  2020-02-27 23:51     ` Alastair D'Silva
  2020-02-27 17:01   ` Dan Williams
  1 sibling, 2 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-27  8:22 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch requests the metadata required to issue admin commands, as well
> as some helper functions to construct and check the completion of the
> commands.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    |  65 ++++++++
>   .../platforms/powernv/pmem/ocxl_internal.c    | 153 ++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.h    |  61 +++++++
>   3 files changed, 279 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 431212c9f0cc..4e782d22605b 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -216,6 +216,58 @@ static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
>   	return 0;
>   }
>   
> +/**
> + * extract_command_metadata() - Extract command data from MMIO & save it for further use
> + * @ocxlpmem: the device metadata
> + * @offset: The base address of the command data structures (address of CREQO)
> + * @command_metadata: A pointer to the command metadata to populate
> + * Return: 0 on success, negative on failure
> + */
> +static int extract_command_metadata(struct ocxlpmem *ocxlpmem, u32 offset,
> +					struct command_metadata *command_metadata)
> +{
> +	int rc;
> +	u64 tmp;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, offset, OCXL_LITTLE_ENDIAN,
> +				     &tmp);
> +	if (rc)
> +		return rc;
> +
> +	command_metadata->request_offset = tmp >> 32;
> +	command_metadata->response_offset = tmp & 0xFFFFFFFF;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, offset + 8, OCXL_LITTLE_ENDIAN,
> +				     &tmp);
> +	if (rc)
> +		return rc;
> +
> +	command_metadata->data_offset = tmp >> 32;
> +	command_metadata->data_size = tmp & 0xFFFFFFFF;
> +
> +	command_metadata->id = 0;
> +
> +	return 0;
> +}
> +
> +/**
> + * setup_command_metadata() - Set up the command metadata
> + * @ocxlpmem: the device metadata
> + */
> +static int setup_command_metadata(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +
> +	mutex_init(&ocxlpmem->admin_command.lock);
> +
> +	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_ACMA_CREQO,
> +				      &ocxlpmem->admin_command);
> +	if (rc)
> +		return rc;
> +
> +	return 0;
> +}
> +
>   /**
>    * is_usable() - Is a controller usable?
>    * @ocxlpmem: the device metadata
> @@ -456,6 +508,14 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   	}
>   	ocxlpmem->pdev = pdev;
>   
> +	ocxlpmem->timeouts[ADMIN_COMMAND_ERRLOG] = 2000; // ms
> +	ocxlpmem->timeouts[ADMIN_COMMAND_HEARTBEAT] = 100; // ms
> +	ocxlpmem->timeouts[ADMIN_COMMAND_SMART] = 100; // ms
> +	ocxlpmem->timeouts[ADMIN_COMMAND_CONTROLLER_DUMP] = 1000; // ms
> +	ocxlpmem->timeouts[ADMIN_COMMAND_CONTROLLER_STATS] = 100; // ms
> +	ocxlpmem->timeouts[ADMIN_COMMAND_SHUTDOWN] = 1000; // ms
> +	ocxlpmem->timeouts[ADMIN_COMMAND_FW_UPDATE] = 16000; // ms

Why are we keeping these timeouts in a per device struct? I can't see 
anywhere where we change these values.

> +
>   	pci_set_drvdata(pdev, ocxlpmem);
>   
>   	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
> @@ -501,6 +561,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   		goto err;
>   	}
>   
> +	if (setup_command_metadata(ocxlpmem)) {
> +		dev_err(&pdev->dev, "Could not read OCXL command matada\n");

metadata

Also, "OCXL command metadata" is misleading, this is a pmem specific 
thing, not an OpenCAPI thing, I would prefer just "command metadata".

> +		goto err;
> +	}
> +
>   	elapsed = 0;
>   	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
>   	while (!is_usable(ocxlpmem, false)) {
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> index 617ca943b1b8..583f48023025 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> @@ -17,3 +17,156 @@ int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi)
>   
>   	return 0;
>   }
> +
> +#define COMMAND_REQUEST_SIZE (8 * sizeof(u64))
> +static int scm_command_request(const struct ocxlpmem *ocxlpmem,
> +			       struct command_metadata *cmd, u8 op_code)
> +{
> +	u64 val = op_code;
> +	int rc;
> +	u8 i;
> +
> +	cmd->op_code = op_code;
> +	cmd->id++;
> +
> +	val |= ((u64)cmd->id) << 16;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, cmd->request_offset,
> +				      OCXL_LITTLE_ENDIAN, val);
> +	if (rc)
> +		return rc;
> +
> +	for (i = sizeof(u64); i < COMMAND_REQUEST_SIZE; i += sizeof(u64)) {
> +		rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +					      cmd->request_offset + i,
> +					      OCXL_LITTLE_ENDIAN, 0);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +int admin_command_request(struct ocxlpmem *ocxlpmem, u8 op_code)
> +{
> +	u64 val;
> +	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHI,
> +					 OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;

Ignoring the value here expected, you're just trying to verify that you 
don't see an error on the read?

> +
> +	return scm_command_request(ocxlpmem, &ocxlpmem->admin_command, op_code);
> +}
> +
> +static int command_response(const struct ocxlpmem *ocxlpmem,
> +			    const struct command_metadata *cmd)
> +{
> +	u64 val;
> +	u16 id;
> +	u8 status;
> +	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					 cmd->response_offset,
> +					 OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	status = val & 0xff;
> +	id = (val >> 16) & 0xffff;
> +
> +	if (id != cmd->id) {
> +		dev_warn(&ocxlpmem->dev,
> +			 "Expected response for command %d, but received response for command %d instead.\n",
> +			 cmd->id, id);

If this happens I imagine something's gone pretty wrong - this should 
probably be a dev_err()? And perhaps we want to make sure we return an 
error code rather than whatever status code we get from the MMIO?

> +	}
> +
> +	return status;
> +}
> +
> +int admin_response(const struct ocxlpmem *ocxlpmem)
> +{
> +	return command_response(ocxlpmem, &ocxlpmem->admin_command);
> +}
> +
> +
> +int admin_command_execute(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
> +				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_HCI_ACRW);
> +}
> +
> +static bool admin_command_complete(const struct ocxlpmem *ocxlpmem)
> +{
> +	u64 val = 0;
> +
> +	int rc = ocxlpmem_chi(ocxlpmem, &val);
> +
> +	WARN_ON(rc);
> +
> +	return (val & GLOBAL_MMIO_CHI_ACRA) != 0;
> +}
> +
> +int admin_command_complete_timeout(const struct ocxlpmem *ocxlpmem,
> +				   int command)
> +{
> +	u32 timeout = ocxlpmem->timeouts[command];
> +	// 32 is the next power of 2 greater than the 20ms minimum for msleep
> +#define TIMEOUT_SLEEP_MILLIS 32
> +	timeout /= TIMEOUT_SLEEP_MILLIS;
> +	if (!timeout)
> +		timeout = DEFAULT_TIMEOUT / TIMEOUT_SLEEP_MILLIS;
> +
> +	while (timeout-- > 0) {
> +		if (admin_command_complete(ocxlpmem))
> +			return 0;
> +		msleep(TIMEOUT_SLEEP_MILLIS);
> +	}

I think the more traditional way to implement timeouts is something more 
like:

   unsigned long timeout = jiffies + msecs_to_jiffies(<timeout period>);
   do {
     <check>
     <sleep>
   } while (time_before(jiffies, timeout));

> +
> +	if (admin_command_complete(ocxlpmem))
> +		return 0;
> +
> +	return -EBUSY;
> +}
> +
> +int admin_response_handled(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
> +				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_CHI_ACRA);
> +}

This looks wrong? My reading of the spec is that you're meant to *clear* 
ACRA upon completion of handing, this looks like it's setting ACRA to 1.

> +
> +void warn_status(const struct ocxlpmem *ocxlpmem, const char *message,
> +		     u8 status)
> +{
> +	const char *text = "Unknown";
> +
> +	switch (status) {
> +	case STATUS_SUCCESS:
> +		text = "Success";
> +		break;
> +
> +	case STATUS_MEM_UNAVAILABLE:
> +		text = "Persistent memory unavailable";
> +		break;
> +
> +	case STATUS_BAD_OPCODE:
> +		text = "Bad opcode";
> +		break;
> +
> +	case STATUS_BAD_REQUEST_PARM:
> +		text = "Bad request parameter";
> +		break;
> +
> +	case STATUS_BAD_DATA_PARM:
> +		text = "Bad data parameter";
> +		break;
> +
> +	case STATUS_DEBUG_BLOCKED:
> +		text = "Debug action blocked";
> +		break;
> +
> +	case STATUS_FAIL:
> +		text = "Failed";
> +		break;
> +	}
> +
> +	dev_warn(&ocxlpmem->dev, "%s: %s (%x)\n", message, text, status);
> +}
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index ba0301533d00..2fef68c71271 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -7,6 +7,7 @@
>   #include <linux/mm.h>
>   
>   #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> +#define DEFAULT_TIMEOUT 100
>   
>   #define GLOBAL_MMIO_CHI		0x000
>   #define GLOBAL_MMIO_CHIC	0x008
> @@ -80,6 +81,16 @@
>   #define STATUS_FW_ARG_INVALID	0x51
>   #define STATUS_FW_INVALID	0x52
>   
> +struct command_metadata {
> +	u32 request_offset;
> +	u32 response_offset;
> +	u32 data_offset;
> +	u32 data_size;
> +	struct mutex lock;
> +	u16 id;
> +	u8 op_code;
> +};
> +
>   struct ocxlpmem_function0 {
>   	struct pci_dev *pdev;
>   	struct ocxl_fn *ocxl_fn;
> @@ -95,9 +106,11 @@ struct ocxlpmem {
>   	struct ocxl_afu *ocxl_afu;
>   	struct ocxl_context *ocxl_context;
>   	void *metadata_addr;
> +	struct command_metadata admin_command;
>   	struct resource pmem_res;
>   	struct nd_region *nd_region;
>   	char fw_version[8+1];
> +	u32 timeouts[ADMIN_COMMAND_MAX+1];
>   
>   	u32 max_controller_dump_size;
>   	u16 scm_revision; // major/minor
> @@ -122,3 +135,51 @@ struct ocxlpmem {
>    * Returns 0 on success, negative on error
>    */
>   int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi);
> +
> +/**
> + * admin_command_request() - Issue an admin command request
> + * @ocxlpmem: the device metadata
> + * @op_code: The op-code for the command
> + *
> + * Returns an identifier for the command, or negative on error
> + */
> +int admin_command_request(struct ocxlpmem *ocxlpmem, u8 op_code);
> +
> +/**
> + * admin_response() - Validate an admin response
> + * @ocxlpmem: the device metadata
> + * Returns the status code of the command, or negative on error
> + */
> +int admin_response(const struct ocxlpmem *ocxlpmem);
> +
> +/**
> + * admin_command_execute() - Notify the controller to start processing a pending admin command
> + * @ocxlpmem: the device metadata
> + * Returns 0 on success, negative on error
> + */
> +int admin_command_execute(const struct ocxlpmem *ocxlpmem);
> +
> +/**
> + * admin_command_complete_timeout() - Wait for an admin command to finish executing
> + * @ocxlpmem: the device metadata
> + * @command: the admin command to wait for completion (determines the timeout)
> + * Returns 0 on success, -EBUSY on timeout
> + */
> +int admin_command_complete_timeout(const struct ocxlpmem *ocxlpmem,
> +				   int command);
> +
> +/**
> + * admin_response_handled() - Notify the controller that the admin response has been handled
> + * @ocxlpmem: the device metadata
> + * Returns 0 on success, negative on failure
> + */
> +int admin_response_handled(const struct ocxlpmem *ocxlpmem);
> +
> +/**
> + * warn_status() - Emit a kernel warning showing a command status.
> + * @ocxlpmem: the device metadata
> + * @message: A message to accompany the warning
> + * @status: The command status
> + */
> +void warn_status(const struct ocxlpmem *ocxlpmem, const char *message,
> +		 u8 status);
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands
  2020-02-27  8:22   ` Andrew Donnellan
@ 2020-02-27  8:27     ` Andrew Donnellan
  2020-02-27 23:54       ` Alastair D'Silva
  2020-02-27 23:51     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-27  8:27 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 27/2/20 7:22 pm, Andrew Donnellan wrote:
>> +int admin_command_request(struct ocxlpmem *ocxlpmem, u8 op_code)
>> +{
>> +    u64 val;
>> +    int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, 
>> GLOBAL_MMIO_CHI,
>> +                     OCXL_LITTLE_ENDIAN, &val);
>> +    if (rc)
>> +        return rc;
> 
> Ignoring the value here expected, you're just trying to verify that you 
> don't see an error on the read?

I see that in the next patch, in ns_command_request() you check that 
NSCRA is 1 - did you mean to check that ACRA = 1 here?


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands
  2020-02-21  3:27 ` [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands Alastair D'Silva
@ 2020-02-27  8:30   ` Andrew Donnellan
  2020-02-27 23:56     ` Alastair D'Silva
  2020-02-27 17:02   ` Dan Williams
  2020-03-02 17:58   ` Frederic Barrat
  2 siblings, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-27  8:30 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:> +int 
ns_response_handled(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
> +				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_CHI_NSCRA);
> +}

Same comment as on the last patch - I think we're meant to be clearing 
this bit, not setting it to 1,


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands
  2020-02-21  3:27 ` [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands Alastair D'Silva
  2020-02-27  8:22   ` Andrew Donnellan
@ 2020-02-27 17:01   ` Dan Williams
  2020-02-27 23:57     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Dan Williams @ 2020-02-27 17:01 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: alastair, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashev skiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Thu, Feb 20, 2020 at 7:28 PM Alastair D'Silva <alastair@au1.ibm.com> wrote:
>
> From: Alastair D'Silva <alastair@d-silva.org>
>
> This patch requests the metadata required to issue admin commands, as well
> as some helper functions to construct and check the completion of the
> commands.

What are the admin commands? Any pointer to a spec? Why does Linux
need to support these commands?
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands
  2020-02-21  3:27 ` [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands Alastair D'Silva
  2020-02-27  8:30   ` Andrew Donnellan
@ 2020-02-27 17:02   ` Dan Williams
  2020-03-02 17:58   ` Frederic Barrat
  2 siblings, 0 replies; 130+ messages in thread
From: Dan Williams @ 2020-02-27 17:02 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: alastair, Aneesh Kumar K . V, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Frederic Barrat,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashev skiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Thu, Feb 20, 2020 at 7:28 PM Alastair D'Silva <alastair@au1.ibm.com> wrote:
>
> From: Alastair D'Silva <alastair@d-silva.org>
>
> Similar to the previous patch, this adds support for near storage commands.

Similar comment as the last patch. This changelog does not give the
reviewer any frame of reference to review the patch.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory
  2020-02-21  3:27 ` [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory Alastair D'Silva
  2020-02-26  5:07   ` Andrew Donnellan
@ 2020-02-27 20:44   ` Frederic Barrat
  2020-02-28  0:54     ` Alastair D'Silva
  2020-02-28 18:32   ` Frederic Barrat
  2 siblings, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-02-27 20:44 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This driver exposes LPC memory on OpenCAPI pmem cards
> as an NVDIMM, allowing the existing nvram infrastructure
> to be used.
> 
> Namespace metadata is stored on the media itself, so
> scm_reserve_metadata() maps 1 section's worth of PMEM storage
> at the start to hold this. The rest of the PMEM range is registered
> with libnvdimm as an nvdimm. scm_ndctl_config_read/write/size() provide
> callbacks to libnvdimm to access the metadata.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/Kconfig        |   3 +
>   arch/powerpc/platforms/powernv/Makefile       |   1 +
>   arch/powerpc/platforms/powernv/pmem/Kconfig   |  15 +
>   arch/powerpc/platforms/powernv/pmem/Makefile  |   7 +
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 473 ++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.h    |  28 ++
>   6 files changed, 527 insertions(+)
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/Kconfig
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/Makefile
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl.c
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> 
> diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
> index 938803eab0ad..fc8976af0e52 100644
> --- a/arch/powerpc/platforms/powernv/Kconfig
> +++ b/arch/powerpc/platforms/powernv/Kconfig
> @@ -50,3 +50,6 @@ config PPC_VAS
>   config SCOM_DEBUGFS
>   	bool "Expose SCOM controllers via debugfs"
>   	depends on DEBUG_FS
> +
> +source "arch/powerpc/platforms/powernv/pmem/Kconfig"
> +
> diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
> index c0f8120045c3..0bbd72988b6f 100644
> --- a/arch/powerpc/platforms/powernv/Makefile
> +++ b/arch/powerpc/platforms/powernv/Makefile
> @@ -21,3 +21,4 @@ obj-$(CONFIG_PPC_VAS)	+= vas.o vas-window.o vas-debug.o
>   obj-$(CONFIG_OCXL_BASE)	+= ocxl.o
>   obj-$(CONFIG_SCOM_DEBUGFS) += opal-xscom.o
>   obj-$(CONFIG_PPC_SECURE_BOOT) += opal-secvar.o
> +obj-$(CONFIG_LIBNVDIMM) += pmem/
> diff --git a/arch/powerpc/platforms/powernv/pmem/Kconfig b/arch/powerpc/platforms/powernv/pmem/Kconfig
> new file mode 100644
> index 000000000000..c5d927520920
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/Kconfig
> @@ -0,0 +1,15 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +if LIBNVDIMM
> +
> +config OCXL_PMEM
> +	tristate "OpenCAPI Persistent Memory"
> +	depends on LIBNVDIMM && PPC_POWERNV && PCI && EEH && ZONE_DEVICE && OCXL
> +	help
> +	  Exposes devices that implement the OpenCAPI Storage Class Memory
> +	  specification as persistent memory regions. You may also want
> +	  DEV_DAX, DEV_DAX_PMEM & FS_DAX if you plan on using DAX devices
> +	  stacked on top of this driver.
> +
> +	  Select N if unsure.
> +
> +endif
> diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile b/arch/powerpc/platforms/powernv/pmem/Makefile
> new file mode 100644
> index 000000000000..1c55c4193175
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +ccflags-$(CONFIG_PPC_WERROR)	+= -Werror
> +
> +obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
> +
> +ocxlpmem-y := ocxl.o
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> new file mode 100644
> index 000000000000..3c4eeb5dcc0f
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -0,0 +1,473 @@
> +// SPDX-License-Id
> +// Copyright 2019 IBM Corp.
> +
> +/*
> + * A driver for OpenCAPI devices that implement the Storage Class
> + * Memory specification.
> + */
> +
> +#include <linux/module.h>
> +#include <misc/ocxl.h>
> +#include <linux/ndctl.h>
> +#include <linux/mm_types.h>
> +#include <linux/memory_hotplug.h>
> +#include "ocxl_internal.h"
> +
> +
> +static const struct pci_device_id ocxlpmem_pci_tbl[] = {
> +	{ PCI_DEVICE(PCI_VENDOR_ID_IBM, 0x0625), },
> +	{ }
> +};
> +
> +MODULE_DEVICE_TABLE(pci, ocxlpmem_pci_tbl);
> +
> +#define NUM_MINORS 256 // Total to reserve
> +
> +static dev_t ocxlpmem_dev;
> +static struct class *ocxlpmem_class;
> +static struct mutex minors_idr_lock;
> +static struct idr minors_idr;
> +
> +/**
> + * ndctl_config_write() - Handle a ND_CMD_SET_CONFIG_DATA command from ndctl
> + * @ocxlpmem: the device metadata
> + * @command: the incoming data to write
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_write(struct ocxlpmem *ocxlpmem,
> +			      struct nd_cmd_set_config_hdr *command)
> +{
> +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> +		return -EINVAL;
> +
> +	memcpy_flushcache(ocxlpmem->metadata_addr + command->in_offset, command->in_buf,
> +			  command->in_length);
> +
> +	return 0;
> +}
> +
> +/**
> + * ndctl_config_read() - Handle a ND_CMD_GET_CONFIG_DATA command from ndctl
> + * @ocxlpmem: the device metadata
> + * @command: the read request
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_read(struct ocxlpmem *ocxlpmem,
> +			     struct nd_cmd_get_config_data_hdr *command)
> +{
> +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> +		return -EINVAL;
> +
> +	memcpy_mcsafe(command->out_buf, ocxlpmem->metadata_addr + command->in_offset,
> +		      command->in_length);
> +
> +	return 0;
> +}
> +
> +/**
> + * ndctl_config_size() - Handle a ND_CMD_GET_CONFIG_SIZE command from ndctl
> + * @command: the read request
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_size(struct nd_cmd_get_config_size *command)
> +{
> +	command->status = 0;
> +	command->config_size = LABEL_AREA_SIZE;
> +	command->max_xfer = PAGE_SIZE;
> +
> +	return 0;
> +}
> +
> +static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
> +		 struct nvdimm *nvdimm,
> +		 unsigned int cmd, void *buf, unsigned int buf_len, int *cmd_rc)
> +{
> +	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct ocxlpmem, bus_desc);
> +
> +	switch (cmd) {
> +	case ND_CMD_GET_CONFIG_SIZE:
> +		*cmd_rc = ndctl_config_size(buf);
> +		return 0;
> +
> +	case ND_CMD_GET_CONFIG_DATA:
> +		*cmd_rc = ndctl_config_read(ocxlpmem, buf);
> +		return 0;
> +
> +	case ND_CMD_SET_CONFIG_DATA:
> +		*cmd_rc = ndctl_config_write(ocxlpmem, buf);
> +		return 0;
> +
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> +
> +/**
> + * reserve_metadata() - Reserve space for nvdimm metadata
> + * @ocxlpmem: the device metadata
> + * @lpc_mem: The resource representing the LPC memory of the OpenCAPI device
> + */
> +static int reserve_metadata(struct ocxlpmem *ocxlpmem,
> +			    struct resource *lpc_mem)
> +{
> +	ocxlpmem->metadata_addr = devm_memremap(&ocxlpmem->dev, lpc_mem->start,
> +						LABEL_AREA_SIZE, MEMREMAP_WB);
> +	if (IS_ERR(ocxlpmem->metadata_addr))
> +		return PTR_ERR(ocxlpmem->metadata_addr);
> +
> +	return 0;
> +}
> +
> +/**
> + * register_lpc_mem() - Discover persistent memory on a device and register it with the NVDIMM subsystem
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success
> + */
> +static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
> +{
> +	struct nd_region_desc region_desc;
> +	struct nd_mapping_desc nd_mapping_desc;
> +	struct resource *lpc_mem;
> +	const struct ocxl_afu_config *config;
> +	const struct ocxl_fn_config *fn_config;
> +	int rc;
> +	unsigned long nvdimm_cmd_mask = 0;
> +	unsigned long nvdimm_flags = 0;
> +	int target_node;
> +	char serial[16+1];
> +
> +	// Set up the reserved metadata area
> +	rc = ocxl_afu_map_lpc_mem(ocxlpmem->ocxl_afu);
> +	if (rc < 0)
> +		return rc;
> +
> +	lpc_mem = ocxl_afu_lpc_mem(ocxlpmem->ocxl_afu);
> +	if (lpc_mem == NULL || lpc_mem->start == 0)
> +		return -EINVAL;
> +
> +	config = ocxl_afu_config(ocxlpmem->ocxl_afu);
> +	fn_config = ocxl_function_config(ocxlpmem->ocxl_fn);
> +
> +	rc = reserve_metadata(ocxlpmem, lpc_mem);
> +	if (rc)
> +		return rc;
> +
> +	ocxlpmem->bus_desc.provider_name = "ocxl-pmem";
> +	ocxlpmem->bus_desc.ndctl = ndctl;
> +	ocxlpmem->bus_desc.module = THIS_MODULE;
> +
> +	ocxlpmem->nvdimm_bus = nvdimm_bus_register(&ocxlpmem->dev,
> +						   &ocxlpmem->bus_desc);
> +	if (!ocxlpmem->nvdimm_bus)
> +		return -EINVAL;
> +
> +	ocxlpmem->pmem_res.start = (u64)lpc_mem->start + LABEL_AREA_SIZE;
> +	ocxlpmem->pmem_res.end = (u64)lpc_mem->start + config->lpc_mem_size - 1;
> +	ocxlpmem->pmem_res.name = "OpenCAPI persistent memory";
> +
> +	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
> +	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
> +	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
> +
> +	set_bit(NDD_ALIASING, &nvdimm_flags);
> +
> +	snprintf(serial, sizeof(serial), "%llx", fn_config->serial);
> +	nd_mapping_desc.nvdimm = nvdimm_create(ocxlpmem->nvdimm_bus, ocxlpmem,
> +				 NULL, nvdimm_flags, nvdimm_cmd_mask,
> +				 0, NULL);
> +	if (!nd_mapping_desc.nvdimm)
> +		return -ENOMEM;
> +
> +	if (nvdimm_bus_check_dimm_count(ocxlpmem->nvdimm_bus, 1))
> +		return -EINVAL;
> +
> +	nd_mapping_desc.start = ocxlpmem->pmem_res.start;
> +	nd_mapping_desc.size = resource_size(&ocxlpmem->pmem_res);
> +	nd_mapping_desc.position = 0;
> +
> +	ocxlpmem->nd_set.cookie1 = fn_config->serial + 1; // allow for empty serial
> +	ocxlpmem->nd_set.cookie2 = fn_config->serial + 1;
> +
> +	target_node = of_node_to_nid(ocxlpmem->pdev->dev.of_node);
> +
> +	memset(&region_desc, 0, sizeof(region_desc));
> +	region_desc.res = &ocxlpmem->pmem_res;
> +	region_desc.numa_node = NUMA_NO_NODE;
> +	region_desc.target_node = target_node;
> +	region_desc.num_mappings = 1;
> +	region_desc.mapping = &nd_mapping_desc;
> +	region_desc.nd_set = &ocxlpmem->nd_set;
> +
> +	set_bit(ND_REGION_PAGEMAP, &region_desc.flags);
> +	/*
> +	 * NB: libnvdimm copies the data from ndr_desc into it's own
> +	 * structures so passing a stack pointer is fine.
> +	 */
> +	ocxlpmem->nd_region = nvdimm_pmem_region_create(ocxlpmem->nvdimm_bus,
> +							&region_desc);
> +	if (!ocxlpmem->nd_region)
> +		return -EINVAL;
> +
> +	dev_info(&ocxlpmem->dev,
> +		 "Onlining %lluMB of persistent memory\n",
> +		 nd_mapping_desc.size / SZ_1M);
> +
> +	return 0;
> +}
> +
> +/**
> + * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem device
> + * @ocxlpmem: the device metadata
> + * Return: the allocated minor number
> + */
> +static int allocate_minor(struct ocxlpmem *ocxlpmem)
> +{
> +	int minor;
> +
> +	mutex_lock(&minors_idr_lock);
> +	minor = idr_alloc(&minors_idr, ocxlpmem, 0, NUM_MINORS, GFP_KERNEL);
> +	mutex_unlock(&minors_idr_lock);
> +	return minor;
> +}
> +
> +static void free_minor(struct ocxlpmem *ocxlpmem)
> +{
> +	mutex_lock(&minors_idr_lock);
> +	idr_remove(&minors_idr, MINOR(ocxlpmem->dev.devt));
> +	mutex_unlock(&minors_idr_lock);
> +}
> +
> +/**
> + * free_ocxlpmem() - Free all members of an ocxlpmem struct
> + * @ocxlpmem: the device struct to clear
> + */
> +static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +
> +	if (ocxlpmem->nvdimm_bus)
> +		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> +
> +	free_minor(ocxlpmem);
> +
> +	if (ocxlpmem->metadata_addr)
> +		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
> +
> +	if (ocxlpmem->ocxl_context) {
> +		rc = ocxl_context_detach(ocxlpmem->ocxl_context);
> +		if (rc == -EBUSY)
> +			dev_warn(&ocxlpmem->dev, "Timeout detaching ocxl context\n");
> +		else
> +			ocxl_context_free(ocxlpmem->ocxl_context);
> +
> +	}
> +
> +	if (ocxlpmem->ocxl_afu)
> +		ocxl_afu_put(ocxlpmem->ocxl_afu);
> +
> +	if (ocxlpmem->ocxl_fn)
> +		ocxl_function_close(ocxlpmem->ocxl_fn);
> +
> +	kfree(ocxlpmem);
> +}
> +
> +/**
> + * free_ocxlpmem_dev() - Free an OpenCAPI persistent memory device
> + * @dev: The device struct
> + */
> +static void free_ocxlpmem_dev(struct device *dev)
> +{
> +	struct ocxlpmem *ocxlpmem = container_of(dev, struct ocxlpmem, dev);
> +
> +	free_ocxlpmem(ocxlpmem);
> +}
> +
> +/**
> + * ocxlpmem_register() - Register an OpenCAPI pmem device with the kernel
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +	int minor = allocate_minor(ocxlpmem);
> +
> +	if (minor < 0)
> +		return minor;
> +
> +	ocxlpmem->dev.release = free_ocxlpmem_dev;
> +	rc = dev_set_name(&ocxlpmem->dev, "ocxlpmem%d", minor);
> +	if (rc < 0)
> +		return rc;
> +
> +	ocxlpmem->dev.devt = MKDEV(MAJOR(ocxlpmem_dev), minor);
> +	ocxlpmem->dev.class = ocxlpmem_class;



This function, as well as allocate_minor() and free_minor() above 
reference resources (the IDR, the file class, ...) which are not 
initialized yet. The function file_init() is coming in a later patch.



> +	ocxlpmem->dev.parent = &ocxlpmem->pdev->dev;
> +
> +	return device_register(&ocxlpmem->dev);
> +}
> +
> +/**
> + * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
> + * @pdev: the PCI device information struct
> + */
> +static void ocxlpmem_remove(struct pci_dev *pdev)
> +{
> +	if (PCI_FUNC(pdev->devfn) == 0) {
> +		struct ocxlpmem_function0 *func0 = pci_get_drvdata(pdev);
> +
> +		if (func0) {
> +			ocxl_function_close(func0->ocxl_fn);
> +			func0->ocxl_fn = NULL;
> +		}



The struct ocxlpmem_function0 allocated on probe() should be freed.



> +	} else {
> +		struct ocxlpmem *ocxlpmem = pci_get_drvdata(pdev);
> +
> +		if (ocxlpmem)
> +			device_unregister(&ocxlpmem->dev);
> +	}
> +}
> +
> +/**
> + * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
> + * This is important as it enables templates higher than 0 across all other functions,
> + * which in turn enables higher bandwidth accesses
> + * @pdev: the PCI device information struct
> + * Return: 0 on success, negative on failure
> + */
> +static int probe_function0(struct pci_dev *pdev)
> +{
> +	struct ocxlpmem_function0 *func0 = NULL;
> +	struct ocxl_fn *fn;
> +
> +	func0 = kzalloc(sizeof(*func0), GFP_KERNEL);
> +	if (!func0)
> +		return -ENOMEM;
> +
> +	func0->pdev = pdev;



Storing the struct pci_dev for function 0 appears to be useless.



> +	fn = ocxl_function_open(pdev);
> +	if (IS_ERR(fn)) {
> +		kfree(func0);
> +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> +		return PTR_ERR(fn);
> +	}
> +	func0->ocxl_fn = fn;
> +
> +	pci_set_drvdata(pdev, func0);
> +
> +	return 0;
> +}
> +
> +/**
> + * probe() - Init an OpenCAPI persistent memory device
> + * @pdev: the PCI device information struct
> + * @ent: The entry from ocxlpmem_pci_tbl
> + * Return: 0 on success, negative on failure
> + */
> +static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> +{
> +	struct ocxlpmem *ocxlpmem;
> +	int rc;
> +
> +	if (PCI_FUNC(pdev->devfn) == 0)
> +		return probe_function0(pdev);
> +	else if (PCI_FUNC(pdev->devfn) != 1)
> +		return 0;
> +
> +	ocxlpmem = kzalloc(sizeof(*ocxlpmem), GFP_KERNEL);
> +	if (!ocxlpmem) {
> +		dev_err(&pdev->dev, "Could not allocate OpenCAPI persistent memory metadata\n");
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +	ocxlpmem->pdev = pdev;


We should probably call pci_dev_get() here if we store the struct 
pci_dev pointer. We could debate how useful it really is, considering 
we're registering a device, which will also take a reference, but it 
looks like the safe thing to do considering all those resources don't 
have exactly the same life cycle and it is standard practice to 
guarantee that we won't have a dangling pointer.



> +
> +	pci_set_drvdata(pdev, ocxlpmem);
> +
> +	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
> +	if (IS_ERR(ocxlpmem->ocxl_fn)) {
> +		kfree(ocxlpmem);


ocxlpmem is freed...


> +		pci_set_drvdata(pdev, NULL);
> +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> +		rc = PTR_ERR(ocxlpmem->ocxl_fn);


... and then referenced.



> +		goto err;
> +	}
> +
> +	ocxlpmem->ocxl_afu = ocxl_function_fetch_afu(ocxlpmem->ocxl_fn, 0);
> +	if (ocxlpmem->ocxl_afu == NULL) {
> +		dev_err(&pdev->dev, "Could not get OCXL AFU from function\n");


The error path here should match the above, to free struct ocxlpmem.


> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	ocxl_afu_get(ocxlpmem->ocxl_afu);
> +
> +	// Resources allocated below here are cleaned up in the release handler
> +
> +	rc = ocxlpmem_register(ocxlpmem);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory device with the kernel\n");
> +		goto err;
> +	}
> +
> +	rc = ocxl_context_alloc(&ocxlpmem->ocxl_context, ocxlpmem->ocxl_afu, NULL);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not allocate OCXL context\n");
> +		goto err;
> +	}
> +
> +	rc = ocxl_context_attach(ocxlpmem->ocxl_context, 0, NULL);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not attach ocxl context\n");
> +		goto err;
> +	}
> +
> +	rc = register_lpc_mem(ocxlpmem);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory with libnvdimm\n");
> +		goto err;
> +	}
> +
> +	return 0;
> +
> +err:
> +	/*
> +	 * Further cleanup is done in the release handler via free_ocxlpmem()
> +	 * This allows us to keep the character device live to handle IOCTLs to
> +	 * investigate issues if the card has an error
> +	 */


If we fail probe, we don't call device_unregister() and the data 
structures will never be freed. The comment seems to indicate it's done 
on purpose but that looks suprising and wrong. If we fail probe, the 
kernel thinks the driver it _not_ handling the device, so we need to 
exit probe() cleanly. We're not supposed to be able to make some debug 
ioctl calls. Once we fail probe, the kernel is free to do whatever it 
wants with the pci device. If you manage to extract some debug info 
during development, then fine, but it's not something we can rely on and 
upstream.
If the card enters an error state after probe(), then we don't need that 
anyway. We have all the time in the world to call ioctl's, as long as we 
don't call the remove callback of the driver.




> +
> +	dev_err(&pdev->dev,
> +		"Error detected, will not register OpenCAPI persistent memory\n");
> +	return rc;
> +}
> +
> +static struct pci_driver pci_driver = {
> +	.name = "ocxl-pmem",
> +	.id_table = ocxlpmem_pci_tbl,
> +	.probe = probe,
> +	.remove = ocxlpmem_remove,
> +	.shutdown = ocxlpmem_remove,



nitpick: why doesn't the probe callback follow the same naming 
convention? It's all static and doesn't really matter, but...

   Fred



> +};
> +
> +static int __init ocxlpmem_init(void)
> +{
> +	int rc = 0;
> +
> +	rc = pci_register_driver(&pci_driver);
> +	if (rc)
> +		return rc;
> +
> +	return 0;
> +}
> +
> +static void ocxlpmem_exit(void)
> +{
> +	pci_unregister_driver(&pci_driver);
> +}
> +
> +module_init(ocxlpmem_init);
> +module_exit(ocxlpmem_exit);
> +
> +MODULE_DESCRIPTION("OpenCAPI Persistent Memory");
> +MODULE_LICENSE("GPL");
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> new file mode 100644
> index 000000000000..0faf3740e9b8
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -0,0 +1,28 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +// Copyright 2019 IBM Corp.
> +
> +#include <linux/pci.h>
> +#include <misc/ocxl.h>
> +#include <linux/libnvdimm.h>
> +#include <linux/mm.h>
> +
> +#define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> +
> +struct ocxlpmem_function0 {
> +	struct pci_dev *pdev;
> +	struct ocxl_fn *ocxl_fn;
> +};
> +
> +struct ocxlpmem {
> +	struct device dev;
> +	struct pci_dev *pdev;
> +	struct ocxl_fn *ocxl_fn;
> +	struct nd_interleave_set nd_set;
> +	struct nvdimm_bus_descriptor bus_desc;
> +	struct nvdimm_bus *nvdimm_bus;
> +	struct ocxl_afu *ocxl_afu;
> +	struct ocxl_context *ocxl_context;
> +	void *metadata_addr;
> +	struct resource pmem_res;
> +	struct nd_region *nd_region;
> +};
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands
  2020-02-27  8:22   ` Andrew Donnellan
  2020-02-27  8:27     ` Andrew Donnellan
@ 2020-02-27 23:51     ` Alastair D'Silva
  1 sibling, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-27 23:51 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Thu, 2020-02-27 at 19:22 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This patch requests the metadata required to issue admin commands,
> > as well
> > as some helper functions to construct and check the completion of
> > the
> > commands.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c    |  65 ++++++++
> >   .../platforms/powernv/pmem/ocxl_internal.c    | 153
> > ++++++++++++++++++
> >   .../platforms/powernv/pmem/ocxl_internal.h    |  61 +++++++
> >   3 files changed, 279 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index 431212c9f0cc..4e782d22605b 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -216,6 +216,58 @@ static int register_lpc_mem(struct ocxlpmem
> > *ocxlpmem)
> >   	return 0;
> >   }
> >   
> > +/**
> > + * extract_command_metadata() - Extract command data from MMIO &
> > save it for further use
> > + * @ocxlpmem: the device metadata
> > + * @offset: The base address of the command data structures
> > (address of CREQO)
> > + * @command_metadata: A pointer to the command metadata to
> > populate
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int extract_command_metadata(struct ocxlpmem *ocxlpmem, u32
> > offset,
> > +					struct command_metadata
> > *command_metadata)
> > +{
> > +	int rc;
> > +	u64 tmp;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, offset,
> > OCXL_LITTLE_ENDIAN,
> > +				     &tmp);
> > +	if (rc)
> > +		return rc;
> > +
> > +	command_metadata->request_offset = tmp >> 32;
> > +	command_metadata->response_offset = tmp & 0xFFFFFFFF;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, offset + 8,
> > OCXL_LITTLE_ENDIAN,
> > +				     &tmp);
> > +	if (rc)
> > +		return rc;
> > +
> > +	command_metadata->data_offset = tmp >> 32;
> > +	command_metadata->data_size = tmp & 0xFFFFFFFF;
> > +
> > +	command_metadata->id = 0;
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * setup_command_metadata() - Set up the command metadata
> > + * @ocxlpmem: the device metadata
> > + */
> > +static int setup_command_metadata(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +
> > +	mutex_init(&ocxlpmem->admin_command.lock);
> > +
> > +	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_ACMA_CREQO,
> > +				      &ocxlpmem->admin_command);
> > +	if (rc)
> > +		return rc;
> > +
> > +	return 0;
> > +}
> > +
> >   /**
> >    * is_usable() - Is a controller usable?
> >    * @ocxlpmem: the device metadata
> > @@ -456,6 +508,14 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   	}
> >   	ocxlpmem->pdev = pdev;
> >   
> > +	ocxlpmem->timeouts[ADMIN_COMMAND_ERRLOG] = 2000; // ms
> > +	ocxlpmem->timeouts[ADMIN_COMMAND_HEARTBEAT] = 100; // ms
> > +	ocxlpmem->timeouts[ADMIN_COMMAND_SMART] = 100; // ms
> > +	ocxlpmem->timeouts[ADMIN_COMMAND_CONTROLLER_DUMP] = 1000; // ms
> > +	ocxlpmem->timeouts[ADMIN_COMMAND_CONTROLLER_STATS] = 100; // ms
> > +	ocxlpmem->timeouts[ADMIN_COMMAND_SHUTDOWN] = 1000; // ms
> > +	ocxlpmem->timeouts[ADMIN_COMMAND_FW_UPDATE] = 16000; // ms
> 
> Why are we keeping these timeouts in a per device struct? I can't
> see 
> anywhere where we change these values.
> 

These are overwritten in a later patch, which I've missed! thanks for
pointing this out.

These initial values will be overwritten by card specific timeouts.

> > +
> >   	pci_set_drvdata(pdev, ocxlpmem);
> >   
> >   	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
> > @@ -501,6 +561,11 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   		goto err;
> >   	}
> >   
> > +	if (setup_command_metadata(ocxlpmem)) {
> > +		dev_err(&pdev->dev, "Could not read OCXL command
> > matada\n");
> 
> metadata

Wow, not sure how that happened.

> 
> Also, "OCXL command metadata" is misleading, this is a pmem specific 
> thing, not an OpenCAPI thing, I would prefer just "command metadata".
> 

Ok

> > +		goto err;
> > +	}
> > +
> >   	elapsed = 0;
> >   	timeout = ocxlpmem->readiness_timeout + ocxlpmem-
> > >memory_available_timeout;
> >   	while (!is_usable(ocxlpmem, false)) {
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> > index 617ca943b1b8..583f48023025 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> > @@ -17,3 +17,156 @@ int ocxlpmem_chi(const struct ocxlpmem
> > *ocxlpmem, u64 *chi)
> >   
> >   	return 0;
> >   }
> > +
> > +#define COMMAND_REQUEST_SIZE (8 * sizeof(u64))
> > +static int scm_command_request(const struct ocxlpmem *ocxlpmem,
> > +			       struct command_metadata *cmd, u8
> > op_code)
> > +{
> > +	u64 val = op_code;
> > +	int rc;
> > +	u8 i;
> > +
> > +	cmd->op_code = op_code;
> > +	cmd->id++;
> > +
> > +	val |= ((u64)cmd->id) << 16;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, cmd-
> > >request_offset,
> > +				      OCXL_LITTLE_ENDIAN, val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	for (i = sizeof(u64); i < COMMAND_REQUEST_SIZE; i +=
> > sizeof(u64)) {
> > +		rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +					      cmd->request_offset + i,
> > +					      OCXL_LITTLE_ENDIAN, 0);
> > +		if (rc)
> > +			return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +int admin_command_request(struct ocxlpmem *ocxlpmem, u8 op_code)
> > +{
> > +	u64 val;
> > +	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHI,
> > +					 OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> 
> Ignoring the value here expected, you're just trying to verify that
> you 
> don't see an error on the read?
> 

This was some vestigial code that should be removed.

> > +
> > +	return scm_command_request(ocxlpmem, &ocxlpmem->admin_command,
> > op_code);
> > +}
> > +
> > +static int command_response(const struct ocxlpmem *ocxlpmem,
> > +			    const struct command_metadata *cmd)
> > +{
> > +	u64 val;
> > +	u16 id;
> > +	u8 status;
> > +	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +					 cmd->response_offset,
> > +					 OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	status = val & 0xff;
> > +	id = (val >> 16) & 0xffff;
> > +
> > +	if (id != cmd->id) {
> > +		dev_warn(&ocxlpmem->dev,
> > +			 "Expected response for command %d, but
> > received response for command %d instead.\n",
> > +			 cmd->id, id);
> 
> If this happens I imagine something's gone pretty wrong - this
> should 
> probably be a dev_err()? And perhaps we want to make sure we return
> an 
> error code rather than whatever status code we get from the MMIO?
> 

Ok

> > +	}
> > +
> > +	return status;
> > +}
> > +
> > +int admin_response(const struct ocxlpmem *ocxlpmem)
> > +{
> > +	return command_response(ocxlpmem, &ocxlpmem->admin_command);
> > +}
> > +
> > +
> > +int admin_command_execute(const struct ocxlpmem *ocxlpmem)
> > +{
> > +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_HCI,
> > +				      OCXL_LITTLE_ENDIAN,
> > GLOBAL_MMIO_HCI_ACRW);
> > +}
> > +
> > +static bool admin_command_complete(const struct ocxlpmem
> > *ocxlpmem)
> > +{
> > +	u64 val = 0;
> > +
> > +	int rc = ocxlpmem_chi(ocxlpmem, &val);
> > +
> > +	WARN_ON(rc);
> > +
> > +	return (val & GLOBAL_MMIO_CHI_ACRA) != 0;
> > +}
> > +
> > +int admin_command_complete_timeout(const struct ocxlpmem
> > *ocxlpmem,
> > +				   int command)
> > +{
> > +	u32 timeout = ocxlpmem->timeouts[command];
> > +	// 32 is the next power of 2 greater than the 20ms minimum for
> > msleep
> > +#define TIMEOUT_SLEEP_MILLIS 32
> > +	timeout /= TIMEOUT_SLEEP_MILLIS;
> > +	if (!timeout)
> > +		timeout = DEFAULT_TIMEOUT / TIMEOUT_SLEEP_MILLIS;
> > +
> > +	while (timeout-- > 0) {
> > +		if (admin_command_complete(ocxlpmem))
> > +			return 0;
> > +		msleep(TIMEOUT_SLEEP_MILLIS);
> > +	}
> 
> I think the more traditional way to implement timeouts is something
> more 
> like:
> 
>    unsigned long timeout = jiffies + msecs_to_jiffies(<timeout
> period>);
>    do {
>      <check>
>      <sleep>
>    } while (time_before(jiffies, timeout));
> 

ok

> > +
> > +	if (admin_command_complete(ocxlpmem))
> > +		return 0;
> > +
> > +	return -EBUSY;
> > +}
> > +
> > +int admin_response_handled(const struct ocxlpmem *ocxlpmem)
> > +{
> > +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHIC,
> > +				      OCXL_LITTLE_ENDIAN,
> > GLOBAL_MMIO_CHI_ACRA);
> > +}
> 
> This looks wrong? My reading of the spec is that you're meant to
> *clear* 
> ACRA upon completion of handing, this looks like it's setting ACRA to
> 1.
> 

Writing a 1 to the CHIC register clears the respective bit in the CHI
register. I'll add a comment.

> > +
> > +void warn_status(const struct ocxlpmem *ocxlpmem, const char
> > *message,
> > +		     u8 status)
> > +{
> > +	const char *text = "Unknown";
> > +
> > +	switch (status) {
> > +	case STATUS_SUCCESS:
> > +		text = "Success";
> > +		break;
> > +
> > +	case STATUS_MEM_UNAVAILABLE:
> > +		text = "Persistent memory unavailable";
> > +		break;
> > +
> > +	case STATUS_BAD_OPCODE:
> > +		text = "Bad opcode";
> > +		break;
> > +
> > +	case STATUS_BAD_REQUEST_PARM:
> > +		text = "Bad request parameter";
> > +		break;
> > +
> > +	case STATUS_BAD_DATA_PARM:
> > +		text = "Bad data parameter";
> > +		break;
> > +
> > +	case STATUS_DEBUG_BLOCKED:
> > +		text = "Debug action blocked";
> > +		break;
> > +
> > +	case STATUS_FAIL:
> > +		text = "Failed";
> > +		break;
> > +	}
> > +
> > +	dev_warn(&ocxlpmem->dev, "%s: %s (%x)\n", message, text,
> > status);
> > +}
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > index ba0301533d00..2fef68c71271 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > @@ -7,6 +7,7 @@
> >   #include <linux/mm.h>
> >   
> >   #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> > +#define DEFAULT_TIMEOUT 100
> >   
> >   #define GLOBAL_MMIO_CHI		0x000
> >   #define GLOBAL_MMIO_CHIC	0x008
> > @@ -80,6 +81,16 @@
> >   #define STATUS_FW_ARG_INVALID	0x51
> >   #define STATUS_FW_INVALID	0x52
> >   
> > +struct command_metadata {
> > +	u32 request_offset;
> > +	u32 response_offset;
> > +	u32 data_offset;
> > +	u32 data_size;
> > +	struct mutex lock;
> > +	u16 id;
> > +	u8 op_code;
> > +};
> > +
> >   struct ocxlpmem_function0 {
> >   	struct pci_dev *pdev;
> >   	struct ocxl_fn *ocxl_fn;
> > @@ -95,9 +106,11 @@ struct ocxlpmem {
> >   	struct ocxl_afu *ocxl_afu;
> >   	struct ocxl_context *ocxl_context;
> >   	void *metadata_addr;
> > +	struct command_metadata admin_command;
> >   	struct resource pmem_res;
> >   	struct nd_region *nd_region;
> >   	char fw_version[8+1];
> > +	u32 timeouts[ADMIN_COMMAND_MAX+1];
> >   
> >   	u32 max_controller_dump_size;
> >   	u16 scm_revision; // major/minor
> > @@ -122,3 +135,51 @@ struct ocxlpmem {
> >    * Returns 0 on success, negative on error
> >    */
> >   int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi);
> > +
> > +/**
> > + * admin_command_request() - Issue an admin command request
> > + * @ocxlpmem: the device metadata
> > + * @op_code: The op-code for the command
> > + *
> > + * Returns an identifier for the command, or negative on error
> > + */
> > +int admin_command_request(struct ocxlpmem *ocxlpmem, u8 op_code);
> > +
> > +/**
> > + * admin_response() - Validate an admin response
> > + * @ocxlpmem: the device metadata
> > + * Returns the status code of the command, or negative on error
> > + */
> > +int admin_response(const struct ocxlpmem *ocxlpmem);
> > +
> > +/**
> > + * admin_command_execute() - Notify the controller to start
> > processing a pending admin command
> > + * @ocxlpmem: the device metadata
> > + * Returns 0 on success, negative on error
> > + */
> > +int admin_command_execute(const struct ocxlpmem *ocxlpmem);
> > +
> > +/**
> > + * admin_command_complete_timeout() - Wait for an admin command to
> > finish executing
> > + * @ocxlpmem: the device metadata
> > + * @command: the admin command to wait for completion (determines
> > the timeout)
> > + * Returns 0 on success, -EBUSY on timeout
> > + */
> > +int admin_command_complete_timeout(const struct ocxlpmem
> > *ocxlpmem,
> > +				   int command);
> > +
> > +/**
> > + * admin_response_handled() - Notify the controller that the admin
> > response has been handled
> > + * @ocxlpmem: the device metadata
> > + * Returns 0 on success, negative on failure
> > + */
> > +int admin_response_handled(const struct ocxlpmem *ocxlpmem);
> > +
> > +/**
> > + * warn_status() - Emit a kernel warning showing a command status.
> > + * @ocxlpmem: the device metadata
> > + * @message: A message to accompany the warning
> > + * @status: The command status
> > + */
> > +void warn_status(const struct ocxlpmem *ocxlpmem, const char
> > *message,
> > +		 u8 status);
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands
  2020-02-27  8:27     ` Andrew Donnellan
@ 2020-02-27 23:54       ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-27 23:54 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Thu, 2020-02-27 at 19:27 +1100, Andrew Donnellan wrote:
> On 27/2/20 7:22 pm, Andrew Donnellan wrote:
> > > +int admin_command_request(struct ocxlpmem *ocxlpmem, u8 op_code)
> > > +{
> > > +    u64 val;
> > > +    int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, 
> > > GLOBAL_MMIO_CHI,
> > > +                     OCXL_LITTLE_ENDIAN, &val);
> > > +    if (rc)
> > > +        return rc;
> > 
> > Ignoring the value here expected, you're just trying to verify that
> > you 
> > don't see an error on the read?
> 
> I see that in the next patch, in ns_command_request() you check that 
> NSCRA is 1 - did you mean to check that ACRA = 1 here?
> 
> 

I was in one version, but that was causing problems in startup since
there was successful prior command to assert ACRA.

I should remove the NSCRA check too.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands
  2020-02-27  8:30   ` Andrew Donnellan
@ 2020-02-27 23:56     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-27 23:56 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Thu, 2020-02-27 at 19:30 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:> +int 
> ns_response_handled(const struct ocxlpmem *ocxlpmem)
> > +{
> > +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHIC,
> > +				      OCXL_LITTLE_ENDIAN,
> > GLOBAL_MMIO_CHI_NSCRA);
> > +}
> 
> Same comment as on the last patch - I think we're meant to be
> clearing 
> this bit, not setting it to 1,
> 

Same reply :) Writing to the CHIC register clears the bit in CHI.

> 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH v3 14/27] powerpc/powernv/pmem: Add support for Admin commands
  2020-02-27 17:01   ` Dan Williams
@ 2020-02-27 23:57     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-27 23:57 UTC (permalink / raw)
  To: Dan Williams
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Andrew Donnellan,
	Arnd Bergmann, Greg Kroah-Hartman, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy, Linux Kernel Mailing List,
	linuxppc-dev, linux-nvdimm, Linux MM

On Thu, 2020-02-27 at 09:01 -0800, Dan Williams wrote:
> On Thu, Feb 20, 2020 at 7:28 PM Alastair D'Silva <
> alastair@au1.ibm.com> wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This patch requests the metadata required to issue admin commands,
> > as well
> > as some helper functions to construct and check the completion of
> > the
> > commands.
> 
> What are the admin commands? Any pointer to a spec? Why does Linux
> need to support these commands?


I'll flesh these out for the next spin, thanks.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory
  2020-02-27 20:44   ` Frederic Barrat
@ 2020-02-28  0:54     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-02-28  0:54 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Thu, 2020-02-27 at 21:44 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This driver exposes LPC memory on OpenCAPI pmem cards
> > as an NVDIMM, allowing the existing nvram infrastructure
> > to be used.
> > 
> > Namespace metadata is stored on the media itself, so
> > scm_reserve_metadata() maps 1 section's worth of PMEM storage
> > at the start to hold this. The rest of the PMEM range is registered
> > with libnvdimm as an nvdimm. scm_ndctl_config_read/write/size()
> > provide
> > callbacks to libnvdimm to access the metadata.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/Kconfig        |   3 +
> >   arch/powerpc/platforms/powernv/Makefile       |   1 +
> >   arch/powerpc/platforms/powernv/pmem/Kconfig   |  15 +
> >   arch/powerpc/platforms/powernv/pmem/Makefile  |   7 +
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 473
> > ++++++++++++++++++
> >   .../platforms/powernv/pmem/ocxl_internal.h    |  28 ++
> >   6 files changed, 527 insertions(+)
> >   create mode 100644 arch/powerpc/platforms/powernv/pmem/Kconfig
> >   create mode 100644 arch/powerpc/platforms/powernv/pmem/Makefile
> >   create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl.c
> >   create mode 100644
> > arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > 
> > diff --git a/arch/powerpc/platforms/powernv/Kconfig
> > b/arch/powerpc/platforms/powernv/Kconfig
> > index 938803eab0ad..fc8976af0e52 100644
> > --- a/arch/powerpc/platforms/powernv/Kconfig
> > +++ b/arch/powerpc/platforms/powernv/Kconfig
> > @@ -50,3 +50,6 @@ config PPC_VAS
> >   config SCOM_DEBUGFS
> >   	bool "Expose SCOM controllers via debugfs"
> >   	depends on DEBUG_FS
> > +
> > +source "arch/powerpc/platforms/powernv/pmem/Kconfig"
> > +
> > diff --git a/arch/powerpc/platforms/powernv/Makefile
> > b/arch/powerpc/platforms/powernv/Makefile
> > index c0f8120045c3..0bbd72988b6f 100644
> > --- a/arch/powerpc/platforms/powernv/Makefile
> > +++ b/arch/powerpc/platforms/powernv/Makefile
> > @@ -21,3 +21,4 @@ obj-$(CONFIG_PPC_VAS)	+= vas.o vas-window.o
> > vas-debug.o
> >   obj-$(CONFIG_OCXL_BASE)	+= ocxl.o
> >   obj-$(CONFIG_SCOM_DEBUGFS) += opal-xscom.o
> >   obj-$(CONFIG_PPC_SECURE_BOOT) += opal-secvar.o
> > +obj-$(CONFIG_LIBNVDIMM) += pmem/
> > diff --git a/arch/powerpc/platforms/powernv/pmem/Kconfig
> > b/arch/powerpc/platforms/powernv/pmem/Kconfig
> > new file mode 100644
> > index 000000000000..c5d927520920
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/pmem/Kconfig
> > @@ -0,0 +1,15 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +if LIBNVDIMM
> > +
> > +config OCXL_PMEM
> > +	tristate "OpenCAPI Persistent Memory"
> > +	depends on LIBNVDIMM && PPC_POWERNV && PCI && EEH &&
> > ZONE_DEVICE && OCXL
> > +	help
> > +	  Exposes devices that implement the OpenCAPI Storage Class
> > Memory
> > +	  specification as persistent memory regions. You may also want
> > +	  DEV_DAX, DEV_DAX_PMEM & FS_DAX if you plan on using DAX
> > devices
> > +	  stacked on top of this driver.
> > +
> > +	  Select N if unsure.
> > +
> > +endif
> > diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile
> > b/arch/powerpc/platforms/powernv/pmem/Makefile
> > new file mode 100644
> > index 000000000000..1c55c4193175
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/pmem/Makefile
> > @@ -0,0 +1,7 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +ccflags-$(CONFIG_PPC_WERROR)	+= -Werror
> > +
> > +obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
> > +
> > +ocxlpmem-y := ocxl.o
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > new file mode 100644
> > index 000000000000..3c4eeb5dcc0f
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -0,0 +1,473 @@
> > +// SPDX-License-Id
> > +// Copyright 2019 IBM Corp.
> > +
> > +/*
> > + * A driver for OpenCAPI devices that implement the Storage Class
> > + * Memory specification.
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <misc/ocxl.h>
> > +#include <linux/ndctl.h>
> > +#include <linux/mm_types.h>
> > +#include <linux/memory_hotplug.h>
> > +#include "ocxl_internal.h"
> > +
> > +
> > +static const struct pci_device_id ocxlpmem_pci_tbl[] = {
> > +	{ PCI_DEVICE(PCI_VENDOR_ID_IBM, 0x0625), },
> > +	{ }
> > +};
> > +
> > +MODULE_DEVICE_TABLE(pci, ocxlpmem_pci_tbl);
> > +
> > +#define NUM_MINORS 256 // Total to reserve
> > +
> > +static dev_t ocxlpmem_dev;
> > +static struct class *ocxlpmem_class;
> > +static struct mutex minors_idr_lock;
> > +static struct idr minors_idr;
> > +
> > +/**
> > + * ndctl_config_write() - Handle a ND_CMD_SET_CONFIG_DATA command
> > from ndctl
> > + * @ocxlpmem: the device metadata
> > + * @command: the incoming data to write
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ndctl_config_write(struct ocxlpmem *ocxlpmem,
> > +			      struct nd_cmd_set_config_hdr *command)
> > +{
> > +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> > +		return -EINVAL;
> > +
> > +	memcpy_flushcache(ocxlpmem->metadata_addr + command->in_offset, 
> > command->in_buf,
> > +			  command->in_length);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * ndctl_config_read() - Handle a ND_CMD_GET_CONFIG_DATA command
> > from ndctl
> > + * @ocxlpmem: the device metadata
> > + * @command: the read request
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ndctl_config_read(struct ocxlpmem *ocxlpmem,
> > +			     struct nd_cmd_get_config_data_hdr
> > *command)
> > +{
> > +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> > +		return -EINVAL;
> > +
> > +	memcpy_mcsafe(command->out_buf, ocxlpmem->metadata_addr +
> > command->in_offset,
> > +		      command->in_length);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * ndctl_config_size() - Handle a ND_CMD_GET_CONFIG_SIZE command
> > from ndctl
> > + * @command: the read request
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ndctl_config_size(struct nd_cmd_get_config_size
> > *command)
> > +{
> > +	command->status = 0;
> > +	command->config_size = LABEL_AREA_SIZE;
> > +	command->max_xfer = PAGE_SIZE;
> > +
> > +	return 0;
> > +}
> > +
> > +static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
> > +		 struct nvdimm *nvdimm,
> > +		 unsigned int cmd, void *buf, unsigned int buf_len, int
> > *cmd_rc)
> > +{
> > +	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct
> > ocxlpmem, bus_desc);
> > +
> > +	switch (cmd) {
> > +	case ND_CMD_GET_CONFIG_SIZE:
> > +		*cmd_rc = ndctl_config_size(buf);
> > +		return 0;
> > +
> > +	case ND_CMD_GET_CONFIG_DATA:
> > +		*cmd_rc = ndctl_config_read(ocxlpmem, buf);
> > +		return 0;
> > +
> > +	case ND_CMD_SET_CONFIG_DATA:
> > +		*cmd_rc = ndctl_config_write(ocxlpmem, buf);
> > +		return 0;
> > +
> > +	default:
> > +		return -ENOTTY;
> > +	}
> > +}
> > +
> > +/**
> > + * reserve_metadata() - Reserve space for nvdimm metadata
> > + * @ocxlpmem: the device metadata
> > + * @lpc_mem: The resource representing the LPC memory of the
> > OpenCAPI device
> > + */
> > +static int reserve_metadata(struct ocxlpmem *ocxlpmem,
> > +			    struct resource *lpc_mem)
> > +{
> > +	ocxlpmem->metadata_addr = devm_memremap(&ocxlpmem->dev,
> > lpc_mem->start,
> > +						LABEL_AREA_SIZE,
> > MEMREMAP_WB);
> > +	if (IS_ERR(ocxlpmem->metadata_addr))
> > +		return PTR_ERR(ocxlpmem->metadata_addr);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * register_lpc_mem() - Discover persistent memory on a device and
> > register it with the NVDIMM subsystem
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success
> > + */
> > +static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
> > +{
> > +	struct nd_region_desc region_desc;
> > +	struct nd_mapping_desc nd_mapping_desc;
> > +	struct resource *lpc_mem;
> > +	const struct ocxl_afu_config *config;
> > +	const struct ocxl_fn_config *fn_config;
> > +	int rc;
> > +	unsigned long nvdimm_cmd_mask = 0;
> > +	unsigned long nvdimm_flags = 0;
> > +	int target_node;
> > +	char serial[16+1];
> > +
> > +	// Set up the reserved metadata area
> > +	rc = ocxl_afu_map_lpc_mem(ocxlpmem->ocxl_afu);
> > +	if (rc < 0)
> > +		return rc;
> > +
> > +	lpc_mem = ocxl_afu_lpc_mem(ocxlpmem->ocxl_afu);
> > +	if (lpc_mem == NULL || lpc_mem->start == 0)
> > +		return -EINVAL;
> > +
> > +	config = ocxl_afu_config(ocxlpmem->ocxl_afu);
> > +	fn_config = ocxl_function_config(ocxlpmem->ocxl_fn);
> > +
> > +	rc = reserve_metadata(ocxlpmem, lpc_mem);
> > +	if (rc)
> > +		return rc;
> > +
> > +	ocxlpmem->bus_desc.provider_name = "ocxl-pmem";
> > +	ocxlpmem->bus_desc.ndctl = ndctl;
> > +	ocxlpmem->bus_desc.module = THIS_MODULE;
> > +
> > +	ocxlpmem->nvdimm_bus = nvdimm_bus_register(&ocxlpmem->dev,
> > +						   &ocxlpmem-
> > >bus_desc);
> > +	if (!ocxlpmem->nvdimm_bus)
> > +		return -EINVAL;
> > +
> > +	ocxlpmem->pmem_res.start = (u64)lpc_mem->start +
> > LABEL_AREA_SIZE;
> > +	ocxlpmem->pmem_res.end = (u64)lpc_mem->start + config-
> > >lpc_mem_size - 1;
> > +	ocxlpmem->pmem_res.name = "OpenCAPI persistent memory";
> > +
> > +	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
> > +	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
> > +	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
> > +
> > +	set_bit(NDD_ALIASING, &nvdimm_flags);
> > +
> > +	snprintf(serial, sizeof(serial), "%llx", fn_config->serial);
> > +	nd_mapping_desc.nvdimm = nvdimm_create(ocxlpmem->nvdimm_bus,
> > ocxlpmem,
> > +				 NULL, nvdimm_flags, nvdimm_cmd_mask,
> > +				 0, NULL);
> > +	if (!nd_mapping_desc.nvdimm)
> > +		return -ENOMEM;
> > +
> > +	if (nvdimm_bus_check_dimm_count(ocxlpmem->nvdimm_bus, 1))
> > +		return -EINVAL;
> > +
> > +	nd_mapping_desc.start = ocxlpmem->pmem_res.start;
> > +	nd_mapping_desc.size = resource_size(&ocxlpmem->pmem_res);
> > +	nd_mapping_desc.position = 0;
> > +
> > +	ocxlpmem->nd_set.cookie1 = fn_config->serial + 1; // allow for
> > empty serial
> > +	ocxlpmem->nd_set.cookie2 = fn_config->serial + 1;
> > +
> > +	target_node = of_node_to_nid(ocxlpmem->pdev->dev.of_node);
> > +
> > +	memset(&region_desc, 0, sizeof(region_desc));
> > +	region_desc.res = &ocxlpmem->pmem_res;
> > +	region_desc.numa_node = NUMA_NO_NODE;
> > +	region_desc.target_node = target_node;
> > +	region_desc.num_mappings = 1;
> > +	region_desc.mapping = &nd_mapping_desc;
> > +	region_desc.nd_set = &ocxlpmem->nd_set;
> > +
> > +	set_bit(ND_REGION_PAGEMAP, &region_desc.flags);
> > +	/*
> > +	 * NB: libnvdimm copies the data from ndr_desc into it's own
> > +	 * structures so passing a stack pointer is fine.
> > +	 */
> > +	ocxlpmem->nd_region = nvdimm_pmem_region_create(ocxlpmem-
> > >nvdimm_bus,
> > +							&region_desc);
> > +	if (!ocxlpmem->nd_region)
> > +		return -EINVAL;
> > +
> > +	dev_info(&ocxlpmem->dev,
> > +		 "Onlining %lluMB of persistent memory\n",
> > +		 nd_mapping_desc.size / SZ_1M);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * allocate_minor() - Allocate a minor number to use for an
> > OpenCAPI pmem device
> > + * @ocxlpmem: the device metadata
> > + * Return: the allocated minor number
> > + */
> > +static int allocate_minor(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int minor;
> > +
> > +	mutex_lock(&minors_idr_lock);
> > +	minor = idr_alloc(&minors_idr, ocxlpmem, 0, NUM_MINORS,
> > GFP_KERNEL);
> > +	mutex_unlock(&minors_idr_lock);
> > +	return minor;
> > +}
> > +
> > +static void free_minor(struct ocxlpmem *ocxlpmem)
> > +{
> > +	mutex_lock(&minors_idr_lock);
> > +	idr_remove(&minors_idr, MINOR(ocxlpmem->dev.devt));
> > +	mutex_unlock(&minors_idr_lock);
> > +}
> > +
> > +/**
> > + * free_ocxlpmem() - Free all members of an ocxlpmem struct
> > + * @ocxlpmem: the device struct to clear
> > + */
> > +static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +
> > +	if (ocxlpmem->nvdimm_bus)
> > +		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> > +
> > +	free_minor(ocxlpmem);
> > +
> > +	if (ocxlpmem->metadata_addr)
> > +		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
> > +
> > +	if (ocxlpmem->ocxl_context) {
> > +		rc = ocxl_context_detach(ocxlpmem->ocxl_context);
> > +		if (rc == -EBUSY)
> > +			dev_warn(&ocxlpmem->dev, "Timeout detaching
> > ocxl context\n");
> > +		else
> > +			ocxl_context_free(ocxlpmem->ocxl_context);
> > +
> > +	}
> > +
> > +	if (ocxlpmem->ocxl_afu)
> > +		ocxl_afu_put(ocxlpmem->ocxl_afu);
> > +
> > +	if (ocxlpmem->ocxl_fn)
> > +		ocxl_function_close(ocxlpmem->ocxl_fn);
> > +
> > +	kfree(ocxlpmem);
> > +}
> > +
> > +/**
> > + * free_ocxlpmem_dev() - Free an OpenCAPI persistent memory device
> > + * @dev: The device struct
> > + */
> > +static void free_ocxlpmem_dev(struct device *dev)
> > +{
> > +	struct ocxlpmem *ocxlpmem = container_of(dev, struct ocxlpmem,
> > dev);
> > +
> > +	free_ocxlpmem(ocxlpmem);
> > +}
> > +
> > +/**
> > + * ocxlpmem_register() - Register an OpenCAPI pmem device with the
> > kernel
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +	int minor = allocate_minor(ocxlpmem);
> > +
> > +	if (minor < 0)
> > +		return minor;
> > +
> > +	ocxlpmem->dev.release = free_ocxlpmem_dev;
> > +	rc = dev_set_name(&ocxlpmem->dev, "ocxlpmem%d", minor);
> > +	if (rc < 0)
> > +		return rc;
> > +
> > +	ocxlpmem->dev.devt = MKDEV(MAJOR(ocxlpmem_dev), minor);
> > +	ocxlpmem->dev.class = ocxlpmem_class;
> 
> 
> This function, as well as allocate_minor() and free_minor() above 
> reference resources (the IDR, the file class, ...) which are not 
> initialized yet. The function file_init() is coming in a later patch.
> 

Thanks, I caught this at runtime when I booted a kernel with (just)
this patch :) Fixed in v4.

> 
> 
> > +	ocxlpmem->dev.parent = &ocxlpmem->pdev->dev;
> > +
> > +	return device_register(&ocxlpmem->dev);
> > +}
> > +
> > +/**
> > + * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
> > + * @pdev: the PCI device information struct
> > + */
> > +static void ocxlpmem_remove(struct pci_dev *pdev)
> > +{
> > +	if (PCI_FUNC(pdev->devfn) == 0) {
> > +		struct ocxlpmem_function0 *func0 =
> > pci_get_drvdata(pdev);
> > +
> > +		if (func0) {
> > +			ocxl_function_close(func0->ocxl_fn);
> > +			func0->ocxl_fn = NULL;
> > +		}
> 
> 
> The struct ocxlpmem_function0 allocated on probe() should be freed.
> 

I've dropped struct as per the thread from Andrew Donellan.

> 
> 
> > +	} else {
> > +		struct ocxlpmem *ocxlpmem = pci_get_drvdata(pdev);
> > +
> > +		if (ocxlpmem)
> > +			device_unregister(&ocxlpmem->dev);
> > +	}
> > +}
> > +
> > +/**
> > + * probe_function0() - Set up function 0 for an OpenCAPI
> > persistent memory device
> > + * This is important as it enables templates higher than 0 across
> > all other functions,
> > + * which in turn enables higher bandwidth accesses
> > + * @pdev: the PCI device information struct
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int probe_function0(struct pci_dev *pdev)
> > +{
> > +	struct ocxlpmem_function0 *func0 = NULL;
> > +	struct ocxl_fn *fn;
> > +
> > +	func0 = kzalloc(sizeof(*func0), GFP_KERNEL);
> > +	if (!func0)
> > +		return -ENOMEM;
> > +
> > +	func0->pdev = pdev;
> 
> 
> Storing the struct pci_dev for function 0 appears to be useless.
> 

Yup

> 
> 
> > +	fn = ocxl_function_open(pdev);
> > +	if (IS_ERR(fn)) {
> > +		kfree(func0);
> > +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> > +		return PTR_ERR(fn);
> > +	}
> > +	func0->ocxl_fn = fn;
> > +
> > +	pci_set_drvdata(pdev, func0);
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * probe() - Init an OpenCAPI persistent memory device
> > + * @pdev: the PCI device information struct
> > + * @ent: The entry from ocxlpmem_pci_tbl
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int probe(struct pci_dev *pdev, const struct pci_device_id
> > *ent)
> > +{
> > +	struct ocxlpmem *ocxlpmem;
> > +	int rc;
> > +
> > +	if (PCI_FUNC(pdev->devfn) == 0)
> > +		return probe_function0(pdev);
> > +	else if (PCI_FUNC(pdev->devfn) != 1)
> > +		return 0;
> > +
> > +	ocxlpmem = kzalloc(sizeof(*ocxlpmem), GFP_KERNEL);
> > +	if (!ocxlpmem) {
> > +		dev_err(&pdev->dev, "Could not allocate OpenCAPI
> > persistent memory metadata\n");
> > +		rc = -ENOMEM;
> > +		goto err;
> > +	}
> > +	ocxlpmem->pdev = pdev;
> 
> We should probably call pci_dev_get() here if we store the struct 
> pci_dev pointer. We could debate how useful it really is,
> considering 
> we're registering a device, which will also take a reference, but it 
> looks like the safe thing to do considering all those resources
> don't 
> have exactly the same life cycle and it is standard practice to 
> guarantee that we won't have a dangling pointer.
> 

Ok

> 
> 
> > +
> > +	pci_set_drvdata(pdev, ocxlpmem);
> > +
> > +	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
> > +	if (IS_ERR(ocxlpmem->ocxl_fn)) {
> > +		kfree(ocxlpmem);
> 
> ocxlpmem is freed...
> 
> 
> > +		pci_set_drvdata(pdev, NULL);
> > +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> > +		rc = PTR_ERR(ocxlpmem->ocxl_fn);
> 
> ... and then referenced.
> 

Ok

> 
> 
> > +		goto err;
> > +	}
> > +
> > +	ocxlpmem->ocxl_afu = ocxl_function_fetch_afu(ocxlpmem->ocxl_fn, 
> > 0);
> > +	if (ocxlpmem->ocxl_afu == NULL) {
> > +		dev_err(&pdev->dev, "Could not get OCXL AFU from
> > function\n");
> 
> The error path here should match the above, to free struct ocxlpmem.
> 

Yup, I've factored out err_unregiseterd to unify the error paths.

> 
> > +		rc = -ENXIO;
> > +		goto err;
> > +	}
> > +
> > +	ocxl_afu_get(ocxlpmem->ocxl_afu);
> > +
> > +	// Resources allocated below here are cleaned up in the release
> > handler
> > +
> > +	rc = ocxlpmem_register(ocxlpmem);
> > +	if (rc) {
> > +		dev_err(&pdev->dev, "Could not register OpenCAPI
> > persistent memory device with the kernel\n");
> > +		goto err;
> > +	}
> > +
> > +	rc = ocxl_context_alloc(&ocxlpmem->ocxl_context, ocxlpmem-
> > >ocxl_afu, NULL);
> > +	if (rc) {
> > +		dev_err(&pdev->dev, "Could not allocate OCXL
> > context\n");
> > +		goto err;
> > +	}
> > +
> > +	rc = ocxl_context_attach(ocxlpmem->ocxl_context, 0, NULL);
> > +	if (rc) {
> > +		dev_err(&pdev->dev, "Could not attach ocxl context\n");
> > +		goto err;
> > +	}
> > +
> > +	rc = register_lpc_mem(ocxlpmem);
> > +	if (rc) {
> > +		dev_err(&pdev->dev, "Could not register OpenCAPI
> > persistent memory with libnvdimm\n");
> > +		goto err;
> > +	}
> > +
> > +	return 0;
> > +
> > +err:
> > +	/*
> > +	 * Further cleanup is done in the release handler via
> > free_ocxlpmem()
> > +	 * This allows us to keep the character device live to handle
> > IOCTLs to
> > +	 * investigate issues if the card has an error
> > +	 */
> 
> If we fail probe, we don't call device_unregister() and the data 
> structures will never be freed. The comment seems to indicate it's
> done 
> on purpose but that looks suprising and wrong. If we fail probe, the 
> kernel thinks the driver it _not_ handling the device, so we need to 
> exit probe() cleanly. We're not supposed to be able to make some
> debug 
> ioctl calls. Once we fail probe, the kernel is free to do whatever
> it 
> wants with the pci device. If you manage to extract some debug info 
> during development, then fine, but it's not something we can rely on
> and 
> upstream.
> If the card enters an error state after probe(), then we don't need
> that 
> anyway. We have all the time in the world to call ioctl's, as long as
> we 
> don't call the remove callback of the driver.
> 
> 

Ok

> 
> 
> > +
> > +	dev_err(&pdev->dev,
> > +		"Error detected, will not register OpenCAPI persistent
> > memory\n");
> > +	return rc;
> > +}
> > +
> > +static struct pci_driver pci_driver = {
> > +	.name = "ocxl-pmem",
> > +	.id_table = ocxlpmem_pci_tbl,
> > +	.probe = probe,
> > +	.remove = ocxlpmem_remove,
> > +	.shutdown = ocxlpmem_remove,
> 
> 
> nitpick: why doesn't the probe callback follow the same naming 
> convention? It's all static and doesn't really matter, but...

I had dropped the prefix when I renamed from scm as it doesn't really
value-add, but clearly, I missed some :) I'll fix it.

> 
>    Fred
> 
> 
> 
> > +};
> > +
> > +static int __init ocxlpmem_init(void)
> > +{
> > +	int rc = 0;
> > +
> > +	rc = pci_register_driver(&pci_driver);
> > +	if (rc)
> > +		return rc;
> > +
> > +	return 0;
> > +}
> > +
> > +static void ocxlpmem_exit(void)
> > +{
> > +	pci_unregister_driver(&pci_driver);
> > +}
> > +
> > +module_init(ocxlpmem_init);
> > +module_exit(ocxlpmem_exit);
> > +
> > +MODULE_DESCRIPTION("OpenCAPI Persistent Memory");
> > +MODULE_LICENSE("GPL");
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > new file mode 100644
> > index 000000000000..0faf3740e9b8
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > @@ -0,0 +1,28 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +// Copyright 2019 IBM Corp.
> > +
> > +#include <linux/pci.h>
> > +#include <misc/ocxl.h>
> > +#include <linux/libnvdimm.h>
> > +#include <linux/mm.h>
> > +
> > +#define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> > +
> > +struct ocxlpmem_function0 {
> > +	struct pci_dev *pdev;
> > +	struct ocxl_fn *ocxl_fn;
> > +};
> > +
> > +struct ocxlpmem {
> > +	struct device dev;
> > +	struct pci_dev *pdev;
> > +	struct ocxl_fn *ocxl_fn;
> > +	struct nd_interleave_set nd_set;
> > +	struct nvdimm_bus_descriptor bus_desc;
> > +	struct nvdimm_bus *nvdimm_bus;
> > +	struct ocxl_afu *ocxl_afu;
> > +	struct ocxl_context *ocxl_context;
> > +	void *metadata_addr;
> > +	struct resource pmem_res;
> > +	struct nd_region *nd_region;
> > +};
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data
  2020-02-21  3:27 ` [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data Alastair D'Silva
@ 2020-02-28  6:12   ` Andrew Donnellan
  2020-03-02  5:40     ` Alastair D'Silva
  2020-03-04 11:06     ` Frederic Barrat
  0 siblings, 2 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-28  6:12 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> When health & performance data is requested from the controller,
> it responds with an error log containing the requested information.
> 
> This patch allows the request to me issued via an IOCTL.

A better explanation would be good - this IOCTL triggers a request to 
the controller to collect controller health/perf data, and the 
controller will later respond with an error log that can be picked up 
via the error log IOCTL that you've defined earlier.


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 22/27] powerpc/powernv/pmem: Implement the heartbeat command
  2020-02-21  3:27 ` [PATCH v3 22/27] powerpc/powernv/pmem: Implement the heartbeat command Alastair D'Silva
@ 2020-02-28  6:20   ` Andrew Donnellan
  2020-03-04 14:25   ` Frederic Barrat
  1 sibling, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-28  6:20 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> The heartbeat admin command is a simple admin command that exercises
> the communication mechanisms within the controller.
> 
> This patch issues a heartbeat command to the card during init to ensure
> we can communicate with the card's controller.
>  > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

Looks okay.

Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs
  2020-02-21  3:27 ` [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs Alastair D'Silva
@ 2020-02-28  6:25   ` Andrew Donnellan
  2020-02-28  7:15     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-02-28  6:25 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> +int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem)
> +{
> +	int i, rc;
> +
> +	for (i = 0; i < ARRAY_SIZE(attrs); i++) {
> +		rc = device_create_file(&ocxlpmem->dev, &attrs[i]);
> +		if (rc) {
> +			for (; --i >= 0;)
> +				device_remove_file(&ocxlpmem->dev, &attrs[i]);

I'd rather avoid weird for loop constructs if possible.

Is it actually dangerous to call device_remove_file() on an attr that 
hasn't been added? If not then I'd rather define an err: label and loop 
over the whole array there.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs
  2020-02-28  6:25   ` Andrew Donnellan
@ 2020-02-28  7:15     ` Greg Kroah-Hartman
  2020-03-01 23:42       ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Greg Kroah-Hartman @ 2020-02-28  7:15 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Alastair D'Silva, alastair, Aneesh Kumar K . V,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Frederic Barrat, Arnd Bergmann, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy, linux-kernel

On Fri, Feb 28, 2020 at 05:25:31PM +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > +int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int i, rc;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(attrs); i++) {
> > +		rc = device_create_file(&ocxlpmem->dev, &attrs[i]);
> > +		if (rc) {
> > +			for (; --i >= 0;)
> > +				device_remove_file(&ocxlpmem->dev, &attrs[i]);
> 
> I'd rather avoid weird for loop constructs if possible.
> 
> Is it actually dangerous to call device_remove_file() on an attr that hasn't
> been added? If not then I'd rather define an err: label and loop over the
> whole array there.

None of this should be used at all, just use attribute groups properly
and the driver core will handle this all for you.

device_create/remove_file should never be called by anyone anymore if at all
possible.

thanks,

greg k-h
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory
  2020-02-21  3:27 ` [PATCH v3 10/27] powerpc: Add driver for OpenCAPI Persistent Memory Alastair D'Silva
  2020-02-26  5:07   ` Andrew Donnellan
  2020-02-27 20:44   ` Frederic Barrat
@ 2020-02-28 18:32   ` Frederic Barrat
  2 siblings, 0 replies; 130+ messages in thread
From: Frederic Barrat @ 2020-02-28 18:32 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This driver exposes LPC memory on OpenCAPI pmem cards
> as an NVDIMM, allowing the existing nvram infrastructure
> to be used.
> 
> Namespace metadata is stored on the media itself, so
> scm_reserve_metadata() maps 1 section's worth of PMEM storage
> at the start to hold this. The rest of the PMEM range is registered
> with libnvdimm as an nvdimm. scm_ndctl_config_read/write/size() provide
> callbacks to libnvdimm to access the metadata.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/Kconfig        |   3 +
>   arch/powerpc/platforms/powernv/Makefile       |   1 +
>   arch/powerpc/platforms/powernv/pmem/Kconfig   |  15 +
>   arch/powerpc/platforms/powernv/pmem/Makefile  |   7 +
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 473 ++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.h    |  28 ++
>   6 files changed, 527 insertions(+)
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/Kconfig
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/Makefile
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl.c
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> 
> diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
> index 938803eab0ad..fc8976af0e52 100644
> --- a/arch/powerpc/platforms/powernv/Kconfig
> +++ b/arch/powerpc/platforms/powernv/Kconfig
> @@ -50,3 +50,6 @@ config PPC_VAS
>   config SCOM_DEBUGFS
>   	bool "Expose SCOM controllers via debugfs"
>   	depends on DEBUG_FS
> +
> +source "arch/powerpc/platforms/powernv/pmem/Kconfig"
> +
> diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
> index c0f8120045c3..0bbd72988b6f 100644
> --- a/arch/powerpc/platforms/powernv/Makefile
> +++ b/arch/powerpc/platforms/powernv/Makefile
> @@ -21,3 +21,4 @@ obj-$(CONFIG_PPC_VAS)	+= vas.o vas-window.o vas-debug.o
>   obj-$(CONFIG_OCXL_BASE)	+= ocxl.o
>   obj-$(CONFIG_SCOM_DEBUGFS) += opal-xscom.o
>   obj-$(CONFIG_PPC_SECURE_BOOT) += opal-secvar.o
> +obj-$(CONFIG_LIBNVDIMM) += pmem/
> diff --git a/arch/powerpc/platforms/powernv/pmem/Kconfig b/arch/powerpc/platforms/powernv/pmem/Kconfig
> new file mode 100644
> index 000000000000..c5d927520920
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/Kconfig
> @@ -0,0 +1,15 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +if LIBNVDIMM
> +
> +config OCXL_PMEM
> +	tristate "OpenCAPI Persistent Memory"
> +	depends on LIBNVDIMM && PPC_POWERNV && PCI && EEH && ZONE_DEVICE && OCXL
> +	help
> +	  Exposes devices that implement the OpenCAPI Storage Class Memory
> +	  specification as persistent memory regions. You may also want
> +	  DEV_DAX, DEV_DAX_PMEM & FS_DAX if you plan on using DAX devices
> +	  stacked on top of this driver.
> +
> +	  Select N if unsure.
> +
> +endif
> diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile b/arch/powerpc/platforms/powernv/pmem/Makefile
> new file mode 100644
> index 000000000000..1c55c4193175
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +ccflags-$(CONFIG_PPC_WERROR)	+= -Werror
> +
> +obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
> +
> +ocxlpmem-y := ocxl.o
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> new file mode 100644
> index 000000000000..3c4eeb5dcc0f
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -0,0 +1,473 @@
> +// SPDX-License-Id
> +// Copyright 2019 IBM Corp.
> +
> +/*
> + * A driver for OpenCAPI devices that implement the Storage Class
> + * Memory specification.
> + */
> +
> +#include <linux/module.h>
> +#include <misc/ocxl.h>
> +#include <linux/ndctl.h>
> +#include <linux/mm_types.h>
> +#include <linux/memory_hotplug.h>
> +#include "ocxl_internal.h"
> +
> +
> +static const struct pci_device_id ocxlpmem_pci_tbl[] = {
> +	{ PCI_DEVICE(PCI_VENDOR_ID_IBM, 0x0625), },
> +	{ }
> +};
> +
> +MODULE_DEVICE_TABLE(pci, ocxlpmem_pci_tbl);
> +
> +#define NUM_MINORS 256 // Total to reserve
> +
> +static dev_t ocxlpmem_dev;
> +static struct class *ocxlpmem_class;
> +static struct mutex minors_idr_lock;
> +static struct idr minors_idr;
> +
> +/**
> + * ndctl_config_write() - Handle a ND_CMD_SET_CONFIG_DATA command from ndctl
> + * @ocxlpmem: the device metadata
> + * @command: the incoming data to write
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_write(struct ocxlpmem *ocxlpmem,
> +			      struct nd_cmd_set_config_hdr *command)
> +{
> +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> +		return -EINVAL;
> +
> +	memcpy_flushcache(ocxlpmem->metadata_addr + command->in_offset, command->in_buf,
> +			  command->in_length);
> +
> +	return 0;
> +}
> +
> +/**
> + * ndctl_config_read() - Handle a ND_CMD_GET_CONFIG_DATA command from ndctl
> + * @ocxlpmem: the device metadata
> + * @command: the read request
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_read(struct ocxlpmem *ocxlpmem,
> +			     struct nd_cmd_get_config_data_hdr *command)
> +{
> +	if (command->in_offset + command->in_length > LABEL_AREA_SIZE)
> +		return -EINVAL;
> +
> +	memcpy_mcsafe(command->out_buf, ocxlpmem->metadata_addr + command->in_offset,
> +		      command->in_length);
> +
> +	return 0;
> +}
> +
> +/**
> + * ndctl_config_size() - Handle a ND_CMD_GET_CONFIG_SIZE command from ndctl
> + * @command: the read request
> + * Return: 0 on success, negative on failure
> + */
> +static int ndctl_config_size(struct nd_cmd_get_config_size *command)
> +{
> +	command->status = 0;
> +	command->config_size = LABEL_AREA_SIZE;
> +	command->max_xfer = PAGE_SIZE;
> +
> +	return 0;
> +}
> +
> +static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
> +		 struct nvdimm *nvdimm,
> +		 unsigned int cmd, void *buf, unsigned int buf_len, int *cmd_rc)
> +{
> +	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct ocxlpmem, bus_desc);
> +
> +	switch (cmd) {
> +	case ND_CMD_GET_CONFIG_SIZE:
> +		*cmd_rc = ndctl_config_size(buf);
> +		return 0;
> +
> +	case ND_CMD_GET_CONFIG_DATA:
> +		*cmd_rc = ndctl_config_read(ocxlpmem, buf);
> +		return 0;
> +
> +	case ND_CMD_SET_CONFIG_DATA:
> +		*cmd_rc = ndctl_config_write(ocxlpmem, buf);
> +		return 0;
> +
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> +
> +/**
> + * reserve_metadata() - Reserve space for nvdimm metadata
> + * @ocxlpmem: the device metadata
> + * @lpc_mem: The resource representing the LPC memory of the OpenCAPI device
> + */
> +static int reserve_metadata(struct ocxlpmem *ocxlpmem,
> +			    struct resource *lpc_mem)
> +{
> +	ocxlpmem->metadata_addr = devm_memremap(&ocxlpmem->dev, lpc_mem->start,
> +						LABEL_AREA_SIZE, MEMREMAP_WB);
> +	if (IS_ERR(ocxlpmem->metadata_addr))
> +		return PTR_ERR(ocxlpmem->metadata_addr);
> +
> +	return 0;
> +}
> +
> +/**
> + * register_lpc_mem() - Discover persistent memory on a device and register it with the NVDIMM subsystem
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success
> + */
> +static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
> +{
> +	struct nd_region_desc region_desc;
> +	struct nd_mapping_desc nd_mapping_desc;
> +	struct resource *lpc_mem;
> +	const struct ocxl_afu_config *config;
> +	const struct ocxl_fn_config *fn_config;
> +	int rc;
> +	unsigned long nvdimm_cmd_mask = 0;
> +	unsigned long nvdimm_flags = 0;
> +	int target_node;
> +	char serial[16+1];
> +
> +	// Set up the reserved metadata area
> +	rc = ocxl_afu_map_lpc_mem(ocxlpmem->ocxl_afu);
> +	if (rc < 0)
> +		return rc;
> +
> +	lpc_mem = ocxl_afu_lpc_mem(ocxlpmem->ocxl_afu);
> +	if (lpc_mem == NULL || lpc_mem->start == 0)
> +		return -EINVAL;
> +
> +	config = ocxl_afu_config(ocxlpmem->ocxl_afu);
> +	fn_config = ocxl_function_config(ocxlpmem->ocxl_fn);
> +
> +	rc = reserve_metadata(ocxlpmem, lpc_mem);
> +	if (rc)
> +		return rc;
> +
> +	ocxlpmem->bus_desc.provider_name = "ocxl-pmem";
> +	ocxlpmem->bus_desc.ndctl = ndctl;
> +	ocxlpmem->bus_desc.module = THIS_MODULE;
> +
> +	ocxlpmem->nvdimm_bus = nvdimm_bus_register(&ocxlpmem->dev,
> +						   &ocxlpmem->bus_desc);
> +	if (!ocxlpmem->nvdimm_bus)
> +		return -EINVAL;
> +
> +	ocxlpmem->pmem_res.start = (u64)lpc_mem->start + LABEL_AREA_SIZE;
> +	ocxlpmem->pmem_res.end = (u64)lpc_mem->start + config->lpc_mem_size - 1;
> +	ocxlpmem->pmem_res.name = "OpenCAPI persistent memory";
> +
> +	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
> +	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
> +	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
> +
> +	set_bit(NDD_ALIASING, &nvdimm_flags);
> +
> +	snprintf(serial, sizeof(serial), "%llx", fn_config->serial);
> +	nd_mapping_desc.nvdimm = nvdimm_create(ocxlpmem->nvdimm_bus, ocxlpmem,
> +				 NULL, nvdimm_flags, nvdimm_cmd_mask,
> +				 0, NULL);
> +	if (!nd_mapping_desc.nvdimm)
> +		return -ENOMEM;
> +
> +	if (nvdimm_bus_check_dimm_count(ocxlpmem->nvdimm_bus, 1))
> +		return -EINVAL;
> +
> +	nd_mapping_desc.start = ocxlpmem->pmem_res.start;
> +	nd_mapping_desc.size = resource_size(&ocxlpmem->pmem_res);
> +	nd_mapping_desc.position = 0;
> +
> +	ocxlpmem->nd_set.cookie1 = fn_config->serial + 1; // allow for empty serial
> +	ocxlpmem->nd_set.cookie2 = fn_config->serial + 1;
> +
> +	target_node = of_node_to_nid(ocxlpmem->pdev->dev.of_node);
> +
> +	memset(&region_desc, 0, sizeof(region_desc));
> +	region_desc.res = &ocxlpmem->pmem_res;
> +	region_desc.numa_node = NUMA_NO_NODE;
> +	region_desc.target_node = target_node;
> +	region_desc.num_mappings = 1;
> +	region_desc.mapping = &nd_mapping_desc;
> +	region_desc.nd_set = &ocxlpmem->nd_set;
> +
> +	set_bit(ND_REGION_PAGEMAP, &region_desc.flags);
> +	/*
> +	 * NB: libnvdimm copies the data from ndr_desc into it's own
> +	 * structures so passing a stack pointer is fine.
> +	 */
> +	ocxlpmem->nd_region = nvdimm_pmem_region_create(ocxlpmem->nvdimm_bus,
> +							&region_desc);
> +	if (!ocxlpmem->nd_region)
> +		return -EINVAL;
> +
> +	dev_info(&ocxlpmem->dev,
> +		 "Onlining %lluMB of persistent memory\n",
> +		 nd_mapping_desc.size / SZ_1M);
> +
> +	return 0;
> +}



There seems to be a lot of nvdimm-related operations which are done here 
and the undo part in free_ocxlpmem() is a lot shorter. Are we okay? Does 
the driver support being unloaded and reloaded, therefore reinitializing 
the same resources again?

   Fred




> +
> +/**
> + * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem device
> + * @ocxlpmem: the device metadata
> + * Return: the allocated minor number
> + */
> +static int allocate_minor(struct ocxlpmem *ocxlpmem)
> +{
> +	int minor;
> +
> +	mutex_lock(&minors_idr_lock);
> +	minor = idr_alloc(&minors_idr, ocxlpmem, 0, NUM_MINORS, GFP_KERNEL);
> +	mutex_unlock(&minors_idr_lock);
> +	return minor;
> +}
> +
> +static void free_minor(struct ocxlpmem *ocxlpmem)
> +{
> +	mutex_lock(&minors_idr_lock);
> +	idr_remove(&minors_idr, MINOR(ocxlpmem->dev.devt));
> +	mutex_unlock(&minors_idr_lock);
> +}
> +
> +/**
> + * free_ocxlpmem() - Free all members of an ocxlpmem struct
> + * @ocxlpmem: the device struct to clear
> + */
> +static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +
> +	if (ocxlpmem->nvdimm_bus)
> +		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> +
> +	free_minor(ocxlpmem);
> +
> +	if (ocxlpmem->metadata_addr)
> +		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
> +
> +	if (ocxlpmem->ocxl_context) {
> +		rc = ocxl_context_detach(ocxlpmem->ocxl_context);
> +		if (rc == -EBUSY)
> +			dev_warn(&ocxlpmem->dev, "Timeout detaching ocxl context\n");
> +		else
> +			ocxl_context_free(ocxlpmem->ocxl_context);
> +
> +	}
> +
> +	if (ocxlpmem->ocxl_afu)
> +		ocxl_afu_put(ocxlpmem->ocxl_afu);
> +
> +	if (ocxlpmem->ocxl_fn)
> +		ocxl_function_close(ocxlpmem->ocxl_fn);
> +
> +	kfree(ocxlpmem);
> +}
> +
> +/**
> + * free_ocxlpmem_dev() - Free an OpenCAPI persistent memory device
> + * @dev: The device struct
> + */
> +static void free_ocxlpmem_dev(struct device *dev)
> +{
> +	struct ocxlpmem *ocxlpmem = container_of(dev, struct ocxlpmem, dev);
> +
> +	free_ocxlpmem(ocxlpmem);
> +}
> +
> +/**
> + * ocxlpmem_register() - Register an OpenCAPI pmem device with the kernel
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +	int minor = allocate_minor(ocxlpmem);
> +
> +	if (minor < 0)
> +		return minor;
> +
> +	ocxlpmem->dev.release = free_ocxlpmem_dev;
> +	rc = dev_set_name(&ocxlpmem->dev, "ocxlpmem%d", minor);
> +	if (rc < 0)
> +		return rc;
> +
> +	ocxlpmem->dev.devt = MKDEV(MAJOR(ocxlpmem_dev), minor);
> +	ocxlpmem->dev.class = ocxlpmem_class;
> +	ocxlpmem->dev.parent = &ocxlpmem->pdev->dev;
> +
> +	return device_register(&ocxlpmem->dev);
> +}
> +
> +/**
> + * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
> + * @pdev: the PCI device information struct
> + */
> +static void ocxlpmem_remove(struct pci_dev *pdev)
> +{
> +	if (PCI_FUNC(pdev->devfn) == 0) {
> +		struct ocxlpmem_function0 *func0 = pci_get_drvdata(pdev);
> +
> +		if (func0) {
> +			ocxl_function_close(func0->ocxl_fn);
> +			func0->ocxl_fn = NULL;
> +		}
> +	} else {
> +		struct ocxlpmem *ocxlpmem = pci_get_drvdata(pdev);
> +
> +		if (ocxlpmem)
> +			device_unregister(&ocxlpmem->dev);
> +	}
> +}
> +
> +/**
> + * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
> + * This is important as it enables templates higher than 0 across all other functions,
> + * which in turn enables higher bandwidth accesses
> + * @pdev: the PCI device information struct
> + * Return: 0 on success, negative on failure
> + */
> +static int probe_function0(struct pci_dev *pdev)
> +{
> +	struct ocxlpmem_function0 *func0 = NULL;
> +	struct ocxl_fn *fn;
> +
> +	func0 = kzalloc(sizeof(*func0), GFP_KERNEL);
> +	if (!func0)
> +		return -ENOMEM;
> +
> +	func0->pdev = pdev;
> +	fn = ocxl_function_open(pdev);
> +	if (IS_ERR(fn)) {
> +		kfree(func0);
> +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> +		return PTR_ERR(fn);
> +	}
> +	func0->ocxl_fn = fn;
> +
> +	pci_set_drvdata(pdev, func0);
> +
> +	return 0;
> +}
> +
> +/**
> + * probe() - Init an OpenCAPI persistent memory device
> + * @pdev: the PCI device information struct
> + * @ent: The entry from ocxlpmem_pci_tbl
> + * Return: 0 on success, negative on failure
> + */
> +static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> +{
> +	struct ocxlpmem *ocxlpmem;
> +	int rc;
> +
> +	if (PCI_FUNC(pdev->devfn) == 0)
> +		return probe_function0(pdev);
> +	else if (PCI_FUNC(pdev->devfn) != 1)
> +		return 0;
> +
> +	ocxlpmem = kzalloc(sizeof(*ocxlpmem), GFP_KERNEL);
> +	if (!ocxlpmem) {
> +		dev_err(&pdev->dev, "Could not allocate OpenCAPI persistent memory metadata\n");
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +	ocxlpmem->pdev = pdev;
> +
> +	pci_set_drvdata(pdev, ocxlpmem);
> +
> +	ocxlpmem->ocxl_fn = ocxl_function_open(pdev);
> +	if (IS_ERR(ocxlpmem->ocxl_fn)) {
> +		kfree(ocxlpmem);
> +		pci_set_drvdata(pdev, NULL);
> +		dev_err(&pdev->dev, "failed to open OCXL function\n");
> +		rc = PTR_ERR(ocxlpmem->ocxl_fn);
> +		goto err;
> +	}
> +
> +	ocxlpmem->ocxl_afu = ocxl_function_fetch_afu(ocxlpmem->ocxl_fn, 0);
> +	if (ocxlpmem->ocxl_afu == NULL) {
> +		dev_err(&pdev->dev, "Could not get OCXL AFU from function\n");
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	ocxl_afu_get(ocxlpmem->ocxl_afu);
> +
> +	// Resources allocated below here are cleaned up in the release handler
> +
> +	rc = ocxlpmem_register(ocxlpmem);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory device with the kernel\n");
> +		goto err;
> +	}
> +
> +	rc = ocxl_context_alloc(&ocxlpmem->ocxl_context, ocxlpmem->ocxl_afu, NULL);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not allocate OCXL context\n");
> +		goto err;
> +	}
> +
> +	rc = ocxl_context_attach(ocxlpmem->ocxl_context, 0, NULL);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not attach ocxl context\n");
> +		goto err;
> +	}
> +
> +	rc = register_lpc_mem(ocxlpmem);
> +	if (rc) {
> +		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory with libnvdimm\n");
> +		goto err;
> +	}
> +
> +	return 0;
> +
> +err:
> +	/*
> +	 * Further cleanup is done in the release handler via free_ocxlpmem()
> +	 * This allows us to keep the character device live to handle IOCTLs to
> +	 * investigate issues if the card has an error
> +	 */
> +
> +	dev_err(&pdev->dev,
> +		"Error detected, will not register OpenCAPI persistent memory\n");
> +	return rc;
> +}
> +
> +static struct pci_driver pci_driver = {
> +	.name = "ocxl-pmem",
> +	.id_table = ocxlpmem_pci_tbl,
> +	.probe = probe,
> +	.remove = ocxlpmem_remove,
> +	.shutdown = ocxlpmem_remove,
> +};
> +
> +static int __init ocxlpmem_init(void)
> +{
> +	int rc = 0;
> +
> +	rc = pci_register_driver(&pci_driver);
> +	if (rc)
> +		return rc;
> +
> +	return 0;
> +}
> +
> +static void ocxlpmem_exit(void)
> +{
> +	pci_unregister_driver(&pci_driver);
> +}
> +
> +module_init(ocxlpmem_init);
> +module_exit(ocxlpmem_exit);
> +
> +MODULE_DESCRIPTION("OpenCAPI Persistent Memory");
> +MODULE_LICENSE("GPL");
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> new file mode 100644
> index 000000000000..0faf3740e9b8
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -0,0 +1,28 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +// Copyright 2019 IBM Corp.
> +
> +#include <linux/pci.h>
> +#include <misc/ocxl.h>
> +#include <linux/libnvdimm.h>
> +#include <linux/mm.h>
> +
> +#define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> +
> +struct ocxlpmem_function0 {
> +	struct pci_dev *pdev;
> +	struct ocxl_fn *ocxl_fn;
> +};
> +
> +struct ocxlpmem {
> +	struct device dev;
> +	struct pci_dev *pdev;
> +	struct ocxl_fn *ocxl_fn;
> +	struct nd_interleave_set nd_set;
> +	struct nvdimm_bus_descriptor bus_desc;
> +	struct nvdimm_bus *nvdimm_bus;
> +	struct ocxl_afu *ocxl_afu;
> +	struct ocxl_context *ocxl_context;
> +	void *metadata_addr;
> +	struct resource pmem_res;
> +	struct nd_region *nd_region;
> +};
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs
  2020-02-28  7:15     ` Greg Kroah-Hartman
@ 2020-03-01 23:42       ` Alastair D'Silva
  2020-03-02  5:38         ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-01 23:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy, linux-kernel,
	linuxppc-dev, linux-nvdimm

On Fri, 2020-02-28 at 08:15 +0100, Greg Kroah-Hartman wrote:
> On Fri, Feb 28, 2020 at 05:25:31PM +1100, Andrew Donnellan wrote:
> > On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > > +int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem)
> > > +{
> > > +	int i, rc;
> > > +
> > > +	for (i = 0; i < ARRAY_SIZE(attrs); i++) {
> > > +		rc = device_create_file(&ocxlpmem->dev, &attrs[i]);
> > > +		if (rc) {
> > > +			for (; --i >= 0;)
> > > +				device_remove_file(&ocxlpmem->dev,
> > > &attrs[i]);
> > 
> > I'd rather avoid weird for loop constructs if possible.
> > 
> > Is it actually dangerous to call device_remove_file() on an attr
> > that hasn't
> > been added? If not then I'd rather define an err: label and loop
> > over the
> > whole array there.
> 
> None of this should be used at all, just use attribute groups
> properly
> and the driver core will handle this all for you.
> 
> device_create/remove_file should never be called by anyone anymore if
> at all
> possible.
> 
> thanks,
> 
> greg k-h


Thanks, I'll rework it to use the .groups member of struct pci_driver.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with
  2020-02-21  3:27 ` [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with Alastair D'Silva
@ 2020-03-02  5:34   ` Andrew Donnellan
  2020-03-02  6:05     ` Alastair D'Silva
  2020-03-03  9:28   ` Frederic Barrat
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-02  5:34 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch introduces a character device (/dev/ocxl-scmX) which further
> patches will use to interact with userspace.

As with the comments on other patches in this series, this commit 
message is lacking in explanation. What's the purpose of this device?

> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 116 +++++++++++++++++-
>   .../platforms/powernv/pmem/ocxl_internal.h    |   2 +
>   2 files changed, 116 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index b8bd7e703b19..63109a870d2c 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -10,6 +10,7 @@
>   #include <misc/ocxl.h>
>   #include <linux/delay.h>
>   #include <linux/ndctl.h>
> +#include <linux/fs.h>
>   #include <linux/mm_types.h>
>   #include <linux/memory_hotplug.h>
>   #include "ocxl_internal.h"
> @@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
>   
>   	free_minor(ocxlpmem);
>   
> +	if (ocxlpmem->cdev.owner)
> +		cdev_del(&ocxlpmem->cdev);
> +
>   	if (ocxlpmem->metadata_addr)
>   		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
>   
> @@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
>   	return device_register(&ocxlpmem->dev);
>   }
>   
> +static void ocxlpmem_put(struct ocxlpmem *ocxlpmem)
> +{
> +	put_device(&ocxlpmem->dev);
> +}
> +
> +static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem)
> +{
> +	return (get_device(&ocxlpmem->dev) == NULL) ? NULL : ocxlpmem;
> +}
> +
> +static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno)
> +{
> +	struct ocxlpmem *ocxlpmem;
> +	int minor = MINOR(devno);
> +	/*
> +	 * We don't declare an RCU critical section here, as our AFU
> +	 * is protected by a re0ference counter on the device. By the time the
> +	 * minor number of a device is removed from the idr, the ref count of
> +	 * the device is already at 0, so no user API will access that AFU and
> +	 * this function can't return it.
> +	 */
> +	ocxlpmem = idr_find(&minors_idr, minor);
> +	if (ocxlpmem)
> +		ocxlpmem_get(ocxlpmem);
> +	return ocxlpmem;
> +}
> +
> +static int file_open(struct inode *inode, struct file *file)
> +{
> +	struct ocxlpmem *ocxlpmem;
> +
> +	ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev);
> +	if (!ocxlpmem)
> +		return -ENODEV;
> +
> +	file->private_data = ocxlpmem;
> +	return 0;
> +}
> +
> +static int file_release(struct inode *inode, struct file *file)
> +{
> +	struct ocxlpmem *ocxlpmem = file->private_data;
> +
> +	ocxlpmem_put(ocxlpmem);
> +	return 0;
> +}
> +
> +static const struct file_operations fops = {
> +	.owner		= THIS_MODULE,
> +	.open		= file_open,
> +	.release	= file_release,
> +};
> +
> +/**
> + * create_cdev() - Create the chardev in /dev for the device
> + * @ocxlpmem: the SCM metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int create_cdev(struct ocxlpmem *ocxlpmem)
> +{
> +	cdev_init(&ocxlpmem->cdev, &fops);
> +	return cdev_add(&ocxlpmem->cdev, ocxlpmem->dev.devt, 1);
> +}
> +
>   /**
>    * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
>    * @pdev: the PCI device information struct
> @@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   		goto err;
>   	}
>   
> +	if (create_cdev(ocxlpmem)) {
> +		dev_err(&pdev->dev, "Could not create character device\n");
> +		goto err;
> +	}
> +
>   	elapsed = 0;
>   	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
>   	while (!is_usable(ocxlpmem, false)) {
> @@ -613,20 +686,59 @@ static struct pci_driver pci_driver = {
>   	.shutdown = ocxlpmem_remove,
>   };
>   
> +static int file_init(void)
> +{
> +	int rc;
> +
> +	mutex_init(&minors_idr_lock);
> +	idr_init(&minors_idr);
> +
> +	rc = alloc_chrdev_region(&ocxlpmem_dev, 0, NUM_MINORS, "ocxl-pmem");

If the driver is going to be called "ocxlpmem" can we standardise on 
that without the extra hyphen?

> +	if (rc) {
> +		idr_destroy(&minors_idr);
> +		pr_err("Unable to allocate OpenCAPI persistent memory major number: %d\n", rc);
> +		return rc;
> +	}
> +
> +	ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem");
> +	if (IS_ERR(ocxlpmem_class)) {
> +		idr_destroy(&minors_idr);
> +		pr_err("Unable to create ocxl-pmem class\n");
> +		unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> +		return PTR_ERR(ocxlpmem_class);
> +	}
> +
> +	return 0;
> +}
> +
> +static void file_exit(void)
> +{
> +	class_destroy(ocxlpmem_class);
> +	unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> +	idr_destroy(&minors_idr);
> +}
> +
>   static int __init ocxlpmem_init(void)
>   {
> -	int rc = 0;
> +	int rc;
>   
> -	rc = pci_register_driver(&pci_driver);
> +	rc = file_init();
>   	if (rc)
>   		return rc;
>   
> +	rc = pci_register_driver(&pci_driver);
> +	if (rc) {
> +		file_exit();
> +		return rc;
> +	}
> +
>   	return 0;
>   }
>   
>   static void ocxlpmem_exit(void)
>   {
>   	pci_unregister_driver(&pci_driver);
> +	file_exit();
>   }
>   
>   module_init(ocxlpmem_init);
-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs
  2020-03-01 23:42       ` Alastair D'Silva
@ 2020-03-02  5:38         ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-02  5:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann, Andrew Morton,
	Mauro Carvalho Chehab, David S. Miller, Rob Herring,
	Anton Blanchard, Krzysztof Kozlowski, Mahesh Salgaonkar,
	Madhavan Srinivasan, Cédric Le Goater, Anju T Sudhakar,
	Hari Bathini, Thomas Gleixner, Greg Kurz, Nicholas Piggin,
	Masahiro Yamada, Alexey Kardashevskiy, linux-kernel,
	linuxppc-dev, linux-nvdimm

On Mon, 2020-03-02 at 10:42 +1100, Alastair D'Silva wrote:
> On Fri, 2020-02-28 at 08:15 +0100, Greg Kroah-Hartman wrote:
> > On Fri, Feb 28, 2020 at 05:25:31PM +1100, Andrew Donnellan wrote:
> > > On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > > > +int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem)
> > > > +{
> > > > +	int i, rc;
> > > > +
> > > > +	for (i = 0; i < ARRAY_SIZE(attrs); i++) {
> > > > +		rc = device_create_file(&ocxlpmem->dev,
> > > > &attrs[i]);
> > > > +		if (rc) {
> > > > +			for (; --i >= 0;)
> > > > +				device_remove_file(&ocxlpmem-
> > > > >dev,
> > > > &attrs[i]);
> > > 
> > > I'd rather avoid weird for loop constructs if possible.
> > > 
> > > Is it actually dangerous to call device_remove_file() on an attr
> > > that hasn't
> > > been added? If not then I'd rather define an err: label and loop
> > > over the
> > > whole array there.
> > 
> > None of this should be used at all, just use attribute groups
> > properly
> > and the driver core will handle this all for you.
> > 
> > device_create/remove_file should never be called by anyone anymore
> > if
> > at all
> > possible.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Thanks, I'll rework it to use the .groups member of struct
> pci_driver.
> 

I ended up making these available as DIMM attributes instead.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data
  2020-02-28  6:12   ` Andrew Donnellan
@ 2020-03-02  5:40     ` Alastair D'Silva
  2020-03-04 11:06     ` Frederic Barrat
  1 sibling, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-02  5:40 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Fri, 2020-02-28 at 17:12 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > When health & performance data is requested from the controller,
> > it responds with an error log containing the requested information.
> > 
> > This patch allows the request to me issued via an IOCTL.
> 
> A better explanation would be good - this IOCTL triggers a request
> to 
> the controller to collect controller health/perf data, and the 
> controller will later respond with an error log that can be picked
> up 
> via the error log IOCTL that you've defined earlier.
> 
> 

Ok

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with
  2020-03-02  5:34   ` Andrew Donnellan
@ 2020-03-02  6:05     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-02  6:05 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Mon, 2020-03-02 at 16:34 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This patch introduces a character device (/dev/ocxl-scmX) which
> > further
> > patches will use to interact with userspace.
> 
> As with the comments on other patches in this series, this commit 
> message is lacking in explanation. What's the purpose of this device?
> 

I'll reword this for v4.

> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 116
> > +++++++++++++++++-
> >   .../platforms/powernv/pmem/ocxl_internal.h    |   2 +
> >   2 files changed, 116 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index b8bd7e703b19..63109a870d2c 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -10,6 +10,7 @@
> >   #include <misc/ocxl.h>
> >   #include <linux/delay.h>
> >   #include <linux/ndctl.h>
> > +#include <linux/fs.h>
> >   #include <linux/mm_types.h>
> >   #include <linux/memory_hotplug.h>
> >   #include "ocxl_internal.h"
> > @@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem
> > *ocxlpmem)
> >   
> >   	free_minor(ocxlpmem);
> >   
> > +	if (ocxlpmem->cdev.owner)
> > +		cdev_del(&ocxlpmem->cdev);
> > +
> >   	if (ocxlpmem->metadata_addr)
> >   		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
> >   
> > @@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem
> > *ocxlpmem)
> >   	return device_register(&ocxlpmem->dev);
> >   }
> >   
> > +static void ocxlpmem_put(struct ocxlpmem *ocxlpmem)
> > +{
> > +	put_device(&ocxlpmem->dev);
> > +}
> > +
> > +static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem)
> > +{
> > +	return (get_device(&ocxlpmem->dev) == NULL) ? NULL : ocxlpmem;
> > +}
> > +
> > +static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno)
> > +{
> > +	struct ocxlpmem *ocxlpmem;
> > +	int minor = MINOR(devno);
> > +	/*
> > +	 * We don't declare an RCU critical section here, as our AFU
> > +	 * is protected by a re0ference counter on the device. By the
> > time the
> > +	 * minor number of a device is removed from the idr, the ref
> > count of
> > +	 * the device is already at 0, so no user API will access that
> > AFU and
> > +	 * this function can't return it.
> > +	 */
> > +	ocxlpmem = idr_find(&minors_idr, minor);
> > +	if (ocxlpmem)
> > +		ocxlpmem_get(ocxlpmem);
> > +	return ocxlpmem;
> > +}
> > +
> > +static int file_open(struct inode *inode, struct file *file)
> > +{
> > +	struct ocxlpmem *ocxlpmem;
> > +
> > +	ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev);
> > +	if (!ocxlpmem)
> > +		return -ENODEV;
> > +
> > +	file->private_data = ocxlpmem;
> > +	return 0;
> > +}
> > +
> > +static int file_release(struct inode *inode, struct file *file)
> > +{
> > +	struct ocxlpmem *ocxlpmem = file->private_data;
> > +
> > +	ocxlpmem_put(ocxlpmem);
> > +	return 0;
> > +}
> > +
> > +static const struct file_operations fops = {
> > +	.owner		= THIS_MODULE,
> > +	.open		= file_open,
> > +	.release	= file_release,
> > +};
> > +
> > +/**
> > + * create_cdev() - Create the chardev in /dev for the device
> > + * @ocxlpmem: the SCM metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int create_cdev(struct ocxlpmem *ocxlpmem)
> > +{
> > +	cdev_init(&ocxlpmem->cdev, &fops);
> > +	return cdev_add(&ocxlpmem->cdev, ocxlpmem->dev.devt, 1);
> > +}
> > +
> >   /**
> >    * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
> >    * @pdev: the PCI device information struct
> > @@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   		goto err;
> >   	}
> >   
> > +	if (create_cdev(ocxlpmem)) {
> > +		dev_err(&pdev->dev, "Could not create character
> > device\n");
> > +		goto err;
> > +	}
> > +
> >   	elapsed = 0;
> >   	timeout = ocxlpmem->readiness_timeout + ocxlpmem-
> > >memory_available_timeout;
> >   	while (!is_usable(ocxlpmem, false)) {
> > @@ -613,20 +686,59 @@ static struct pci_driver pci_driver = {
> >   	.shutdown = ocxlpmem_remove,
> >   };
> >   
> > +static int file_init(void)
> > +{
> > +	int rc;
> > +
> > +	mutex_init(&minors_idr_lock);
> > +	idr_init(&minors_idr);
> > +
> > +	rc = alloc_chrdev_region(&ocxlpmem_dev, 0, NUM_MINORS, "ocxl-
> > pmem");
> 
> If the driver is going to be called "ocxlpmem" can we standardise on 
> that without the extra hyphen?

Ok

> > +	if (rc) {
> > +		idr_destroy(&minors_idr);
> > +		pr_err("Unable to allocate OpenCAPI persistent memory
> > major number: %d\n", rc);
> > +		return rc;
> > +	}
> > +
> > +	ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem");
> > +	if (IS_ERR(ocxlpmem_class)) {
> > +		idr_destroy(&minors_idr);
> > +		pr_err("Unable to create ocxl-pmem class\n");
> > +		unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> > +		return PTR_ERR(ocxlpmem_class);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void file_exit(void)
> > +{
> > +	class_destroy(ocxlpmem_class);
> > +	unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> > +	idr_destroy(&minors_idr);
> > +}
> > +
> >   static int __init ocxlpmem_init(void)
> >   {
> > -	int rc = 0;
> > +	int rc;
> >   
> > -	rc = pci_register_driver(&pci_driver);
> > +	rc = file_init();
> >   	if (rc)
> >   		return rc;
> >   
> > +	rc = pci_register_driver(&pci_driver);
> > +	if (rc) {
> > +		file_exit();
> > +		return rc;
> > +	}
> > +
> >   	return 0;
> >   }
> >   
> >   static void ocxlpmem_exit(void)
> >   {
> >   	pci_unregister_driver(&pci_driver);
> > +	file_exit();
> >   }
> >   
> >   module_init(ocxlpmem_init);
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 26/27] powerpc/powernv/pmem: Expose the firmware version in sysfs
  2020-02-21  3:27 ` [PATCH v3 26/27] powerpc/powernv/pmem: Expose the firmware version " Alastair D'Silva
@ 2020-03-02  7:35   ` Andrew Donnellan
  2020-03-04  4:11     ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-02  7:35 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This information will be used by ndctl in userspace to help users identify
> the device.

You should include the information from the subject line in the body of 
the commit message too.

I think this patch could probably be squashed in with the last one.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready
  2020-02-21  3:27 ` [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready Alastair D'Silva
  2020-02-27  3:54   ` Andrew Donnellan
@ 2020-03-02 17:51   ` Frederic Barrat
  2020-03-04  4:15     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-02 17:51 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch reads timeouts & firmware version from the controller, and
> uses those timeouts to wait for the controller to report that it is ready
> before handing the memory over to libnvdimm.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/Makefile  |  2 +-
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 92 +++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.c    | 19 ++++
>   .../platforms/powernv/pmem/ocxl_internal.h    | 24 +++++
>   4 files changed, 136 insertions(+), 1 deletion(-)
>   create mode 100644 arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile b/arch/powerpc/platforms/powernv/pmem/Makefile
> index 1c55c4193175..4ceda25907d4 100644
> --- a/arch/powerpc/platforms/powernv/pmem/Makefile
> +++ b/arch/powerpc/platforms/powernv/pmem/Makefile
> @@ -4,4 +4,4 @@ ccflags-$(CONFIG_PPC_WERROR)	+= -Werror
>   
>   obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
>   
> -ocxlpmem-y := ocxl.o
> +ocxlpmem-y := ocxl.o ocxl_internal.o
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 3c4eeb5dcc0f..431212c9f0cc 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -8,6 +8,7 @@
>   
>   #include <linux/module.h>
>   #include <misc/ocxl.h>
> +#include <linux/delay.h>
>   #include <linux/ndctl.h>
>   #include <linux/mm_types.h>
>   #include <linux/memory_hotplug.h>
> @@ -215,6 +216,36 @@ static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
>   	return 0;
>   }
>   
> +/**
> + * is_usable() - Is a controller usable?
> + * @ocxlpmem: the device metadata
> + * @verbose: True to log errors
> + * Return: true if the controller is usable
> + */
> +static bool is_usable(const struct ocxlpmem *ocxlpmem, bool verbose)
> +{
> +	u64 chi = 0;
> +	int rc = ocxlpmem_chi(ocxlpmem, &chi);
> +
> +	if (rc < 0)
> +		return false;
> +
> +	if (!(chi & GLOBAL_MMIO_CHI_CRDY)) {
> +		if (verbose)
> +			dev_err(&ocxlpmem->dev, "controller is not ready.\n");
> +		return false;
> +	}
> +
> +	if (!(chi & GLOBAL_MMIO_CHI_MA)) {
> +		if (verbose)
> +			dev_err(&ocxlpmem->dev,
> +				"controller does not have memory available.\n");
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
>   /**
>    * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem device
>    * @ocxlpmem: the device metadata
> @@ -328,6 +359,48 @@ static void ocxlpmem_remove(struct pci_dev *pdev)
>   	}
>   }
>   
> +/**
> + * read_device_metadata() - Retrieve config information from the AFU and save it for future use
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int read_device_metadata(struct ocxlpmem *ocxlpmem)
> +{
> +	u64 val;
> +	int rc;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CCAP0,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	ocxlpmem->scm_revision = val & 0xFFFF;
> +	ocxlpmem->read_latency = (val >> 32) & 0xFF;
> +	ocxlpmem->readiness_timeout = (val >> 48) & 0x0F;
> +	ocxlpmem->memory_available_timeout = val >> 52;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CCAP1,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	ocxlpmem->max_controller_dump_size = val & 0xFFFFFFFF;
> +
> +	// Extract firmware version text
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_FWVER,
> +				     OCXL_HOST_ENDIAN, (u64 *)ocxlpmem->fw_version);
> +	if (rc)
> +		return rc;
> +
> +	ocxlpmem->fw_version[8] = '\0';
> +
> +	dev_info(&ocxlpmem->dev,
> +		 "Firmware version '%s' SCM revision %d:%d\n", ocxlpmem->fw_version,
> +		 ocxlpmem->scm_revision >> 4, ocxlpmem->scm_revision & 0x0F);
> +
> +	return 0;
> +}
> +
>   /**
>    * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
>    * This is important as it enables templates higher than 0 across all other functions,
> @@ -368,6 +441,7 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   {
>   	struct ocxlpmem *ocxlpmem;
>   	int rc;
> +	u16 elapsed, timeout;
>   
>   	if (PCI_FUNC(pdev->devfn) == 0)
>   		return probe_function0(pdev);
> @@ -422,6 +496,24 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   		goto err;
>   	}
>   
> +	if (read_device_metadata(ocxlpmem)) {
> +		dev_err(&pdev->dev, "Could not read metadata\n");



Need to set rc



> +		goto err;
> +	}
> +
> +	elapsed = 0;
> +	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
> +	while (!is_usable(ocxlpmem, false)) {
> +		if (elapsed++ > timeout) {
> +			dev_warn(&ocxlpmem->dev, "OpenCAPI Persistent Memory ready timeout.\n");
> +			(void)is_usable(ocxlpmem, true);


I guess that extra call to is_usable() is just to log the cause of the 
error. However, with some bad luck, the call could now succeed.


   Fred


> +			rc = -ENXIO;
> +			goto err;
> +		}
> +
> +		msleep(1000);
> +	}
> +
>   	rc = register_lpc_mem(ocxlpmem);
>   	if (rc) {
>   		dev_err(&pdev->dev, "Could not register OpenCAPI persistent memory with libnvdimm\n");
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> new file mode 100644
> index 000000000000..617ca943b1b8
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> @@ -0,0 +1,19 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +// Copyright 2019 IBM Corp.
> +
> +#include <misc/ocxl.h>
> +#include <linux/delay.h>
> +#include "ocxl_internal.h"
> +
> +int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi)
> +{
> +	u64 val;
> +	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHI,
> +					 OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	*chi = val;
> +
> +	return 0;
> +}
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index 9cf3e42750e7..ba0301533d00 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -97,4 +97,28 @@ struct ocxlpmem {
>   	void *metadata_addr;
>   	struct resource pmem_res;
>   	struct nd_region *nd_region;
> +	char fw_version[8+1];
> +
> +	u32 max_controller_dump_size;
> +	u16 scm_revision; // major/minor
> +	u8 readiness_timeout;  /* The worst case time (in seconds) that the host shall
> +				* wait for the controller to become operational following a reset (CHI.CRDY).
> +				*/
> +	u8 memory_available_timeout;   /* The worst case time (in seconds) that the host shall
> +					* wait for memory to become available following a reset (CHI.MA).
> +					*/
> +
> +	u16 read_latency; /* The nominal measure of latency (in nanoseconds)
> +			   * associated with an unassisted read of a memory block.
> +			   * This represents the capability of the raw media technology without assistance
> +			   */
>   };
> +
> +/**
> + * ocxlpmem_chi() - Get the value of the CHI register
> + * @ocxlpmem: the device metadata
> + * @chi: returns the CHI value
> + *
> + * Returns 0 on success, negative on error
> + */
> +int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi);
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands
  2020-02-21  3:27 ` [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands Alastair D'Silva
  2020-02-27  8:30   ` Andrew Donnellan
  2020-02-27 17:02   ` Dan Williams
@ 2020-03-02 17:58   ` Frederic Barrat
  2020-03-02 18:42     ` Dan Williams
  2 siblings, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-02 17:58 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Similar to the previous patch, this adds support for near storage commands.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---


Is any of these new functions ever called?

   Fred


>   arch/powerpc/platforms/powernv/pmem/ocxl.c    |  6 +++
>   .../platforms/powernv/pmem/ocxl_internal.c    | 41 +++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.h    | 37 +++++++++++++++++
>   3 files changed, 84 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 4e782d22605b..b8bd7e703b19 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -259,12 +259,18 @@ static int setup_command_metadata(struct ocxlpmem *ocxlpmem)
>   	int rc;
>   
>   	mutex_init(&ocxlpmem->admin_command.lock);
> +	mutex_init(&ocxlpmem->ns_command.lock);
>   
>   	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_ACMA_CREQO,
>   				      &ocxlpmem->admin_command);
>   	if (rc)
>   		return rc;
>   
> +	rc = extract_command_metadata(ocxlpmem, GLOBAL_MMIO_NSCMA_CREQO,
> +					  &ocxlpmem->ns_command);
> +	if (rc)
> +		return rc;
> +
>   	return 0;
>   }
>   
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> index 583f48023025..3e0b133feddf 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> @@ -133,6 +133,47 @@ int admin_response_handled(const struct ocxlpmem *ocxlpmem)
>   				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_CHI_ACRA);
>   }
>   
> +int ns_command_request(struct ocxlpmem *ocxlpmem, u8 op_code)
> +{
> +	u64 val;
> +	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHI,
> +					 OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	if (!(val & GLOBAL_MMIO_CHI_NSCRA))
> +		return -EBUSY;
> +
> +	return scm_command_request(ocxlpmem, &ocxlpmem->ns_command, op_code);
> +}
> +
> +int ns_response(const struct ocxlpmem *ocxlpmem)
> +{
> +	return command_response(ocxlpmem, &ocxlpmem->ns_command);
> +}
> +
> +int ns_command_execute(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
> +				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_HCI_NSCRW);
> +}
> +
> +bool ns_command_complete(const struct ocxlpmem *ocxlpmem)
> +{
> +	u64 val = 0;
> +	int rc = ocxlpmem_chi(ocxlpmem, &val);
> +
> +	WARN_ON(rc);
> +
> +	return (val & GLOBAL_MMIO_CHI_NSCRA) != 0;
> +}
> +
> +int ns_response_handled(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
> +				      OCXL_LITTLE_ENDIAN, GLOBAL_MMIO_CHI_NSCRA);
> +}
> +
>   void warn_status(const struct ocxlpmem *ocxlpmem, const char *message,
>   		     u8 status)
>   {
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index 2fef68c71271..28e2020f6355 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -107,6 +107,7 @@ struct ocxlpmem {
>   	struct ocxl_context *ocxl_context;
>   	void *metadata_addr;
>   	struct command_metadata admin_command;
> +	struct command_metadata ns_command;
>   	struct resource pmem_res;
>   	struct nd_region *nd_region;
>   	char fw_version[8+1];
> @@ -175,6 +176,42 @@ int admin_command_complete_timeout(const struct ocxlpmem *ocxlpmem,
>    */
>   int admin_response_handled(const struct ocxlpmem *ocxlpmem);
>   
> +/**
> + * ns_command_request() - Issue a near storage command request
> + * @ocxlpmem: the device metadata
> + * @op_code: The op-code for the command
> + * Returns an identifier for the command, or negative on error
> + */
> +int ns_command_request(struct ocxlpmem *ocxlpmem, u8 op_code);
> +
> +/**
> + * ns_response() - Validate a near storage response
> + * @ocxlpmem: the device metadata
> + * Returns the status code of the command, or negative on error
> + */
> +int ns_response(const struct ocxlpmem *ocxlpmem);
> +
> +/**
> + * ns_command_execute() - Notify the controller to start processing a pending near storage command
> + * @ocxlpmem: the device metadata
> + * Returns 0 on success, negative on error
> + */
> +int ns_command_execute(const struct ocxlpmem *ocxlpmem);
> +
> +/**
> + * ns_command_complete() - Is a near storage command executing
> + * @ocxlpmem: the device metadata
> + * Returns true if the previous admin command has completed
> + */
> +bool ns_command_complete(const struct ocxlpmem *ocxlpmem);
> +
> +/**
> + * ns_response_handled() - Notify the controller that the near storage response has been handled
> + * @ocxlpmem: the device metadata
> + * Returns 0 on success, negative on failure
> + */
> +int ns_response_handled(const struct ocxlpmem *ocxlpmem);
> +
>   /**
>    * warn_status() - Emit a kernel warning showing a command status.
>    * @ocxlpmem: the device metadata
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands
  2020-03-02 17:58   ` Frederic Barrat
@ 2020-03-02 18:42     ` Dan Williams
  2020-03-04  4:42       ` Alastair D'Silva
  0 siblings, 1 reply; 130+ messages in thread
From: Dan Williams @ 2020-03-02 18:42 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Alastair D'Silva, alastair, Aneesh Kumar K . V,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashev skiy,
	Linux Kernel Mailing List, linuxppc-dev, linux-nvdimm, Linux MM

On Mon, Mar 2, 2020 at 9:59 AM Frederic Barrat <fbarrat@linux.ibm.com> wrote:
>
>
>
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> >
> > Similar to the previous patch, this adds support for near storage commands.
> >
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
>
>
> Is any of these new functions ever called?

This is my concern as well. The libnvdimm command support is limited
to the commands that Linux will use. Other passthrough commands are
supported through a passthrough interface. However, that passthrough
interface is explicitly limited to publicly documented command sets so
that the kernel has an opportunity to constrain and consolidate
command implementations across vendors.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory
  2020-02-21  3:26 ` [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory Alastair D'Silva
  2020-02-24  2:51   ` Andrew Donnellan
  2020-02-25 10:02   ` Frederic Barrat
@ 2020-03-03  6:10   ` Andrew Donnellan
  2020-03-04  5:33     ` Alastair D'Silva
  2 siblings, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-03  6:10 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:26 pm, Alastair D'Silva wrote:> +#ifdef 
CONFIG_MEMORY_HOTPLUG_SPARSE
> +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size)
> +{
> +	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> +	struct pnv_phb *phb = hose->private_data;
> +	u32 bdfn = pci_dev_id(pdev);
> +	__be64 base_addr_be64;
> +	u64 base_addr;
> +	int rc;
> +
> +	rc = opal_npu_mem_alloc(phb->opal_id, bdfn, size, &base_addr_be64);

Sparse warning:

https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/15776//artifact/linux/report.txt

I think in patch 1 we need to change a uint64_t to a __be64.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace
  2020-02-21  3:27 ` [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace Alastair D'Silva
@ 2020-03-03  7:02   ` Andrew Donnellan
  2020-03-04  5:48     ` Alastair D'Silva
  2020-03-04 11:00   ` Frederic Barrat
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-03  7:02 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:> @@ -938,6 +955,51 @@ static 
int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
>   	return rc;
>   }
>   
> +static int ioctl_eventfd(struct ocxlpmem *ocxlpmem,
> +		 struct ioctl_ocxl_pmem_eventfd __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_eventfd args;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	if (ocxlpmem->ev_ctx)
> +		return -EINVAL;

I think EBUSY is more appropriate here.

> +
> +	ocxlpmem->ev_ctx = eventfd_ctx_fdget(args.eventfd);
> +	if (!ocxlpmem->ev_ctx)
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int ioctl_event_check(struct ocxlpmem *ocxlpmem, u64 __user *uarg)
> +{
> +	u64 val = 0;
> +	int rc;
> +	u64 chi = 0;
> +
> +	rc = ocxlpmem_chi(ocxlpmem, &chi);
> +	if (rc < 0)
> +		return rc;
> +
> +	if (chi & GLOBAL_MMIO_CHI_ELA)
> +		val |= IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE;
> +
> +	if (chi & GLOBAL_MMIO_CHI_CDA)
> +		val |= IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE;
> +
> +	if (chi & GLOBAL_MMIO_CHI_CFFS)
> +		val |= IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL;
> +
> +	if (chi & GLOBAL_MMIO_CHI_CHFS)
> +		val |= IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL;
> +
> +	rc = copy_to_user((u64 __user *) uarg, &val, sizeof(val));
> +
> +	return rc;
> +}
> +
>   static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   {
>   	struct ocxlpmem *ocxlpmem = file->private_data;
> @@ -966,6 +1028,15 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   		rc = ioctl_controller_stats(ocxlpmem,
>   					    (struct ioctl_ocxl_pmem_controller_stats __user *)args);
>   		break;
> +
> +	case IOCTL_OCXL_PMEM_EVENTFD:
> +		rc = ioctl_eventfd(ocxlpmem,
> +				   (struct ioctl_ocxl_pmem_eventfd __user *)args);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_EVENT_CHECK:
> +		rc = ioctl_event_check(ocxlpmem, (u64 __user *)args);
> +		break;
>   	}
>   
>   	return rc;
> @@ -1107,6 +1178,146 @@ static void dump_error_log(struct ocxlpmem *ocxlpmem)
>   	kfree(buf);
>   }
>   
> +static irqreturn_t imn0_handler(void *private)
> +{
> +	struct ocxlpmem *ocxlpmem = private;
> +	u64 chi = 0;
> +
> +	(void)ocxlpmem_chi(ocxlpmem, &chi);
> +
> +	if (chi & GLOBAL_MMIO_CHI_ELA) {
> +		dev_warn(&ocxlpmem->dev, "Error log is available\n");
> +
> +		if (ocxlpmem->ev_ctx)
> +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> +	}
> +
> +	if (chi & GLOBAL_MMIO_CHI_CDA) {
> +		dev_warn(&ocxlpmem->dev, "Controller dump is available\n");
> +
> +		if (ocxlpmem->ev_ctx)
> +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> +	}
> +
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t imn1_handler(void *private)
> +{
> +	struct ocxlpmem *ocxlpmem = private;
> +	u64 chi = 0;
> +
> +	(void)ocxlpmem_chi(ocxlpmem, &chi);
> +
> +	if (chi & (GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS)) {
> +		dev_err(&ocxlpmem->dev,
> +			"Controller status is fatal, chi=0x%llx, going offline\n", chi);
> +
> +		if (ocxlpmem->nvdimm_bus) {
> +			nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> +			ocxlpmem->nvdimm_bus = NULL;
> +		}
> +
> +		if (ocxlpmem->ev_ctx)
> +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +
> +/**
> + * ocxlpmem_setup_irq() - Set up the IRQs for the OpenCAPI Persistent Memory device
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int ocxlpmem_setup_irq(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +	u64 irq_addr;
> +
> +	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem->irq_id[0]);
> +	if (rc)
> +		return rc;
> +
> +	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[0],
> +				  imn0_handler, NULL, ocxlpmem);
> +
> +	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, ocxlpmem->irq_id[0]);
> +	if (!irq_addr)
> +		return -EINVAL;
> +
> +	ocxlpmem->irq_addr[0] = ioremap(irq_addr, PAGE_SIZE);
> +	if (!ocxlpmem->irq_addr[0])
> +		return -EINVAL;

Something other than EINVAL for these two

> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_OHP,
> +				      OCXL_LITTLE_ENDIAN,
> +				      (u64)ocxlpmem->irq_addr[0]);
> +	if (rc)
> +		goto out_irq0;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_CFP,
> +				      OCXL_LITTLE_ENDIAN, 0);
> +	if (rc)
> +		goto out_irq0;
> +
> +	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem->irq_id[1]);
> +	if (rc)
> +		goto out_irq0;
> +
> +
> +	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[1],
> +				  imn1_handler, NULL, ocxlpmem);
> +	if (rc)
> +		goto out_irq0;
> +
> +	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, ocxlpmem->irq_id[1]);
> +	if (!irq_addr) {
> +		rc = -EFAULT;
> +		goto out_irq0;
> +	}
> +
> +	ocxlpmem->irq_addr[1] = ioremap(irq_addr, PAGE_SIZE);
> +	if (!ocxlpmem->irq_addr[1]) {
> +		rc = -EINVAL;
> +		goto out_irq0;
> +	}
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA1_OHP,
> +				      OCXL_LITTLE_ENDIAN,
> +				      (u64)ocxlpmem->irq_addr[1]);
> +	if (rc)
> +		goto out_irq1;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA1_CFP,
> +				      OCXL_LITTLE_ENDIAN, 0);
> +	if (rc)
> +		goto out_irq1;
> +
> +	// Enable doorbells
> +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIE,
> +				    OCXL_LITTLE_ENDIAN,
> +				    GLOBAL_MMIO_CHI_ELA | GLOBAL_MMIO_CHI_CDA |
> +				    GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS |
> +				    GLOBAL_MMIO_CHI_NSCRA);

We don't actually do anything in the handlers with NSCRA...

> +	if (rc)
> +		goto out_irq1;
> +
> +	return 0;
> +
> +out_irq1:
> +	iounmap(ocxlpmem->irq_addr[1]);
> +	ocxlpmem->irq_addr[1] = NULL;
> +
> +out_irq0:
> +	iounmap(ocxlpmem->irq_addr[0]);
> +	ocxlpmem->irq_addr[0] = NULL;
> +
> +	return rc;
> +}
> +
>   /**
>    * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
>    * This is important as it enables templates higher than 0 across all other functions,
> @@ -1216,6 +1427,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   		goto err;
>   	}
>   
> +	if (ocxlpmem_setup_irq(ocxlpmem)) {
> +		dev_err(&pdev->dev, "Could not set up OCXL IRQs\n");
> +		goto err;
> +	}
> +
>   	if (setup_command_metadata(ocxlpmem)) {
>   		dev_err(&pdev->dev, "Could not read OCXL command matada\n");
>   		goto err;
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index b953ee522ed4..927690f4888f 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -103,6 +103,10 @@ struct ocxlpmem {
>   	struct pci_dev *pdev;
>   	struct cdev cdev;
>   	struct ocxl_fn *ocxl_fn;
> +#define SCM_IRQ_COUNT 2
> +	int irq_id[SCM_IRQ_COUNT];
> +	struct dev_pagemap irq_pgmap[SCM_IRQ_COUNT];
> +	void *irq_addr[SCM_IRQ_COUNT];

I think this should be tagged __iomem

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with
  2020-02-21  3:27 ` [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with Alastair D'Silva
  2020-03-02  5:34   ` Andrew Donnellan
@ 2020-03-03  9:28   ` Frederic Barrat
  2020-03-05  3:38     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-03  9:28 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch introduces a character device (/dev/ocxl-scmX) which further
> patches will use to interact with userspace.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 116 +++++++++++++++++-
>   .../platforms/powernv/pmem/ocxl_internal.h    |   2 +
>   2 files changed, 116 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index b8bd7e703b19..63109a870d2c 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -10,6 +10,7 @@
>   #include <misc/ocxl.h>
>   #include <linux/delay.h>
>   #include <linux/ndctl.h>
> +#include <linux/fs.h>
>   #include <linux/mm_types.h>
>   #include <linux/memory_hotplug.h>
>   #include "ocxl_internal.h"
> @@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
>   
>   	free_minor(ocxlpmem);
>   
> +	if (ocxlpmem->cdev.owner)
> +		cdev_del(&ocxlpmem->cdev);
> +
>   	if (ocxlpmem->metadata_addr)
>   		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
>   
> @@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)
>   	return device_register(&ocxlpmem->dev);
>   }
>   
> +static void ocxlpmem_put(struct ocxlpmem *ocxlpmem)
> +{
> +	put_device(&ocxlpmem->dev);
> +}
> +
> +static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem)
> +{
> +	return (get_device(&ocxlpmem->dev) == NULL) ? NULL : ocxlpmem;
> +}
> +
> +static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno)
> +{
> +	struct ocxlpmem *ocxlpmem;
> +	int minor = MINOR(devno);
> +	/*
> +	 * We don't declare an RCU critical section here, as our AFU
> +	 * is protected by a reference counter on the device. By the time the
> +	 * minor number of a device is removed from the idr, the ref count of
> +	 * the device is already at 0, so no user API will access that AFU and
> +	 * this function can't return it.
> +	 */


I fixed something related in the ocxl driver (which had enough changes 
with the introduction of the "info" device to make a similar comment 
become wrong). See commit a58d37bce0d21. The issue is handling a 
simultaneous open() and removal of the device through /sysfs as best we can.

We are on a file open path and it's not like we're going to have a 
thousand clients, so performance is not that critical. We can take the 
mutex before searching in the IDR and release it after we increment the 
reference count on the device.
But that's not enough: we could still find the device in the IDR while 
it is being removed in free_ocxlpmem(). I believe the only safe way to 
address it is by removing the user-facing APIs (the char device) before 
calling device_unregister(). So that it's not possible to find the 
device in file_open() if it's in the middle of being removed.

   Fred


> +	ocxlpmem = idr_find(&minors_idr, minor);
> +	if (ocxlpmem)
> +		ocxlpmem_get(ocxlpmem);
> +	return ocxlpmem;
> +}
> +
> +static int file_open(struct inode *inode, struct file *file)
> +{
> +	struct ocxlpmem *ocxlpmem;
> +
> +	ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev);
> +	if (!ocxlpmem)
> +		return -ENODEV;
> +
> +	file->private_data = ocxlpmem;
> +	return 0;
> +}
> +
> +static int file_release(struct inode *inode, struct file *file)
> +{
> +	struct ocxlpmem *ocxlpmem = file->private_data;
> +
> +	ocxlpmem_put(ocxlpmem);
> +	return 0;
> +}
> +
> +static const struct file_operations fops = {
> +	.owner		= THIS_MODULE,
> +	.open		= file_open,
> +	.release	= file_release,
> +};
> +
> +/**
> + * create_cdev() - Create the chardev in /dev for the device
> + * @ocxlpmem: the SCM metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int create_cdev(struct ocxlpmem *ocxlpmem)
> +{
> +	cdev_init(&ocxlpmem->cdev, &fops);
> +	return cdev_add(&ocxlpmem->cdev, ocxlpmem->dev.devt, 1);
> +}
> +
>   /**
>    * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
>    * @pdev: the PCI device information struct
> @@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   		goto err;
>   	}
>   
> +	if (create_cdev(ocxlpmem)) {
> +		dev_err(&pdev->dev, "Could not create character device\n");
> +		goto err;
> +	}


As already mentioned in a previous patch, we branch to the err label so 
rc needs to be set to a valid error.



> +
>   	elapsed = 0;
>   	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
>   	while (!is_usable(ocxlpmem, false)) {
> @@ -613,20 +686,59 @@ static struct pci_driver pci_driver = {
>   	.shutdown = ocxlpmem_remove,
>   };
>   
> +static int file_init(void)
> +{
> +	int rc;
> +
> +	mutex_init(&minors_idr_lock);
> +	idr_init(&minors_idr);
> +
> +	rc = alloc_chrdev_region(&ocxlpmem_dev, 0, NUM_MINORS, "ocxl-pmem");
> +	if (rc) {
> +		idr_destroy(&minors_idr);
> +		pr_err("Unable to allocate OpenCAPI persistent memory major number: %d\n", rc);
> +		return rc;
> +	}
> +
> +	ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem");
> +	if (IS_ERR(ocxlpmem_class)) {
> +		idr_destroy(&minors_idr);
> +		pr_err("Unable to create ocxl-pmem class\n");
> +		unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> +		return PTR_ERR(ocxlpmem_class);
> +	}
> +
> +	return 0;
> +}
> +
> +static void file_exit(void)
> +{
> +	class_destroy(ocxlpmem_class);
> +	unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> +	idr_destroy(&minors_idr);
> +}
> +
>   static int __init ocxlpmem_init(void)
>   {
> -	int rc = 0;
> +	int rc;
>   
> -	rc = pci_register_driver(&pci_driver);
> +	rc = file_init();
>   	if (rc)
>   		return rc;
>   
> +	rc = pci_register_driver(&pci_driver);
> +	if (rc) {
> +		file_exit();
> +		return rc;
> +	}
> +
>   	return 0;
>   }
>   
>   static void ocxlpmem_exit(void)
>   {
>   	pci_unregister_driver(&pci_driver);
> +	file_exit();
>   }
>   
>   module_init(ocxlpmem_init);
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index 28e2020f6355..d2d81fec7bb1 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -2,6 +2,7 @@
>   // Copyright 2019 IBM Corp.
>   
>   #include <linux/pci.h>
> +#include <linux/cdev.h>
>   #include <misc/ocxl.h>
>   #include <linux/libnvdimm.h>
>   #include <linux/mm.h>
> @@ -99,6 +100,7 @@ struct ocxlpmem_function0 {
>   struct ocxlpmem {
>   	struct device dev;
>   	struct pci_dev *pdev;
> +	struct cdev cdev;
>   	struct ocxl_fn *ocxl_fn;
>   	struct nd_interleave_set nd_set;
>   	struct nvdimm_bus_descriptor bus_desc;
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command
  2020-02-21  3:27 ` [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command Alastair D'Silva
@ 2020-03-03 10:36   ` Frederic Barrat
  2020-03-05  4:31     ` Alastair D'Silva
  2020-03-04  5:58   ` Andrew Donnellan
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-03 10:36 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> The read error log command extracts information from the controller's
> internal error log.
> 
> This patch exposes this information in 2 ways:
> - During probe, if an error occurs & a log is available, print it to the
>    console
> - After probe, make the error log available to userspace via an IOCTL.
>    Userspace is notified of pending error logs in a later patch
>    ("powerpc/powernv/pmem: Forward events to userspace")
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 269 ++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.h    |   1 +
>   include/uapi/nvdimm/ocxl-pmem.h               |  46 +++
>   3 files changed, 316 insertions(+)
>   create mode 100644 include/uapi/nvdimm/ocxl-pmem.h
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 63109a870d2c..2b64504f9129 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -447,10 +447,219 @@ static int file_release(struct inode *inode, struct file *file)
>   	return 0;
>   }
>   
> +/**
> + * error_log_header_parse() - Parse the first 64 bits of the error log command response
> + * @ocxlpmem: the device metadata
> + * @length: out, returns the number of bytes in the response (excluding the 64 bit header)
> + */
> +static int error_log_header_parse(struct ocxlpmem *ocxlpmem, u16 *length)
> +{
> +	int rc;
> +	u64 val;
> +


Empty line in the middle of declarations


> +	u16 data_identifier;
> +	u32 data_length;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	data_identifier = val >> 48;
> +	data_length = val & 0xFFFF;
> +
> +	if (data_identifier != 0x454C) { // 'EL'
> +		dev_err(&ocxlpmem->dev,
> +			"Bad data identifier for error log data, expected 'EL', got '%2s' (%#x), data_length=%u\n",
> +			(char *)&data_identifier,
> +			(unsigned int)data_identifier, data_length);
> +		return -EINVAL;
> +	}
> +
> +	*length = data_length;
> +	return 0;
> +}
> +
> +static int error_log_offset_0x08(struct ocxlpmem *ocxlpmem,
> +				 u32 *log_identifier, u32 *program_ref_code)
> +{
> +	int rc;
> +	u64 val;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	*log_identifier = val >> 32;
> +	*program_ref_code = val & 0xFFFFFFFF;
> +
> +	return 0;
> +}
> +
> +static int read_error_log(struct ocxlpmem *ocxlpmem,
> +			  struct ioctl_ocxl_pmem_error_log *log, bool buf_is_user)
> +{
> +	u64 val;
> +	u16 user_buf_length;
> +	u16 buf_length;
> +	u16 i;
> +	int rc;
> +
> +	if (log->buf_size % 8)
> +		return -EINVAL;
> +
> +	rc = ocxlpmem_chi(ocxlpmem, &val);
> +	if (rc)
> +		goto out;



"out" will unlock a mutex not yet taken.



> +
> +	if (!(val & GLOBAL_MMIO_CHI_ELA))
> +		return -EAGAIN;
> +
> +	user_buf_length = log->buf_size;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_ERRLOG);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_ERRLOG);
> +	if (rc < 0) {
> +		dev_warn(&ocxlpmem->dev, "Read error log timed out\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem, "Unexpected status from retrieve error log", rc);
> +		goto out;
> +	}
> +
> +
> +	rc = error_log_header_parse(ocxlpmem, &log->buf_size);
> +	if (rc)
> +		goto out;
> +	// log->buf_size now contains the returned buffer size, not the user size
> +
> +	rc = error_log_offset_0x08(ocxlpmem, &log->log_identifier,
> +				       &log->program_reference_code);
> +	if (rc)
> +		goto out;



Offset 0x08 gets a preferential treatment compared to 0x10 below and 
it's not clear why.
I would create a subfonction which parses all the fields linearly.



> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x10,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		goto out;
> +
> +	log->error_log_type = val >> 56;
> +	log->action_flags = (log->error_log_type == OCXL_PMEM_ERROR_LOG_TYPE_GENERAL) ?
> +			    (val >> 32) & 0xFFFFFF : 0;
> +	log->power_on_seconds = val & 0xFFFFFFFF;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x18,
> +				     OCXL_LITTLE_ENDIAN, &log->timestamp);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x20,
> +				     OCXL_HOST_ENDIAN, &log->wwid[0]);



A bit of a moot point, but is there a reason why some of those MMIO ops 
use OCXL_LITTLE_ENDIAN and the others OCXL_HOST_ENDIAN?



> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x28,
> +				     OCXL_HOST_ENDIAN, &log->wwid[1]);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x30,
> +				     OCXL_HOST_ENDIAN, (u64 *)log->fw_revision);
> +	if (rc)
> +		goto out;
> +	log->fw_revision[8] = '\0';
> +
> +	buf_length = (user_buf_length < log->buf_size) ?
> +		     user_buf_length : log->buf_size;
> +	for (i = 0; i < buf_length + 0x48; i += 8) {
> +		u64 val;
> +
> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					     ocxlpmem->admin_command.data_offset + i,
> +					     OCXL_HOST_ENDIAN, &val);
> +		if (rc)
> +			goto out;
> +
> +		if (buf_is_user) {
> +			if (copy_to_user(&log->buf[i], &val, sizeof(u64))) {
> +				rc = -EFAULT;
> +				goto out;
> +			}
> +		} else
> +			log->buf[i] = val;
> +	}



I think it could be a bit simplified by keeping the handling of the user 
buffer out of this function. Always call it with a kernel buffer. And 
have only one copy_to_user() call on the ioctl() path. You'd need to 
allocate a kernel buf on the ioctl path, but you're already doing it on 
the probe() path, so it should be doable to share code.



> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +
> +}
> +
> +static int ioctl_error_log(struct ocxlpmem *ocxlpmem,
> +		struct ioctl_ocxl_pmem_error_log __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_error_log args;
> +	int rc;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	rc = read_error_log(ocxlpmem, &args, true);
> +	if (rc)
> +		return rc;
> +
> +	if (copy_to_user(uarg, &args, sizeof(args)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
> +{
> +	struct ocxlpmem *ocxlpmem = file->private_data;
> +	int rc = -EINVAL;
> +
> +	switch (cmd) {
> +	case IOCTL_OCXL_PMEM_ERROR_LOG:
> +		rc = ioctl_error_log(ocxlpmem,
> +				     (struct ioctl_ocxl_pmem_error_log __user *)args);
> +		break;
> +	}
> +	return rc;
> +}
> +
>   static const struct file_operations fops = {
>   	.owner		= THIS_MODULE,
>   	.open		= file_open,
>   	.release	= file_release,
> +	.unlocked_ioctl = file_ioctl,
> +	.compat_ioctl   = file_ioctl,
>   };
>   
>   /**
> @@ -527,6 +736,60 @@ static int read_device_metadata(struct ocxlpmem *ocxlpmem)
>   	return 0;
>   }
>   
> +static const char *decode_error_log_type(u8 error_log_type)
> +{
> +	switch (error_log_type) {
> +	case 0x00:
> +		return "general";
> +	case 0x01:
> +		return "predictive failure";
> +	case 0x02:
> +		return "thermal warning";
> +	case 0x03:
> +		return "data loss";
> +	case 0x04:
> +		return "health & performance";
> +	default:
> +		return "unknown";
> +	}
> +}
> +
> +static void dump_error_log(struct ocxlpmem *ocxlpmem)
> +{
> +	struct ioctl_ocxl_pmem_error_log log;
> +	u32 buf_size;
> +	u8 *buf;
> +	int rc;
> +
> +	if (ocxlpmem->admin_command.data_size == 0)
> +		return;
> +
> +	buf_size = ocxlpmem->admin_command.data_size - 0x48;
> +	buf = kzalloc(buf_size, GFP_KERNEL);
> +	if (!buf)
> +		return;
> +
> +	log.buf = buf;
> +	log.buf_size = buf_size;
> +
> +	rc = read_error_log(ocxlpmem, &log, false);
> +	if (rc < 0)
> +		goto out;
> +
> +	dev_warn(&ocxlpmem->dev,
> +		 "OCXL PMEM Error log: WWID=0x%016llx%016llx LID=0x%x PRC=%x type=0x%x %s, Uptime=%u seconds timestamp=0x%llx\n",
> +		 log.wwid[0], log.wwid[1],
> +		 log.log_identifier, log.program_reference_code,
> +		 log.error_log_type,
> +		 decode_error_log_type(log.error_log_type),
> +		 log.power_on_seconds, log.timestamp);
> +	print_hex_dump(KERN_WARNING, "buf", DUMP_PREFIX_OFFSET, 16, 1, buf,
> +		       log.buf_size, false);


dev_warn already logs a warning. Isn't KERN_DEBUG more appropriate for 
the hex dump?



> +
> +out:
> +	kfree(buf);
> +}
> +
>   /**
>    * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
>    * This is important as it enables templates higher than 0 across all other functions,
> @@ -568,6 +831,7 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   	struct ocxlpmem *ocxlpmem;
>   	int rc;
>   	u16 elapsed, timeout;
> +	u64 chi;
>   
>   	if (PCI_FUNC(pdev->devfn) == 0)
>   		return probe_function0(pdev);
> @@ -667,6 +931,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   	return 0;
>   
>   err:
> +	if (ocxlpmem &&
> +		    (ocxlpmem_chi(ocxlpmem, &chi) == 0) &&
> +		    (chi & GLOBAL_MMIO_CHI_ELA))
> +		dump_error_log(ocxlpmem);
> +
>   	/*
>   	 * Further cleanup is done in the release handler via free_ocxlpmem()
>   	 * This allows us to keep the character device live to handle IOCTLs to
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index d2d81fec7bb1..b953ee522ed4 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -5,6 +5,7 @@
>   #include <linux/cdev.h>
>   #include <misc/ocxl.h>
>   #include <linux/libnvdimm.h>
> +#include <uapi/nvdimm/ocxl-pmem.h>


Can't we limit the extra include to ocxl.c?

Completely unrelated, but ocxl.c contains most of the code for this 
driver. We should consider renaming it to ocxlpmem.c or something along 
those lines, since it does a lot more than just interfacing with the 
opencapi interface. And would avoid confusion with an other already 
existing ocxl.c file.



>   #include <linux/mm.h>
>   
>   #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
> new file mode 100644
> index 000000000000..b10f8ac0c20f
> --- /dev/null
> +++ b/include/uapi/nvdimm/ocxl-pmem.h
> @@ -0,0 +1,46 @@
> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> +/* Copyright 2017 IBM Corp. */
> +#ifndef _UAPI_OCXL_SCM_H
> +#define _UAPI_OCXL_SCM_H
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +#define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
> +#define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
> +#define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
> +#define OCXL_PMEM_ERROR_LOG_ACTION_DUMP		(1 << (55-32))
> +
> +#define OCXL_PMEM_ERROR_LOG_TYPE_GENERAL		(0x00)
> +#define OCXL_PMEM_ERROR_LOG_TYPE_PREDICTIVE_FAILURE	(0x01)
> +#define OCXL_PMEM_ERROR_LOG_TYPE_THERMAL_WARNING	(0x02)
> +#define OCXL_PMEM_ERROR_LOG_TYPE_DATA_LOSS		(0x03)
> +#define OCXL_PMEM_ERROR_LOG_TYPE_HEALTH_PERFORMANCE	(0x04)
> +
> +struct ioctl_ocxl_pmem_error_log {
> +	__u32 log_identifier; /* out */
> +	__u32 program_reference_code; /* out */
> +	__u32 action_flags; /* out, recommended course of action */
> +	__u32 power_on_seconds; /* out, Number of seconds the controller has been on when the error occurred */
> +	__u64 timestamp; /* out, relative time since the current IPL */
> +	__u64 wwid[2]; /* out, the NAA formatted WWID associated with the controller */
> +	char  fw_revision[8+1]; /* out, firmware revision as null terminated text */


The 8+1 size will make the compiler add some padding here. Are we 
confident that all the compilers, at least on powerpc, will do the same 
thing and we can guarantee a kernel ABI? I would play it safe and have a 
discussion with folks who understand compilers better.



> +	__u16 buf_size; /* in/out, buffer size provided/required.
> +			 * If required is greater than provided, the buffer
> +			 * will be truncated to the amount provided. If its
> +			 * less, then only the required bytes will be populated.
> +			 * If it is 0, then there are no more error log entries.
> +			 */
> +	__u8  error_log_type;
> +	__u8  reserved1;
> +	__u32 reserved2;
> +	__u64 reserved3[2];
> +	__u8 *buf; /* pointer to output buffer */
> +};
> +
> +/* ioctl numbers */
> +#define OCXL_PMEM_MAGIC 0x5C


Randomly picked?
See (and add entry in) Documentation/userspace-api/ioctl/ioctl-number.rst


   Fred



> +/* SCM devices */
> +#define IOCTL_OCXL_PMEM_ERROR_LOG			_IOWR(OCXL_PMEM_MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log)
> +
> +#endif /* _UAPI_OCXL_SCM_H */
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs
  2020-02-21  3:27 ` [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs Alastair D'Silva
@ 2020-03-03 18:04   ` Frederic Barrat
  2020-03-05 23:37     ` Alastair D'Silva
  2020-03-04  6:53   ` Andrew Donnellan
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-03 18:04 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch adds IOCTLs to allow userspace to request & fetch dumps
> of the internal controller state.
> 
> This is useful during debugging or when a fatal error on the controller
> has occurred.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c | 132 +++++++++++++++++++++
>   include/uapi/nvdimm/ocxl-pmem.h            |  15 +++
>   2 files changed, 147 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 2b64504f9129..2cabafe1fc58 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -640,6 +640,124 @@ static int ioctl_error_log(struct ocxlpmem *ocxlpmem,
>   	return 0;
>   }
>   
> +static int ioctl_controller_dump_data(struct ocxlpmem *ocxlpmem,
> +		struct ioctl_ocxl_pmem_controller_dump_data __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_controller_dump_data args;
> +	u16 i;
> +	u64 val;
> +	int rc;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	if (args.buf_size % 8)
> +		return -EINVAL;
> +
> +	if (args.buf_size > ocxlpmem->admin_command.data_size)
> +		return -EINVAL;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_CONTROLLER_DUMP);
> +	if (rc)
> +		goto out;
> +
> +	val = ((u64)args.offset) << 32;
> +	val |= args.buf_size;
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x08,
> +				      OCXL_LITTLE_ENDIAN, val);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem,
> +					    ADMIN_COMMAND_CONTROLLER_DUMP);
> +	if (rc < 0) {
> +		dev_warn(&ocxlpmem->dev, "Controller dump timed out\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem,
> +			    "Unexpected status from retrieve error log",
> +			    rc);
> +		goto out;
> +	}



It would help if there was a comment indicating how the 3 ioctls are 
used. My understanding is that the userland is:
- requesting the controller to prepare a state dump
- then one or more ioctls to fetch the data. The number of calls 
required to get the full state really depends on the size of the buffer 
passed by user
- a last ioctl to tell the controller that we're done, presumably to let 
it free some resources.


> +
> +	for (i = 0; i < args.buf_size; i += 8) {
> +		u64 val;
> +
> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					     ocxlpmem->admin_command.data_offset + i,
> +					     OCXL_HOST_ENDIAN, &val);
> +		if (rc)
> +			goto out;
> +
> +		if (copy_to_user(&args.buf[i], &val, sizeof(u64))) {
> +			rc = -EFAULT;
> +			goto out;
> +		}
> +	}
> +
> +	if (copy_to_user(uarg, &args, sizeof(args))) {
> +		rc = -EFAULT;
> +		goto out;
> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
> +int request_controller_dump(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +	u64 busy = 1;
> +
> +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
> +				    OCXL_LITTLE_ENDIAN,
> +				    GLOBAL_MMIO_CHI_CDA);
> +


rc is not checked here.


> +
> +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
> +				    OCXL_LITTLE_ENDIAN,
> +				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP);
> +	if (rc)
> +		return rc;
> +
> +	while (busy) {
> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					     GLOBAL_MMIO_HCI,
> +					     OCXL_LITTLE_ENDIAN, &busy);
> +		if (rc)
> +			return rc;
> +
> +		busy &= GLOBAL_MMIO_HCI_CONTROLLER_DUMP;


Setting 'busy' doesn't hurt, but it's not really useful, is it?

We should add some kind of timeout so that if the controller hits an 
issue, we don't spin in kernel space endlessly.



> +		cond_resched();
> +	}
> +
> +	return 0;
> +}
> +
> +static int ioctl_controller_dump_complete(struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
> +				    OCXL_LITTLE_ENDIAN,
> +				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COLLECTED);
> +}
> +
>   static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   {
>   	struct ocxlpmem *ocxlpmem = file->private_data;
> @@ -650,7 +768,21 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   		rc = ioctl_error_log(ocxlpmem,
>   				     (struct ioctl_ocxl_pmem_error_log __user *)args);
>   		break;
> +
> +	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP:
> +		rc = request_controller_dump(ocxlpmem);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA:
> +		rc = ioctl_controller_dump_data(ocxlpmem,
> +						(struct ioctl_ocxl_pmem_controller_dump_data __user *)args);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE:
> +		rc = ioctl_controller_dump_complete(ocxlpmem);
> +		break;
>   	}
> +
>   	return rc;
>   }
>   
> diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
> index b10f8ac0c20f..d4d8512d03f7 100644
> --- a/include/uapi/nvdimm/ocxl-pmem.h
> +++ b/include/uapi/nvdimm/ocxl-pmem.h
> @@ -38,9 +38,24 @@ struct ioctl_ocxl_pmem_error_log {
>   	__u8 *buf; /* pointer to output buffer */
>   };
>   
> +struct ioctl_ocxl_pmem_controller_dump_data {
> +	__u8 *buf; /* pointer to output buffer */


We only support 64-bit user app on powerpc, but using a pointer type in 
a kernel ABI is unusual. We should use a know size like __u64.
(also applies to buf pointer in struct ioctl_ocxl_pmem_error_log from 
previous patch too)

The rest of the structure will also be padded by the compiler, which we 
should avoid.

    Fred



> +	__u16 buf_size; /* in/out, buffer size provided/required.
> +			 * If required is greater than provided, the buffer
> +			 * will be truncated to the amount provided. If its
> +			 * less, then only the required bytes will be populated.
> +			 * If it is 0, then there is no more dump data available.
> +			 */
> +	__u32 offset; /* in, Offset within the dump */
> +	__u64 reserved[8];
> +};
> +
>   /* ioctl numbers */
>   #define OCXL_PMEM_MAGIC 0x5C
>   /* SCM devices */
>   #define IOCTL_OCXL_PMEM_ERROR_LOG			_IOWR(OCXL_PMEM_MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log)
> +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP			_IO(OCXL_PMEM_MAGIC, 0x02)
> +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(OCXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
> +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_MAGIC, 0x04)
>   
>   #endif /* _UAPI_OCXL_SCM_H */
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 26/27] powerpc/powernv/pmem: Expose the firmware version in sysfs
  2020-03-02  7:35   ` Andrew Donnellan
@ 2020-03-04  4:11     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-04  4:11 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Mon, 2020-03-02 at 18:35 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This information will be used by ndctl in userspace to help users
> > identify
> > the device.
> 
> You should include the information from the subject line in the body
> of 
> the commit message too.
> 
> I think this patch could probably be squashed in with the last one.
> 

Ok

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 13/27] powerpc/powernv/pmem: Read the capability registers & wait for device ready
  2020-03-02 17:51   ` Frederic Barrat
@ 2020-03-04  4:15     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-04  4:15 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Mon, 2020-03-02 at 18:51 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This patch reads timeouts & firmware version from the controller,
> > and
> > uses those timeouts to wait for the controller to report that it is
> > ready
> > before handing the memory over to libnvdimm.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/Makefile  |  2 +-
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 92
> > +++++++++++++++++++
> >   .../platforms/powernv/pmem/ocxl_internal.c    | 19 ++++
> >   .../platforms/powernv/pmem/ocxl_internal.h    | 24 +++++
> >   4 files changed, 136 insertions(+), 1 deletion(-)
> >   create mode 100644
> > arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/Makefile
> > b/arch/powerpc/platforms/powernv/pmem/Makefile
> > index 1c55c4193175..4ceda25907d4 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/Makefile
> > +++ b/arch/powerpc/platforms/powernv/pmem/Makefile
> > @@ -4,4 +4,4 @@ ccflags-$(CONFIG_PPC_WERROR)	+= -Werror
> >   
> >   obj-$(CONFIG_OCXL_PMEM) += ocxlpmem.o
> >   
> > -ocxlpmem-y := ocxl.o
> > +ocxlpmem-y := ocxl.o ocxl_internal.o
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index 3c4eeb5dcc0f..431212c9f0cc 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -8,6 +8,7 @@
> >   
> >   #include <linux/module.h>
> >   #include <misc/ocxl.h>
> > +#include <linux/delay.h>
> >   #include <linux/ndctl.h>
> >   #include <linux/mm_types.h>
> >   #include <linux/memory_hotplug.h>
> > @@ -215,6 +216,36 @@ static int register_lpc_mem(struct ocxlpmem
> > *ocxlpmem)
> >   	return 0;
> >   }
> >   
> > +/**
> > + * is_usable() - Is a controller usable?
> > + * @ocxlpmem: the device metadata
> > + * @verbose: True to log errors
> > + * Return: true if the controller is usable
> > + */
> > +static bool is_usable(const struct ocxlpmem *ocxlpmem, bool
> > verbose)
> > +{
> > +	u64 chi = 0;
> > +	int rc = ocxlpmem_chi(ocxlpmem, &chi);
> > +
> > +	if (rc < 0)
> > +		return false;
> > +
> > +	if (!(chi & GLOBAL_MMIO_CHI_CRDY)) {
> > +		if (verbose)
> > +			dev_err(&ocxlpmem->dev, "controller is not
> > ready.\n");
> > +		return false;
> > +	}
> > +
> > +	if (!(chi & GLOBAL_MMIO_CHI_MA)) {
> > +		if (verbose)
> > +			dev_err(&ocxlpmem->dev,
> > +				"controller does not have memory
> > available.\n");
> > +		return false;
> > +	}
> > +
> > +	return true;
> > +}
> > +
> >   /**
> >    * allocate_minor() - Allocate a minor number to use for an
> > OpenCAPI pmem device
> >    * @ocxlpmem: the device metadata
> > @@ -328,6 +359,48 @@ static void ocxlpmem_remove(struct pci_dev
> > *pdev)
> >   	}
> >   }
> >   
> > +/**
> > + * read_device_metadata() - Retrieve config information from the
> > AFU and save it for future use
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int read_device_metadata(struct ocxlpmem *ocxlpmem)
> > +{
> > +	u64 val;
> > +	int rc;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CCAP0,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	ocxlpmem->scm_revision = val & 0xFFFF;
> > +	ocxlpmem->read_latency = (val >> 32) & 0xFF;
> > +	ocxlpmem->readiness_timeout = (val >> 48) & 0x0F;
> > +	ocxlpmem->memory_available_timeout = val >> 52;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CCAP1,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	ocxlpmem->max_controller_dump_size = val & 0xFFFFFFFF;
> > +
> > +	// Extract firmware version text
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_FWVER,
> > +				     OCXL_HOST_ENDIAN, (u64 *)ocxlpmem-
> > >fw_version);
> > +	if (rc)
> > +		return rc;
> > +
> > +	ocxlpmem->fw_version[8] = '\0';
> > +
> > +	dev_info(&ocxlpmem->dev,
> > +		 "Firmware version '%s' SCM revision %d:%d\n",
> > ocxlpmem->fw_version,
> > +		 ocxlpmem->scm_revision >> 4, ocxlpmem->scm_revision &
> > 0x0F);
> > +
> > +	return 0;
> > +}
> > +
> >   /**
> >    * probe_function0() - Set up function 0 for an OpenCAPI
> > persistent memory device
> >    * This is important as it enables templates higher than 0 across
> > all other functions,
> > @@ -368,6 +441,7 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   {
> >   	struct ocxlpmem *ocxlpmem;
> >   	int rc;
> > +	u16 elapsed, timeout;
> >   
> >   	if (PCI_FUNC(pdev->devfn) == 0)
> >   		return probe_function0(pdev);
> > @@ -422,6 +496,24 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   		goto err;
> >   	}
> >   
> > +	if (read_device_metadata(ocxlpmem)) {
> > +		dev_err(&pdev->dev, "Could not read metadata\n");
> 
> 
> Need to set rc
> 
> 
Whoops :)

> 
> > +		goto err;
> > +	}
> > +
> > +	elapsed = 0;
> > +	timeout = ocxlpmem->readiness_timeout + ocxlpmem-
> > >memory_available_timeout;
> > +	while (!is_usable(ocxlpmem, false)) {
> > +		if (elapsed++ > timeout) {
> > +			dev_warn(&ocxlpmem->dev, "OpenCAPI Persistent
> > Memory ready timeout.\n");
> > +			(void)is_usable(ocxlpmem, true);
> 
> I guess that extra call to is_usable() is just to log the cause of
> the 
> error. However, with some bad luck, the call could now succeed.
> 

Yeah, that's pretty ugly, I'll re-engineer it.

> 
>    Fred
> 
> 
> > +			rc = -ENXIO;
> > +			goto err;
> > +		}
> > +
> > +		msleep(1000);
> > +	}
> > +
> >   	rc = register_lpc_mem(ocxlpmem);
> >   	if (rc) {
> >   		dev_err(&pdev->dev, "Could not register OpenCAPI
> > persistent memory with libnvdimm\n");
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> > new file mode 100644
> > index 000000000000..617ca943b1b8
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.c
> > @@ -0,0 +1,19 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +// Copyright 2019 IBM Corp.
> > +
> > +#include <misc/ocxl.h>
> > +#include <linux/delay.h>
> > +#include "ocxl_internal.h"
> > +
> > +int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi)
> > +{
> > +	u64 val;
> > +	int rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHI,
> > +					 OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	*chi = val;
> > +
> > +	return 0;
> > +}
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > index 9cf3e42750e7..ba0301533d00 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > @@ -97,4 +97,28 @@ struct ocxlpmem {
> >   	void *metadata_addr;
> >   	struct resource pmem_res;
> >   	struct nd_region *nd_region;
> > +	char fw_version[8+1];
> > +
> > +	u32 max_controller_dump_size;
> > +	u16 scm_revision; // major/minor
> > +	u8 readiness_timeout;  /* The worst case time (in seconds) that
> > the host shall
> > +				* wait for the controller to become
> > operational following a reset (CHI.CRDY).
> > +				*/
> > +	u8 memory_available_timeout;   /* The worst case time (in
> > seconds) that the host shall
> > +					* wait for memory to become
> > available following a reset (CHI.MA).
> > +					*/
> > +
> > +	u16 read_latency; /* The nominal measure of latency (in
> > nanoseconds)
> > +			   * associated with an unassisted read of a
> > memory block.
> > +			   * This represents the capability of the raw
> > media technology without assistance
> > +			   */
> >   };
> > +
> > +/**
> > + * ocxlpmem_chi() - Get the value of the CHI register
> > + * @ocxlpmem: the device metadata
> > + * @chi: returns the CHI value
> > + *
> > + * Returns 0 on success, negative on error
> > + */
> > +int ocxlpmem_chi(const struct ocxlpmem *ocxlpmem, u64 *chi);
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* RE: [PATCH v3 15/27] powerpc/powernv/pmem: Add support for near storage commands
  2020-03-02 18:42     ` Dan Williams
@ 2020-03-04  4:42       ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-04  4:42 UTC (permalink / raw)
  To: Dan Williams, Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, Linux Kernel Mailing List, linuxppc-dev,
	linux-nvdimm, Linux MM

On Mon, 2020-03-02 at 10:42 -0800, Dan Williams wrote:
> On Mon, Mar 2, 2020 at 9:59 AM Frederic Barrat <fbarrat@linux.ibm.com
> > wrote:
> > 
> > 
> > Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > > From: Alastair D'Silva <alastair@d-silva.org>
> > > 
> > > Similar to the previous patch, this adds support for near storage
> > > commands.
> > > 
> > > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > > ---
> > 
> > Is any of these new functions ever called?
> 
> This is my concern as well. The libnvdimm command support is limited
> to the commands that Linux will use. Other passthrough commands are
> supported through a passthrough interface. However, that passthrough
> interface is explicitly limited to publicly documented command sets
> so
> that the kernel has an opportunity to constrain and consolidate
> command implementations across vendors.


It will be in the patch that implements overwrite. I moved that patch
out of this series, as it needs more testing, so I guess I can submit
this alongside it.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 03/27] powerpc: Map & release OpenCAPI LPC memory
  2020-03-03  6:10   ` Andrew Donnellan
@ 2020-03-04  5:33     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-04  5:33 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Tue, 2020-03-03 at 17:10 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:26 pm, Alastair D'Silva wrote:> +#ifdef 
> CONFIG_MEMORY_HOTPLUG_SPARSE
> > +u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size)
> > +{
> > +	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> > +	struct pnv_phb *phb = hose->private_data;
> > +	u32 bdfn = pci_dev_id(pdev);
> > +	__be64 base_addr_be64;
> > +	u64 base_addr;
> > +	int rc;
> > +
> > +	rc = opal_npu_mem_alloc(phb->opal_id, bdfn, size,
> > &base_addr_be64);
> 
> Sparse warning:
> 
> https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/15776//artifact/linux/report.txt
> 
> I think in patch 1 we need to change a uint64_t to a __be64.
> 

Ok, thanks

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace
  2020-03-03  7:02   ` Andrew Donnellan
@ 2020-03-04  5:48     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-04  5:48 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Tue, 2020-03-03 at 18:02 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:> @@ -938,6 +955,51 @@
> static 
> int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
> >   	return rc;
> >   }
> >   
> > +static int ioctl_eventfd(struct ocxlpmem *ocxlpmem,
> > +		 struct ioctl_ocxl_pmem_eventfd __user *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_eventfd args;
> > +
> > +	if (copy_from_user(&args, uarg, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	if (ocxlpmem->ev_ctx)
> > +		return -EINVAL;
> 
> I think EBUSY is more appropriate here.
> 

Ok

> > +
> > +	ocxlpmem->ev_ctx = eventfd_ctx_fdget(args.eventfd);
> > +	if (!ocxlpmem->ev_ctx)
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}
> > +
> > +static int ioctl_event_check(struct ocxlpmem *ocxlpmem, u64 __user
> > *uarg)
> > +{
> > +	u64 val = 0;
> > +	int rc;
> > +	u64 chi = 0;
> > +
> > +	rc = ocxlpmem_chi(ocxlpmem, &chi);
> > +	if (rc < 0)
> > +		return rc;
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_ELA)
> > +		val |= IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE;
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_CDA)
> > +		val |= IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE;
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_CFFS)
> > +		val |= IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL;
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_CHFS)
> > +		val |= IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL;
> > +
> > +	rc = copy_to_user((u64 __user *) uarg, &val, sizeof(val));
> > +
> > +	return rc;
> > +}
> > +
> >   static long file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long args)
> >   {
> >   	struct ocxlpmem *ocxlpmem = file->private_data;
> > @@ -966,6 +1028,15 @@ static long file_ioctl(struct file *file,
> > unsigned int cmd, unsigned long args)
> >   		rc = ioctl_controller_stats(ocxlpmem,
> >   					    (struct
> > ioctl_ocxl_pmem_controller_stats __user *)args);
> >   		break;
> > +
> > +	case IOCTL_OCXL_PMEM_EVENTFD:
> > +		rc = ioctl_eventfd(ocxlpmem,
> > +				   (struct ioctl_ocxl_pmem_eventfd
> > __user *)args);
> > +		break;
> > +
> > +	case IOCTL_OCXL_PMEM_EVENT_CHECK:
> > +		rc = ioctl_event_check(ocxlpmem, (u64 __user *)args);
> > +		break;
> >   	}
> >   
> >   	return rc;
> > @@ -1107,6 +1178,146 @@ static void dump_error_log(struct ocxlpmem
> > *ocxlpmem)
> >   	kfree(buf);
> >   }
> >   
> > +static irqreturn_t imn0_handler(void *private)
> > +{
> > +	struct ocxlpmem *ocxlpmem = private;
> > +	u64 chi = 0;
> > +
> > +	(void)ocxlpmem_chi(ocxlpmem, &chi);
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_ELA) {
> > +		dev_warn(&ocxlpmem->dev, "Error log is available\n");
> > +
> > +		if (ocxlpmem->ev_ctx)
> > +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> > +	}
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_CDA) {
> > +		dev_warn(&ocxlpmem->dev, "Controller dump is
> > available\n");
> > +
> > +		if (ocxlpmem->ev_ctx)
> > +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> > +	}
> > +
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static irqreturn_t imn1_handler(void *private)
> > +{
> > +	struct ocxlpmem *ocxlpmem = private;
> > +	u64 chi = 0;
> > +
> > +	(void)ocxlpmem_chi(ocxlpmem, &chi);
> > +
> > +	if (chi & (GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS)) {
> > +		dev_err(&ocxlpmem->dev,
> > +			"Controller status is fatal, chi=0x%llx, going
> > offline\n", chi);
> > +
> > +		if (ocxlpmem->nvdimm_bus) {
> > +			nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> > +			ocxlpmem->nvdimm_bus = NULL;
> > +		}
> > +
> > +		if (ocxlpmem->ev_ctx)
> > +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> > +	}
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +
> > +/**
> > + * ocxlpmem_setup_irq() - Set up the IRQs for the OpenCAPI
> > Persistent Memory device
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ocxlpmem_setup_irq(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +	u64 irq_addr;
> > +
> > +	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem-
> > >irq_id[0]);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem-
> > >irq_id[0],
> > +				  imn0_handler, NULL, ocxlpmem);
> > +
> > +	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context,
> > ocxlpmem->irq_id[0]);
> > +	if (!irq_addr)
> > +		return -EINVAL;
> > +
> > +	ocxlpmem->irq_addr[0] = ioremap(irq_addr, PAGE_SIZE);
> > +	if (!ocxlpmem->irq_addr[0])
> > +		return -EINVAL;
> 
> Something other than EINVAL for these two

Ok

> 
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_IMA0_OHP,
> > +				      OCXL_LITTLE_ENDIAN,
> > +				      (u64)ocxlpmem->irq_addr[0]);
> > +	if (rc)
> > +		goto out_irq0;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_IMA0_CFP,
> > +				      OCXL_LITTLE_ENDIAN, 0);
> > +	if (rc)
> > +		goto out_irq0;
> > +
> > +	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem-
> > >irq_id[1]);
> > +	if (rc)
> > +		goto out_irq0;
> > +
> > +
> > +	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem-
> > >irq_id[1],
> > +				  imn1_handler, NULL, ocxlpmem);
> > +	if (rc)
> > +		goto out_irq0;
> > +
> > +	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context,
> > ocxlpmem->irq_id[1]);
> > +	if (!irq_addr) {
> > +		rc = -EFAULT;
> > +		goto out_irq0;
> > +	}
> > +
> > +	ocxlpmem->irq_addr[1] = ioremap(irq_addr, PAGE_SIZE);
> > +	if (!ocxlpmem->irq_addr[1]) {
> > +		rc = -EINVAL;
> > +		goto out_irq0;
> > +	}
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_IMA1_OHP,
> > +				      OCXL_LITTLE_ENDIAN,
> > +				      (u64)ocxlpmem->irq_addr[1]);
> > +	if (rc)
> > +		goto out_irq1;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_IMA1_CFP,
> > +				      OCXL_LITTLE_ENDIAN, 0);
> > +	if (rc)
> > +		goto out_irq1;
> > +
> > +	// Enable doorbells
> > +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHIE,
> > +				    OCXL_LITTLE_ENDIAN,
> > +				    GLOBAL_MMIO_CHI_ELA |
> > GLOBAL_MMIO_CHI_CDA |
> > +				    GLOBAL_MMIO_CHI_CFFS |
> > GLOBAL_MMIO_CHI_CHFS |
> > +				    GLOBAL_MMIO_CHI_NSCRA);
> 
> We don't actually do anything in the handlers with NSCRA...

Good catch, this belongs in the overwrite patch (which was dropped from
this series).

> 
> > +	if (rc)
> > +		goto out_irq1;
> > +
> > +	return 0;
> > +
> > +out_irq1:
> > +	iounmap(ocxlpmem->irq_addr[1]);
> > +	ocxlpmem->irq_addr[1] = NULL;
> > +
> > +out_irq0:
> > +	iounmap(ocxlpmem->irq_addr[0]);
> > +	ocxlpmem->irq_addr[0] = NULL;
> > +
> > +	return rc;
> > +}
> > +
> >   /**
> >    * probe_function0() - Set up function 0 for an OpenCAPI
> > persistent memory device
> >    * This is important as it enables templates higher than 0 across
> > all other functions,
> > @@ -1216,6 +1427,11 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   		goto err;
> >   	}
> >   
> > +	if (ocxlpmem_setup_irq(ocxlpmem)) {
> > +		dev_err(&pdev->dev, "Could not set up OCXL IRQs\n");
> > +		goto err;
> > +	}
> > +
> >   	if (setup_command_metadata(ocxlpmem)) {
> >   		dev_err(&pdev->dev, "Could not read OCXL command
> > matada\n");
> >   		goto err;
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > index b953ee522ed4..927690f4888f 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > @@ -103,6 +103,10 @@ struct ocxlpmem {
> >   	struct pci_dev *pdev;
> >   	struct cdev cdev;
> >   	struct ocxl_fn *ocxl_fn;
> > +#define SCM_IRQ_COUNT 2
> > +	int irq_id[SCM_IRQ_COUNT];
> > +	struct dev_pagemap irq_pgmap[SCM_IRQ_COUNT];
> > +	void *irq_addr[SCM_IRQ_COUNT];
> 
> I think this should be tagged __iomem
> 

Ok

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command
  2020-02-21  3:27 ` [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command Alastair D'Silva
  2020-03-03 10:36   ` Frederic Barrat
@ 2020-03-04  5:58   ` Andrew Donnellan
  1 sibling, 0 replies; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-04  5:58 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> The read error log command extracts information from the controller's
> internal error log.
> 
> This patch exposes this information in 2 ways:
> - During probe, if an error occurs & a log is available, print it to the
>    console
> - After probe, make the error log available to userspace via an IOCTL.
>    Userspace is notified of pending error logs in a later patch
>    ("powerpc/powernv/pmem: Forward events to userspace")
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

A few minor style checks at 
https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/11787//artifact/linux/checkpatch.log

We should also add some documentation for the user interfaces we're 
adding (same applies for all the remaining patches in this series that 
add more interfaces).

> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 269 ++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.h    |   1 +
>   include/uapi/nvdimm/ocxl-pmem.h               |  46 +++
>   3 files changed, 316 insertions(+)
>   create mode 100644 include/uapi/nvdimm/ocxl-pmem.h
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 63109a870d2c..2b64504f9129 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -447,10 +447,219 @@ static int file_release(struct inode *inode, struct file *file)
>   	return 0;
>   }
>   
> +/**
> + * error_log_header_parse() - Parse the first 64 bits of the error log command response
> + * @ocxlpmem: the device metadata
> + * @length: out, returns the number of bytes in the response (excluding the 64 bit header)
> + */
> +static int error_log_header_parse(struct ocxlpmem *ocxlpmem, u16 *length)
> +{
> +	int rc;
> +	u64 val;
> +
> +	u16 data_identifier;
> +	u32 data_length;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	data_identifier = val >> 48;
> +	data_length = val & 0xFFFF;
> +
> +	if (data_identifier != 0x454C) { // 'EL'
> +		dev_err(&ocxlpmem->dev,
> +			"Bad data identifier for error log data, expected 'EL', got '%2s' (%#x), data_length=%u\n",
> +			(char *)&data_identifier,
> +			(unsigned int)data_identifier, data_length);
> +		return -EINVAL;

This should be something other than EINVAL I think

> +	}
> +
> +	*length = data_length;
> +	return 0;
> +}
> +
> +static int error_log_offset_0x08(struct ocxlpmem *ocxlpmem,
> +				 u32 *log_identifier, u32 *program_ref_code)
> +{
> +	int rc;
> +	u64 val;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	*log_identifier = val >> 32;
> +	*program_ref_code = val & 0xFFFFFFFF;
> +
> +	return 0;
> +}
> +
> +static int read_error_log(struct ocxlpmem *ocxlpmem,
> +			  struct ioctl_ocxl_pmem_error_log *log, bool buf_is_user)
> +{
> +	u64 val;
> +	u16 user_buf_length;
> +	u16 buf_length;
> +	u16 i;
> +	int rc;
> +
> +	if (log->buf_size % 8)
> +		return -EINVAL;
> +
> +	rc = ocxlpmem_chi(ocxlpmem, &val);
> +	if (rc)
> +		goto out;
> +
> +	if (!(val & GLOBAL_MMIO_CHI_ELA))
> +		return -EAGAIN;
> +
> +	user_buf_length = log->buf_size;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_ERRLOG);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_ERRLOG);
> +	if (rc < 0) {
> +		dev_warn(&ocxlpmem->dev, "Read error log timed out\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem, "Unexpected status from retrieve error log", rc);
> +		goto out;
> +	}
> +
> +
> +	rc = error_log_header_parse(ocxlpmem, &log->buf_size);
> +	if (rc)
> +		goto out;
> +	// log->buf_size now contains the returned buffer size, not the user size

In the event that the log is truncated to fit the user buffer, we return 
the full log size, I assume this is intentional to signal it's truncated 
as per the nd stuff?

> +
> +	rc = error_log_offset_0x08(ocxlpmem, &log->log_identifier,
> +				       &log->program_reference_code);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x10,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		goto out;
> +
> +	log->error_log_type = val >> 56;
> +	log->action_flags = (log->error_log_type == OCXL_PMEM_ERROR_LOG_TYPE_GENERAL) ?
> +			    (val >> 32) & 0xFFFFFF : 0;
> +	log->power_on_seconds = val & 0xFFFFFFFF;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x18,
> +				     OCXL_LITTLE_ENDIAN, &log->timestamp);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x20,
> +				     OCXL_HOST_ENDIAN, &log->wwid[0]);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x28,
> +				     OCXL_HOST_ENDIAN, &log->wwid[1]);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x30,
> +				     OCXL_HOST_ENDIAN, (u64 *)log->fw_revision);

Why the difference between HOST and LITTLE_ENDIAN between these fields?

> +	if (rc)
> +		goto out;
> +	log->fw_revision[8] = '\0';
> +
> +	buf_length = (user_buf_length < log->buf_size) ?
> +		     user_buf_length : log->buf_size;
> +	for (i = 0; i < buf_length + 0x48; i += 8) {

+ 0x48 here doesn't look right...

> +		u64 val;
> +
> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					     ocxlpmem->admin_command.data_offset + i,

...did you mean to add 0x48 here?

> +					     OCXL_HOST_ENDIAN, &val);
> +		if (rc)
> +			goto out;
> +
> +		if (buf_is_user) {
> +			if (copy_to_user(&log->buf[i], &val, sizeof(u64))) {
> +				rc = -EFAULT;
> +				goto out;
> +			}
> +		} else
> +			log->buf[i] = val;

Please use braces consistently on both sides of if/else.

> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +
> +}
> +
> +static int ioctl_error_log(struct ocxlpmem *ocxlpmem,
> +		struct ioctl_ocxl_pmem_error_log __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_error_log args;
> +	int rc;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	rc = read_error_log(ocxlpmem, &args, true);
> +	if (rc)
> +		return rc;
> +
> +	if (copy_to_user(uarg, &args, sizeof(args)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
> +{
> +	struct ocxlpmem *ocxlpmem = file->private_data;
> +	int rc = -EINVAL;
> +
> +	switch (cmd) {
> +	case IOCTL_OCXL_PMEM_ERROR_LOG:
> +		rc = ioctl_error_log(ocxlpmem,
> +				     (struct ioctl_ocxl_pmem_error_log __user *)args);
> +		break;
> +	}
> +	return rc;
> +}
> +
>   static const struct file_operations fops = {
>   	.owner		= THIS_MODULE,
>   	.open		= file_open,
>   	.release	= file_release,
> +	.unlocked_ioctl = file_ioctl,
> +	.compat_ioctl   = file_ioctl,
>   };
>   
>   /**
> @@ -527,6 +736,60 @@ static int read_device_metadata(struct ocxlpmem *ocxlpmem)
>   	return 0;
>   }
>   
> +static const char *decode_error_log_type(u8 error_log_type)
> +{
> +	switch (error_log_type) {
> +	case 0x00:
> +		return "general";
> +	case 0x01:
> +		return "predictive failure";
> +	case 0x02:
> +		return "thermal warning";
> +	case 0x03:
> +		return "data loss";
> +	case 0x04:
> +		return "health & performance";
> +	default:
> +		return "unknown";
> +	}
> +}
> +
> +static void dump_error_log(struct ocxlpmem *ocxlpmem)
> +{
> +	struct ioctl_ocxl_pmem_error_log log;
> +	u32 buf_size;
> +	u8 *buf;
> +	int rc;
> +
> +	if (ocxlpmem->admin_command.data_size == 0)
> +		return;
> +
> +	buf_size = ocxlpmem->admin_command.data_size - 0x48;
> +	buf = kzalloc(buf_size, GFP_KERNEL);
> +	if (!buf)
> +		return;
> +
> +	log.buf = buf;
> +	log.buf_size = buf_size;
> +
> +	rc = read_error_log(ocxlpmem, &log, false);
> +	if (rc < 0)
> +		goto out;
> +
> +	dev_warn(&ocxlpmem->dev,
> +		 "OCXL PMEM Error log: WWID=0x%016llx%016llx LID=0x%x PRC=%x type=0x%x %s, Uptime=%u seconds timestamp=0x%llx\n",
> +		 log.wwid[0], log.wwid[1],
> +		 log.log_identifier, log.program_reference_code,
> +		 log.error_log_type,
> +		 decode_error_log_type(log.error_log_type),
> +		 log.power_on_seconds, log.timestamp);
> +	print_hex_dump(KERN_WARNING, "buf", DUMP_PREFIX_OFFSET, 16, 1, buf,
> +		       log.buf_size, false); > +
> +out:
> +	kfree(buf);
> +}
> +
>   /**
>    * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
>    * This is important as it enables templates higher than 0 across all other functions,
> @@ -568,6 +831,7 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   	struct ocxlpmem *ocxlpmem;
>   	int rc;
>   	u16 elapsed, timeout;
> +	u64 chi;
>   
>   	if (PCI_FUNC(pdev->devfn) == 0)
>   		return probe_function0(pdev);
> @@ -667,6 +931,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   	return 0;
>   
>   err:
> +	if (ocxlpmem &&
> +		    (ocxlpmem_chi(ocxlpmem, &chi) == 0) &&
> +		    (chi & GLOBAL_MMIO_CHI_ELA))
> +		dump_error_log(ocxlpmem);
> +
>   	/*
>   	 * Further cleanup is done in the release handler via free_ocxlpmem()
>   	 * This allows us to keep the character device live to handle IOCTLs to
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index d2d81fec7bb1..b953ee522ed4 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -5,6 +5,7 @@
>   #include <linux/cdev.h>
>   #include <misc/ocxl.h>
>   #include <linux/libnvdimm.h>
> +#include <uapi/nvdimm/ocxl-pmem.h>
>   #include <linux/mm.h>
>   
>   #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
> new file mode 100644
> index 000000000000..b10f8ac0c20f
> --- /dev/null
> +++ b/include/uapi/nvdimm/ocxl-pmem.h
> @@ -0,0 +1,46 @@
> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> +/* Copyright 2017 IBM Corp. */
> +#ifndef _UAPI_OCXL_SCM_H
> +#define _UAPI_OCXL_SCM_H
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +#define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
> +#define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
> +#define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
> +#define OCXL_PMEM_ERROR_LOG_ACTION_DUMP		(1 << (55-32))
> +
> +#define OCXL_PMEM_ERROR_LOG_TYPE_GENERAL		(0x00)
> +#define OCXL_PMEM_ERROR_LOG_TYPE_PREDICTIVE_FAILURE	(0x01)
> +#define OCXL_PMEM_ERROR_LOG_TYPE_THERMAL_WARNING	(0x02)
> +#define OCXL_PMEM_ERROR_LOG_TYPE_DATA_LOSS		(0x03)
> +#define OCXL_PMEM_ERROR_LOG_TYPE_HEALTH_PERFORMANCE	(0x04)
> +
> +struct ioctl_ocxl_pmem_error_log {
> +	__u32 log_identifier; /* out */
> +	__u32 program_reference_code; /* out */
> +	__u32 action_flags; /* out, recommended course of action */
> +	__u32 power_on_seconds; /* out, Number of seconds the controller has been on when the error occurred */
> +	__u64 timestamp; /* out, relative time since the current IPL */
> +	__u64 wwid[2]; /* out, the NAA formatted WWID associated with the controller */
> +	char  fw_revision[8+1]; /* out, firmware revision as null terminated text */
> +	__u16 buf_size; /* in/out, buffer size provided/required.
> +			 * If required is greater than provided, the buffer
> +			 * will be truncated to the amount provided. If its
> +			 * less, then only the required bytes will be populated.
> +			 * If it is 0, then there are no more error log entries.
> +			 */
> +	__u8  error_log_type;
> +	__u8  reserved1;
> +	__u32 reserved2;
> +	__u64 reserved3[2];
> +	__u8 *buf; /* pointer to output buffer */
> +};
> +
> +/* ioctl numbers */
> +#define OCXL_PMEM_MAGIC 0x5C
> +/* SCM devices */
> +#define IOCTL_OCXL_PMEM_ERROR_LOG			_IOWR(OCXL_PMEM_MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log)
> +
> +#endif /* _UAPI_OCXL_SCM_H */
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs
  2020-02-21  3:27 ` [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs Alastair D'Silva
  2020-03-03 18:04   ` Frederic Barrat
@ 2020-03-04  6:53   ` Andrew Donnellan
  2020-03-06  3:34     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-04  6:53 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> +static int ioctl_controller_dump_data(struct ocxlpmem *ocxlpmem,
> +		struct ioctl_ocxl_pmem_controller_dump_data __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_controller_dump_data args;
> +	u16 i;
> +	u64 val;
> +	int rc;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	if (args.buf_size % 8)
> +		return -EINVAL;
> +
> +	if (args.buf_size > ocxlpmem->admin_command.data_size)
> +		return -EINVAL;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_CONTROLLER_DUMP);
> +	if (rc)
> +		goto out;
> +
> +	val = ((u64)args.offset) << 32;
> +	val |= args.buf_size;
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x08,
> +				      OCXL_LITTLE_ENDIAN, val);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem,
> +					    ADMIN_COMMAND_CONTROLLER_DUMP);
> +	if (rc < 0) {
> +		dev_warn(&ocxlpmem->dev, "Controller dump timed out\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem,
> +			    "Unexpected status from retrieve error log",

Controller dump

> +			    rc);
> +		goto out;
> +	}
> +
> +	for (i = 0; i < args.buf_size; i += 8) {
> +		u64 val;
> +
> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					     ocxlpmem->admin_command.data_offset + i,
> +					     OCXL_HOST_ENDIAN, &val);

Is a controller dump something where we want to do endian swapping?

Any reason we're not doing the usual check of the data identifier, 
additional data length etc?

> +		if (rc)
> +			goto out;
> +
> +		if (copy_to_user(&args.buf[i], &val, sizeof(u64))) {
> +			rc = -EFAULT;
> +			goto out;
> +		}
> +	}
> +
> +	if (copy_to_user(uarg, &args, sizeof(args))) {
> +		rc = -EFAULT;
> +		goto out;
> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
> +int request_controller_dump(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +	u64 busy = 1;
> +
> +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIC,
> +				    OCXL_LITTLE_ENDIAN,
> +				    GLOBAL_MMIO_CHI_CDA);

This return code is ignored

> +
> +
> +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
> +				    OCXL_LITTLE_ENDIAN,
> +				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP);
> +	if (rc)
> +		return rc;
> +
> +	while (busy) {
> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					     GLOBAL_MMIO_HCI,
> +					     OCXL_LITTLE_ENDIAN, &busy);
> +		if (rc)
> +			return rc;
> +
> +		busy &= GLOBAL_MMIO_HCI_CONTROLLER_DUMP;
> +		cond_resched();
> +	}
> +
> +	return 0;
> +}


-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics
  2020-02-21  3:27 ` [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics Alastair D'Silva
@ 2020-03-04  9:25   ` Frederic Barrat
  2020-03-12  0:15     ` Alastair D'Silva
  2020-03-05  0:46   ` Andrew Donnellan
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-04  9:25 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> The controller can report a number of statistics that are useful
> in evaluating the performance and reliability of the card.
> 
> This patch exposes this information via an IOCTL.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c | 185 +++++++++++++++++++++
>   include/uapi/nvdimm/ocxl-pmem.h            |  17 ++
>   2 files changed, 202 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 2cabafe1fc58..009d4fd29e7d 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -758,6 +758,186 @@ static int ioctl_controller_dump_complete(struct ocxlpmem *ocxlpmem)
>   				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COLLECTED);
>   }
>   
> +/**
> + * controller_stats_header_parse() - Parse the first 64 bits of the controller stats admin command response
> + * @ocxlpmem: the device metadata
> + * @length: out, returns the number of bytes in the response (excluding the 64 bit header)
> + */
> +static int controller_stats_header_parse(struct ocxlpmem *ocxlpmem,
> +	u32 *length)
> +{
> +	int rc;
> +	u64 val;
> +


unexpected empty line


> +	u16 data_identifier;
> +	u32 data_length;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	data_identifier = val >> 48;
> +	data_length = val & 0xFFFFFFFF;
> +
> +	if (data_identifier != 0x4353) { // 'CS'
> +		dev_err(&ocxlpmem->dev,
> +			"Bad data identifier for controller stats, expected 'CS', got '%-.*s'\n",
> +			2, (char *)&data_identifier);



Wow, I'm clueless what that string format looks like :-)
2 arguments? Did you check the kernel string formatter does what you want?
You may consider unifying the format though, the error log patch uses a 
simpler (better?) format for a similar message.



> +		return -EINVAL;
> +	}
> +
> +	*length = data_length;
> +	return 0;
> +}
> +
> +static int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
> +				  struct ioctl_ocxl_pmem_controller_stats __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_controller_stats args;
> +	u32 length;
> +	int rc;
> +	u64 val;
> +
> +	memset(&args, '\0', sizeof(args));
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_CONTROLLER_STATS);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x08,
> +				      OCXL_LITTLE_ENDIAN, 0);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +
> +	rc = admin_command_complete_timeout(ocxlpmem,
> +					    ADMIN_COMMAND_CONTROLLER_STATS);
> +	if (rc < 0) {
> +		dev_warn(&ocxlpmem->dev, "Controller stats timed out\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem,
> +			    "Unexpected status from controller stats", rc);
> +		goto out;
> +	}


All those ioctls commands follow the same pattern:
1. admin_command_request()
2. optionnaly, set some mmio registers specific to the command
3. admin_command_execute()
4. admin_command_complete_timeout()
5. admin_response()

By swapping 1 and 2, we could then factorize steps 1, 3, 4 and 5 in a 
function and simplify/shorten the code each time a command is called.

Regarding step 2 (and that's true for all similar patches), a comment 
about what the mmio tuning does would help and avoid looking up the 
spec. Looking up the spec during the review is expected, but it will 
ease reading the code 6 months from now.



> +
> +	rc = controller_stats_header_parse(ocxlpmem, &length);
> +	if (rc)
> +		goto out;
> +
> +	if (length != 0x140)
> +		warn_status(ocxlpmem,
> +			    "Unexpected length for controller stats data, expected 0x140, got 0x%x",
> +			    length);
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x08,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		goto out;
> +
> +	args.reset_count = val >> 32;
> +	args.reset_uptime = val & 0xFFFFFFFF;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x10,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		goto out;
> +
> +	args.power_on_uptime = val >> 32;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x08,
 > +				     OCXL_LITTLE_ENDIAN, &args.host_load_count);


Those offsets are hard to understand, even with the spec next to me. And 
it seems that we could harden things a bit:
each block as a "statistics parameter ID" and the length of the data for 
that block. We should check that and make sure we're reading what we expect.
For example, from the spec I'm looking (110d), I would expect the host 
load count to be at offset 0x10. It's entirely possible I'm misreading 
it though.



> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x10,
> +				     OCXL_LITTLE_ENDIAN, &args.host_store_count);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x18,
> +				     OCXL_LITTLE_ENDIAN, &args.media_read_count);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x20,
> +				     OCXL_LITTLE_ENDIAN, &args.media_write_count);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x28,
> +				     OCXL_LITTLE_ENDIAN, &args.cache_hit_count);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x30,
> +				     OCXL_LITTLE_ENDIAN, &args.cache_miss_count);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x38,
> +				     OCXL_LITTLE_ENDIAN, &args.media_read_latency);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x40,
> +				     OCXL_LITTLE_ENDIAN, &args.media_write_latency);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x48,
> +				     OCXL_LITTLE_ENDIAN, &args.cache_read_latency);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x50,
> +				     OCXL_LITTLE_ENDIAN, &args.cache_write_latency);
> +	if (rc)
> +		goto out;
> +
> +	if (copy_to_user(uarg, &args, sizeof(args))) {
> +		rc = -EFAULT;
> +		goto out;
> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = 0;
> +	goto out;


That may be more of a personal habit, but that final goto disrupts the 
"good case" flow. And I think it's pretty unusual within the kernel.


> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
>   static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   {
>   	struct ocxlpmem *ocxlpmem = file->private_data;
> @@ -781,6 +961,11 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE:
>   		rc = ioctl_controller_dump_complete(ocxlpmem);
>   		break;
> +
> +	case IOCTL_OCXL_PMEM_CONTROLLER_STATS:
> +		rc = ioctl_controller_stats(ocxlpmem,
> +					    (struct ioctl_ocxl_pmem_controller_stats __user *)args);
> +		break;
>   	}
>   
>   	return rc;
> diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
> index d4d8512d03f7..add223aa2fdb 100644
> --- a/include/uapi/nvdimm/ocxl-pmem.h
> +++ b/include/uapi/nvdimm/ocxl-pmem.h
> @@ -50,6 +50,22 @@ struct ioctl_ocxl_pmem_controller_dump_data {
>   	__u64 reserved[8];
>   };
>   
> +struct ioctl_ocxl_pmem_controller_stats {
> +	__u32 reset_count;
> +	__u32 reset_uptime; /* seconds */
> +	__u32 power_on_uptime; /* seconds */


Same as before, we're going to have some padding here.

   Fred


> +	__u64 host_load_count;
> +	__u64 host_store_count;
> +	__u64 media_read_count;
> +	__u64 media_write_count;
> +	__u64 cache_hit_count;
> +	__u64 cache_miss_count;
> +	__u64 media_read_latency; /* nanoseconds */
> +	__u64 media_write_latency; /* nanoseconds */
> +	__u64 cache_read_latency; /* nanoseconds */
> +	__u64 cache_write_latency; /* nanoseconds */
> +};
> +
>   /* ioctl numbers */
>   #define OCXL_PMEM_MAGIC 0x5C
>   /* SCM devices */
> @@ -57,5 +73,6 @@ struct ioctl_ocxl_pmem_controller_dump_data {
>   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP			_IO(OCXL_PMEM_MAGIC, 0x02)
>   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(OCXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
>   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_MAGIC, 0x04)
> +#define IOCTL_OCXL_PMEM_CONTROLLER_STATS		_IO(OCXL_PMEM_MAGIC, 0x05)
>   
>   #endif /* _UAPI_OCXL_SCM_H */
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace
  2020-02-21  3:27 ` [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace Alastair D'Silva
  2020-03-03  7:02   ` Andrew Donnellan
@ 2020-03-04 11:00   ` Frederic Barrat
  2020-03-11  3:32     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-04 11:00 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Some of the interrupts that the card generates are better handled
> by the userspace daemon, in particular:
> Controller Hardware/Firmware Fatal
> Controller Dump Available
> Error Log available
> 
> This patch allows a userspace application to register an eventfd with
> the driver via SCM_IOCTL_EVENTFD to receive notifications of these
> interrupts.
> 
> Userspace can then identify what events have occurred by calling
> SCM_IOCTL_EVENT_CHECK and checking against the SCM_IOCTL_EVENT_FOO
> masks.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 216 ++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.h    |   5 +
>   include/uapi/nvdimm/ocxl-pmem.h               |  16 ++
>   3 files changed, 237 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 009d4fd29e7d..e46696d3cc36 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -10,6 +10,7 @@
>   #include <misc/ocxl.h>
>   #include <linux/delay.h>
>   #include <linux/ndctl.h>
> +#include <linux/eventfd.h>
>   #include <linux/fs.h>
>   #include <linux/mm_types.h>
>   #include <linux/memory_hotplug.h>
> @@ -335,11 +336,22 @@ static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
>   {
>   	int rc;
>   
> +	// Disable doorbells
> +	(void)ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIEC,
> +				     OCXL_LITTLE_ENDIAN,
> +				     GLOBAL_MMIO_CHI_ALL);
> +
>   	if (ocxlpmem->nvdimm_bus)
>   		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
>   
>   	free_minor(ocxlpmem);
>   
> +	if (ocxlpmem->irq_addr[1])
> +		iounmap(ocxlpmem->irq_addr[1]);
> +
> +	if (ocxlpmem->irq_addr[0])
> +		iounmap(ocxlpmem->irq_addr[0]);
> +
>   	if (ocxlpmem->cdev.owner)
>   		cdev_del(&ocxlpmem->cdev);
>   
> @@ -443,6 +455,11 @@ static int file_release(struct inode *inode, struct file *file)
>   {
>   	struct ocxlpmem *ocxlpmem = file->private_data;
>   
> +	if (ocxlpmem->ev_ctx) {
> +		eventfd_ctx_put(ocxlpmem->ev_ctx);
> +		ocxlpmem->ev_ctx = NULL;
> +	}
> +
>   	ocxlpmem_put(ocxlpmem);
>   	return 0;
>   }
> @@ -938,6 +955,51 @@ static int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
>   	return rc;
>   }
>   
> +static int ioctl_eventfd(struct ocxlpmem *ocxlpmem,
> +		 struct ioctl_ocxl_pmem_eventfd __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_eventfd args;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	if (ocxlpmem->ev_ctx)
> +		return -EINVAL;


EBUSY?


> +
> +	ocxlpmem->ev_ctx = eventfd_ctx_fdget(args.eventfd);
> +	if (!ocxlpmem->ev_ctx)
> +		return -EFAULT;


Why not use what eventfd_ctx_fdget() returned? (through some IS_ERR() 
and PTR_ERR() convolution)


> +
> +	return 0;
> +}
> +
> +static int ioctl_event_check(struct ocxlpmem *ocxlpmem, u64 __user *uarg)
> +{
> +	u64 val = 0;
> +	int rc;
> +	u64 chi = 0;
> +
> +	rc = ocxlpmem_chi(ocxlpmem, &chi);
> +	if (rc < 0)
> +		return rc;
> +
> +	if (chi & GLOBAL_MMIO_CHI_ELA)
> +		val |= IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE;
> +
> +	if (chi & GLOBAL_MMIO_CHI_CDA)
> +		val |= IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE;
> +
> +	if (chi & GLOBAL_MMIO_CHI_CFFS)
> +		val |= IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL;
> +
> +	if (chi & GLOBAL_MMIO_CHI_CHFS)
> +		val |= IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL;
> +
> +	rc = copy_to_user((u64 __user *) uarg, &val, sizeof(val));
> +


copy_to_user doesn't return an errno. Should be:

if (copy_to_user((u64 __user *) uarg, &val, sizeof(val)))
	return -EFAULT;


> +	return rc;
> +}
> +
>   static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   {
>   	struct ocxlpmem *ocxlpmem = file->private_data;
> @@ -966,6 +1028,15 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   		rc = ioctl_controller_stats(ocxlpmem,
>   					    (struct ioctl_ocxl_pmem_controller_stats __user *)args);
>   		break;
> +
> +	case IOCTL_OCXL_PMEM_EVENTFD:
> +		rc = ioctl_eventfd(ocxlpmem,
> +				   (struct ioctl_ocxl_pmem_eventfd __user *)args);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_EVENT_CHECK:
> +		rc = ioctl_event_check(ocxlpmem, (u64 __user *)args);
> +		break;
>   	}
>   
>   	return rc;
> @@ -1107,6 +1178,146 @@ static void dump_error_log(struct ocxlpmem *ocxlpmem)
>   	kfree(buf);
>   }
>   
> +static irqreturn_t imn0_handler(void *private)
> +{
> +	struct ocxlpmem *ocxlpmem = private;
> +	u64 chi = 0;
> +
> +	(void)ocxlpmem_chi(ocxlpmem, &chi);
> +
> +	if (chi & GLOBAL_MMIO_CHI_ELA) {
> +		dev_warn(&ocxlpmem->dev, "Error log is available\n");
> +
> +		if (ocxlpmem->ev_ctx)
> +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> +	}
> +
> +	if (chi & GLOBAL_MMIO_CHI_CDA) {
> +		dev_warn(&ocxlpmem->dev, "Controller dump is available\n");
> +
> +		if (ocxlpmem->ev_ctx)
> +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> +	}
> +
> +


(at least) one empty line too many.


> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t imn1_handler(void *private)
> +{
> +	struct ocxlpmem *ocxlpmem = private;
> +	u64 chi = 0;
> +
> +	(void)ocxlpmem_chi(ocxlpmem, &chi);
> +
> +	if (chi & (GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS)) {
> +		dev_err(&ocxlpmem->dev,
> +			"Controller status is fatal, chi=0x%llx, going offline\n", chi);
> +
> +		if (ocxlpmem->nvdimm_bus) {
> +			nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> +			ocxlpmem->nvdimm_bus = NULL;
> +		}
> +
> +		if (ocxlpmem->ev_ctx)
> +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +
> +/**
> + * ocxlpmem_setup_irq() - Set up the IRQs for the OpenCAPI Persistent Memory device
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int ocxlpmem_setup_irq(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +	u64 irq_addr;
> +
> +	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem->irq_id[0]);
> +	if (rc)
> +		return rc;
> +
> +	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[0],
> +				  imn0_handler, NULL, ocxlpmem);
> +
> +	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, ocxlpmem->irq_id[0]);
> +	if (!irq_addr)
> +		return -EINVAL;
> +
> +	ocxlpmem->irq_addr[0] = ioremap(irq_addr, PAGE_SIZE);
> +	if (!ocxlpmem->irq_addr[0])
> +		return -EINVAL;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_OHP,
> +				      OCXL_LITTLE_ENDIAN,
> +				      (u64)ocxlpmem->irq_addr[0]);
> +	if (rc)
> +		goto out_irq0;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_CFP,
> +				      OCXL_LITTLE_ENDIAN, 0);
> +	if (rc)
> +		goto out_irq0;


That's a few lines of duplicate code. On the other hand, there's enough 
varying parameters between the 2 interrupts that factorizing in a 
subfunction would be slightly less readable. So duplicating is probably ok.



> +	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem->irq_id[1]);
> +	if (rc)
> +		goto out_irq0;
> +
> +
> +	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[1],
> +				  imn1_handler, NULL, ocxlpmem);
> +	if (rc)
> +		goto out_irq0;
> +
> +	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, ocxlpmem->irq_id[1]);
> +	if (!irq_addr) {
> +		rc = -EFAULT;
> +		goto out_irq0;
> +	}
> +
> +	ocxlpmem->irq_addr[1] = ioremap(irq_addr, PAGE_SIZE);
> +	if (!ocxlpmem->irq_addr[1]) {
> +		rc = -EINVAL;
> +		goto out_irq0;
> +	}
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA1_OHP,
> +				      OCXL_LITTLE_ENDIAN,
> +				      (u64)ocxlpmem->irq_addr[1]);
> +	if (rc)
> +		goto out_irq1;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA1_CFP,
> +				      OCXL_LITTLE_ENDIAN, 0);
> +	if (rc)
> +		goto out_irq1;
> +
> +	// Enable doorbells
> +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIE,
> +				    OCXL_LITTLE_ENDIAN,
> +				    GLOBAL_MMIO_CHI_ELA | GLOBAL_MMIO_CHI_CDA |
> +				    GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS |
> +				    GLOBAL_MMIO_CHI_NSCRA);


GLOBAL_MMIO_CHI_NSCRA doesn't seem to be handled in the handlers.



> +	if (rc)
> +		goto out_irq1;
> +
> +	return 0;
> +
> +out_irq1:
> +	iounmap(ocxlpmem->irq_addr[1]);
> +	ocxlpmem->irq_addr[1] = NULL;
> +
> +out_irq0:
> +	iounmap(ocxlpmem->irq_addr[0]);
> +	ocxlpmem->irq_addr[0] = NULL;
> +
> +	return rc;
> +}
> +
>   /**
>    * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device
>    * This is important as it enables templates higher than 0 across all other functions,
> @@ -1216,6 +1427,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   		goto err;
>   	}
>   
> +	if (ocxlpmem_setup_irq(ocxlpmem)) {
> +		dev_err(&pdev->dev, "Could not set up OCXL IRQs\n");


Like with other patches, rc needs to be set.


> +		goto err;
> +	}
> +
>   	if (setup_command_metadata(ocxlpmem)) {
>   		dev_err(&pdev->dev, "Could not read OCXL command matada\n");
>   		goto err;
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index b953ee522ed4..927690f4888f 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -103,6 +103,10 @@ struct ocxlpmem {
>   	struct pci_dev *pdev;
>   	struct cdev cdev;
>   	struct ocxl_fn *ocxl_fn;
> +#define SCM_IRQ_COUNT 2
> +	int irq_id[SCM_IRQ_COUNT];
> +	struct dev_pagemap irq_pgmap[SCM_IRQ_COUNT];


irq_pgmap is not used.


> +	void *irq_addr[SCM_IRQ_COUNT];
>   	struct nd_interleave_set nd_set;
>   	struct nvdimm_bus_descriptor bus_desc;
>   	struct nvdimm_bus *nvdimm_bus;
> @@ -113,6 +117,7 @@ struct ocxlpmem {
>   	struct command_metadata ns_command;
>   	struct resource pmem_res;
>   	struct nd_region *nd_region;
> +	struct eventfd_ctx *ev_ctx;
>   	char fw_version[8+1];
>   	u32 timeouts[ADMIN_COMMAND_MAX+1];
>   
> diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
> index add223aa2fdb..988eb0bc413d 100644
> --- a/include/uapi/nvdimm/ocxl-pmem.h
> +++ b/include/uapi/nvdimm/ocxl-pmem.h
> @@ -66,6 +66,20 @@ struct ioctl_ocxl_pmem_controller_stats {
>   	__u64 cache_write_latency; /* nanoseconds */
>   };
>   
> +struct ioctl_ocxl_pmem_eventfd {
> +	__s32 eventfd;
> +	__u32 reserved;
> +};
> +
> +#ifndef BIT_ULL
> +#define BIT_ULL(nr)	(1ULL << (nr))
> +#endif
> +
> +#define IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE	BIT_ULL(0)
> +#define IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE	BIT_ULL(1)
> +#define IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL		BIT_ULL(2)
> +#define IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL		BIT_ULL(3)
> +


I'm not fond of adding a macro with such a generic name as BIT_ULL() in 
a user header file. What's wrong with:

#define IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE	0x1
#define IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE	0x2
#define IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL		0x4
#define IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL		0x8


   Fred


>   /* ioctl numbers */
>   #define OCXL_PMEM_MAGIC 0x5C
>   /* SCM devices */
> @@ -74,5 +88,7 @@ struct ioctl_ocxl_pmem_controller_stats {
>   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(OCXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
>   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_MAGIC, 0x04)
>   #define IOCTL_OCXL_PMEM_CONTROLLER_STATS		_IO(OCXL_PMEM_MAGIC, 0x05)
> +#define IOCTL_OCXL_PMEM_EVENTFD				_IOW(OCXL_PMEM_MAGIC, 0x06, struct ioctl_ocxl_pmem_eventfd)
> +#define IOCTL_OCXL_PMEM_EVENT_CHECK			_IOR(OCXL_PMEM_MAGIC, 0x07, __u64)
>   
>   #endif /* _UAPI_OCXL_SCM_H */
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data
  2020-02-28  6:12   ` Andrew Donnellan
  2020-03-02  5:40     ` Alastair D'Silva
@ 2020-03-04 11:06     ` Frederic Barrat
  2020-03-11  3:38       ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-04 11:06 UTC (permalink / raw)
  To: Andrew Donnellan, Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashevskiy,
	linux-kernel, linuxppc-dev, linux-nvdimm, linux-mm



Le 28/02/2020 à 07:12, Andrew Donnellan a écrit :
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
>> From: Alastair D'Silva <alastair@d-silva.org>
>>
>> When health & performance data is requested from the controller,
>> it responds with an error log containing the requested information.
>>
>> This patch allows the request to me issued via an IOCTL.
> 
> A better explanation would be good - this IOCTL triggers a request to 
> the controller to collect controller health/perf data, and the 
> controller will later respond with an error log that can be picked up 
> via the error log IOCTL that you've defined earlier.

And even more precisely (to also check my understanding):

 > this IOCTL triggers a request to
 > the controller to collect controller health/perf data, and the
 > controller will later respond

by raising an interrupt to let the user app know that

 > an error log that can be picked up
 > via the error log IOCTL that you've defined earlier.


The rest of the patch looks ok to me.

   Fred
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 22/27] powerpc/powernv/pmem: Implement the heartbeat command
  2020-02-21  3:27 ` [PATCH v3 22/27] powerpc/powernv/pmem: Implement the heartbeat command Alastair D'Silva
  2020-02-28  6:20   ` Andrew Donnellan
@ 2020-03-04 14:25   ` Frederic Barrat
  1 sibling, 0 replies; 130+ messages in thread
From: Frederic Barrat @ 2020-03-04 14:25 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> The heartbeat admin command is a simple admin command that exercises
> the communication mechanisms within the controller.
> 
> This patch issues a heartbeat command to the card during init to ensure
> we can communicate with the card's controller.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---


Nothing to add compared to what has already been commented on previous 
patches (rc not set in probe(), higher level function to execute admin 
command in one call).

   Fred



>   arch/powerpc/platforms/powernv/pmem/ocxl.c | 43 ++++++++++++++++++++++
>   1 file changed, 43 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 081883a8247a..e01f6f9fc180 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -306,6 +306,44 @@ static bool is_usable(const struct ocxlpmem *ocxlpmem, bool verbose)
>   	return true;
>   }
>   
> +/**
> + * heartbeat() - Issue a heartbeat command to the controller
> + * @ocxlpmem: the device metadata
> + * Return: 0 if the controller responded correctly, negative on error
> + */
> +static int heartbeat(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_HEARTBEAT);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_HEARTBEAT);
> +	if (rc < 0) {
> +		dev_err(&ocxlpmem->dev, "Heartbeat timeout\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS)
> +		warn_status(ocxlpmem, "Unexpected status from heartbeat", rc);
> +
> +	(void)admin_response_handled(ocxlpmem);
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
>   /**
>    * allocate_minor() - Allocate a minor number to use for an OpenCAPI pmem device
>    * @ocxlpmem: the device metadata
> @@ -1458,6 +1496,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>   		goto err;
>   	}
>   
> +	if (heartbeat(ocxlpmem)) {
> +		dev_err(&pdev->dev, "Heartbeat failed\n");
> +		goto err;
> +	}
> +
>   	elapsed = 0;
>   	timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout;
>   	while (!is_usable(ocxlpmem, false)) {
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 23/27] powerpc/powernv/pmem: Add debug IOCTLs
  2020-02-21  3:27 ` [PATCH v3 23/27] powerpc/powernv/pmem: Add debug IOCTLs Alastair D'Silva
@ 2020-03-04 15:21   ` Frederic Barrat
  2020-03-12  4:24     ` Alastair D'Silva
  2020-03-05  3:11   ` Andrew Donnellan
  1 sibling, 1 reply; 130+ messages in thread
From: Frederic Barrat @ 2020-03-04 15:21 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> These IOCTLs provide low level access to the card to aid in debugging
> controller/FPGA firmware.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/Kconfig |   6 +
>   arch/powerpc/platforms/powernv/pmem/ocxl.c  | 249 ++++++++++++++++++++
>   include/uapi/nvdimm/ocxl-pmem.h             |  32 +++
>   3 files changed, 287 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/Kconfig b/arch/powerpc/platforms/powernv/pmem/Kconfig
> index c5d927520920..3f44429d70c9 100644
> --- a/arch/powerpc/platforms/powernv/pmem/Kconfig
> +++ b/arch/powerpc/platforms/powernv/pmem/Kconfig
> @@ -12,4 +12,10 @@ config OCXL_PMEM
>   
>   	  Select N if unsure.
>   
> +config OCXL_PMEM_DEBUG
> +	bool "OpenCAPI Persistent Memory debugging"
> +	depends on OCXL_PMEM
> +	help
> +	  Enables low level IOCTLs for OpenCAPI Persistent Memory firmware development
> +
>   endif
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index e01f6f9fc180..d4ce5e9e0521 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -1050,6 +1050,235 @@ int req_controller_health_perf(struct ocxlpmem *ocxlpmem)
>   				      GLOBAL_MMIO_HCI_REQ_HEALTH_PERF);
>   }
>   
> +#ifdef CONFIG_OCXL_PMEM_DEBUG
> +/**
> + * enable_fwdebug() - Enable FW debug on the controller
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int enable_fwdebug(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
> +				      OCXL_LITTLE_ENDIAN,
> +				      GLOBAL_MMIO_HCI_FW_DEBUG);
> +}
> +
> +/**
> + * disable_fwdebug() - Disable FW debug on the controller
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int disable_fwdebug(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCIC,
> +				      OCXL_LITTLE_ENDIAN,
> +				      GLOBAL_MMIO_HCI_FW_DEBUG);
> +}
> +
> +static int ioctl_fwdebug(struct ocxlpmem *ocxlpmem,
> +			     struct ioctl_ocxl_pmem_fwdebug __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_fwdebug args;
> +	u64 val;
> +	int i;
> +	int rc;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	// Buffer size must be a multiple of 8
> +	if ((args.buf_size & 0x07))
> +		return -EINVAL;
> +
> +	if (args.buf_size > ocxlpmem->admin_command.data_size)
> +		return -EINVAL;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = enable_fwdebug(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_FW_DEBUG);
> +	if (rc)
> +		goto out;
> +
> +	// Write DebugAction & FunctionCode
> +	val = ((u64)args.debug_action << 56) | ((u64)args.function_code << 40);
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x08,
> +				      OCXL_LITTLE_ENDIAN, val);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x10,
> +				      OCXL_LITTLE_ENDIAN, args.debug_parameter_1);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x18,
> +				      OCXL_LITTLE_ENDIAN, args.debug_parameter_2);
> +	if (rc)
> +		goto out;
> +
> +	for (i = 0x20; i < 0x38; i += 0x08)
> +		rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +					      ocxlpmem->admin_command.request_offset + i,
> +					      OCXL_LITTLE_ENDIAN, 0);
> +	if (rc)
> +		goto out;


rc is the for loop body. The rc test is not.


> +
> +
> +	// Populate admin command buffer
> +	if (args.buf_size) {
> +		for (i = 0; i < args.buf_size; i += sizeof(u64)) {
> +			u64 val;
> +
> +			if (copy_from_user(&val, &args.buf[i], sizeof(u64)))
> +				return -EFAULT;


need to get rc and goto out because of the mutex


> +
> +			rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +						      ocxlpmem->admin_command.data_offset + i,
> +						      OCXL_HOST_ENDIAN, val);
> +			if (rc)
> +				goto out;
> +		}
> +	}
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem,
> +					    ocxlpmem->timeouts[ADMIN_COMMAND_FW_DEBUG]);
> +	if (rc < 0)
> +		goto out;
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem, "Unexpected status from FW Debug", rc);
> +		goto out;
> +	}
> +
> +	if (args.buf_size) {
> +		for (i = 0; i < args.buf_size; i += sizeof(u64)) {
> +			u64 val;
> +
> +			rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +						     ocxlpmem->admin_command.data_offset + i,
> +						     OCXL_HOST_ENDIAN, &val);
> +			if (rc)
> +				goto out;
> +
> +			if (copy_to_user(&args.buf[i], &val, sizeof(u64))) {
> +				rc = -EFAULT;
> +				goto out;
> +			}
> +		}
> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = disable_fwdebug(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
> +static int ioctl_shutdown(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_SHUTDOWN);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_SHUTDOWN);
> +	if (rc < 0) {
> +		dev_warn(&ocxlpmem->dev, "Shutdown timed out\n");
> +		goto out;
> +	}
> +
> +	rc = 0;
> +	goto out;


We can remove that goto.

No admin_response_handled()? Is that shutting down the full adapter and 
we have nobody to talk to? What happens next?


> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
> +static int ioctl_mmio_write(struct ocxlpmem *ocxlpmem,
> +				struct ioctl_ocxl_pmem_mmio __user *uarg)
> +{
> +	struct scm_ioctl_mmio args;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	return ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, args.address,
> +					OCXL_LITTLE_ENDIAN, args.val);
> +}
> +
> +static int ioctl_mmio_read(struct ocxlpmem *ocxlpmem,
> +				     struct ioctl_ocxl_pmem_mmio __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_mmio args;
> +	int rc;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, args.address,
> +				     OCXL_LITTLE_ENDIAN, &args.val);
> +	if (rc)
> +		return rc;
> +
> +	if (copy_to_user(uarg, &args, sizeof(args)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +#else /* CONFIG_OCXL_PMEM_DEBUG */
> +static int ioctl_fwdebug(struct ocxlpmem *ocxlpmem,
> +			     struct ioctl_ocxl_pmem_fwdebug __user *uarg)
> +{
> +	return -EPERM;
> +}
> +
> +static int ioctl_shutdown(struct ocxlpmem *ocxlpmem)
> +{
> +	return -EPERM;
> +}
> +
> +static int ioctl_mmio_write(struct ocxlpmem *ocxlpmem,
> +				struct ioctl_ocxl_pmem_mmio __user *uarg)
> +{
> +	return -EPERM;
> +}
> +
> +static int ioctl_mmio_read(struct ocxlpmem *ocxlpmem,
> +			       struct ioctl_ocxl_pmem_mmio __user *uarg)
> +{
> +	return -EPERM;
> +}


The 'else' clause could be dropped, the ioctls will return EINVAL, which 
is fine, I think.



> +#endif /* CONFIG_OCXL_PMEM_DEBUG */
> +
>   static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   {
>   	struct ocxlpmem *ocxlpmem = file->private_data;
> @@ -1091,6 +1320,26 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   	case IOCTL_OCXL_PMEM_REQUEST_HEALTH:
>   		rc = req_controller_health_perf(ocxlpmem);
>   		break;
> +
> +	case IOCTL_OCXL_PMEM_FWDEBUG:
> +		rc = ioctl_fwdebug(ocxlpmem,
> +				   (struct ioctl_ocxl_pmem_fwdebug __user *)args);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_SHUTDOWN:
> +		rc = ioctl_shutdown(ocxlpmem);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_MMIO_WRITE:
> +		rc = ioctl_mmio_write(ocxlpmem,
> +				      (struct ioctl_ocxl_pmem_mmio __user *)args);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_MMIO_READ:
> +		rc = ioctl_mmio_read(ocxlpmem,
> +				     (struct ioctl_ocxl_pmem_mmio __user *)args);
> +		break;
> +
>   	}
>   
>   	return rc;
> diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
> index 0d03abb44001..e20a4f8be82a 100644
> --- a/include/uapi/nvdimm/ocxl-pmem.h
> +++ b/include/uapi/nvdimm/ocxl-pmem.h
> @@ -6,6 +6,28 @@
>   #include <linux/types.h>
>   #include <linux/ioctl.h>
>   
> +enum ocxlpmem_fwdebug_action {
> +	OCXL_PMEM_FWDEBUG_READ_CONTROLLER_MEMORY = 0x01,
> +	OCXL_PMEM_FWDEBUG_WRITE_CONTROLLER_MEMORY = 0x02,
> +	OCXL_PMEM_FWDEBUG_ENABLE_FUNCTION = 0x03,
> +	OCXL_PMEM_FWDEBUG_DISABLE_FUNCTION = 0x04,
> +	OCXL_PMEM_FWDEBUG_GET_PEL = 0x05, // Retrieve Persistent Error Log
> +};
> +
> +struct ioctl_ocxl_pmem_buffer_info {
> +	__u32	admin_command_buffer_size; // out
> +	__u32	near_storage_buffer_size; // out
> +};
> +
> +struct ioctl_ocxl_pmem_fwdebug { // All args are inputs
> +	enum ocxlpmem_fwdebug_action debug_action;


More kernel ABI problems. My interpretation of the "enumeration 
specifiers" section of C99 is that we can't rely on the size of the enum.


> +	__u16 function_code;
> +	__u16 buf_size; // Size of optional data buffer
> +	__u64 debug_parameter_1;
> +	__u64 debug_parameter_2;
> +	__u8 *buf; // Pointer to optional in/out data buffer
> +};
> +
>   #define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
>   #define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
>   #define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
> @@ -66,6 +88,11 @@ struct ioctl_ocxl_pmem_controller_stats {
>   	__u64 cache_write_latency; /* nanoseconds */
>   };
>   
> +struct ioctl_ocxl_pmem_mmio {
> +	__u64 address; /* Offset in global MMIO space */
> +	__u64 val; /* value to write/was read */
> +};


Can we group all the debug data structures together in the header file, 
with a comment indicating that they may not be available in the kernel, 
depending on the config?

   Fred


> +
>   struct ioctl_ocxl_pmem_eventfd {
>   	__s32 eventfd;
>   	__u32 reserved;
> @@ -92,4 +119,9 @@ struct ioctl_ocxl_pmem_eventfd {
>   #define IOCTL_OCXL_PMEM_EVENT_CHECK			_IOR(OCXL_PMEM_MAGIC, 0x07, __u64)
>   #define IOCTL_OCXL_PMEM_REQUEST_HEALTH			_IO(OCXL_PMEM_MAGIC, 0x08)
>   
> +#define IOCTL_OCXL_PMEM_FWDEBUG		_IOWR(OCXL_PMEM_MAGIC, 0xf0, struct ioctl_ocxl_pmem_fwdebug)
> +#define IOCTL_OCXL_PMEM_MMIO_WRITE	_IOW(OCXL_PMEM_MAGIC, 0xf1, struct ioctl_ocxl_pmem_mmio)
> +#define IOCTL_OCXL_PMEM_MMIO_READ	_IOWR(OCXL_PMEM_MAGIC, 0xf2, struct ioctl_ocxl_pmem_mmio)
> +#define IOCTL_OCXL_PMEM_SHUTDOWN	_IO(OCXL_PMEM_MAGIC, 0xf3)
> +
>   #endif /* _UAPI_OCXL_SCM_H */
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 24/27] powerpc/powernv/pmem: Expose SMART data via ndctl
  2020-02-21  3:27 ` [PATCH v3 24/27] powerpc/powernv/pmem: Expose SMART data via ndctl Alastair D'Silva
@ 2020-03-04 15:40   ` Frederic Barrat
  2020-03-05  3:36   ` Andrew Donnellan
  1 sibling, 0 replies; 130+ messages in thread
From: Frederic Barrat @ 2020-03-04 15:40 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel



Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch retrieves proprietary formatted SMART data and makes it
> available via ndctl. A later contribution will be made to ndctl to
> parse this data.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---


Nothing new to add compared to previous patches with similarities.

   Fred



>   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 128 ++++++++++++++++++
>   .../platforms/powernv/pmem/ocxl_internal.h    |  18 +++
>   include/uapi/linux/ndctl.h                    |   1 +
>   3 files changed, 147 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index d4ce5e9e0521..5cd1b6d78dd6 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -81,6 +81,129 @@ static int ndctl_config_size(struct nd_cmd_get_config_size *command)
>   	return 0;
>   }
>   
> +/**
> + * smart_header_parse() - Parse the first 64 bits of the SMART admin command response
> + * @ocxlpmem: the device metadata
> + * @length: out, returns the number of bytes in the response (excluding the 64 bit header)
> + */
> +static int smart_header_parse(struct ocxlpmem *ocxlpmem, u32 *length)
> +{
> +	int rc;
> +	u64 val;
> +
> +	u16 data_identifier;
> +	u32 data_length;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	data_identifier = val >> 48;
> +	data_length = val & 0xFFFFFFFF;
> +
> +	if (data_identifier != 0x534D) { // 'SM'
> +		dev_err(&ocxlpmem->dev,
> +			"Bad data identifier for smart data, expected 'SM', got '%-.*s'\n",
> +			2, (char *)&data_identifier);
> +		return -EINVAL;
> +	}
> +
> +	*length = data_length;
> +	return 0;
> +}
> +
> +static int ndctl_smart(struct ocxlpmem *ocxlpmem, struct nd_cmd_pkg *pkg)
> +{
> +	u32 length, i;
> +	struct nd_ocxl_smart *out;
> +	int rc;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_SMART);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_SMART);
> +	if (rc < 0) {
> +		dev_err(&ocxlpmem->dev, "SMART timeout\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem, "Unexpected status from SMART", rc);
> +		goto out;
> +	}
> +
> +	rc = smart_header_parse(ocxlpmem, &length);
> +	if (rc)
> +		goto out;
> +
> +	pkg->nd_fw_size = length;
> +
> +	length = min(length, pkg->nd_size_out); // bytes
> +	out = (struct nd_ocxl_smart *)pkg->nd_payload;
> +	// Each SMART attribute is 2 * 64 bits
> +	out->count = length / (2 * sizeof(u64)); // attributes
> +
> +	for (i = 0; i < length; i += sizeof(u64)) {
> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					     ocxlpmem->admin_command.data_offset + sizeof(u64) + i,
> +					     OCXL_LITTLE_ENDIAN,
> +					     &out->attribs[i/sizeof(u64)]);
> +		if (rc)
> +			goto out;
> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = 0;
> +	goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
> +static int ndctl_call(struct ocxlpmem *ocxlpmem, void *buf, unsigned int buf_len)
> +{
> +	struct nd_cmd_pkg *pkg = buf;
> +
> +	if (buf_len < sizeof(struct nd_cmd_pkg)) {
> +		dev_err(&ocxlpmem->dev, "Invalid ND_CALL size=%u\n", buf_len);
> +		return -EINVAL;
> +	}
> +
> +	if (pkg->nd_family != NVDIMM_FAMILY_OCXL) {
> +		dev_err(&ocxlpmem->dev, "Invalid ND_CALL family=0x%llx\n", pkg->nd_family);
> +		return -EINVAL;
> +	}
> +
> +	switch (pkg->nd_command) {
> +	case ND_CMD_OCXL_SMART:
> +		ndctl_smart(ocxlpmem, pkg);
> +		break;
> +
> +	default:
> +		dev_err(&ocxlpmem->dev, "Invalid ND_CALL command=0x%llx\n", pkg->nd_command);
> +		return -EINVAL;
> +	}
> +
> +
> +	return 0;
> +}
> +
>   static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
>   		 struct nvdimm *nvdimm,
>   		 unsigned int cmd, void *buf, unsigned int buf_len, int *cmd_rc)
> @@ -88,6 +211,10 @@ static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
>   	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct ocxlpmem, bus_desc);
>   
>   	switch (cmd) {
> +	case ND_CMD_CALL:
> +		*cmd_rc = ndctl_call(ocxlpmem, buf, buf_len);
> +		return 0;
> +
>   	case ND_CMD_GET_CONFIG_SIZE:
>   		*cmd_rc = ndctl_config_size(buf);
>   		return 0;
> @@ -171,6 +298,7 @@ static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
>   	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
>   	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
>   	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
> +	set_bit(ND_CMD_CALL, &nvdimm_cmd_mask);
>   
>   	set_bit(NDD_ALIASING, &nvdimm_flags);
>   
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index 927690f4888f..0eb7a35d24ae 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -7,6 +7,7 @@
>   #include <linux/libnvdimm.h>
>   #include <uapi/nvdimm/ocxl-pmem.h>
>   #include <linux/mm.h>
> +#include <linux/ndctl.h>
>   
>   #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
>   #define DEFAULT_TIMEOUT 100
> @@ -98,6 +99,23 @@ struct ocxlpmem_function0 {
>   	struct ocxl_fn *ocxl_fn;
>   };
>   
> +struct nd_ocxl_smart {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u64 attribs[0];
> +} __packed;
> +
> +struct nd_pkg_ocxl {
> +	struct nd_cmd_pkg gen;
> +	union {
> +		struct nd_ocxl_smart smart;
> +	};
> +};
> +
> +enum nd_cmd_ocxl {
> +	ND_CMD_OCXL_SMART = 1,
> +};
> +
>   struct ocxlpmem {
>   	struct device dev;
>   	struct pci_dev *pdev;
> diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
> index de5d90212409..2885052e7f40 100644
> --- a/include/uapi/linux/ndctl.h
> +++ b/include/uapi/linux/ndctl.h
> @@ -244,6 +244,7 @@ struct nd_cmd_pkg {
>   #define NVDIMM_FAMILY_HPE2 2
>   #define NVDIMM_FAMILY_MSFT 3
>   #define NVDIMM_FAMILY_HYPERV 4
> +#define NVDIMM_FAMILY_OCXL 6
>   
>   #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
>   					struct nd_cmd_pkg)
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics
  2020-02-21  3:27 ` [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics Alastair D'Silva
  2020-03-04  9:25   ` Frederic Barrat
@ 2020-03-05  0:46   ` Andrew Donnellan
  2020-03-12  4:47     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-05  0:46 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> The controller can report a number of statistics that are useful
> in evaluating the performance and reliability of the card.
> 
> This patch exposes this information via an IOCTL.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/ocxl.c | 185 +++++++++++++++++++++
>   include/uapi/nvdimm/ocxl-pmem.h            |  17 ++
>   2 files changed, 202 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index 2cabafe1fc58..009d4fd29e7d 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -758,6 +758,186 @@ static int ioctl_controller_dump_complete(struct ocxlpmem *ocxlpmem)
>   				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COLLECTED);
>   }
>   
> +/**
> + * controller_stats_header_parse() - Parse the first 64 bits of the controller stats admin command response
> + * @ocxlpmem: the device metadata
> + * @length: out, returns the number of bytes in the response (excluding the 64 bit header)
> + */
> +static int controller_stats_header_parse(struct ocxlpmem *ocxlpmem,
> +	u32 *length)
> +{
> +	int rc;
> +	u64 val;
> +
> +	u16 data_identifier;
> +	u32 data_length;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		return rc;
> +
> +	data_identifier = val >> 48;
> +	data_length = val & 0xFFFFFFFF;
> +
> +	if (data_identifier != 0x4353) { // 'CS'
> +		dev_err(&ocxlpmem->dev,
> +			"Bad data identifier for controller stats, expected 'CS', got '%-.*s'\n",
> +			2, (char *)&data_identifier);
> +		return -EINVAL;

Same comment as earlier patches re EINVAL

> +	}
> +
> +	*length = data_length;
> +	return 0;
> +}
> +
> +static int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
> +				  struct ioctl_ocxl_pmem_controller_stats __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_controller_stats args;
> +	u32 length;
> +	int rc;
> +	u64 val;
> +
> +	memset(&args, '\0', sizeof(args));
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_CONTROLLER_STATS);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x08,
> +				      OCXL_LITTLE_ENDIAN, 0);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +
> +	rc = admin_command_complete_timeout(ocxlpmem,
> +					    ADMIN_COMMAND_CONTROLLER_STATS);
> +	if (rc < 0) {
> +		dev_warn(&ocxlpmem->dev, "Controller stats timed out\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem,
> +			    "Unexpected status from controller stats", rc);
> +		goto out;
> +	}
> +
> +	rc = controller_stats_header_parse(ocxlpmem, &length);
> +	if (rc)
> +		goto out;
> +
> +	if (length != 0x140)
> +		warn_status(ocxlpmem,
> +			    "Unexpected length for controller stats data, expected 0x140, got 0x%x",
> +			    length);

Might be worth a comment to explain where 0x140 comes from (it looks 
correct from my reading of the spec)

> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x08,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		goto out;
> +
> +	args.reset_count = val >> 32;
> +	args.reset_uptime = val & 0xFFFFFFFF;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x10,
> +				     OCXL_LITTLE_ENDIAN, &val);
> +	if (rc)
> +		goto out;
> +
> +	args.power_on_uptime = val >> 32;

We're not collecting life remaining?

> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x08,
> +				     OCXL_LITTLE_ENDIAN, &args.host_load_count);

My reading of the spec says HLC is at +0x10

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x10,
> +				     OCXL_LITTLE_ENDIAN, &args.host_store_count);

HSC at +0x18

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x18,
> +				     OCXL_LITTLE_ENDIAN, &args.media_read_count);

MRC is at +0x50

And you're missing CRU, HLD, HSD

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x20,
> +				     OCXL_LITTLE_ENDIAN, &args.media_write_count);

MWC at +0x58

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x28,
> +				     OCXL_LITTLE_ENDIAN, &args.cache_hit_count);

CRHC at +0x90

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x30,
> +				     OCXL_LITTLE_ENDIAN, &args.cache_miss_count);

This field doesn't seem to exist at all in my copy of the spec

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x38,
> +				     OCXL_LITTLE_ENDIAN, &args.media_read_latency);

Nor this one

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x40,
> +				     OCXL_LITTLE_ENDIAN, &args.media_write_latency);

Nor this one

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x48,
> +				     OCXL_LITTLE_ENDIAN, &args.cache_read_latency);

Nor this one

> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +				     ocxlpmem->admin_command.data_offset + 0x08 + 0x40 + 0x50,
> +				     OCXL_LITTLE_ENDIAN, &args.cache_write_latency);

Nor this one

> +	if (rc)
> +		goto out;
> +
> +	if (copy_to_user(uarg, &args, sizeof(args))) {
> +		rc = -EFAULT;
> +		goto out;
> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = 0;
> +	goto out;

Per Fred this pattern isn't common in the kernel, but perhaps this is 
just personal taste

> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
>   static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   {
>   	struct ocxlpmem *ocxlpmem = file->private_data;
> @@ -781,6 +961,11 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE:
>   		rc = ioctl_controller_dump_complete(ocxlpmem);
>   		break;
> +
> +	case IOCTL_OCXL_PMEM_CONTROLLER_STATS:
> +		rc = ioctl_controller_stats(ocxlpmem,
> +					    (struct ioctl_ocxl_pmem_controller_stats __user *)args);
> +		break;
>   	}
>   
>   	return rc;
> diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
> index d4d8512d03f7..add223aa2fdb 100644
> --- a/include/uapi/nvdimm/ocxl-pmem.h
> +++ b/include/uapi/nvdimm/ocxl-pmem.h
> @@ -50,6 +50,22 @@ struct ioctl_ocxl_pmem_controller_dump_data {
>   	__u64 reserved[8];
>   };
>   
> +struct ioctl_ocxl_pmem_controller_stats {
> +	__u32 reset_count;
> +	__u32 reset_uptime; /* seconds */
> +	__u32 power_on_uptime; /* seconds */
> +	__u64 host_load_count;
> +	__u64 host_store_count;
> +	__u64 media_read_count;
> +	__u64 media_write_count;
> +	__u64 cache_hit_count;
> +	__u64 cache_miss_count;
> +	__u64 media_read_latency; /* nanoseconds */
> +	__u64 media_write_latency; /* nanoseconds */
> +	__u64 cache_read_latency; /* nanoseconds */
> +	__u64 cache_write_latency; /* nanoseconds */
> +};
> +
>   /* ioctl numbers */
>   #define OCXL_PMEM_MAGIC 0x5C
>   /* SCM devices */
> @@ -57,5 +73,6 @@ struct ioctl_ocxl_pmem_controller_dump_data {
>   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP			_IO(OCXL_PMEM_MAGIC, 0x02)
>   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(OCXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
>   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_MAGIC, 0x04)
> +#define IOCTL_OCXL_PMEM_CONTROLLER_STATS		_IO(OCXL_PMEM_MAGIC, 0x05)
>   
>   #endif /* _UAPI_OCXL_SCM_H */
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 23/27] powerpc/powernv/pmem: Add debug IOCTLs
  2020-02-21  3:27 ` [PATCH v3 23/27] powerpc/powernv/pmem: Add debug IOCTLs Alastair D'Silva
  2020-03-04 15:21   ` Frederic Barrat
@ 2020-03-05  3:11   ` Andrew Donnellan
  2020-03-12  4:58     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-05  3:11 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> These IOCTLs provide low level access to the card to aid in debugging
> controller/FPGA firmware.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>   arch/powerpc/platforms/powernv/pmem/Kconfig |   6 +
>   arch/powerpc/platforms/powernv/pmem/ocxl.c  | 249 ++++++++++++++++++++
>   include/uapi/nvdimm/ocxl-pmem.h             |  32 +++
>   3 files changed, 287 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pmem/Kconfig b/arch/powerpc/platforms/powernv/pmem/Kconfig
> index c5d927520920..3f44429d70c9 100644
> --- a/arch/powerpc/platforms/powernv/pmem/Kconfig
> +++ b/arch/powerpc/platforms/powernv/pmem/Kconfig
> @@ -12,4 +12,10 @@ config OCXL_PMEM
>   
>   	  Select N if unsure.
>   
> +config OCXL_PMEM_DEBUG
> +	bool "OpenCAPI Persistent Memory debugging"
> +	depends on OCXL_PMEM
> +	help
> +	  Enables low level IOCTLs for OpenCAPI Persistent Memory firmware development
> +

How dangerous are these ioctls and does that need to be pointed out in 
this description?

>   endif
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> index e01f6f9fc180..d4ce5e9e0521 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> @@ -1050,6 +1050,235 @@ int req_controller_health_perf(struct ocxlpmem *ocxlpmem)
>   				      GLOBAL_MMIO_HCI_REQ_HEALTH_PERF);
>   }
>   
> +#ifdef CONFIG_OCXL_PMEM_DEBUG
> +/**
> + * enable_fwdebug() - Enable FW debug on the controller
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int enable_fwdebug(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCI,
> +				      OCXL_LITTLE_ENDIAN,
> +				      GLOBAL_MMIO_HCI_FW_DEBUG);
> +}
> +
> +/**
> + * disable_fwdebug() - Disable FW debug on the controller
> + * @ocxlpmem: the device metadata
> + * Return: 0 on success, negative on failure
> + */
> +static int disable_fwdebug(const struct ocxlpmem *ocxlpmem)
> +{
> +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_HCIC,
> +				      OCXL_LITTLE_ENDIAN,
> +				      GLOBAL_MMIO_HCI_FW_DEBUG);
> +}
> +
> +static int ioctl_fwdebug(struct ocxlpmem *ocxlpmem,
> +			     struct ioctl_ocxl_pmem_fwdebug __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_fwdebug args;
> +	u64 val;
> +	int i;
> +	int rc;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	// Buffer size must be a multiple of 8
> +	if ((args.buf_size & 0x07))
> +		return -EINVAL;
> +
> +	if (args.buf_size > ocxlpmem->admin_command.data_size)
> +		return -EINVAL;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = enable_fwdebug(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_FW_DEBUG);
> +	if (rc)
> +		goto out;
> +
> +	// Write DebugAction & FunctionCode
> +	val = ((u64)args.debug_action << 56) | ((u64)args.function_code << 40);
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x08,
> +				      OCXL_LITTLE_ENDIAN, val);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x10,
> +				      OCXL_LITTLE_ENDIAN, args.debug_parameter_1);
> +	if (rc)
> +		goto out;
> +
> +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +				      ocxlpmem->admin_command.request_offset + 0x18,
> +				      OCXL_LITTLE_ENDIAN, args.debug_parameter_2);
> +	if (rc)
> +		goto out;
> +
> +	for (i = 0x20; i < 0x38; i += 0x08)

Comparison should be <=, the request block ends at 0x40.

But in any case, scm_command_request() should I think already handle the 
clearing of the request block?

> +		rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +					      ocxlpmem->admin_command.request_offset + i,
> +					      OCXL_LITTLE_ENDIAN, 0);
> +	if (rc)
> +		goto out;
> +
> +
> +	// Populate admin command buffer
> +	if (args.buf_size) {
> +		for (i = 0; i < args.buf_size; i += sizeof(u64)) {
> +			u64 val;
> +
> +			if (copy_from_user(&val, &args.buf[i], sizeof(u64)))
> +				return -EFAULT;
> +
> +			rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> +						      ocxlpmem->admin_command.data_offset + i,
> +						      OCXL_HOST_ENDIAN, val);
> +			if (rc)
> +				goto out;
> +		}
> +	}
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem,
> +					    ocxlpmem->timeouts[ADMIN_COMMAND_FW_DEBUG]);
> +	if (rc < 0)
> +		goto out;
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem, "Unexpected status from FW Debug", rc);
> +		goto out;
> +	}
> +
> +	if (args.buf_size) {
> +		for (i = 0; i < args.buf_size; i += sizeof(u64)) {
> +			u64 val;
> +
> +			rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +						     ocxlpmem->admin_command.data_offset + i,
> +						     OCXL_HOST_ENDIAN, &val);

No check of the data identifier?

It seems to me that there's no definition in the spec whatsoever for the 
format of the data, so just copying as much as fits in the buffer seems 
correct.

> +			if (rc)
> +				goto out;
> +
> +			if (copy_to_user(&args.buf[i], &val, sizeof(u64))) {
> +				rc = -EFAULT;
> +				goto out;
> +			}
> +		}
> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = disable_fwdebug(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
> +static int ioctl_shutdown(struct ocxlpmem *ocxlpmem)
> +{
> +	int rc;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_SHUTDOWN);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_SHUTDOWN);
> +	if (rc < 0) {
> +		dev_warn(&ocxlpmem->dev, "Shutdown timed out\n");
> +		goto out;
> +	}
> +
> +	rc = 0;
> +	goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
> +static int ioctl_mmio_write(struct ocxlpmem *ocxlpmem,
> +				struct ioctl_ocxl_pmem_mmio __user *uarg)
> +{
> +	struct scm_ioctl_mmio args;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	return ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, args.address,
> +					OCXL_LITTLE_ENDIAN, args.val);
> +}
> +
> +static int ioctl_mmio_read(struct ocxlpmem *ocxlpmem,
> +				     struct ioctl_ocxl_pmem_mmio __user *uarg)
> +{
> +	struct ioctl_ocxl_pmem_mmio args;
> +	int rc;
> +
> +	if (copy_from_user(&args, uarg, sizeof(args)))
> +		return -EFAULT;
> +
> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, args.address,
> +				     OCXL_LITTLE_ENDIAN, &args.val);
> +	if (rc)
> +		return rc;
> +
> +	if (copy_to_user(uarg, &args, sizeof(args)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +#else /* CONFIG_OCXL_PMEM_DEBUG */
> +static int ioctl_fwdebug(struct ocxlpmem *ocxlpmem,
> +			     struct ioctl_ocxl_pmem_fwdebug __user *uarg)
> +{
> +	return -EPERM;
> +}
> +
> +static int ioctl_shutdown(struct ocxlpmem *ocxlpmem)
> +{
> +	return -EPERM;
> +}
> +
> +static int ioctl_mmio_write(struct ocxlpmem *ocxlpmem,
> +				struct ioctl_ocxl_pmem_mmio __user *uarg)
> +{
> +	return -EPERM;
> +}
> +
> +static int ioctl_mmio_read(struct ocxlpmem *ocxlpmem,
> +			       struct ioctl_ocxl_pmem_mmio __user *uarg)
> +{
> +	return -EPERM;
> +}
> +#endif /* CONFIG_OCXL_PMEM_DEBUG */
> +
>   static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   {
>   	struct ocxlpmem *ocxlpmem = file->private_data;
> @@ -1091,6 +1320,26 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args)
>   	case IOCTL_OCXL_PMEM_REQUEST_HEALTH:
>   		rc = req_controller_health_perf(ocxlpmem);
>   		break;
> +
> +	case IOCTL_OCXL_PMEM_FWDEBUG:
> +		rc = ioctl_fwdebug(ocxlpmem,
> +				   (struct ioctl_ocxl_pmem_fwdebug __user *)args);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_SHUTDOWN:
> +		rc = ioctl_shutdown(ocxlpmem);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_MMIO_WRITE:
> +		rc = ioctl_mmio_write(ocxlpmem,
> +				      (struct ioctl_ocxl_pmem_mmio __user *)args);
> +		break;
> +
> +	case IOCTL_OCXL_PMEM_MMIO_READ:
> +		rc = ioctl_mmio_read(ocxlpmem,
> +				     (struct ioctl_ocxl_pmem_mmio __user *)args);
> +		break;
> +
>   	}
>   
>   	return rc;
> diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h
> index 0d03abb44001..e20a4f8be82a 100644
> --- a/include/uapi/nvdimm/ocxl-pmem.h
> +++ b/include/uapi/nvdimm/ocxl-pmem.h
> @@ -6,6 +6,28 @@
>   #include <linux/types.h>
>   #include <linux/ioctl.h>
>   
> +enum ocxlpmem_fwdebug_action {
> +	OCXL_PMEM_FWDEBUG_READ_CONTROLLER_MEMORY = 0x01,
> +	OCXL_PMEM_FWDEBUG_WRITE_CONTROLLER_MEMORY = 0x02,
> +	OCXL_PMEM_FWDEBUG_ENABLE_FUNCTION = 0x03,
> +	OCXL_PMEM_FWDEBUG_DISABLE_FUNCTION = 0x04,
> +	OCXL_PMEM_FWDEBUG_GET_PEL = 0x05, // Retrieve Persistent Error Log
> +};
> +
> +struct ioctl_ocxl_pmem_buffer_info {
> +	__u32	admin_command_buffer_size; // out
> +	__u32	near_storage_buffer_size; // out
> +};

This struct seems unused.

> +
> +struct ioctl_ocxl_pmem_fwdebug { // All args are inputs
> +	enum ocxlpmem_fwdebug_action debug_action;
> +	__u16 function_code;
> +	__u16 buf_size; // Size of optional data buffer
> +	__u64 debug_parameter_1;
> +	__u64 debug_parameter_2;
> +	__u8 *buf; // Pointer to optional in/out data buffer
> +};
> +
>   #define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
>   #define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
>   #define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
> @@ -66,6 +88,11 @@ struct ioctl_ocxl_pmem_controller_stats {
>   	__u64 cache_write_latency; /* nanoseconds */
>   };
>   
> +struct ioctl_ocxl_pmem_mmio {
> +	__u64 address; /* Offset in global MMIO space */
> +	__u64 val; /* value to write/was read */
> +};
> +
>   struct ioctl_ocxl_pmem_eventfd {
>   	__s32 eventfd;
>   	__u32 reserved;
> @@ -92,4 +119,9 @@ struct ioctl_ocxl_pmem_eventfd {
>   #define IOCTL_OCXL_PMEM_EVENT_CHECK			_IOR(OCXL_PMEM_MAGIC, 0x07, __u64)
>   #define IOCTL_OCXL_PMEM_REQUEST_HEALTH			_IO(OCXL_PMEM_MAGIC, 0x08)
>   
> +#define IOCTL_OCXL_PMEM_FWDEBUG		_IOWR(OCXL_PMEM_MAGIC, 0xf0, struct ioctl_ocxl_pmem_fwdebug)
> +#define IOCTL_OCXL_PMEM_MMIO_WRITE	_IOW(OCXL_PMEM_MAGIC, 0xf1, struct ioctl_ocxl_pmem_mmio)
> +#define IOCTL_OCXL_PMEM_MMIO_READ	_IOWR(OCXL_PMEM_MAGIC, 0xf2, struct ioctl_ocxl_pmem_mmio)
> +#define IOCTL_OCXL_PMEM_SHUTDOWN	_IO(OCXL_PMEM_MAGIC, 0xf3)
> +
>   #endif /* _UAPI_OCXL_SCM_H */
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 24/27] powerpc/powernv/pmem: Expose SMART data via ndctl
  2020-02-21  3:27 ` [PATCH v3 24/27] powerpc/powernv/pmem: Expose SMART data via ndctl Alastair D'Silva
  2020-03-04 15:40   ` Frederic Barrat
@ 2020-03-05  3:36   ` Andrew Donnellan
  2020-03-12 23:14     ` Alastair D'Silva
  1 sibling, 1 reply; 130+ messages in thread
From: Andrew Donnellan @ 2020-03-05  3:36 UTC (permalink / raw)
  To: Alastair D'Silva, alastair
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> +static int ndctl_smart(struct ocxlpmem *ocxlpmem, struct nd_cmd_pkg *pkg)
> +{
> +	u32 length, i;
> +	struct nd_ocxl_smart *out;
> +	int rc;
> +
> +	mutex_lock(&ocxlpmem->admin_command.lock);
> +
> +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_SMART);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_execute(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_SMART);
> +	if (rc < 0) {
> +		dev_err(&ocxlpmem->dev, "SMART timeout\n");
> +		goto out;
> +	}
> +
> +	rc = admin_response(ocxlpmem);
> +	if (rc < 0)
> +		goto out;
> +	if (rc != STATUS_SUCCESS) {
> +		warn_status(ocxlpmem, "Unexpected status from SMART", rc);
> +		goto out;
> +	}
> +
> +	rc = smart_header_parse(ocxlpmem, &length);
> +	if (rc)
> +		goto out;
> +
> +	pkg->nd_fw_size = length;
> +
> +	length = min(length, pkg->nd_size_out); // bytes
> +	out = (struct nd_ocxl_smart *)pkg->nd_payload;
> +	// Each SMART attribute is 2 * 64 bits
> +	out->count = length / (2 * sizeof(u64)); // attributes

 From what I can tell - 8 bytes of nd_ocxl_smart are taken up for the 
count + reserved bytes, so this is going to potentially overrun the user 
buffer.

> +
> +	for (i = 0; i < length; i += sizeof(u64)) {

It might be neater to make i count up by 1 and then multiply by 
sizeof(u64) later.

> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> +					     ocxlpmem->admin_command.data_offset + sizeof(u64) + i,

+ 0x08 rather than + sizeof(u64) for consistency.

> +					     OCXL_LITTLE_ENDIAN,
> +					     &out->attribs[i/sizeof(u64)]);
> +		if (rc)
> +			goto out;
> +	}
> +
> +	rc = admin_response_handled(ocxlpmem);
> +	if (rc)
> +		goto out;
> +
> +	rc = 0;
> +	goto out;
> +
> +out:
> +	mutex_unlock(&ocxlpmem->admin_command.lock);
> +	return rc;
> +}
> +
> +static int ndctl_call(struct ocxlpmem *ocxlpmem, void *buf, unsigned int buf_len)
> +{
> +	struct nd_cmd_pkg *pkg = buf;
> +
> +	if (buf_len < sizeof(struct nd_cmd_pkg)) {
> +		dev_err(&ocxlpmem->dev, "Invalid ND_CALL size=%u\n", buf_len);
> +		return -EINVAL;
> +	}
> +
> +	if (pkg->nd_family != NVDIMM_FAMILY_OCXL) {
> +		dev_err(&ocxlpmem->dev, "Invalid ND_CALL family=0x%llx\n", pkg->nd_family);
> +		return -EINVAL;
> +	}
> +
> +	switch (pkg->nd_command) {
> +	case ND_CMD_OCXL_SMART:
> +		ndctl_smart(ocxlpmem, pkg);

Did you intend to dispose of the return code here?

> +		break;
> +
> +	default:
> +		dev_err(&ocxlpmem->dev, "Invalid ND_CALL command=0x%llx\n", pkg->nd_command);
> +		return -EINVAL;
> +	}
> +
> +
> +	return 0;
> +}
> +
>   static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
>   		 struct nvdimm *nvdimm,
>   		 unsigned int cmd, void *buf, unsigned int buf_len, int *cmd_rc)
> @@ -88,6 +211,10 @@ static int ndctl(struct nvdimm_bus_descriptor *nd_desc,
>   	struct ocxlpmem *ocxlpmem = container_of(nd_desc, struct ocxlpmem, bus_desc);
>   
>   	switch (cmd) {
> +	case ND_CMD_CALL:
> +		*cmd_rc = ndctl_call(ocxlpmem, buf, buf_len);
> +		return 0;
> +
>   	case ND_CMD_GET_CONFIG_SIZE:
>   		*cmd_rc = ndctl_config_size(buf);
>   		return 0;
> @@ -171,6 +298,7 @@ static int register_lpc_mem(struct ocxlpmem *ocxlpmem)
>   	set_bit(ND_CMD_GET_CONFIG_SIZE, &nvdimm_cmd_mask);
>   	set_bit(ND_CMD_GET_CONFIG_DATA, &nvdimm_cmd_mask);
>   	set_bit(ND_CMD_SET_CONFIG_DATA, &nvdimm_cmd_mask);
> +	set_bit(ND_CMD_CALL, &nvdimm_cmd_mask);
>   
>   	set_bit(NDD_ALIASING, &nvdimm_flags);
>   
> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> index 927690f4888f..0eb7a35d24ae 100644
> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> @@ -7,6 +7,7 @@
>   #include <linux/libnvdimm.h>
>   #include <uapi/nvdimm/ocxl-pmem.h>
>   #include <linux/mm.h>
> +#include <linux/ndctl.h>
>   
>   #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
>   #define DEFAULT_TIMEOUT 100
> @@ -98,6 +99,23 @@ struct ocxlpmem_function0 {
>   	struct ocxl_fn *ocxl_fn;
>   };
>   
> +struct nd_ocxl_smart {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u64 attribs[0];
> +} __packed;
> +
> +struct nd_pkg_ocxl {
> +	struct nd_cmd_pkg gen;
> +	union {
> +		struct nd_ocxl_smart smart;
> +	};
> +};
> +
> +enum nd_cmd_ocxl {
> +	ND_CMD_OCXL_SMART = 1,
> +};
> +
>   struct ocxlpmem {
>   	struct device dev;
>   	struct pci_dev *pdev;
> diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
> index de5d90212409..2885052e7f40 100644
> --- a/include/uapi/linux/ndctl.h
> +++ b/include/uapi/linux/ndctl.h
> @@ -244,6 +244,7 @@ struct nd_cmd_pkg {
>   #define NVDIMM_FAMILY_HPE2 2
>   #define NVDIMM_FAMILY_MSFT 3
>   #define NVDIMM_FAMILY_HYPERV 4
> +#define NVDIMM_FAMILY_OCXL 6
>   
>   #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
>   					struct nd_cmd_pkg)
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with
  2020-03-03  9:28   ` Frederic Barrat
@ 2020-03-05  3:38     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-05  3:38 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Tue, 2020-03-03 at 10:28 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This patch introduces a character device (/dev/ocxl-scmX) which
> > further
> > patches will use to interact with userspace.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 116
> > +++++++++++++++++-
> >   .../platforms/powernv/pmem/ocxl_internal.h    |   2 +
> >   2 files changed, 116 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index b8bd7e703b19..63109a870d2c 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -10,6 +10,7 @@
> >   #include <misc/ocxl.h>
> >   #include <linux/delay.h>
> >   #include <linux/ndctl.h>
> > +#include <linux/fs.h>
> >   #include <linux/mm_types.h>
> >   #include <linux/memory_hotplug.h>
> >   #include "ocxl_internal.h"
> > @@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem
> > *ocxlpmem)
> >   
> >   	free_minor(ocxlpmem);
> >   
> > +	if (ocxlpmem->cdev.owner)
> > +		cdev_del(&ocxlpmem->cdev);
> > +
> >   	if (ocxlpmem->metadata_addr)
> >   		devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr);
> >   
> > @@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem
> > *ocxlpmem)
> >   	return device_register(&ocxlpmem->dev);
> >   }
> >   
> > +static void ocxlpmem_put(struct ocxlpmem *ocxlpmem)
> > +{
> > +	put_device(&ocxlpmem->dev);
> > +}
> > +
> > +static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem)
> > +{
> > +	return (get_device(&ocxlpmem->dev) == NULL) ? NULL : ocxlpmem;
> > +}
> > +
> > +static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno)
> > +{
> > +	struct ocxlpmem *ocxlpmem;
> > +	int minor = MINOR(devno);
> > +	/*
> > +	 * We don't declare an RCU critical section here, as our AFU
> > +	 * is protected by a reference counter on the device. By the
> > time the
> > +	 * minor number of a device is removed from the idr, the ref
> > count of
> > +	 * the device is already at 0, so no user API will access that
> > AFU and
> > +	 * this function can't return it.
> > +	 */
> 
> I fixed something related in the ocxl driver (which had enough
> changes 
> with the introduction of the "info" device to make a similar comment 
> become wrong). See commit a58d37bce0d21. The issue is handling a 
> simultaneous open() and removal of the device through /sysfs as best
> we can.
> 
> We are on a file open path and it's not like we're going to have a 
> thousand clients, so performance is not that critical. We can take
> the 
> mutex before searching in the IDR and release it after we increment
> the 
> reference count on the device.
> But that's not enough: we could still find the device in the IDR
> while 
> it is being removed in free_ocxlpmem(). I believe the only safe way
> to 
> address it is by removing the user-facing APIs (the char device)
> before 
> calling device_unregister(). So that it's not possible to find the 
> device in file_open() if it's in the middle of being removed.
> 
>    Fred
> 
> 

Ok, I'll replicate that patch & follow your advice.

> > +	ocxlpmem = idr_find(&minors_idr, minor);
> > +	if (ocxlpmem)
> > +		ocxlpmem_get(ocxlpmem);
> > +	return ocxlpmem;
> > +}
> > +
> > +static int file_open(struct inode *inode, struct file *file)
> > +{
> > +	struct ocxlpmem *ocxlpmem;
> > +
> > +	ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev);
> > +	if (!ocxlpmem)
> > +		return -ENODEV;
> > +
> > +	file->private_data = ocxlpmem;
> > +	return 0;
> > +}
> > +
> > +static int file_release(struct inode *inode, struct file *file)
> > +{
> > +	struct ocxlpmem *ocxlpmem = file->private_data;
> > +
> > +	ocxlpmem_put(ocxlpmem);
> > +	return 0;
> > +}
> > +
> > +static const struct file_operations fops = {
> > +	.owner		= THIS_MODULE,
> > +	.open		= file_open,
> > +	.release	= file_release,
> > +};
> > +
> > +/**
> > + * create_cdev() - Create the chardev in /dev for the device
> > + * @ocxlpmem: the SCM metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int create_cdev(struct ocxlpmem *ocxlpmem)
> > +{
> > +	cdev_init(&ocxlpmem->cdev, &fops);
> > +	return cdev_add(&ocxlpmem->cdev, ocxlpmem->dev.devt, 1);
> > +}
> > +
> >   /**
> >    * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
> >    * @pdev: the PCI device information struct
> > @@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   		goto err;
> >   	}
> >   
> > +	if (create_cdev(ocxlpmem)) {
> > +		dev_err(&pdev->dev, "Could not create character
> > device\n");
> > +		goto err;
> > +	}
> 
> As already mentioned in a previous patch, we branch to the err label
> so 
> rc needs to be set to a valid error.
> 

Ok

> 
> 
> > +
> >   	elapsed = 0;
> >   	timeout = ocxlpmem->readiness_timeout + ocxlpmem-
> > >memory_available_timeout;
> >   	while (!is_usable(ocxlpmem, false)) {
> > @@ -613,20 +686,59 @@ static struct pci_driver pci_driver = {
> >   	.shutdown = ocxlpmem_remove,
> >   };
> >   
> > +static int file_init(void)
> > +{
> > +	int rc;
> > +
> > +	mutex_init(&minors_idr_lock);
> > +	idr_init(&minors_idr);
> > +
> > +	rc = alloc_chrdev_region(&ocxlpmem_dev, 0, NUM_MINORS, "ocxl-
> > pmem");
> > +	if (rc) {
> > +		idr_destroy(&minors_idr);
> > +		pr_err("Unable to allocate OpenCAPI persistent memory
> > major number: %d\n", rc);
> > +		return rc;
> > +	}
> > +
> > +	ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem");
> > +	if (IS_ERR(ocxlpmem_class)) {
> > +		idr_destroy(&minors_idr);
> > +		pr_err("Unable to create ocxl-pmem class\n");
> > +		unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> > +		return PTR_ERR(ocxlpmem_class);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void file_exit(void)
> > +{
> > +	class_destroy(ocxlpmem_class);
> > +	unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> > +	idr_destroy(&minors_idr);
> > +}
> > +
> >   static int __init ocxlpmem_init(void)
> >   {
> > -	int rc = 0;
> > +	int rc;
> >   
> > -	rc = pci_register_driver(&pci_driver);
> > +	rc = file_init();
> >   	if (rc)
> >   		return rc;
> >   
> > +	rc = pci_register_driver(&pci_driver);
> > +	if (rc) {
> > +		file_exit();
> > +		return rc;
> > +	}
> > +
> >   	return 0;
> >   }
> >   
> >   static void ocxlpmem_exit(void)
> >   {
> >   	pci_unregister_driver(&pci_driver);
> > +	file_exit();
> >   }
> >   
> >   module_init(ocxlpmem_init);
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > index 28e2020f6355..d2d81fec7bb1 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > @@ -2,6 +2,7 @@
> >   // Copyright 2019 IBM Corp.
> >   
> >   #include <linux/pci.h>
> > +#include <linux/cdev.h>
> >   #include <misc/ocxl.h>
> >   #include <linux/libnvdimm.h>
> >   #include <linux/mm.h>
> > @@ -99,6 +100,7 @@ struct ocxlpmem_function0 {
> >   struct ocxlpmem {
> >   	struct device dev;
> >   	struct pci_dev *pdev;
> > +	struct cdev cdev;
> >   	struct ocxl_fn *ocxl_fn;
> >   	struct nd_interleave_set nd_set;
> >   	struct nvdimm_bus_descriptor bus_desc;
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command
  2020-03-03 10:36   ` Frederic Barrat
@ 2020-03-05  4:31     ` Alastair D'Silva
  2020-03-05  9:33       ` Frederic Barrat
  0 siblings, 1 reply; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-05  4:31 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Tue, 2020-03-03 at 11:36 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > The read error log command extracts information from the
> > controller's
> > internal error log.
> > 
> > This patch exposes this information in 2 ways:
> > - During probe, if an error occurs & a log is available, print it
> > to the
> >    console
> > - After probe, make the error log available to userspace via an
> > IOCTL.
> >    Userspace is notified of pending error logs in a later patch
> >    ("powerpc/powernv/pmem: Forward events to userspace")
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 269
> > ++++++++++++++++++
> >   .../platforms/powernv/pmem/ocxl_internal.h    |   1 +
> >   include/uapi/nvdimm/ocxl-pmem.h               |  46 +++
> >   3 files changed, 316 insertions(+)
> >   create mode 100644 include/uapi/nvdimm/ocxl-pmem.h
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index 63109a870d2c..2b64504f9129 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -447,10 +447,219 @@ static int file_release(struct inode *inode,
> > struct file *file)
> >   	return 0;
> >   }
> >   
> > +/**
> > + * error_log_header_parse() - Parse the first 64 bits of the error
> > log command response
> > + * @ocxlpmem: the device metadata
> > + * @length: out, returns the number of bytes in the response
> > (excluding the 64 bit header)
> > + */
> > +static int error_log_header_parse(struct ocxlpmem *ocxlpmem, u16
> > *length)
> > +{
> > +	int rc;
> > +	u64 val;
> > +
> 
> Empty line in the middle of declarations
> 

Ok

> 
> > +	u16 data_identifier;
> > +	u32 data_length;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	data_identifier = val >> 48;
> > +	data_length = val & 0xFFFF;
> > +
> > +	if (data_identifier != 0x454C) { // 'EL'
> > +		dev_err(&ocxlpmem->dev,
> > +			"Bad data identifier for error log data,
> > expected 'EL', got '%2s' (%#x), data_length=%u\n",
> > +			(char *)&data_identifier,
> > +			(unsigned int)data_identifier, data_length);
> > +		return -EINVAL;
> > +	}
> > +
> > +	*length = data_length;
> > +	return 0;
> > +}
> > +
> > +static int error_log_offset_0x08(struct ocxlpmem *ocxlpmem,
> > +				 u32 *log_identifier, u32
> > *program_ref_code)
> > +{
> > +	int rc;
> > +	u64 val;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	*log_identifier = val >> 32;
> > +	*program_ref_code = val & 0xFFFFFFFF;
> > +
> > +	return 0;
> > +}
> > +
> > +static int read_error_log(struct ocxlpmem *ocxlpmem,
> > +			  struct ioctl_ocxl_pmem_error_log *log, bool
> > buf_is_user)
> > +{
> > +	u64 val;
> > +	u16 user_buf_length;
> > +	u16 buf_length;
> > +	u16 i;
> > +	int rc;
> > +
> > +	if (log->buf_size % 8)
> > +		return -EINVAL;
> > +
> > +	rc = ocxlpmem_chi(ocxlpmem, &val);
> > +	if (rc)
> > +		goto out;
> 
> 
> "out" will unlock a mutex not yet taken.
> 

Thanks, that should have been a return.

> 
> 
> > +
> > +	if (!(val & GLOBAL_MMIO_CHI_ELA))
> > +		return -EAGAIN;
> > +
> > +	user_buf_length = log->buf_size;
> > +
> > +	mutex_lock(&ocxlpmem->admin_command.lock);
> > +
> > +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_ERRLOG);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_execute(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_complete_timeout(ocxlpmem,
> > ADMIN_COMMAND_ERRLOG);
> > +	if (rc < 0) {
> > +		dev_warn(&ocxlpmem->dev, "Read error log timed out\n");
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response(ocxlpmem);
> > +	if (rc < 0)
> > +		goto out;
> > +	if (rc != STATUS_SUCCESS) {
> > +		warn_status(ocxlpmem, "Unexpected status from retrieve
> > error log", rc);
> > +		goto out;
> > +	}
> > +
> > +
> > +	rc = error_log_header_parse(ocxlpmem, &log->buf_size);
> > +	if (rc)
> > +		goto out;
> > +	// log->buf_size now contains the returned buffer size, not the
> > user size
> > +
> > +	rc = error_log_offset_0x08(ocxlpmem, &log->log_identifier,
> > +				       &log->program_reference_code);
> > +	if (rc)
> > +		goto out;
> 
> 
> Offset 0x08 gets a preferential treatment compared to 0x10 below and 
> it's not clear why.
> I would create a subfonction which parses all the fields linearly.
> 

I'll inline the contents of error_log_offset_0x08() - I can't see a big
benefit to factoring out the guts of that function.

> 
> 
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x10,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		goto out;
> > +
> > +	log->error_log_type = val >> 56;
> > +	log->action_flags = (log->error_log_type ==
> > OCXL_PMEM_ERROR_LOG_TYPE_GENERAL) ?
> > +			    (val >> 32) & 0xFFFFFF : 0;
> > +	log->power_on_seconds = val & 0xFFFFFFFF;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x18,
> > +				     OCXL_LITTLE_ENDIAN, &log-
> > >timestamp);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x20,
> > +				     OCXL_HOST_ENDIAN, &log->wwid[0]);
> 
> 
> A bit of a moot point, but is there a reason why some of those MMIO
> ops 
> use OCXL_LITTLE_ENDIAN and the others OCXL_HOST_ENDIAN?
> 

Some are little endian values, and some are binary data. WWIDs should
be LE though.

> 
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x28,
> > +				     OCXL_HOST_ENDIAN, &log->wwid[1]);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x30,
> > +				     OCXL_HOST_ENDIAN, (u64 *)log-
> > >fw_revision);
> > +	if (rc)
> > +		goto out;
> > +	log->fw_revision[8] = '\0';
> > +
> > +	buf_length = (user_buf_length < log->buf_size) ?
> > +		     user_buf_length : log->buf_size;
> > +	for (i = 0; i < buf_length + 0x48; i += 8) {
> > +		u64 val;
> > +
> > +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +					     ocxlpmem-
> > >admin_command.data_offset + i,
> > +					     OCXL_HOST_ENDIAN, &val);
> > +		if (rc)
> > +			goto out;
> > +
> > +		if (buf_is_user) {
> > +			if (copy_to_user(&log->buf[i], &val,
> > sizeof(u64))) {
> > +				rc = -EFAULT;
> > +				goto out;
> > +			}
> > +		} else
> > +			log->buf[i] = val;
> > +	}
> 
> 
> I think it could be a bit simplified by keeping the handling of the
> user 
> buffer out of this function. Always call it with a kernel buffer.
> And 
> have only one copy_to_user() call on the ioctl() path. You'd need to 
> allocate a kernel buf on the ioctl path, but you're already doing it
> on 
> the probe() path, so it should be doable to share code.

Hmm, the problem then is that on the IOCTL side, I'll have to save,
modify, then restore the buf member of struct
ioctl_ocxl_pmem_error_log, which would be uglier.

> 
> 
> > +
> > +	rc = admin_response_handled(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +out:
> > +	mutex_unlock(&ocxlpmem->admin_command.lock);
> > +	return rc;
> > +
> > +}
> > +
> > +static int ioctl_error_log(struct ocxlpmem *ocxlpmem,
> > +		struct ioctl_ocxl_pmem_error_log __user *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_error_log args;
> > +	int rc;
> > +
> > +	if (copy_from_user(&args, uarg, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	rc = read_error_log(ocxlpmem, &args, true);
> > +	if (rc)
> > +		return rc;
> > +
> > +	if (copy_to_user(uarg, &args, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}
> > +
> > +static long file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long args)
> > +{
> > +	struct ocxlpmem *ocxlpmem = file->private_data;
> > +	int rc = -EINVAL;
> > +
> > +	switch (cmd) {
> > +	case IOCTL_OCXL_PMEM_ERROR_LOG:
> > +		rc = ioctl_error_log(ocxlpmem,
> > +				     (struct ioctl_ocxl_pmem_error_log
> > __user *)args);
> > +		break;
> > +	}
> > +	return rc;
> > +}
> > +
> >   static const struct file_operations fops = {
> >   	.owner		= THIS_MODULE,
> >   	.open		= file_open,
> >   	.release	= file_release,
> > +	.unlocked_ioctl = file_ioctl,
> > +	.compat_ioctl   = file_ioctl,
> >   };
> >   
> >   /**
> > @@ -527,6 +736,60 @@ static int read_device_metadata(struct
> > ocxlpmem *ocxlpmem)
> >   	return 0;
> >   }
> >   
> > +static const char *decode_error_log_type(u8 error_log_type)
> > +{
> > +	switch (error_log_type) {
> > +	case 0x00:
> > +		return "general";
> > +	case 0x01:
> > +		return "predictive failure";
> > +	case 0x02:
> > +		return "thermal warning";
> > +	case 0x03:
> > +		return "data loss";
> > +	case 0x04:
> > +		return "health & performance";
> > +	default:
> > +		return "unknown";
> > +	}
> > +}
> > +
> > +static void dump_error_log(struct ocxlpmem *ocxlpmem)
> > +{
> > +	struct ioctl_ocxl_pmem_error_log log;
> > +	u32 buf_size;
> > +	u8 *buf;
> > +	int rc;
> > +
> > +	if (ocxlpmem->admin_command.data_size == 0)
> > +		return;
> > +
> > +	buf_size = ocxlpmem->admin_command.data_size - 0x48;
> > +	buf = kzalloc(buf_size, GFP_KERNEL);
> > +	if (!buf)
> > +		return;
> > +
> > +	log.buf = buf;
> > +	log.buf_size = buf_size;
> > +
> > +	rc = read_error_log(ocxlpmem, &log, false);
> > +	if (rc < 0)
> > +		goto out;
> > +
> > +	dev_warn(&ocxlpmem->dev,
> > +		 "OCXL PMEM Error log: WWID=0x%016llx%016llx LID=0x%x
> > PRC=%x type=0x%x %s, Uptime=%u seconds timestamp=0x%llx\n",
> > +		 log.wwid[0], log.wwid[1],
> > +		 log.log_identifier, log.program_reference_code,
> > +		 log.error_log_type,
> > +		 decode_error_log_type(log.error_log_type),
> > +		 log.power_on_seconds, log.timestamp);
> > +	print_hex_dump(KERN_WARNING, "buf", DUMP_PREFIX_OFFSET, 16, 1,
> > buf,
> > +		       log.buf_size, false);
> 
> dev_warn already logs a warning. Isn't KERN_DEBUG more appropriate
> for 
> the hex dump?
> 
> 

The hex dump is associated binary data for the warning, it doesn't
replicate the contents of the message.

> 
> > +
> > +out:
> > +	kfree(buf);
> > +}
> > +
> >   /**
> >    * probe_function0() - Set up function 0 for an OpenCAPI
> > persistent memory device
> >    * This is important as it enables templates higher than 0 across
> > all other functions,
> > @@ -568,6 +831,7 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   	struct ocxlpmem *ocxlpmem;
> >   	int rc;
> >   	u16 elapsed, timeout;
> > +	u64 chi;
> >   
> >   	if (PCI_FUNC(pdev->devfn) == 0)
> >   		return probe_function0(pdev);
> > @@ -667,6 +931,11 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   	return 0;
> >   
> >   err:
> > +	if (ocxlpmem &&
> > +		    (ocxlpmem_chi(ocxlpmem, &chi) == 0) &&
> > +		    (chi & GLOBAL_MMIO_CHI_ELA))
> > +		dump_error_log(ocxlpmem);
> > +
> >   	/*
> >   	 * Further cleanup is done in the release handler via
> > free_ocxlpmem()
> >   	 * This allows us to keep the character device live to handle
> > IOCTLs to
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > index d2d81fec7bb1..b953ee522ed4 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > @@ -5,6 +5,7 @@
> >   #include <linux/cdev.h>
> >   #include <misc/ocxl.h>
> >   #include <linux/libnvdimm.h>
> > +#include <uapi/nvdimm/ocxl-pmem.h>
> 
> Can't we limit the extra include to ocxl.c?
> 

Yes, there are no consumers referred to in ocxl_interal.[hc]

> Completely unrelated, but ocxl.c contains most of the code for this 
> driver. We should consider renaming it to ocxlpmem.c or something
> along 
> those lines, since it does a lot more than just interfacing with the 
> opencapi interface. And would avoid confusion with an other already 
> existing ocxl.c file.
> 

Ok, my thinking was that it's already in a pmem directory, but I can
see arguments both ways.

> 
> >   #include <linux/mm.h>
> >   
> >   #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
> > diff --git a/include/uapi/nvdimm/ocxl-pmem.h
> > b/include/uapi/nvdimm/ocxl-pmem.h
> > new file mode 100644
> > index 000000000000..b10f8ac0c20f
> > --- /dev/null
> > +++ b/include/uapi/nvdimm/ocxl-pmem.h
> > @@ -0,0 +1,46 @@
> > +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> > +/* Copyright 2017 IBM Corp. */
> > +#ifndef _UAPI_OCXL_SCM_H
> > +#define _UAPI_OCXL_SCM_H
> > +
> > +#include <linux/types.h>
> > +#include <linux/ioctl.h>
> > +
> > +#define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
> > +#define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
> > +#define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
> > +#define OCXL_PMEM_ERROR_LOG_ACTION_DUMP		(1 << (55-32))
> > +
> > +#define OCXL_PMEM_ERROR_LOG_TYPE_GENERAL		(0x00)
> > +#define OCXL_PMEM_ERROR_LOG_TYPE_PREDICTIVE_FAILURE	(0x01)
> > +#define OCXL_PMEM_ERROR_LOG_TYPE_THERMAL_WARNING	(0x02)
> > +#define OCXL_PMEM_ERROR_LOG_TYPE_DATA_LOSS		(0x03)
> > +#define OCXL_PMEM_ERROR_LOG_TYPE_HEALTH_PERFORMANCE	(0x04)
> > +
> > +struct ioctl_ocxl_pmem_error_log {
> > +	__u32 log_identifier; /* out */
> > +	__u32 program_reference_code; /* out */
> > +	__u32 action_flags; /* out, recommended course of action */
> > +	__u32 power_on_seconds; /* out, Number of seconds the
> > controller has been on when the error occurred */
> > +	__u64 timestamp; /* out, relative time since the current IPL */
> > +	__u64 wwid[2]; /* out, the NAA formatted WWID associated with
> > the controller */
> > +	char  fw_revision[8+1]; /* out, firmware revision as null
> > terminated text */
> 
> The 8+1 size will make the compiler add some padding here. Are we 
> confident that all the compilers, at least on powerpc, will do the
> same 
> thing and we can guarantee a kernel ABI? I would play it safe and
> have a 
> discussion with folks who understand compilers better.
> 

I'll add some explicit padding.

> 
> 
> > +	__u16 buf_size; /* in/out, buffer size provided/required.
> > +			 * If required is greater than provided, the
> > buffer
> > +			 * will be truncated to the amount provided. If
> > its
> > +			 * less, then only the required bytes will be
> > populated.
> > +			 * If it is 0, then there are no more error log
> > entries.
> > +			 */
> > +	__u8  error_log_type;
> > +	__u8  reserved1;
> > +	__u32 reserved2;
> > +	__u64 reserved3[2];
> > +	__u8 *buf; /* pointer to output buffer */
> > +};
> > +
> > +/* ioctl numbers */
> > +#define OCXL_PMEM_MAGIC 0x5C
> 
> Randomly picked?
> See (and add entry in) Documentation/userspace-api/ioctl/ioctl-
> number.rst
> 
Ok

> 
>    Fred
> 
> 
> 
> > +/* SCM devices */
> > +#define IOCTL_OCXL_PMEM_ERROR_LOG			_IOWR(OCXL_PMEM
> > _MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log)
> > +
> > +#endif /* _UAPI_OCXL_SCM_H */
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command
  2020-03-05  4:31     ` Alastair D'Silva
@ 2020-03-05  9:33       ` Frederic Barrat
  0 siblings, 0 replies; 130+ messages in thread
From: Frederic Barrat @ 2020-03-05  9:33 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel


>>> +	if (rc)
>>> +		goto out;
>>> +
>>> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
>>> +				     ocxlpmem-
>>>> admin_command.data_offset + 0x28,
>>> +				     OCXL_HOST_ENDIAN, &log->wwid[1]);
>>> +	if (rc)
>>> +		goto out;
>>> +
>>> +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
>>> +				     ocxlpmem-
>>>> admin_command.data_offset + 0x30,
>>> +				     OCXL_HOST_ENDIAN, (u64 *)log-
>>>> fw_revision);
>>> +	if (rc)
>>> +		goto out;
>>> +	log->fw_revision[8] = '\0';
>>> +
>>> +	buf_length = (user_buf_length < log->buf_size) ?
>>> +		     user_buf_length : log->buf_size;
>>> +	for (i = 0; i < buf_length + 0x48; i += 8) {
>>> +		u64 val;
>>> +
>>> +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
>>> +					     ocxlpmem-
>>>> admin_command.data_offset + i,
>>> +					     OCXL_HOST_ENDIAN, &val);
>>> +		if (rc)
>>> +			goto out;
>>> +
>>> +		if (buf_is_user) {
>>> +			if (copy_to_user(&log->buf[i], &val,
>>> sizeof(u64))) {
>>> +				rc = -EFAULT;
>>> +				goto out;
>>> +			}
>>> +		} else
>>> +			log->buf[i] = val;
>>> +	}
>>
>>
>> I think it could be a bit simplified by keeping the handling of the
>> user
>> buffer out of this function. Always call it with a kernel buffer.
>> And
>> have only one copy_to_user() call on the ioctl() path. You'd need to
>> allocate a kernel buf on the ioctl path, but you're already doing it
>> on
>> the probe() path, so it should be doable to share code.
> 
> Hmm, the problem then is that on the IOCTL side, I'll have to save,
> modify, then restore the buf member of struct
> ioctl_ocxl_pmem_error_log, which would be uglier.


buf is just an output buffer. All you'd need to do is allocate a kernel 
buf, like it's already done for the "probe" case in dump_error_log(). 
And add a global copy_to_user() of the buf at the end of the ioctl path, 
instead of having multiple smaller copy_to_user() in the loop here.
copy_to_user() is a bit expensive so it's usually better to regroup 
them. I think it's easy here and make sense since that function is also 
trying to handle both a kernel and user space bufffers.
But we're not in a critical path, and after this patch, there are others 
copying out mmio content to user buffers and those don't have a kernel 
buffer to handle, so the copy_to_user() in a loop makes things easier.
So I guess the conclusion is whatever you think is the easiest...



>>
>>
>>> +
>>> +	rc = admin_response_handled(ocxlpmem);
>>> +	if (rc)
>>> +		goto out;
>>> +
>>> +out:
>>> +	mutex_unlock(&ocxlpmem->admin_command.lock);
>>> +	return rc;
>>> +
>>> +}
>>> +
>>> +static int ioctl_error_log(struct ocxlpmem *ocxlpmem,
>>> +		struct ioctl_ocxl_pmem_error_log __user *uarg)
>>> +{
>>> +	struct ioctl_ocxl_pmem_error_log args;
>>> +	int rc;
>>> +
>>> +	if (copy_from_user(&args, uarg, sizeof(args)))
>>> +		return -EFAULT;
>>> +
>>> +	rc = read_error_log(ocxlpmem, &args, true);
>>> +	if (rc)
>>> +		return rc;
>>> +
>>> +	if (copy_to_user(uarg, &args, sizeof(args)))
>>> +		return -EFAULT;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static long file_ioctl(struct file *file, unsigned int cmd,
>>> unsigned long args)
>>> +{
>>> +	struct ocxlpmem *ocxlpmem = file->private_data;
>>> +	int rc = -EINVAL;
>>> +
>>> +	switch (cmd) {
>>> +	case IOCTL_OCXL_PMEM_ERROR_LOG:
>>> +		rc = ioctl_error_log(ocxlpmem,
>>> +				     (struct ioctl_ocxl_pmem_error_log
>>> __user *)args);
>>> +		break;
>>> +	}
>>> +	return rc;
>>> +}
>>> +
>>>    static const struct file_operations fops = {
>>>    	.owner		= THIS_MODULE,
>>>    	.open		= file_open,
>>>    	.release	= file_release,
>>> +	.unlocked_ioctl = file_ioctl,
>>> +	.compat_ioctl   = file_ioctl,
>>>    };
>>>    
>>>    /**
>>> @@ -527,6 +736,60 @@ static int read_device_metadata(struct
>>> ocxlpmem *ocxlpmem)
>>>    	return 0;
>>>    }
>>>    
>>> +static const char *decode_error_log_type(u8 error_log_type)
>>> +{
>>> +	switch (error_log_type) {
>>> +	case 0x00:
>>> +		return "general";
>>> +	case 0x01:
>>> +		return "predictive failure";
>>> +	case 0x02:
>>> +		return "thermal warning";
>>> +	case 0x03:
>>> +		return "data loss";
>>> +	case 0x04:
>>> +		return "health & performance";
>>> +	default:
>>> +		return "unknown";
>>> +	}
>>> +}
>>> +
>>> +static void dump_error_log(struct ocxlpmem *ocxlpmem)
>>> +{
>>> +	struct ioctl_ocxl_pmem_error_log log;
>>> +	u32 buf_size;
>>> +	u8 *buf;
>>> +	int rc;
>>> +
>>> +	if (ocxlpmem->admin_command.data_size == 0)
>>> +		return;
>>> +
>>> +	buf_size = ocxlpmem->admin_command.data_size - 0x48;
>>> +	buf = kzalloc(buf_size, GFP_KERNEL);
>>> +	if (!buf)
>>> +		return;
>>> +
>>> +	log.buf = buf;
>>> +	log.buf_size = buf_size;
>>> +
>>> +	rc = read_error_log(ocxlpmem, &log, false);
>>> +	if (rc < 0)
>>> +		goto out;
>>> +
>>> +	dev_warn(&ocxlpmem->dev,
>>> +		 "OCXL PMEM Error log: WWID=0x%016llx%016llx LID=0x%x
>>> PRC=%x type=0x%x %s, Uptime=%u seconds timestamp=0x%llx\n",
>>> +		 log.wwid[0], log.wwid[1],
>>> +		 log.log_identifier, log.program_reference_code,
>>> +		 log.error_log_type,
>>> +		 decode_error_log_type(log.error_log_type),
>>> +		 log.power_on_seconds, log.timestamp);
>>> +	print_hex_dump(KERN_WARNING, "buf", DUMP_PREFIX_OFFSET, 16, 1,
>>> buf,
>>> +		       log.buf_size, false);
>>
>> dev_warn already logs a warning. Isn't KERN_DEBUG more appropriate
>> for
>> the hex dump?
>>
>>
> 
> The hex dump is associated binary data for the warning, it doesn't
> replicate the contents of the message.


My point is not about duplicating, it's about exposing an hexadecimal 
dump where it makes sense. Those DEBUG and WARNING tags are used for 
filtering content. For example to know what to display on the console. A 
warning to mention that a device hits a serious error is perfectly fine. 
A hexadecimal dump which is going to be meaningless to most everybody is 
not. The system is not crashing, so it's not like the console is our 
last hope. I think the dump is debug data and should be tagged as such.

   Fred



>>
>>> +
>>> +out:
>>> +	kfree(buf);
>>> +}
>>> +
>>>    /**
>>>     * probe_function0() - Set up function 0 for an OpenCAPI
>>> persistent memory device
>>>     * This is important as it enables templates higher than 0 across
>>> all other functions,
>>> @@ -568,6 +831,7 @@ static int probe(struct pci_dev *pdev, const
>>> struct pci_device_id *ent)
>>>    	struct ocxlpmem *ocxlpmem;
>>>    	int rc;
>>>    	u16 elapsed, timeout;
>>> +	u64 chi;
>>>    
>>>    	if (PCI_FUNC(pdev->devfn) == 0)
>>>    		return probe_function0(pdev);
>>> @@ -667,6 +931,11 @@ static int probe(struct pci_dev *pdev, const
>>> struct pci_device_id *ent)
>>>    	return 0;
>>>    
>>>    err:
>>> +	if (ocxlpmem &&
>>> +		    (ocxlpmem_chi(ocxlpmem, &chi) == 0) &&
>>> +		    (chi & GLOBAL_MMIO_CHI_ELA))
>>> +		dump_error_log(ocxlpmem);
>>> +
>>>    	/*
>>>    	 * Further cleanup is done in the release handler via
>>> free_ocxlpmem()
>>>    	 * This allows us to keep the character device live to handle
>>> IOCTLs to
>>> diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
>>> b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
>>> index d2d81fec7bb1..b953ee522ed4 100644
>>> --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
>>> +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
>>> @@ -5,6 +5,7 @@
>>>    #include <linux/cdev.h>
>>>    #include <misc/ocxl.h>
>>>    #include <linux/libnvdimm.h>
>>> +#include <uapi/nvdimm/ocxl-pmem.h>
>>
>> Can't we limit the extra include to ocxl.c?
>>
> 
> Yes, there are no consumers referred to in ocxl_interal.[hc]
> 
>> Completely unrelated, but ocxl.c contains most of the code for this
>> driver. We should consider renaming it to ocxlpmem.c or something
>> along
>> those lines, since it does a lot more than just interfacing with the
>> opencapi interface. And would avoid confusion with an other already
>> existing ocxl.c file.
>>
> 
> Ok, my thinking was that it's already in a pmem directory, but I can
> see arguments both ways.
> 
>>
>>>    #include <linux/mm.h>
>>>    
>>>    #define LABEL_AREA_SIZE	(1UL << PA_SECTION_SHIFT)
>>> diff --git a/include/uapi/nvdimm/ocxl-pmem.h
>>> b/include/uapi/nvdimm/ocxl-pmem.h
>>> new file mode 100644
>>> index 000000000000..b10f8ac0c20f
>>> --- /dev/null
>>> +++ b/include/uapi/nvdimm/ocxl-pmem.h
>>> @@ -0,0 +1,46 @@
>>> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
>>> +/* Copyright 2017 IBM Corp. */
>>> +#ifndef _UAPI_OCXL_SCM_H
>>> +#define _UAPI_OCXL_SCM_H
>>> +
>>> +#include <linux/types.h>
>>> +#include <linux/ioctl.h>
>>> +
>>> +#define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
>>> +#define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
>>> +#define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
>>> +#define OCXL_PMEM_ERROR_LOG_ACTION_DUMP		(1 << (55-32))
>>> +
>>> +#define OCXL_PMEM_ERROR_LOG_TYPE_GENERAL		(0x00)
>>> +#define OCXL_PMEM_ERROR_LOG_TYPE_PREDICTIVE_FAILURE	(0x01)
>>> +#define OCXL_PMEM_ERROR_LOG_TYPE_THERMAL_WARNING	(0x02)
>>> +#define OCXL_PMEM_ERROR_LOG_TYPE_DATA_LOSS		(0x03)
>>> +#define OCXL_PMEM_ERROR_LOG_TYPE_HEALTH_PERFORMANCE	(0x04)
>>> +
>>> +struct ioctl_ocxl_pmem_error_log {
>>> +	__u32 log_identifier; /* out */
>>> +	__u32 program_reference_code; /* out */
>>> +	__u32 action_flags; /* out, recommended course of action */
>>> +	__u32 power_on_seconds; /* out, Number of seconds the
>>> controller has been on when the error occurred */
>>> +	__u64 timestamp; /* out, relative time since the current IPL */
>>> +	__u64 wwid[2]; /* out, the NAA formatted WWID associated with
>>> the controller */
>>> +	char  fw_revision[8+1]; /* out, firmware revision as null
>>> terminated text */
>>
>> The 8+1 size will make the compiler add some padding here. Are we
>> confident that all the compilers, at least on powerpc, will do the
>> same
>> thing and we can guarantee a kernel ABI? I would play it safe and
>> have a
>> discussion with folks who understand compilers better.
>>
> 
> I'll add some explicit padding.
> 
>>
>>
>>> +	__u16 buf_size; /* in/out, buffer size provided/required.
>>> +			 * If required is greater than provided, the
>>> buffer
>>> +			 * will be truncated to the amount provided. If
>>> its
>>> +			 * less, then only the required bytes will be
>>> populated.
>>> +			 * If it is 0, then there are no more error log
>>> entries.
>>> +			 */
>>> +	__u8  error_log_type;
>>> +	__u8  reserved1;
>>> +	__u32 reserved2;
>>> +	__u64 reserved3[2];
>>> +	__u8 *buf; /* pointer to output buffer */
>>> +};
>>> +
>>> +/* ioctl numbers */
>>> +#define OCXL_PMEM_MAGIC 0x5C
>>
>> Randomly picked?
>> See (and add entry in) Documentation/userspace-api/ioctl/ioctl-
>> number.rst
>>
> Ok
> 
>>
>>     Fred
>>
>>
>>
>>> +/* SCM devices */
>>> +#define IOCTL_OCXL_PMEM_ERROR_LOG			_IOWR(OCXL_PMEM
>>> _MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log)
>>> +
>>> +#endif /* _UAPI_OCXL_SCM_H */
>>>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs
  2020-03-03 18:04   ` Frederic Barrat
@ 2020-03-05 23:37     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-05 23:37 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Tue, 2020-03-03 at 19:04 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > This patch adds IOCTLs to allow userspace to request & fetch dumps
> > of the internal controller state.
> > 
> > This is useful during debugging or when a fatal error on the
> > controller
> > has occurred.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c | 132
> > +++++++++++++++++++++
> >   include/uapi/nvdimm/ocxl-pmem.h            |  15 +++
> >   2 files changed, 147 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index 2b64504f9129..2cabafe1fc58 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -640,6 +640,124 @@ static int ioctl_error_log(struct ocxlpmem
> > *ocxlpmem,
> >   	return 0;
> >   }
> >   
> > +static int ioctl_controller_dump_data(struct ocxlpmem *ocxlpmem,
> > +		struct ioctl_ocxl_pmem_controller_dump_data __user
> > *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_controller_dump_data args;
> > +	u16 i;
> > +	u64 val;
> > +	int rc;
> > +
> > +	if (copy_from_user(&args, uarg, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	if (args.buf_size % 8)
> > +		return -EINVAL;
> > +
> > +	if (args.buf_size > ocxlpmem->admin_command.data_size)
> > +		return -EINVAL;
> > +
> > +	mutex_lock(&ocxlpmem->admin_command.lock);
> > +
> > +	rc = admin_command_request(ocxlpmem,
> > ADMIN_COMMAND_CONTROLLER_DUMP);
> > +	if (rc)
> > +		goto out;
> > +
> > +	val = ((u64)args.offset) << 32;
> > +	val |= args.buf_size;
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +				      ocxlpmem-
> > >admin_command.request_offset + 0x08,
> > +				      OCXL_LITTLE_ENDIAN, val);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_execute(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_complete_timeout(ocxlpmem,
> > +					    ADMIN_COMMAND_CONTROLLER_DU
> > MP);
> > +	if (rc < 0) {
> > +		dev_warn(&ocxlpmem->dev, "Controller dump timed
> > out\n");
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response(ocxlpmem);
> > +	if (rc < 0)
> > +		goto out;
> > +	if (rc != STATUS_SUCCESS) {
> > +		warn_status(ocxlpmem,
> > +			    "Unexpected status from retrieve error
> > log",
> > +			    rc);
> > +		goto out;
> > +	}
> 
> 
> It would help if there was a comment indicating how the 3 ioctls are 
> used. My understanding is that the userland is:
> - requesting the controller to prepare a state dump
> - then one or more ioctls to fetch the data. The number of calls 
> required to get the full state really depends on the size of the
> buffer 
> passed by user
> - a last ioctl to tell the controller that we're done, presumably to
> let 
> it free some resources.
> 

Ok, will add it to the blurb.
> 
> > +
> > +	for (i = 0; i < args.buf_size; i += 8) {
> > +		u64 val;
> > +
> > +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +					     ocxlpmem-
> > >admin_command.data_offset + i,
> > +					     OCXL_HOST_ENDIAN, &val);
> > +		if (rc)
> > +			goto out;
> > +
> > +		if (copy_to_user(&args.buf[i], &val, sizeof(u64))) {
> > +			rc = -EFAULT;
> > +			goto out;
> > +		}
> > +	}
> > +
> > +	if (copy_to_user(uarg, &args, sizeof(args))) {
> > +		rc = -EFAULT;
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response_handled(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +out:
> > +	mutex_unlock(&ocxlpmem->admin_command.lock);
> > +	return rc;
> > +}
> > +
> > +int request_controller_dump(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +	u64 busy = 1;
> > +
> > +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHIC,
> > +				    OCXL_LITTLE_ENDIAN,
> > +				    GLOBAL_MMIO_CHI_CDA);
> > +
> 
> rc is not checked here.

Whoops

> 
> 
> > +
> > +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_HCI,
> > +				    OCXL_LITTLE_ENDIAN,
> > +				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP);
> > +	if (rc)
> > +		return rc;
> > +
> > +	while (busy) {
> > +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +					     GLOBAL_MMIO_HCI,
> > +					     OCXL_LITTLE_ENDIAN,
> > &busy);
> > +		if (rc)
> > +			return rc;
> > +
> > +		busy &= GLOBAL_MMIO_HCI_CONTROLLER_DUMP;
> 
> Setting 'busy' doesn't hurt, but it's not really useful, is it?
> 
> We should add some kind of timeout so that if the controller hits an 
> issue, we don't spin in kernel space endlessly.
> 
> 

Here we are polling the controller dump bit of the HCI register until
the controller clears it - that line is masking off the bits we don't
care about.

I'll talk to the firmware team about adding a timeout for that to the
spec so we know how long to wait for before giving up.

> 
> > +		cond_resched();
> > +	}
> > +
> > +	return 0;
> > +}

> > +
> > +static int ioctl_controller_dump_complete(struct ocxlpmem
> > *ocxlpmem)
> > +{
> > +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_HCI,
> > +				    OCXL_LITTLE_ENDIAN,
> > +				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COL
> > LECTED);
> > +}
> > +
> >   static long file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long args)
> >   {
> >   	struct ocxlpmem *ocxlpmem = file->private_data;
> > @@ -650,7 +768,21 @@ static long file_ioctl(struct file *file,
> > unsigned int cmd, unsigned long args)
> >   		rc = ioctl_error_log(ocxlpmem,
> >   				     (struct ioctl_ocxl_pmem_error_log
> > __user *)args);
> >   		break;
> > +
> > +	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP:
> > +		rc = request_controller_dump(ocxlpmem);
> > +		break;
> > +
> > +	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA:
> > +		rc = ioctl_controller_dump_data(ocxlpmem,
> > +						(struct
> > ioctl_ocxl_pmem_controller_dump_data __user *)args);
> > +		break;
> > +
> > +	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE:
> > +		rc = ioctl_controller_dump_complete(ocxlpmem);
> > +		break;
> >   	}
> > +
> >   	return rc;
> >   }
> >   
> > diff --git a/include/uapi/nvdimm/ocxl-pmem.h
> > b/include/uapi/nvdimm/ocxl-pmem.h
> > index b10f8ac0c20f..d4d8512d03f7 100644
> > --- a/include/uapi/nvdimm/ocxl-pmem.h
> > +++ b/include/uapi/nvdimm/ocxl-pmem.h
> > @@ -38,9 +38,24 @@ struct ioctl_ocxl_pmem_error_log {
> >   	__u8 *buf; /* pointer to output buffer */
> >   };
> >   
> > +struct ioctl_ocxl_pmem_controller_dump_data {
> > +	__u8 *buf; /* pointer to output buffer */
> 
> We only support 64-bit user app on powerpc, but using a pointer type
> in 
> a kernel ABI is unusual. We should use a know size like __u64.
> (also applies to buf pointer in struct ioctl_ocxl_pmem_error_log
> from 
> previous patch too)
> 
> The rest of the structure will also be padded by the compiler, which
> we 
> should avoid.
> 
>     Fred
> 

Ok, I'll co-erce the pointers into a __u64.

> 
> 
> > +	__u16 buf_size; /* in/out, buffer size provided/required.
> > +			 * If required is greater than provided, the
> > buffer
> > +			 * will be truncated to the amount provided. If
> > its
> > +			 * less, then only the required bytes will be
> > populated.
> > +			 * If it is 0, then there is no more dump data
> > available.
> > +			 */
> > +	__u32 offset; /* in, Offset within the dump */
> > +	__u64 reserved[8];
> > +};
> > +
> >   /* ioctl numbers */
> >   #define OCXL_PMEM_MAGIC 0x5C
> >   /* SCM devices */
> >   #define IOCTL_OCXL_PMEM_ERROR_LOG			_IOWR(OCXL_PMEM
> > _MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log)
> > +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP			_IO(OCX
> > L_PMEM_MAGIC, 0x02)
> > +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(O
> > CXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
> > +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_M
> > AGIC, 0x04)
> >   
> >   #endif /* _UAPI_OCXL_SCM_H */
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs
  2020-03-04  6:53   ` Andrew Donnellan
@ 2020-03-06  3:34     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-06  3:34 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Wed, 2020-03-04 at 17:53 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > +static int ioctl_controller_dump_data(struct ocxlpmem *ocxlpmem,
> > +		struct ioctl_ocxl_pmem_controller_dump_data __user
> > *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_controller_dump_data args;
> > +	u16 i;
> > +	u64 val;
> > +	int rc;
> > +
> > +	if (copy_from_user(&args, uarg, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	if (args.buf_size % 8)
> > +		return -EINVAL;
> > +
> > +	if (args.buf_size > ocxlpmem->admin_command.data_size)
> > +		return -EINVAL;
> > +
> > +	mutex_lock(&ocxlpmem->admin_command.lock);
> > +
> > +	rc = admin_command_request(ocxlpmem,
> > ADMIN_COMMAND_CONTROLLER_DUMP);
> > +	if (rc)
> > +		goto out;
> > +
> > +	val = ((u64)args.offset) << 32;
> > +	val |= args.buf_size;
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +				      ocxlpmem-
> > >admin_command.request_offset + 0x08,
> > +				      OCXL_LITTLE_ENDIAN, val);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_execute(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_complete_timeout(ocxlpmem,
> > +					    ADMIN_COMMAND_CONTROLLER_DU
> > MP);
> > +	if (rc < 0) {
> > +		dev_warn(&ocxlpmem->dev, "Controller dump timed
> > out\n");
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response(ocxlpmem);
> > +	if (rc < 0)
> > +		goto out;
> > +	if (rc != STATUS_SUCCESS) {
> > +		warn_status(ocxlpmem,
> > +			    "Unexpected status from retrieve error
> > log",
> 
> Controller dump
> 

Ok

> > +			    rc);
> > +		goto out;
> > +	}
> > +
> > +	for (i = 0; i < args.buf_size; i += 8) {
> > +		u64 val;
> > +
> > +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +					     ocxlpmem-
> > >admin_command.data_offset + i,
> > +					     OCXL_HOST_ENDIAN, &val);
> 
> Is a controller dump something where we want to do endian swapping?
> 

No, we just have raw binary data that we want to pass through.
OCXL_HOST_ENDIAN does no swapping.

> Any reason we're not doing the usual check of the data identifier, 
> additional data length etc?
> 

I'll add that

> > +		if (rc)
> > +			goto out;
> > +
> > +		if (copy_to_user(&args.buf[i], &val, sizeof(u64))) {
> > +			rc = -EFAULT;
> > +			goto out;
> > +		}
> > +	}
> > +
> > +	if (copy_to_user(uarg, &args, sizeof(args))) {
> > +		rc = -EFAULT;
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response_handled(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +out:
> > +	mutex_unlock(&ocxlpmem->admin_command.lock);
> > +	return rc;
> > +}
> > +
> > +int request_controller_dump(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +	u64 busy = 1;
> > +
> > +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHIC,
> > +				    OCXL_LITTLE_ENDIAN,
> > +				    GLOBAL_MMIO_CHI_CDA);
> 
> This return code is ignored
> 
> > +
> > +
> > +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_HCI,
> > +				    OCXL_LITTLE_ENDIAN,
> > +				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP);
> > +	if (rc)
> > +		return rc;
> > +
> > +	while (busy) {
> > +		rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +					     GLOBAL_MMIO_HCI,
> > +					     OCXL_LITTLE_ENDIAN,
> > &busy);
> > +		if (rc)
> > +			return rc;
> > +
> > +		busy &= GLOBAL_MMIO_HCI_CONTROLLER_DUMP;
> > +		cond_resched();
> > +	}
> > +
> > +	return 0;
> > +}
> 
> 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 20/27] powerpc/powernv/pmem: Forward events to userspace
  2020-03-04 11:00   ` Frederic Barrat
@ 2020-03-11  3:32     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-11  3:32 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Wed, 2020-03-04 at 12:00 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > Some of the interrupts that the card generates are better handled
> > by the userspace daemon, in particular:
> > Controller Hardware/Firmware Fatal
> > Controller Dump Available
> > Error Log available
> > 
> > This patch allows a userspace application to register an eventfd
> > with
> > the driver via SCM_IOCTL_EVENTFD to receive notifications of these
> > interrupts.
> > 
> > Userspace can then identify what events have occurred by calling
> > SCM_IOCTL_EVENT_CHECK and checking against the SCM_IOCTL_EVENT_FOO
> > masks.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c    | 216
> > ++++++++++++++++++
> >   .../platforms/powernv/pmem/ocxl_internal.h    |   5 +
> >   include/uapi/nvdimm/ocxl-pmem.h               |  16 ++
> >   3 files changed, 237 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index 009d4fd29e7d..e46696d3cc36 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -10,6 +10,7 @@
> >   #include <misc/ocxl.h>
> >   #include <linux/delay.h>
> >   #include <linux/ndctl.h>
> > +#include <linux/eventfd.h>
> >   #include <linux/fs.h>
> >   #include <linux/mm_types.h>
> >   #include <linux/memory_hotplug.h>
> > @@ -335,11 +336,22 @@ static void free_ocxlpmem(struct ocxlpmem
> > *ocxlpmem)
> >   {
> >   	int rc;
> >   
> > +	// Disable doorbells
> > +	(void)ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHIEC,
> > +				     OCXL_LITTLE_ENDIAN,
> > +				     GLOBAL_MMIO_CHI_ALL);
> > +
> >   	if (ocxlpmem->nvdimm_bus)
> >   		nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> >   
> >   	free_minor(ocxlpmem);
> >   
> > +	if (ocxlpmem->irq_addr[1])
> > +		iounmap(ocxlpmem->irq_addr[1]);
> > +
> > +	if (ocxlpmem->irq_addr[0])
> > +		iounmap(ocxlpmem->irq_addr[0]);
> > +
> >   	if (ocxlpmem->cdev.owner)
> >   		cdev_del(&ocxlpmem->cdev);
> >   
> > @@ -443,6 +455,11 @@ static int file_release(struct inode *inode,
> > struct file *file)
> >   {
> >   	struct ocxlpmem *ocxlpmem = file->private_data;
> >   
> > +	if (ocxlpmem->ev_ctx) {
> > +		eventfd_ctx_put(ocxlpmem->ev_ctx);
> > +		ocxlpmem->ev_ctx = NULL;
> > +	}
> > +
> >   	ocxlpmem_put(ocxlpmem);
> >   	return 0;
> >   }
> > @@ -938,6 +955,51 @@ static int ioctl_controller_stats(struct
> > ocxlpmem *ocxlpmem,
> >   	return rc;
> >   }
> >   
> > +static int ioctl_eventfd(struct ocxlpmem *ocxlpmem,
> > +		 struct ioctl_ocxl_pmem_eventfd __user *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_eventfd args;
> > +
> > +	if (copy_from_user(&args, uarg, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	if (ocxlpmem->ev_ctx)
> > +		return -EINVAL;
> 
> EBUSY?
> 
Ok

> 
> > +
> > +	ocxlpmem->ev_ctx = eventfd_ctx_fdget(args.eventfd);
> > +	if (!ocxlpmem->ev_ctx)
> > +		return -EFAULT;
> 
> Why not use what eventfd_ctx_fdget() returned? (through some
> IS_ERR() 
> and PTR_ERR() convolution)
> 

Ok
> 
> > +
> > +	return 0;
> > +}
> > +
> > +static int ioctl_event_check(struct ocxlpmem *ocxlpmem, u64 __user
> > *uarg)
> > +{
> > +	u64 val = 0;
> > +	int rc;
> > +	u64 chi = 0;
> > +
> > +	rc = ocxlpmem_chi(ocxlpmem, &chi);
> > +	if (rc < 0)
> > +		return rc;
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_ELA)
> > +		val |= IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE;
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_CDA)
> > +		val |= IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE;
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_CFFS)
> > +		val |= IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL;
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_CHFS)
> > +		val |= IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL;
> > +
> > +	rc = copy_to_user((u64 __user *) uarg, &val, sizeof(val));
> > +
> 
> copy_to_user doesn't return an errno. Should be:
> 
> if (copy_to_user((u64 __user *) uarg, &val, sizeof(val)))
> 	return -EFAULT;
> 
Ok

> 
> > +	return rc;
> > +}
> > +
> >   static long file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long args)
> >   {
> >   	struct ocxlpmem *ocxlpmem = file->private_data;
> > @@ -966,6 +1028,15 @@ static long file_ioctl(struct file *file,
> > unsigned int cmd, unsigned long args)
> >   		rc = ioctl_controller_stats(ocxlpmem,
> >   					    (struct
> > ioctl_ocxl_pmem_controller_stats __user *)args);
> >   		break;
> > +
> > +	case IOCTL_OCXL_PMEM_EVENTFD:
> > +		rc = ioctl_eventfd(ocxlpmem,
> > +				   (struct ioctl_ocxl_pmem_eventfd
> > __user *)args);
> > +		break;
> > +
> > +	case IOCTL_OCXL_PMEM_EVENT_CHECK:
> > +		rc = ioctl_event_check(ocxlpmem, (u64 __user *)args);
> > +		break;
> >   	}
> >   
> >   	return rc;
> > @@ -1107,6 +1178,146 @@ static void dump_error_log(struct ocxlpmem
> > *ocxlpmem)
> >   	kfree(buf);
> >   }
> >   
> > +static irqreturn_t imn0_handler(void *private)
> > +{
> > +	struct ocxlpmem *ocxlpmem = private;
> > +	u64 chi = 0;
> > +
> > +	(void)ocxlpmem_chi(ocxlpmem, &chi);
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_ELA) {
> > +		dev_warn(&ocxlpmem->dev, "Error log is available\n");
> > +
> > +		if (ocxlpmem->ev_ctx)
> > +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> > +	}
> > +
> > +	if (chi & GLOBAL_MMIO_CHI_CDA) {
> > +		dev_warn(&ocxlpmem->dev, "Controller dump is
> > available\n");
> > +
> > +		if (ocxlpmem->ev_ctx)
> > +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> > +	}
> > +
> > +
> 
> (at least) one empty line too many.
> 

Ok

> 
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static irqreturn_t imn1_handler(void *private)
> > +{
> > +	struct ocxlpmem *ocxlpmem = private;
> > +	u64 chi = 0;
> > +
> > +	(void)ocxlpmem_chi(ocxlpmem, &chi);
> > +
> > +	if (chi & (GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS)) {
> > +		dev_err(&ocxlpmem->dev,
> > +			"Controller status is fatal, chi=0x%llx, going
> > offline\n", chi);
> > +
> > +		if (ocxlpmem->nvdimm_bus) {
> > +			nvdimm_bus_unregister(ocxlpmem->nvdimm_bus);
> > +			ocxlpmem->nvdimm_bus = NULL;
> > +		}
> > +
> > +		if (ocxlpmem->ev_ctx)
> > +			eventfd_signal(ocxlpmem->ev_ctx, 1);
> > +	}
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +
> > +/**
> > + * ocxlpmem_setup_irq() - Set up the IRQs for the OpenCAPI
> > Persistent Memory device
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int ocxlpmem_setup_irq(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +	u64 irq_addr;
> > +
> > +	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem-
> > >irq_id[0]);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem-
> > >irq_id[0],
> > +				  imn0_handler, NULL, ocxlpmem);
> > +
> > +	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context,
> > ocxlpmem->irq_id[0]);
> > +	if (!irq_addr)
> > +		return -EINVAL;
> > +
> > +	ocxlpmem->irq_addr[0] = ioremap(irq_addr, PAGE_SIZE);
> > +	if (!ocxlpmem->irq_addr[0])
> > +		return -EINVAL;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_IMA0_OHP,
> > +				      OCXL_LITTLE_ENDIAN,
> > +				      (u64)ocxlpmem->irq_addr[0]);
> > +	if (rc)
> > +		goto out_irq0;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_IMA0_CFP,
> > +				      OCXL_LITTLE_ENDIAN, 0);
> > +	if (rc)
> > +		goto out_irq0;
> 
> That's a few lines of duplicate code. On the other hand, there's
> enough 
> varying parameters between the 2 interrupts that factorizing in a 
> subfunction would be slightly less readable. So duplicating is
> probably ok.
> 
> 
> 
> > +	rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem-
> > >irq_id[1]);
> > +	if (rc)
> > +		goto out_irq0;
> > +
> > +
> > +	rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem-
> > >irq_id[1],
> > +				  imn1_handler, NULL, ocxlpmem);
> > +	if (rc)
> > +		goto out_irq0;
> > +
> > +	irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context,
> > ocxlpmem->irq_id[1]);
> > +	if (!irq_addr) {
> > +		rc = -EFAULT;
> > +		goto out_irq0;
> > +	}
> > +
> > +	ocxlpmem->irq_addr[1] = ioremap(irq_addr, PAGE_SIZE);
> > +	if (!ocxlpmem->irq_addr[1]) {
> > +		rc = -EINVAL;
> > +		goto out_irq0;
> > +	}
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_IMA1_OHP,
> > +				      OCXL_LITTLE_ENDIAN,
> > +				      (u64)ocxlpmem->irq_addr[1]);
> > +	if (rc)
> > +		goto out_irq1;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_IMA1_CFP,
> > +				      OCXL_LITTLE_ENDIAN, 0);
> > +	if (rc)
> > +		goto out_irq1;
> > +
> > +	// Enable doorbells
> > +	rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_CHIE,
> > +				    OCXL_LITTLE_ENDIAN,
> > +				    GLOBAL_MMIO_CHI_ELA |
> > GLOBAL_MMIO_CHI_CDA |
> > +				    GLOBAL_MMIO_CHI_CFFS |
> > GLOBAL_MMIO_CHI_CHFS |
> > +				    GLOBAL_MMIO_CHI_NSCRA);
> 
> GLOBAL_MMIO_CHI_NSCRA doesn't seem to be handled in the handlers.
> 

This will be moved to the overwrite patch.

> 
> 
> > +	if (rc)
> > +		goto out_irq1;
> > +
> > +	return 0;
> > +
> > +out_irq1:
> > +	iounmap(ocxlpmem->irq_addr[1]);
> > +	ocxlpmem->irq_addr[1] = NULL;
> > +
> > +out_irq0:
> > +	iounmap(ocxlpmem->irq_addr[0]);
> > +	ocxlpmem->irq_addr[0] = NULL;
> > +
> > +	return rc;
> > +}
> > +
> >   /**
> >    * probe_function0() - Set up function 0 for an OpenCAPI
> > persistent memory device
> >    * This is important as it enables templates higher than 0 across
> > all other functions,
> > @@ -1216,6 +1427,11 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> >   		goto err;
> >   	}
> >   
> > +	if (ocxlpmem_setup_irq(ocxlpmem)) {
> > +		dev_err(&pdev->dev, "Could not set up OCXL IRQs\n");
> 
> Like with other patches, rc needs to be set.
> 
ok

> 
> > +		goto err;
> > +	}
> > +
> >   	if (setup_command_metadata(ocxlpmem)) {
> >   		dev_err(&pdev->dev, "Could not read OCXL command
> > matada\n");
> >   		goto err;
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > index b953ee522ed4..927690f4888f 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h
> > @@ -103,6 +103,10 @@ struct ocxlpmem {
> >   	struct pci_dev *pdev;
> >   	struct cdev cdev;
> >   	struct ocxl_fn *ocxl_fn;
> > +#define SCM_IRQ_COUNT 2
> > +	int irq_id[SCM_IRQ_COUNT];
> > +	struct dev_pagemap irq_pgmap[SCM_IRQ_COUNT];
> 
> irq_pgmap is not used.

Ok
> 
> 
> > +	void *irq_addr[SCM_IRQ_COUNT];
> >   	struct nd_interleave_set nd_set;
> >   	struct nvdimm_bus_descriptor bus_desc;
> >   	struct nvdimm_bus *nvdimm_bus;
> > @@ -113,6 +117,7 @@ struct ocxlpmem {
> >   	struct command_metadata ns_command;
> >   	struct resource pmem_res;
> >   	struct nd_region *nd_region;
> > +	struct eventfd_ctx *ev_ctx;
> >   	char fw_version[8+1];
> >   	u32 timeouts[ADMIN_COMMAND_MAX+1];
> >   
> > diff --git a/include/uapi/nvdimm/ocxl-pmem.h
> > b/include/uapi/nvdimm/ocxl-pmem.h
> > index add223aa2fdb..988eb0bc413d 100644
> > --- a/include/uapi/nvdimm/ocxl-pmem.h
> > +++ b/include/uapi/nvdimm/ocxl-pmem.h
> > @@ -66,6 +66,20 @@ struct ioctl_ocxl_pmem_controller_stats {
> >   	__u64 cache_write_latency; /* nanoseconds */
> >   };
> >   
> > +struct ioctl_ocxl_pmem_eventfd {
> > +	__s32 eventfd;
> > +	__u32 reserved;
> > +};
> > +
> > +#ifndef BIT_ULL
> > +#define BIT_ULL(nr)	(1ULL << (nr))
> > +#endif
> > +
> > +#define IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE	BIT_ULL
> > (0)
> > +#define IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE	BIT_ULL(1)
> > +#define IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL		BIT_ULL
> > (2)
> > +#define IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL		BIT_ULL
> > (3)
> > +
> 
> I'm not fond of adding a macro with such a generic name as BIT_ULL()
> in 
> a user header file. What's wrong with:
> 
> #define IOCTL_OCXL_PMEM_EVENT_CONTROLLER_DUMP_AVAILABLE	0x1
> #define IOCTL_OCXL_PMEM_EVENT_ERROR_LOG_AVAILABLE	0x2
> #define IOCTL_OCXL_PMEM_EVENT_HARDWARE_FATAL		0x4
> #define IOCTL_OCXL_PMEM_EVENT_FIRMWARE_FATAL		0x8
> 
> 

Nothing, I'll change it.

>    Fred
> 
> 
> >   /* ioctl numbers */
> >   #define OCXL_PMEM_MAGIC 0x5C
> >   /* SCM devices */
> > @@ -74,5 +88,7 @@ struct ioctl_ocxl_pmem_controller_stats {
> >   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(O
> > CXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
> >   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_M
> > AGIC, 0x04)
> >   #define IOCTL_OCXL_PMEM_CONTROLLER_STATS		_IO(OCXL_PMEM_M
> > AGIC, 0x05)
> > +#define IOCTL_OCXL_PMEM_EVENTFD				_IOW(OC
> > XL_PMEM_MAGIC, 0x06, struct ioctl_ocxl_pmem_eventfd)
> > +#define IOCTL_OCXL_PMEM_EVENT_CHECK			_IOR(OC
> > XL_PMEM_MAGIC, 0x07, __u64)
> >   
> >   #endif /* _UAPI_OCXL_SCM_H */
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data
  2020-03-04 11:06     ` Frederic Barrat
@ 2020-03-11  3:38       ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-11  3:38 UTC (permalink / raw)
  To: Frederic Barrat, Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Arnd Bergmann, Greg Kroah-Hartman,
	Andrew Morton, Mauro Carvalho Chehab, David S. Miller,
	Rob Herring, Anton Blanchard, Krzysztof Kozlowski,
	Mahesh Salgaonkar, Madhavan Srinivasan, Cédric Le Goater,
	Anju T Sudhakar, Hari Bathini, Thomas Gleixner, Greg Kurz,
	Nicholas Piggin, Masahiro Yamada, Alexey Kardashevskiy,
	linux-kernel, linuxppc-dev, linux-nvdimm

On Wed, 2020-03-04 at 12:06 +0100, Frederic Barrat wrote:
> 
> Le 28/02/2020 à 07:12, Andrew Donnellan a écrit :
> > On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > > From: Alastair D'Silva <alastair@d-silva.org>
> > > 
> > > When health & performance data is requested from the controller,
> > > it responds with an error log containing the requested
> > > information.
> > > 
> > > This patch allows the request to me issued via an IOCTL.
> > 
> > A better explanation would be good - this IOCTL triggers a request
> > to 
> > the controller to collect controller health/perf data, and the 
> > controller will later respond with an error log that can be picked
> > up 
> > via the error log IOCTL that you've defined earlier.
> 
> And even more precisely (to also check my understanding):
> 
>  > this IOCTL triggers a request to
>  > the controller to collect controller health/perf data, and the
>  > controller will later respond
> 
> by raising an interrupt to let the user app know that
> 
>  > an error log that can be picked up
>  > via the error log IOCTL that you've defined earlier.
> 
> 
> The rest of the patch looks ok to me.
> 
>    Fred

Ok

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics
  2020-03-04  9:25   ` Frederic Barrat
@ 2020-03-12  0:15     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-12  0:15 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Wed, 2020-03-04 at 10:25 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > The controller can report a number of statistics that are useful
> > in evaluating the performance and reliability of the card.
> > 
> > This patch exposes this information via an IOCTL.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c | 185
> > +++++++++++++++++++++
> >   include/uapi/nvdimm/ocxl-pmem.h            |  17 ++
> >   2 files changed, 202 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index 2cabafe1fc58..009d4fd29e7d 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -758,6 +758,186 @@ static int
> > ioctl_controller_dump_complete(struct ocxlpmem *ocxlpmem)
> >   				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COL
> > LECTED);
> >   }
> >   
> > +/**
> > + * controller_stats_header_parse() - Parse the first 64 bits of
> > the controller stats admin command response
> > + * @ocxlpmem: the device metadata
> > + * @length: out, returns the number of bytes in the response
> > (excluding the 64 bit header)
> > + */
> > +static int controller_stats_header_parse(struct ocxlpmem
> > *ocxlpmem,
> > +	u32 *length)
> > +{
> > +	int rc;
> > +	u64 val;
> > +
> 
> unexpected empty line
> 

Ok

> 
> > +	u16 data_identifier;
> > +	u32 data_length;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	data_identifier = val >> 48;
> > +	data_length = val & 0xFFFFFFFF;
> > +
> > +	if (data_identifier != 0x4353) { // 'CS'
> > +		dev_err(&ocxlpmem->dev,
> > +			"Bad data identifier for controller stats,
> > expected 'CS', got '%-.*s'\n",
> > +			2, (char *)&data_identifier);
> 
> 
> Wow, I'm clueless what that string format looks like :-)
> 2 arguments? Did you check the kernel string formatter does what you
> want?
> You may consider unifying the format though, the error log patch uses
> a 
> simpler (better?) format for a similar message.
> 

Sorry, force of habit from my old job where we dealt with a lot of
variable length, non-NULL terminated buffers. FYI - it takes the string
length from the first argument.

I'll change it to a fixed length string like the others :)

> 
> 
> > +		return -EINVAL;
> > +	}
> > +
> > +	*length = data_length;
> > +	return 0;
> > +}
> > +
> > +static int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
> > +				  struct
> > ioctl_ocxl_pmem_controller_stats __user *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_controller_stats args;
> > +	u32 length;
> > +	int rc;
> > +	u64 val;
> > +
> > +	memset(&args, '\0', sizeof(args));
> > +
> > +	mutex_lock(&ocxlpmem->admin_command.lock);
> > +
> > +	rc = admin_command_request(ocxlpmem,
> > ADMIN_COMMAND_CONTROLLER_STATS);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +				      ocxlpmem-
> > >admin_command.request_offset + 0x08,
> > +				      OCXL_LITTLE_ENDIAN, 0);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_execute(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +
> > +	rc = admin_command_complete_timeout(ocxlpmem,
> > +					    ADMIN_COMMAND_CONTROLLER_ST
> > ATS);
> > +	if (rc < 0) {
> > +		dev_warn(&ocxlpmem->dev, "Controller stats timed
> > out\n");
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response(ocxlpmem);
> > +	if (rc < 0)
> > +		goto out;
> > +	if (rc != STATUS_SUCCESS) {
> > +		warn_status(ocxlpmem,
> > +			    "Unexpected status from controller stats",
> > rc);
> > +		goto out;
> > +	}
> 
> All those ioctls commands follow the same pattern:
> 1. admin_command_request()
> 2. optionnaly, set some mmio registers specific to the command
> 3. admin_command_execute()
> 4. admin_command_complete_timeout()
> 5. admin_response()
> 
> By swapping 1 and 2, we could then factorize steps 1, 3, 4 and 5 in
> a 
> function and simplify/shorten the code each time a command is called.
> 
> Regarding step 2 (and that's true for all similar patches), a
> comment 
> about what the mmio tuning does would help and avoid looking up the 
> spec. Looking up the spec during the review is expected, but it will 
> ease reading the code 6 months from now.
> 
> 

I'll rework this and add a wrapper in the Admin Commands patch.

> 
> > +
> > +	rc = controller_stats_header_parse(ocxlpmem, &length);
> > +	if (rc)
> > +		goto out;
> > +
> > +	if (length != 0x140)
> > +		warn_status(ocxlpmem,
> > +			    "Unexpected length for controller stats
> > data, expected 0x140, got 0x%x",
> > +			    length);
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x08,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		goto out;
> > +
> > +	args.reset_count = val >> 32;
> > +	args.reset_uptime = val & 0xFFFFFFFF;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x10,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		goto out;
> > +
> > +	args.power_on_uptime = val >> 32;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x08,
>  > +				     OCXL_LITTLE_ENDIAN,
> &args.host_load_count);
> 
> 
> Those offsets are hard to understand, even with the spec next to me.
> And 
> it seems that we could harden things a bit:
> each block as a "statistics parameter ID" and the length of the data
> for 
> that block. We should check that and make sure we're reading what we
> expect.
> For example, from the spec I'm looking (110d), I would expect the
> host 
> load count to be at offset 0x10. It's entirely possible I'm
> misreading 
> it though.
> 

I'll rework this too.

> 
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x10,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.host_store_count);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x18,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.media_read_count);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x20,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.media_write_count);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x28,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.cache_hit_count);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x30,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.cache_miss_count);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x38,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.media_read_latency);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x40,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.media_write_latency);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x48,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.cache_read_latency);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x50,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.cache_write_latency);
> > +	if (rc)
> > +		goto out;
> > +
> > +	if (copy_to_user(uarg, &args, sizeof(args))) {
> > +		rc = -EFAULT;
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response_handled(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = 0;
> > +	goto out;
> 
> That may be more of a personal habit, but that final goto disrupts
> the 
> "good case" flow. And I think it's pretty unusual within the kernel.
> 

Ok

> 
> > +
> > +out:
> > +	mutex_unlock(&ocxlpmem->admin_command.lock);
> > +	return rc;
> > +}
> > +
> >   static long file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long args)
> >   {
> >   	struct ocxlpmem *ocxlpmem = file->private_data;
> > @@ -781,6 +961,11 @@ static long file_ioctl(struct file *file,
> > unsigned int cmd, unsigned long args)
> >   	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE:
> >   		rc = ioctl_controller_dump_complete(ocxlpmem);
> >   		break;
> > +
> > +	case IOCTL_OCXL_PMEM_CONTROLLER_STATS:
> > +		rc = ioctl_controller_stats(ocxlpmem,
> > +					    (struct
> > ioctl_ocxl_pmem_controller_stats __user *)args);
> > +		break;
> >   	}
> >   
> >   	return rc;
> > diff --git a/include/uapi/nvdimm/ocxl-pmem.h
> > b/include/uapi/nvdimm/ocxl-pmem.h
> > index d4d8512d03f7..add223aa2fdb 100644
> > --- a/include/uapi/nvdimm/ocxl-pmem.h
> > +++ b/include/uapi/nvdimm/ocxl-pmem.h
> > @@ -50,6 +50,22 @@ struct ioctl_ocxl_pmem_controller_dump_data {
> >   	__u64 reserved[8];
> >   };
> >   
> > +struct ioctl_ocxl_pmem_controller_stats {
> > +	__u32 reset_count;
> > +	__u32 reset_uptime; /* seconds */
> > +	__u32 power_on_uptime; /* seconds */
> 
> Same as before, we're going to have some padding here.
> 
>    Fred
> 
Ok

> 
> > +	__u64 host_load_count;
> > +	__u64 host_store_count;
> > +	__u64 media_read_count;
> > +	__u64 media_write_count;
> > +	__u64 cache_hit_count;
> > +	__u64 cache_miss_count;
> > +	__u64 media_read_latency; /* nanoseconds */
> > +	__u64 media_write_latency; /* nanoseconds */
> > +	__u64 cache_read_latency; /* nanoseconds */
> > +	__u64 cache_write_latency; /* nanoseconds */
> > +};
> > +
> >   /* ioctl numbers */
> >   #define OCXL_PMEM_MAGIC 0x5C
> >   /* SCM devices */
> > @@ -57,5 +73,6 @@ struct ioctl_ocxl_pmem_controller_dump_data {
> >   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP			_IO(OCX
> > L_PMEM_MAGIC, 0x02)
> >   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA		_IOWR(O
> > CXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data)
> >   #define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE	_IO(OCXL_PMEM_M
> > AGIC, 0x04)
> > +#define IOCTL_OCXL_PMEM_CONTROLLER_STATS		_IO(OCXL_PMEM_M
> > AGIC, 0x05)
> >   
> >   #endif /* _UAPI_OCXL_SCM_H */
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 23/27] powerpc/powernv/pmem: Add debug IOCTLs
  2020-03-04 15:21   ` Frederic Barrat
@ 2020-03-12  4:24     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-12  4:24 UTC (permalink / raw)
  To: Frederic Barrat
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Andrew Donnellan, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Wed, 2020-03-04 at 16:21 +0100, Frederic Barrat wrote:
> 
> Le 21/02/2020 à 04:27, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > These IOCTLs provide low level access to the card to aid in
> > debugging
> > controller/FPGA firmware.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/Kconfig |   6 +
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c  | 249
> > ++++++++++++++++++++
> >   include/uapi/nvdimm/ocxl-pmem.h             |  32 +++
> >   3 files changed, 287 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/Kconfig
> > b/arch/powerpc/platforms/powernv/pmem/Kconfig
> > index c5d927520920..3f44429d70c9 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/Kconfig
> > +++ b/arch/powerpc/platforms/powernv/pmem/Kconfig
> > @@ -12,4 +12,10 @@ config OCXL_PMEM
> >   
> >   	  Select N if unsure.
> >   
> > +config OCXL_PMEM_DEBUG
> > +	bool "OpenCAPI Persistent Memory debugging"
> > +	depends on OCXL_PMEM
> > +	help
> > +	  Enables low level IOCTLs for OpenCAPI Persistent Memory
> > firmware development
> > +
> >   endif
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index e01f6f9fc180..d4ce5e9e0521 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -1050,6 +1050,235 @@ int req_controller_health_perf(struct
> > ocxlpmem *ocxlpmem)
> >   				      GLOBAL_MMIO_HCI_REQ_HEALTH_PERF);
> >   }
> >   
> > +#ifdef CONFIG_OCXL_PMEM_DEBUG
> > +/**
> > + * enable_fwdebug() - Enable FW debug on the controller
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int enable_fwdebug(const struct ocxlpmem *ocxlpmem)
> > +{
> > +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_HCI,
> > +				      OCXL_LITTLE_ENDIAN,
> > +				      GLOBAL_MMIO_HCI_FW_DEBUG);
> > +}
> > +
> > +/**
> > + * disable_fwdebug() - Disable FW debug on the controller
> > + * @ocxlpmem: the device metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int disable_fwdebug(const struct ocxlpmem *ocxlpmem)
> > +{
> > +	return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu,
> > GLOBAL_MMIO_HCIC,
> > +				      OCXL_LITTLE_ENDIAN,
> > +				      GLOBAL_MMIO_HCI_FW_DEBUG);
> > +}
> > +
> > +static int ioctl_fwdebug(struct ocxlpmem *ocxlpmem,
> > +			     struct ioctl_ocxl_pmem_fwdebug __user
> > *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_fwdebug args;
> > +	u64 val;
> > +	int i;
> > +	int rc;
> > +
> > +	if (copy_from_user(&args, uarg, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	// Buffer size must be a multiple of 8
> > +	if ((args.buf_size & 0x07))
> > +		return -EINVAL;
> > +
> > +	if (args.buf_size > ocxlpmem->admin_command.data_size)
> > +		return -EINVAL;
> > +
> > +	mutex_lock(&ocxlpmem->admin_command.lock);
> > +
> > +	rc = enable_fwdebug(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_FW_DEBUG);
> > +	if (rc)
> > +		goto out;
> > +
> > +	// Write DebugAction & FunctionCode
> > +	val = ((u64)args.debug_action << 56) | ((u64)args.function_code
> > << 40);
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +				      ocxlpmem-
> > >admin_command.request_offset + 0x08,
> > +				      OCXL_LITTLE_ENDIAN, val);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +				      ocxlpmem-
> > >admin_command.request_offset + 0x10,
> > +				      OCXL_LITTLE_ENDIAN,
> > args.debug_parameter_1);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +				      ocxlpmem-
> > >admin_command.request_offset + 0x18,
> > +				      OCXL_LITTLE_ENDIAN,
> > args.debug_parameter_2);
> > +	if (rc)
> > +		goto out;
> > +
> > +	for (i = 0x20; i < 0x38; i += 0x08)
> > +		rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +					      ocxlpmem-
> > >admin_command.request_offset + i,
> > +					      OCXL_LITTLE_ENDIAN, 0);
> > +	if (rc)
> > +		goto out;
> 
> rc is the for loop body. The rc test is not.
> 
Whoops :)

> 
> > +
> > +
> > +	// Populate admin command buffer
> > +	if (args.buf_size) {
> > +		for (i = 0; i < args.buf_size; i += sizeof(u64)) {
> > +			u64 val;
> > +
> > +			if (copy_from_user(&val, &args.buf[i],
> > sizeof(u64)))
> > +				return -EFAULT;
> 
> need to get rc and goto out because of the mutex
> 
Ok

> 
> > +
> > +			rc = ocxl_global_mmio_write64(ocxlpmem-
> > >ocxl_afu,
> > +						      ocxlpmem-
> > >admin_command.data_offset + i,
> > +						      OCXL_HOST_ENDIAN,
> > val);
> > +			if (rc)
> > +				goto out;
> > +		}
> > +	}
> > +
> > +	rc = admin_command_execute(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_complete_timeout(ocxlpmem,
> > +					    ocxlpmem-
> > >timeouts[ADMIN_COMMAND_FW_DEBUG]);
> > +	if (rc < 0)
> > +		goto out;
> > +
> > +	rc = admin_response(ocxlpmem);
> > +	if (rc < 0)
> > +		goto out;
> > +	if (rc != STATUS_SUCCESS) {
> > +		warn_status(ocxlpmem, "Unexpected status from FW
> > Debug", rc);
> > +		goto out;
> > +	}
> > +
> > +	if (args.buf_size) {
> > +		for (i = 0; i < args.buf_size; i += sizeof(u64)) {
> > +			u64 val;
> > +
> > +			rc = ocxl_global_mmio_read64(ocxlpmem-
> > >ocxl_afu,
> > +						     ocxlpmem-
> > >admin_command.data_offset + i,
> > +						     OCXL_HOST_ENDIAN,
> > &val);
> > +			if (rc)
> > +				goto out;
> > +
> > +			if (copy_to_user(&args.buf[i], &val,
> > sizeof(u64))) {
> > +				rc = -EFAULT;
> > +				goto out;
> > +			}
> > +		}
> > +	}
> > +
> > +	rc = admin_response_handled(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = disable_fwdebug(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +out:
> > +	mutex_unlock(&ocxlpmem->admin_command.lock);
> > +	return rc;
> > +}
> > +
> > +static int ioctl_shutdown(struct ocxlpmem *ocxlpmem)
> > +{
> > +	int rc;
> > +
> > +	mutex_lock(&ocxlpmem->admin_command.lock);
> > +
> > +	rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_SHUTDOWN);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_execute(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_complete_timeout(ocxlpmem,
> > ADMIN_COMMAND_SHUTDOWN);
> > +	if (rc < 0) {
> > +		dev_warn(&ocxlpmem->dev, "Shutdown timed out\n");
> > +		goto out;
> > +	}
> > +
> > +	rc = 0;
> > +	goto out;
> 
> We can remove that goto.

Ok

> 
> No admin_response_handled()? Is that shutting down the full adapter
> and 
> we have nobody to talk to? What happens next?
> 

That's an oversight, we should call admin_response_handled().

> 
> > +
> > +out:
> > +	mutex_unlock(&ocxlpmem->admin_command.lock);
> > +	return rc;
> > +}
> > +
> > +static int ioctl_mmio_write(struct ocxlpmem *ocxlpmem,
> > +				struct ioctl_ocxl_pmem_mmio __user
> > *uarg)
> > +{
> > +	struct scm_ioctl_mmio args;
> > +
> > +	if (copy_from_user(&args, uarg, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	return ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > args.address,
> > +					OCXL_LITTLE_ENDIAN, args.val);
> > +}
> > +
> > +static int ioctl_mmio_read(struct ocxlpmem *ocxlpmem,
> > +				     struct ioctl_ocxl_pmem_mmio __user
> > *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_mmio args;
> > +	int rc;
> > +
> > +	if (copy_from_user(&args, uarg, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, args.address,
> > +				     OCXL_LITTLE_ENDIAN, &args.val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	if (copy_to_user(uarg, &args, sizeof(args)))
> > +		return -EFAULT;
> > +
> > +	return 0;
> > +}
> > +#else /* CONFIG_OCXL_PMEM_DEBUG */
> > +static int ioctl_fwdebug(struct ocxlpmem *ocxlpmem,
> > +			     struct ioctl_ocxl_pmem_fwdebug __user
> > *uarg)
> > +{
> > +	return -EPERM;
> > +}
> > +
> > +static int ioctl_shutdown(struct ocxlpmem *ocxlpmem)
> > +{
> > +	return -EPERM;
> > +}
> > +
> > +static int ioctl_mmio_write(struct ocxlpmem *ocxlpmem,
> > +				struct ioctl_ocxl_pmem_mmio __user
> > *uarg)
> > +{
> > +	return -EPERM;
> > +}
> > +
> > +static int ioctl_mmio_read(struct ocxlpmem *ocxlpmem,
> > +			       struct ioctl_ocxl_pmem_mmio __user
> > *uarg)
> > +{
> > +	return -EPERM;
> > +}
> 
> The 'else' clause could be dropped, the ioctls will return EINVAL,
> which 
> is fine, I think.
> 
> 

Ok

> 
> > +#endif /* CONFIG_OCXL_PMEM_DEBUG */
> > +
> >   static long file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long args)
> >   {
> >   	struct ocxlpmem *ocxlpmem = file->private_data;
> > @@ -1091,6 +1320,26 @@ static long file_ioctl(struct file *file,
> > unsigned int cmd, unsigned long args)
> >   	case IOCTL_OCXL_PMEM_REQUEST_HEALTH:
> >   		rc = req_controller_health_perf(ocxlpmem);
> >   		break;
> > +
> > +	case IOCTL_OCXL_PMEM_FWDEBUG:
> > +		rc = ioctl_fwdebug(ocxlpmem,
> > +				   (struct ioctl_ocxl_pmem_fwdebug
> > __user *)args);
> > +		break;
> > +
> > +	case IOCTL_OCXL_PMEM_SHUTDOWN:
> > +		rc = ioctl_shutdown(ocxlpmem);
> > +		break;
> > +
> > +	case IOCTL_OCXL_PMEM_MMIO_WRITE:
> > +		rc = ioctl_mmio_write(ocxlpmem,
> > +				      (struct ioctl_ocxl_pmem_mmio
> > __user *)args);
> > +		break;
> > +
> > +	case IOCTL_OCXL_PMEM_MMIO_READ:
> > +		rc = ioctl_mmio_read(ocxlpmem,
> > +				     (struct ioctl_ocxl_pmem_mmio
> > __user *)args);
> > +		break;
> > +
> >   	}
> >   
> >   	return rc;
> > diff --git a/include/uapi/nvdimm/ocxl-pmem.h
> > b/include/uapi/nvdimm/ocxl-pmem.h
> > index 0d03abb44001..e20a4f8be82a 100644
> > --- a/include/uapi/nvdimm/ocxl-pmem.h
> > +++ b/include/uapi/nvdimm/ocxl-pmem.h
> > @@ -6,6 +6,28 @@
> >   #include <linux/types.h>
> >   #include <linux/ioctl.h>
> >   
> > +enum ocxlpmem_fwdebug_action {
> > +	OCXL_PMEM_FWDEBUG_READ_CONTROLLER_MEMORY = 0x01,
> > +	OCXL_PMEM_FWDEBUG_WRITE_CONTROLLER_MEMORY = 0x02,
> > +	OCXL_PMEM_FWDEBUG_ENABLE_FUNCTION = 0x03,
> > +	OCXL_PMEM_FWDEBUG_DISABLE_FUNCTION = 0x04,
> > +	OCXL_PMEM_FWDEBUG_GET_PEL = 0x05, // Retrieve Persistent Error
> > Log
> > +};
> > +
> > +struct ioctl_ocxl_pmem_buffer_info {
> > +	__u32	admin_command_buffer_size; // out
> > +	__u32	near_storage_buffer_size; // out
> > +};
> > +
> > +struct ioctl_ocxl_pmem_fwdebug { // All args are inputs
> > +	enum ocxlpmem_fwdebug_action debug_action;
> 
> More kernel ABI problems. My interpretation of the "enumeration 
> specifiers" section of C99 is that we can't rely on the size of the
> enum.
> 

Ok

> 
> > +	__u16 function_code;
> > +	__u16 buf_size; // Size of optional data buffer
> > +	__u64 debug_parameter_1;
> > +	__u64 debug_parameter_2;
> > +	__u8 *buf; // Pointer to optional in/out data buffer
> > +};
> > +
> >   #define OCXL_PMEM_ERROR_LOG_ACTION_RESET	(1 << (32-32))
> >   #define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW	(1 << (53-32))
> >   #define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE	(1 << (54-32))
> > @@ -66,6 +88,11 @@ struct ioctl_ocxl_pmem_controller_stats {
> >   	__u64 cache_write_latency; /* nanoseconds */
> >   };
> >   
> > +struct ioctl_ocxl_pmem_mmio {
> > +	__u64 address; /* Offset in global MMIO space */
> > +	__u64 val; /* value to write/was read */
> > +};
> 
> Can we group all the debug data structures together in the header
> file, 
> with a comment indicating that they may not be available in the
> kernel, 
> depending on the config?
> 

Ok

>    Fred
> 
> 
> > +
> >   struct ioctl_ocxl_pmem_eventfd {
> >   	__s32 eventfd;
> >   	__u32 reserved;
> > @@ -92,4 +119,9 @@ struct ioctl_ocxl_pmem_eventfd {
> >   #define IOCTL_OCXL_PMEM_EVENT_CHECK			_IOR(OC
> > XL_PMEM_MAGIC, 0x07, __u64)
> >   #define IOCTL_OCXL_PMEM_REQUEST_HEALTH			_IO(OCX
> > L_PMEM_MAGIC, 0x08)
> >   
> > +#define IOCTL_OCXL_PMEM_FWDEBUG		_IOWR(OCXL_PMEM_MAGIC,
> > 0xf0, struct ioctl_ocxl_pmem_fwdebug)
> > +#define IOCTL_OCXL_PMEM_MMIO_WRITE	_IOW(OCXL_PMEM_MAGIC, 0xf1,
> > struct ioctl_ocxl_pmem_mmio)
> > +#define IOCTL_OCXL_PMEM_MMIO_READ	_IOWR(OCXL_PMEM_MAGIC, 0xf2,
> > struct ioctl_ocxl_pmem_mmio)
> > +#define IOCTL_OCXL_PMEM_SHUTDOWN	_IO(OCXL_PMEM_MAGIC, 0xf3)
> > +
> >   #endif /* _UAPI_OCXL_SCM_H */
> > 
-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 130+ messages in thread

* Re: [PATCH v3 19/27] powerpc/powernv/pmem: Add an IOCTL to report controller statistics
  2020-03-05  0:46   ` Andrew Donnellan
@ 2020-03-12  4:47     ` Alastair D'Silva
  0 siblings, 0 replies; 130+ messages in thread
From: Alastair D'Silva @ 2020-03-12  4:47 UTC (permalink / raw)
  To: Andrew Donnellan
  Cc: Aneesh Kumar K . V, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Frederic Barrat, Arnd Bergmann,
	Greg Kroah-Hartman, Andrew Morton, Mauro Carvalho Chehab,
	David S. Miller, Rob Herring, Anton Blanchard,
	Krzysztof Kozlowski, Mahesh Salgaonkar, Madhavan Srinivasan,
	Cédric Le Goater, Anju T Sudhakar, Hari Bathini,
	Thomas Gleixner, Greg Kurz, Nicholas Piggin, Masahiro Yamada,
	Alexey Kardashevskiy, linux-kernel

On Thu, 2020-03-05 at 11:46 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > The controller can report a number of statistics that are useful
> > in evaluating the performance and reliability of the card.
> > 
> > This patch exposes this information via an IOCTL.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c | 185
> > +++++++++++++++++++++
> >   include/uapi/nvdimm/ocxl-pmem.h            |  17 ++
> >   2 files changed, 202 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index 2cabafe1fc58..009d4fd29e7d 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -758,6 +758,186 @@ static int
> > ioctl_controller_dump_complete(struct ocxlpmem *ocxlpmem)
> >   				    GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COL
> > LECTED);
> >   }
> >   
> > +/**
> > + * controller_stats_header_parse() - Parse the first 64 bits of
> > the controller stats admin command response
> > + * @ocxlpmem: the device metadata
> > + * @length: out, returns the number of bytes in the response
> > (excluding the 64 bit header)
> > + */
> > +static int controller_stats_header_parse(struct ocxlpmem
> > *ocxlpmem,
> > +	u32 *length)
> > +{
> > +	int rc;
> > +	u64 val;
> > +
> > +	u16 data_identifier;
> > +	u32 data_length;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	data_identifier = val >> 48;
> > +	data_length = val & 0xFFFFFFFF;
> > +
> > +	if (data_identifier != 0x4353) { // 'CS'
> > +		dev_err(&ocxlpmem->dev,
> > +			"Bad data identifier for controller stats,
> > expected 'CS', got '%-.*s'\n",
> > +			2, (char *)&data_identifier);
> > +		return -EINVAL;
> 
> Same comment as earlier patches re EINVAL
> 

I don't think I've seen a comment yet on these particular blocks. Can
you suggest a better return value?

> > +	}
> > +
> > +	*length = data_length;
> > +	return 0;
> > +}
> > +
> > +static int ioctl_controller_stats(struct ocxlpmem *ocxlpmem,
> > +				  struct
> > ioctl_ocxl_pmem_controller_stats __user *uarg)
> > +{
> > +	struct ioctl_ocxl_pmem_controller_stats args;
> > +	u32 length;
> > +	int rc;
> > +	u64 val;
> > +
> > +	memset(&args, '\0', sizeof(args));
> > +
> > +	mutex_lock(&ocxlpmem->admin_command.lock);
> > +
> > +	rc = admin_command_request(ocxlpmem,
> > ADMIN_COMMAND_CONTROLLER_STATS);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu,
> > +				      ocxlpmem-
> > >admin_command.request_offset + 0x08,
> > +				      OCXL_LITTLE_ENDIAN, 0);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = admin_command_execute(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +
> > +	rc = admin_command_complete_timeout(ocxlpmem,
> > +					    ADMIN_COMMAND_CONTROLLER_ST
> > ATS);
> > +	if (rc < 0) {
> > +		dev_warn(&ocxlpmem->dev, "Controller stats timed
> > out\n");
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response(ocxlpmem);
> > +	if (rc < 0)
> > +		goto out;
> > +	if (rc != STATUS_SUCCESS) {
> > +		warn_status(ocxlpmem,
> > +			    "Unexpected status from controller stats",
> > rc);
> > +		goto out;
> > +	}
> > +
> > +	rc = controller_stats_header_parse(ocxlpmem, &length);
> > +	if (rc)
> > +		goto out;
> > +
> > +	if (length != 0x140)
> > +		warn_status(ocxlpmem,
> > +			    "Unexpected length for controller stats
> > data, expected 0x140, got 0x%x",
> > +			    length);
> 
> Might be worth a comment to explain where 0x140 comes from (it looks 
> correct from my reading of the spec)

Ok

> 
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x08,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		goto out;
> > +
> > +	args.reset_count = val >> 32;
> > +	args.reset_uptime = val & 0xFFFFFFFF;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x10,
> > +				     OCXL_LITTLE_ENDIAN, &val);
> > +	if (rc)
> > +		goto out;
> > +
> > +	args.power_on_uptime = val >> 32;
> 
> We're not collecting life remaining?
> 

It looks like my implementation is out of date. I'll bring it in line
with the spec.

> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x08,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.host_load_count);
> 
> My reading of the spec says HLC is at +0x10
> 
Ditto

> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x10,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.host_store_count);
> 
> HSC at +0x18
> 
Ditto

> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x18,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.media_read_count);
> 
> MRC is at +0x50
> 
> And you're missing CRU, HLD, HSD
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x20,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.media_write_count);
> 
> MWC at +0x58
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x28,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.cache_hit_count);
> 
> CRHC at +0x90
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x30,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.cache_miss_count);
> 
> This field doesn't seem to exist at all in my copy of the spec
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x38,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.media_read_latency);
> 
> Nor this one
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x40,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.media_write_latency);
> 
> Nor this one
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x48,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.cache_read_latency);
> 
> Nor this one
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu,
> > +				     ocxlpmem-
> > >admin_command.data_offset + 0x08 + 0x40 + 0x50,
> > +				     OCXL_LITTLE_ENDIAN,
> > &args.cache_write_latency);
> 
> Nor this one
> 
> > +	if (rc)
> > +		goto out;
> > +
> > +	if (copy_to_user(uarg, &args, sizeof(args))) {
> > +		rc = -EFAULT;
> > +		goto out;
> > +	}
> > +
> > +	rc = admin_response_handled(ocxlpmem);
> > +	if (rc)
> > +		goto out;
> > +
> > +	rc = 0;
> > +	goto out;
> 
> Per Fred this pattern isn't common in the kernel, but perhaps this
> is 
> just personal taste
> 

Ok

> > +
> > +out:
> > +	mutex_unlock(&ocxlpmem->admin_command.lock);
> > +	return rc;
> > +}
> > +
> >   static long file_ioctl(struct file *file, unsigned int cmd,
> > unsigned long args)
> >   {
> >   	struct ocxlpmem *ocxlpmem = file->private_data;
> > @@ -781,6 +961,11 @@ static long file_ioctl(struct file *file,
> > unsigned int cmd, unsigned long args)
> >   	case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE:
> >   		rc = ioctl_controller_dump_complete(ocxlpmem);
> >   		break;
> > +
> > +	case IOCTL_OCXL_PMEM_CONTROLLER_STATS:
> > +		rc = ioctl_controller_stats(ocxlpmem,
> > +					    (struct
> > ioctl_ocxl_pmem_controller_stats __user *)args);
> > +		break;
> >   	}
> >   
> >   	return rc;
> > diff --git a/include/uapi/nvdimm/ocxl-pmem.h
> > b/include/uapi/nvdimm/ocxl-pmem.h
> > index d4d8512d03f7..add223aa2fdb 100644
> > --- a/include/uapi/nvdimm/ocxl-pmem.h
> > +++ b/include/uapi/nvdimm/ocxl-pmem.h
> > @@ -50,6 +50,22 @@ struct ioctl_ocxl_pmem_controller_dump_data {
> >   	__u64 reserved[8];
> >   };
> >   
> > +struct ioctl_ocxl_pmem_controller_stats {
> > +	__u32 reset_count;
> > +	__u32 reset_uptime; /* seconds */
> > +	__u32 power_on_uptime; /* seconds */
> > +	__u64 host_load_count;
> > +	__u64 host_store_count;
> > +	__u64 media_read_count;
> > +	__u64 media_write_count;
> > +	__u64 cache_hit_count;
> > +	__u64 cache_miss_count;
> > +	__u64 media_read_latency; /* nanoseconds */
> > +	__u64 media_write_latency; /* nanoseconds */
> > +	__u64 cache_read_latency; /* nanoseconds */
> > +	__u64 cache_write_latency; /* nanoseconds */
> > +};
> > +
> >   /* ioctl numbers */
> >   #define OCXL_PMEM_MAGIC 0x5C
> >   /* SCM devices */
> > @@ -57