linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/4] Add DPA->HPA translation to dram & general_media events
@ 2024-04-30 17:28 alison.schofield
  2024-04-30 17:28 ` [PATCH v6 1/4] cxl/trace: Correct DPA field masks for general_media & dram events alison.schofield
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: alison.schofield @ 2024-04-30 17:28 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl, Steven Rostedt, Shiyang Ruan

From: Alison Schofield <alison.schofield@intel.com>


Changes in v6:
- No code changes
- Patch 1: Remove wrong comment about rasdaemon not being available (Jonathan)
- Patch 1: Add Fixes tag (Jonathan)
- Patch 3: Remove blank line in tag block (Jonathan)
- Patch 1,4: Add Reviewed-by tags (Jonathan, Ira)
- Link to v5: https://lore.kernel.org/cover.1714435815.git.alison.schofield@intel.com/

Begin Cover Letter:
Add HPA translations to CXL events: cxl_dram and cxl_general_media

Patch 1: 
The DPA mask fix up patch must precede or be part of this set.
It fixes the issue where upper 32 bits of DPA are truncated.

Patches 2 & 3:
Before adding the new support, do some housekeeping and move related
helpers to the region driver because there is no looking up region
related info without CONFIG_CXL_REGION.

Patch 4:
The new functionality is introduced - cxl_dram & cxl_general_media
events. Lookup and log the DPA->HPA translation along with the
region name and region uuid.

Alison Schofield (4):
  cxl/trace: Correct DPA field masks for general_media & dram events
  cxl/region: Move cxl_dpa_to_region() work to the region driver
  cxl/region: Move cxl_trace_hpa() work to the region driver
  cxl/core: Add region info to cxl_general_media and cxl_dram events

 drivers/cxl/core/core.h   |  14 ++++
 drivers/cxl/core/mbox.c   |  36 ++++++++--
 drivers/cxl/core/memdev.c |  44 -------------
 drivers/cxl/core/region.c | 135 ++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/trace.c  |  91 -------------------------
 drivers/cxl/core/trace.h  |  50 ++++++++++----
 include/linux/cxl-event.h |  10 +++
 7 files changed, 226 insertions(+), 154 deletions(-)


base-commit: 4cece764965020c22cff7665b18a012006359095
-- 
2.37.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 1/4] cxl/trace: Correct DPA field masks for general_media & dram events
  2024-04-30 17:28 [PATCH v6 0/4] Add DPA->HPA translation to dram & general_media events alison.schofield
@ 2024-04-30 17:28 ` alison.schofield
  2024-04-30 17:28 ` [PATCH v6 2/4] cxl/region: Move cxl_dpa_to_region() work to the region driver alison.schofield
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: alison.schofield @ 2024-04-30 17:28 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl, Steven Rostedt, Shiyang Ruan, Jonathan Cameron

From: Alison Schofield <alison.schofield@intel.com>

The length of Physical Address in General Media and DRAM event
records is 64-bit, so the field mask for extracting the DPA should
be 64-bit also, otherwise the trace event reports DPA's with the
upper 32 bits of a DPA address masked off. If users do DPA-to-HPA
translations this could lead to incorrect page retirement decisions.

Use GENMASK_ULL() for CXL_DPA_MASK to get all the DPA address bits.

Tidy up CXL_DPA_FLAGS_MASK by using GENMASK() to only mask the exact
flag bits.

These bits are defined as part of the event record physical address
descriptions of General Media and DRAM events in CXL Spec 3.1
Section 8.2.9.2 Events.

Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Co-developed-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.CAmeron@huawei.com>
---
 drivers/cxl/core/trace.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..7c5cd069f10c 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,8 +253,8 @@ TRACE_EVENT(cxl_generic_event,
  * DRAM Event Record
  * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
  */
-#define CXL_DPA_FLAGS_MASK			0x3F
-#define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
+#define CXL_DPA_FLAGS_MASK			GENMASK(1, 0)
+#define CXL_DPA_MASK				GENMASK_ULL(63, 6)
 
 #define CXL_DPA_VOLATILE			BIT(0)
 #define CXL_DPA_NOT_REPAIRABLE			BIT(1)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 2/4] cxl/region: Move cxl_dpa_to_region() work to the region driver
  2024-04-30 17:28 [PATCH v6 0/4] Add DPA->HPA translation to dram & general_media events alison.schofield
  2024-04-30 17:28 ` [PATCH v6 1/4] cxl/trace: Correct DPA field masks for general_media & dram events alison.schofield
@ 2024-04-30 17:28 ` alison.schofield
  2024-04-30 17:28 ` [PATCH v6 3/4] cxl/region: Move cxl_trace_hpa() " alison.schofield
  2024-04-30 17:28 ` [PATCH v6 4/4] cxl/core: Add region info to cxl_general_media and cxl_dram events alison.schofield
  3 siblings, 0 replies; 8+ messages in thread
From: alison.schofield @ 2024-04-30 17:28 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl, Steven Rostedt, Shiyang Ruan, Jonathan Cameron

From: Alison Schofield <alison.schofield@intel.com>

This helper belongs in the region driver as it is only useful
with CONFIG_CXL_REGION. Add a stub in core.h for when the region
driver is not built.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
---
 drivers/cxl/core/core.h   |  7 +++++++
 drivers/cxl/core/memdev.c | 44 ---------------------------------------
 drivers/cxl/core/region.c | 44 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index bc5a95665aa0..87008505f8a9 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -27,7 +27,14 @@ void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled);
 int cxl_region_init(void);
 void cxl_region_exit(void);
 int cxl_get_poison_by_endpoint(struct cxl_port *port);
+struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
+
 #else
+static inline
+struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
+{
+	return NULL;
+}
 static inline int cxl_get_poison_by_endpoint(struct cxl_port *port)
 {
 	return 0;
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index d4e259f3a7e9..0277726afd04 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -251,50 +251,6 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_trigger_poison_list, CXL);
 
-struct cxl_dpa_to_region_context {
-	struct cxl_region *cxlr;
-	u64 dpa;
-};
-
-static int __cxl_dpa_to_region(struct device *dev, void *arg)
-{
-	struct cxl_dpa_to_region_context *ctx = arg;
-	struct cxl_endpoint_decoder *cxled;
-	u64 dpa = ctx->dpa;
-
-	if (!is_endpoint_decoder(dev))
-		return 0;
-
-	cxled = to_cxl_endpoint_decoder(dev);
-	if (!cxled->dpa_res || !resource_size(cxled->dpa_res))
-		return 0;
-
-	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
-		return 0;
-
-	dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa,
-		dev_name(&cxled->cxld.region->dev));
-
-	ctx->cxlr = cxled->cxld.region;
-
-	return 1;
-}
-
-static struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
-{
-	struct cxl_dpa_to_region_context ctx;
-	struct cxl_port *port;
-
-	ctx = (struct cxl_dpa_to_region_context) {
-		.dpa = dpa,
-	};
-	port = cxlmd->endpoint;
-	if (port && is_cxl_endpoint(port) && cxl_num_decoders_committed(port))
-		device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region);
-
-	return ctx.cxlr;
-}
-
 static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
 {
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5c186e0a39b9..4b227659e3f8 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2679,6 +2679,50 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port)
 	return rc;
 }
 
+struct cxl_dpa_to_region_context {
+	struct cxl_region *cxlr;
+	u64 dpa;
+};
+
+static int __cxl_dpa_to_region(struct device *dev, void *arg)
+{
+	struct cxl_dpa_to_region_context *ctx = arg;
+	struct cxl_endpoint_decoder *cxled;
+	u64 dpa = ctx->dpa;
+
+	if (!is_endpoint_decoder(dev))
+		return 0;
+
+	cxled = to_cxl_endpoint_decoder(dev);
+	if (!cxled->dpa_res || !resource_size(cxled->dpa_res))
+		return 0;
+
+	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
+		return 0;
+
+	dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa,
+		dev_name(&cxled->cxld.region->dev));
+
+	ctx->cxlr = cxled->cxld.region;
+
+	return 1;
+}
+
+struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
+{
+	struct cxl_dpa_to_region_context ctx;
+	struct cxl_port *port;
+
+	ctx = (struct cxl_dpa_to_region_context) {
+		.dpa = dpa,
+	};
+	port = cxlmd->endpoint;
+	if (port && is_cxl_endpoint(port) && cxl_num_decoders_committed(port))
+		device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region);
+
+	return ctx.cxlr;
+}
+
 static struct lock_class_key cxl_pmem_region_key;
 
 static struct cxl_pmem_region *cxl_pmem_region_alloc(struct cxl_region *cxlr)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 3/4] cxl/region: Move cxl_trace_hpa() work to the region driver
  2024-04-30 17:28 [PATCH v6 0/4] Add DPA->HPA translation to dram & general_media events alison.schofield
  2024-04-30 17:28 ` [PATCH v6 1/4] cxl/trace: Correct DPA field masks for general_media & dram events alison.schofield
  2024-04-30 17:28 ` [PATCH v6 2/4] cxl/region: Move cxl_dpa_to_region() work to the region driver alison.schofield
@ 2024-04-30 17:28 ` alison.schofield
  2024-04-30 17:28 ` [PATCH v6 4/4] cxl/core: Add region info to cxl_general_media and cxl_dram events alison.schofield
  3 siblings, 0 replies; 8+ messages in thread
From: alison.schofield @ 2024-04-30 17:28 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl, Steven Rostedt, Shiyang Ruan, Jonathan Cameron

From: Alison Schofield <alison.schofield@intel.com>

This work belongs in the region driver as it is only useful with
CONFIG_CXL_REGION. Add a stub in core.h for when the region driver
is not built.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
---
 drivers/cxl/core/core.h   |  7 +++
 drivers/cxl/core/region.c | 91 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/trace.c  | 91 ---------------------------------------
 drivers/cxl/core/trace.h  |  2 -
 4 files changed, 98 insertions(+), 93 deletions(-)

diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 87008505f8a9..625394486459 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -28,8 +28,15 @@ int cxl_region_init(void);
 void cxl_region_exit(void);
 int cxl_get_poison_by_endpoint(struct cxl_port *port);
 struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
+u64 cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
+		  u64 dpa);
 
 #else
+static inline u64
+cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, u64 dpa)
+{
+	return ULLONG_MAX;
+}
 static inline
 struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
 {
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 4b227659e3f8..45eb9c560fd6 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2723,6 +2723,97 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
 	return ctx.cxlr;
 }
 
+static bool cxl_is_hpa_in_range(u64 hpa, struct cxl_region *cxlr, int pos)
+{
+	struct cxl_region_params *p = &cxlr->params;
+	int gran = p->interleave_granularity;
+	int ways = p->interleave_ways;
+	u64 offset;
+
+	/* Is the hpa within this region at all */
+	if (hpa < p->res->start || hpa > p->res->end) {
+		dev_dbg(&cxlr->dev,
+			"Addr trans fail: hpa 0x%llx not in region\n", hpa);
+		return false;
+	}
+
+	/* Is the hpa in an expected chunk for its pos(-ition) */
+	offset = hpa - p->res->start;
+	offset = do_div(offset, gran * ways);
+	if ((offset >= pos * gran) && (offset < (pos + 1) * gran))
+		return true;
+
+	dev_dbg(&cxlr->dev,
+		"Addr trans fail: hpa 0x%llx not in expected chunk\n", hpa);
+
+	return false;
+}
+
+static u64 cxl_dpa_to_hpa(u64 dpa,  struct cxl_region *cxlr,
+			  struct cxl_endpoint_decoder *cxled)
+{
+	u64 dpa_offset, hpa_offset, bits_upper, mask_upper, hpa;
+	struct cxl_region_params *p = &cxlr->params;
+	int pos = cxled->pos;
+	u16 eig = 0;
+	u8 eiw = 0;
+
+	ways_to_eiw(p->interleave_ways, &eiw);
+	granularity_to_eig(p->interleave_granularity, &eig);
+
+	/*
+	 * The device position in the region interleave set was removed
+	 * from the offset at HPA->DPA translation. To reconstruct the
+	 * HPA, place the 'pos' in the offset.
+	 *
+	 * The placement of 'pos' in the HPA is determined by interleave
+	 * ways and granularity and is defined in the CXL Spec 3.0 Section
+	 * 8.2.4.19.13 Implementation Note: Device Decode Logic
+	 */
+
+	/* Remove the dpa base */
+	dpa_offset = dpa - cxl_dpa_resource_start(cxled);
+
+	mask_upper = GENMASK_ULL(51, eig + 8);
+
+	if (eiw < 8) {
+		hpa_offset = (dpa_offset & mask_upper) << eiw;
+		hpa_offset |= pos << (eig + 8);
+	} else {
+		bits_upper = (dpa_offset & mask_upper) >> (eig + 8);
+		bits_upper = bits_upper * 3;
+		hpa_offset = ((bits_upper << (eiw - 8)) + pos) << (eig + 8);
+	}
+
+	/* The lower bits remain unchanged */
+	hpa_offset |= dpa_offset & GENMASK_ULL(eig + 7, 0);
+
+	/* Apply the hpa_offset to the region base address */
+	hpa = hpa_offset + p->res->start;
+
+	if (!cxl_is_hpa_in_range(hpa, cxlr, cxled->pos))
+		return ULLONG_MAX;
+
+	return hpa;
+}
+
+u64 cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
+		  u64 dpa)
+{
+	struct cxl_region_params *p = &cxlr->params;
+	struct cxl_endpoint_decoder *cxled = NULL;
+
+	for (int i = 0; i <  p->nr_targets; i++) {
+		cxled = p->targets[i];
+		if (cxlmd == cxled_to_memdev(cxled))
+			break;
+	}
+	if (!cxled || cxlmd != cxled_to_memdev(cxled))
+		return ULLONG_MAX;
+
+	return cxl_dpa_to_hpa(dpa, cxlr, cxled);
+}
+
 static struct lock_class_key cxl_pmem_region_key;
 
 static struct cxl_pmem_region *cxl_pmem_region_alloc(struct cxl_region *cxlr)
diff --git a/drivers/cxl/core/trace.c b/drivers/cxl/core/trace.c
index d0403dc3c8ab..7f2a9dd0d0e3 100644
--- a/drivers/cxl/core/trace.c
+++ b/drivers/cxl/core/trace.c
@@ -6,94 +6,3 @@
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
-
-static bool cxl_is_hpa_in_range(u64 hpa, struct cxl_region *cxlr, int pos)
-{
-	struct cxl_region_params *p = &cxlr->params;
-	int gran = p->interleave_granularity;
-	int ways = p->interleave_ways;
-	u64 offset;
-
-	/* Is the hpa within this region at all */
-	if (hpa < p->res->start || hpa > p->res->end) {
-		dev_dbg(&cxlr->dev,
-			"Addr trans fail: hpa 0x%llx not in region\n", hpa);
-		return false;
-	}
-
-	/* Is the hpa in an expected chunk for its pos(-ition) */
-	offset = hpa - p->res->start;
-	offset = do_div(offset, gran * ways);
-	if ((offset >= pos * gran) && (offset < (pos + 1) * gran))
-		return true;
-
-	dev_dbg(&cxlr->dev,
-		"Addr trans fail: hpa 0x%llx not in expected chunk\n", hpa);
-
-	return false;
-}
-
-static u64 cxl_dpa_to_hpa(u64 dpa,  struct cxl_region *cxlr,
-			  struct cxl_endpoint_decoder *cxled)
-{
-	u64 dpa_offset, hpa_offset, bits_upper, mask_upper, hpa;
-	struct cxl_region_params *p = &cxlr->params;
-	int pos = cxled->pos;
-	u16 eig = 0;
-	u8 eiw = 0;
-
-	ways_to_eiw(p->interleave_ways, &eiw);
-	granularity_to_eig(p->interleave_granularity, &eig);
-
-	/*
-	 * The device position in the region interleave set was removed
-	 * from the offset at HPA->DPA translation. To reconstruct the
-	 * HPA, place the 'pos' in the offset.
-	 *
-	 * The placement of 'pos' in the HPA is determined by interleave
-	 * ways and granularity and is defined in the CXL Spec 3.0 Section
-	 * 8.2.4.19.13 Implementation Note: Device Decode Logic
-	 */
-
-	/* Remove the dpa base */
-	dpa_offset = dpa - cxl_dpa_resource_start(cxled);
-
-	mask_upper = GENMASK_ULL(51, eig + 8);
-
-	if (eiw < 8) {
-		hpa_offset = (dpa_offset & mask_upper) << eiw;
-		hpa_offset |= pos << (eig + 8);
-	} else {
-		bits_upper = (dpa_offset & mask_upper) >> (eig + 8);
-		bits_upper = bits_upper * 3;
-		hpa_offset = ((bits_upper << (eiw - 8)) + pos) << (eig + 8);
-	}
-
-	/* The lower bits remain unchanged */
-	hpa_offset |= dpa_offset & GENMASK_ULL(eig + 7, 0);
-
-	/* Apply the hpa_offset to the region base address */
-	hpa = hpa_offset + p->res->start;
-
-	if (!cxl_is_hpa_in_range(hpa, cxlr, cxled->pos))
-		return ULLONG_MAX;
-
-	return hpa;
-}
-
-u64 cxl_trace_hpa(struct cxl_region *cxlr, struct cxl_memdev *cxlmd,
-		  u64 dpa)
-{
-	struct cxl_region_params *p = &cxlr->params;
-	struct cxl_endpoint_decoder *cxled = NULL;
-
-	for (int i = 0; i <  p->nr_targets; i++) {
-		cxled = p->targets[i];
-		if (cxlmd == cxled_to_memdev(cxled))
-			break;
-	}
-	if (!cxled || cxlmd != cxled_to_memdev(cxled))
-		return ULLONG_MAX;
-
-	return cxl_dpa_to_hpa(dpa, cxlr, cxled);
-}
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 7c5cd069f10c..e303e618aa05 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -642,8 +642,6 @@ TRACE_EVENT(cxl_memory_module,
 #define cxl_poison_overflow(flags, time)				\
 	(flags & CXL_POISON_FLAG_OVERFLOW ? le64_to_cpu(time) : 0)
 
-u64 cxl_trace_hpa(struct cxl_region *cxlr, struct cxl_memdev *memdev, u64 dpa);
-
 TRACE_EVENT(cxl_poison,
 
 	TP_PROTO(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 4/4] cxl/core: Add region info to cxl_general_media and cxl_dram events
  2024-04-30 17:28 [PATCH v6 0/4] Add DPA->HPA translation to dram & general_media events alison.schofield
                   ` (2 preceding siblings ...)
  2024-04-30 17:28 ` [PATCH v6 3/4] cxl/region: Move cxl_trace_hpa() " alison.schofield
@ 2024-04-30 17:28 ` alison.schofield
  2024-05-03 18:05   ` Dan Williams
  3 siblings, 1 reply; 8+ messages in thread
From: alison.schofield @ 2024-04-30 17:28 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl, Steven Rostedt, Shiyang Ruan, Jonathan Cameron

From: Alison Schofield <alison.schofield@intel.com>

User space may need to know which region, if any, maps the DPAs
(device physical addresses) reported in a cxl_general_media or
cxl_dram event. Since the mapping can change, the kernel provides
this information at the time the event occurs. This informs user
space that at event <timestamp> this <region> mapped this <DPA>
to this <HPA>.

Add the same region info that is included in the cxl_poison trace
event: the DPA->HPA translation, region name, and region uuid.

The new fields are inserted in the trace event and no existing
fields are modified. If the DPA is not mapped, user will see:
hpa=ULLONG_MAX, region="", and uuid=0

This work must be protected by dpa_rwsem & region_rwsem since
it is looking up region mappings.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
---
 drivers/cxl/core/mbox.c   | 36 ++++++++++++++++++++++++++------
 drivers/cxl/core/trace.h  | 44 +++++++++++++++++++++++++++++++--------
 include/linux/cxl-event.h | 10 +++++++++
 3 files changed, 75 insertions(+), 15 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 9adda4795eb7..df0fc2a4570f 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -842,14 +842,38 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 			    enum cxl_event_type event_type,
 			    const uuid_t *uuid, union cxl_event *evt)
 {
-	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
-		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
-	else if (event_type == CXL_CPER_EVENT_DRAM)
-		trace_cxl_dram(cxlmd, type, &evt->dram);
-	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
+	if (event_type == CXL_CPER_EVENT_MEM_MODULE) {
 		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
-	else
+		return;
+	}
+	if (event_type == CXL_CPER_EVENT_GENERIC) {
 		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
+		return;
+	}
+
+	if (trace_cxl_general_media_enabled() || trace_cxl_dram_enabled()) {
+		u64 dpa, hpa = ULLONG_MAX;
+		struct cxl_region *cxlr;
+
+		/*
+		 * These trace points are annotated with HPA and region
+		 * translations. Take topology mutation locks and lookup
+		 * { HPA, REGION } from { DPA, MEMDEV } in the event record.
+		 */
+		guard(rwsem_read)(&cxl_region_rwsem);
+		guard(rwsem_read)(&cxl_dpa_rwsem);
+
+		dpa = le64_to_cpu(evt->common.phys_addr) & CXL_DPA_MASK;
+		cxlr = cxl_dpa_to_region(cxlmd, dpa);
+		if (cxlr)
+			hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
+
+		if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
+			trace_cxl_general_media(cxlmd, type, cxlr, hpa,
+						&evt->gen_media);
+		else if (event_type == CXL_CPER_EVENT_DRAM)
+			trace_cxl_dram(cxlmd, type, cxlr, hpa, &evt->dram);
+	}
 }
 EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
 
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e303e618aa05..07a0394b1d99 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -316,9 +316,9 @@ TRACE_EVENT(cxl_generic_event,
 TRACE_EVENT(cxl_general_media,
 
 	TP_PROTO(const struct cxl_memdev *cxlmd, enum cxl_event_log_type log,
-		 struct cxl_event_gen_media *rec),
+		 struct cxl_region *cxlr, u64 hpa, struct cxl_event_gen_media *rec),
 
-	TP_ARGS(cxlmd, log, rec),
+	TP_ARGS(cxlmd, log, cxlr, hpa, rec),
 
 	TP_STRUCT__entry(
 		CXL_EVT_TP_entry
@@ -330,10 +330,13 @@ TRACE_EVENT(cxl_general_media,
 		__field(u8, channel)
 		__field(u32, device)
 		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
-		__field(u16, validity_flags)
 		/* Following are out of order to pack trace record */
+		__field(u64, hpa)
+		__field_struct(uuid_t, region_uuid)
+		__field(u16, validity_flags)
 		__field(u8, rank)
 		__field(u8, dpa_flags)
+		__string(region_name, cxlr ? dev_name(&cxlr->dev) : "")
 	),
 
 	TP_fast_assign(
@@ -354,18 +357,28 @@ TRACE_EVENT(cxl_general_media,
 		memcpy(__entry->comp_id, &rec->component_id,
 			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
 		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
+		__entry->hpa = hpa;
+		if (cxlr) {
+			__assign_str(region_name, dev_name(&cxlr->dev));
+			uuid_copy(&__entry->region_uuid, &cxlr->params.uuid);
+		} else {
+			__assign_str(region_name, "");
+			uuid_copy(&__entry->region_uuid, &uuid_null);
+		}
 	),
 
 	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
 		"descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
-		"device=%x comp_id=%s validity_flags='%s'",
+		"device=%x comp_id=%s validity_flags='%s' " \
+		"hpa=%llx region=%s region_uuid=%pUb",
 		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
 		show_event_desc_flags(__entry->descriptor),
 		show_mem_event_type(__entry->type),
 		show_trans_type(__entry->transaction_type),
 		__entry->channel, __entry->rank, __entry->device,
 		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
-		show_valid_flags(__entry->validity_flags)
+		show_valid_flags(__entry->validity_flags),
+		__entry->hpa, __get_str(region_name), &__entry->region_uuid
 	)
 );
 
@@ -400,9 +413,9 @@ TRACE_EVENT(cxl_general_media,
 TRACE_EVENT(cxl_dram,
 
 	TP_PROTO(const struct cxl_memdev *cxlmd, enum cxl_event_log_type log,
-		 struct cxl_event_dram *rec),
+		 struct cxl_region *cxlr, u64 hpa, struct cxl_event_dram *rec),
 
-	TP_ARGS(cxlmd, log, rec),
+	TP_ARGS(cxlmd, log, cxlr, hpa, rec),
 
 	TP_STRUCT__entry(
 		CXL_EVT_TP_entry
@@ -417,10 +430,13 @@ TRACE_EVENT(cxl_dram,
 		__field(u32, nibble_mask)
 		__field(u32, row)
 		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
+		__field(u64, hpa)
+		__field_struct(uuid_t, region_uuid)
 		__field(u8, rank)	/* Out of order to pack trace record */
 		__field(u8, bank_group)	/* Out of order to pack trace record */
 		__field(u8, bank)	/* Out of order to pack trace record */
 		__field(u8, dpa_flags)	/* Out of order to pack trace record */
+		__string(region_name, cxlr ? dev_name(&cxlr->dev) : "")
 	),
 
 	TP_fast_assign(
@@ -444,12 +460,21 @@ TRACE_EVENT(cxl_dram,
 		__entry->column = get_unaligned_le16(rec->column);
 		memcpy(__entry->cor_mask, &rec->correction_mask,
 			CXL_EVENT_DER_CORRECTION_MASK_SIZE);
+		__entry->hpa = hpa;
+		if (cxlr) {
+			__assign_str(region_name, dev_name(&cxlr->dev));
+			uuid_copy(&__entry->region_uuid, &cxlr->params.uuid);
+		} else {
+			__assign_str(region_name, "");
+			uuid_copy(&__entry->region_uuid, &uuid_null);
+		}
 	),
 
 	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' descriptor='%s' type='%s' " \
 		"transaction_type='%s' channel=%u rank=%u nibble_mask=%x " \
 		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
-		"validity_flags='%s'",
+		"validity_flags='%s' " \
+		"hpa=%llx region=%s region_uuid=%pUb",
 		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
 		show_event_desc_flags(__entry->descriptor),
 		show_mem_event_type(__entry->type),
@@ -458,7 +483,8 @@ TRACE_EVENT(cxl_dram,
 		__entry->bank_group, __entry->bank,
 		__entry->row, __entry->column,
 		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
-		show_dram_valid_flags(__entry->validity_flags)
+		show_dram_valid_flags(__entry->validity_flags),
+		__entry->hpa, __get_str(region_name), &__entry->region_uuid
 	)
 );
 
diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
index 03fa6d50d46f..5342755777cc 100644
--- a/include/linux/cxl-event.h
+++ b/include/linux/cxl-event.h
@@ -91,11 +91,21 @@ struct cxl_event_mem_module {
 	u8 reserved[0x3d];
 } __packed;
 
+/*
+ * General Media or DRAM Event Common Fields
+ * - provides common access to phys_addr
+ */
+struct cxl_event_common {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+} __packed;
+
 union cxl_event {
 	struct cxl_event_generic generic;
 	struct cxl_event_gen_media gen_media;
 	struct cxl_event_dram dram;
 	struct cxl_event_mem_module mem_module;
+	struct cxl_event_common common;
 } __packed;
 
 /*
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 4/4] cxl/core: Add region info to cxl_general_media and cxl_dram events
  2024-04-30 17:28 ` [PATCH v6 4/4] cxl/core: Add region info to cxl_general_media and cxl_dram events alison.schofield
@ 2024-05-03 18:05   ` Dan Williams
  2024-05-03 18:08     ` Dan Williams
  2024-05-08 16:54     ` Alison Schofield
  0 siblings, 2 replies; 8+ messages in thread
From: Dan Williams @ 2024-05-03 18:05 UTC (permalink / raw)
  To: alison.schofield, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl, Steven Rostedt, Shiyang Ruan, Jonathan Cameron

alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> User space may need to know which region, if any, maps the DPAs
> (device physical addresses) reported in a cxl_general_media or
> cxl_dram event. Since the mapping can change, the kernel provides
> this information at the time the event occurs. This informs user
> space that at event <timestamp> this <region> mapped this <DPA>
> to this <HPA>.
> 
> Add the same region info that is included in the cxl_poison trace
> event: the DPA->HPA translation, region name, and region uuid.
> 
> The new fields are inserted in the trace event and no existing
> fields are modified. If the DPA is not mapped, user will see:
> hpa=ULLONG_MAX, region="", and uuid=0
> 
> This work must be protected by dpa_rwsem & region_rwsem since
> it is looking up region mappings.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
[..]
> diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
> index 03fa6d50d46f..5342755777cc 100644
> --- a/include/linux/cxl-event.h
> +++ b/include/linux/cxl-event.h
> @@ -91,11 +91,21 @@ struct cxl_event_mem_module {
>  	u8 reserved[0x3d];
>  } __packed;
>  
> +/*
> + * General Media or DRAM Event Common Fields
> + * - provides common access to phys_addr
> + */
> +struct cxl_event_common {
> +	struct cxl_event_record_hdr hdr;
> +	__le64 phys_addr;
> +} __packed;
> +
>  union cxl_event {
>  	struct cxl_event_generic generic;
>  	struct cxl_event_gen_media gen_media;
>  	struct cxl_event_dram dram;
>  	struct cxl_event_mem_module mem_module;
> +	struct cxl_event_common common;

I think we should rename this. As I was doing one last once-over my
brain went "wait, there's no physical address in the Common Event
Record".

So at a minimum call this "media_hdr" to match the style of
cxl_event_record_hdr.

Now I say "minimum" because 'struct cxl_event_gen_media' and 'struct
cxl_event_dram' should be including that struct in their definition to
make it impossible that they have different definitions. I.e. I am not a
fan of creating a new 'struct cxl_event_media_hdr' that is only consumed
by 'union cxl_event'.

The "maximum" would be to combine these definitions into one common
flexible format to maximize shared definitions, something like the
below, but the thrash it would cause is probably on the same order as
just adding a 'struct cxl_event_media_hdr media_hdr' member to existing
'struct cxl_event_gen_media' and 'struct cxl_event_dram', fixup all the
users and call it a day,

struct cxl_event_media {
        struct cxl_event_record_hdr hdr;
        struct_group_tagged(cxl_event_media_hdr, media_hdr,
                __le64 phys_addr,
                u8 descriptor,
                u8 type,
                u8 transaction_type,
                u8 validity_flags[2],
                u8 channel,
                u8 rank,
        };
        union { 
                struct_group(general,
                        u8 device[3],
                        u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE],
                        u8 gen_reserved[46],
                );
                struct_group(media,
                        u8 nibble_mask[3],
                        u8 bank_group,
                        u8 bank,
                        u8 row[3],
                        u8 column[2],
                        u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE],
                        u8 dram_reserved[0x17],
                );
        };
} __packed;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 4/4] cxl/core: Add region info to cxl_general_media and cxl_dram events
  2024-05-03 18:05   ` Dan Williams
@ 2024-05-03 18:08     ` Dan Williams
  2024-05-08 16:54     ` Alison Schofield
  1 sibling, 0 replies; 8+ messages in thread
From: Dan Williams @ 2024-05-03 18:08 UTC (permalink / raw)
  To: Dan Williams, alison.schofield, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, Vishal Verma, Ira Weiny
  Cc: linux-cxl, Steven Rostedt, Shiyang Ruan, Jonathan Cameron

Dan Williams wrote:
[..]
> The "maximum" would be to combine these definitions into one common
> flexible format to maximize shared definitions, something like the
> below, but the thrash it would cause is probably on the same order as
> just adding a 'struct cxl_event_media_hdr media_hdr' member to existing
> 'struct cxl_event_gen_media' and 'struct cxl_event_dram', fixup all the
> users and call it a day,
> 
> struct cxl_event_media {
>         struct cxl_event_record_hdr hdr;
>         struct_group_tagged(cxl_event_media_hdr, media_hdr,
>                 __le64 phys_addr,
>                 u8 descriptor,
>                 u8 type,
>                 u8 transaction_type,
>                 u8 validity_flags[2],
>                 u8 channel,
>                 u8 rank,
>         };
>         union { 
>                 struct_group(general,
>                         u8 device[3],
>                         u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE],
>                         u8 gen_reserved[46],
>                 );
>                 struct_group(media,

...should be "dram", but you get the idea.

>                         u8 nibble_mask[3],
>                         u8 bank_group,
>                         u8 bank,
>                         u8 row[3],
>                         u8 column[2],
>                         u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE],
>                         u8 dram_reserved[0x17],
>                 );
>         };
> } __packed;



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 4/4] cxl/core: Add region info to cxl_general_media and cxl_dram events
  2024-05-03 18:05   ` Dan Williams
  2024-05-03 18:08     ` Dan Williams
@ 2024-05-08 16:54     ` Alison Schofield
  1 sibling, 0 replies; 8+ messages in thread
From: Alison Schofield @ 2024-05-08 16:54 UTC (permalink / raw)
  To: Dan Williams
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, linux-cxl, Steven Rostedt, Shiyang Ruan,
	Jonathan Cameron

On Fri, May 03, 2024 at 11:05:29AM -0700, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > User space may need to know which region, if any, maps the DPAs
> > (device physical addresses) reported in a cxl_general_media or
> > cxl_dram event. Since the mapping can change, the kernel provides
> > this information at the time the event occurs. This informs user
> > space that at event <timestamp> this <region> mapped this <DPA>
> > to this <HPA>.
> > 
> > Add the same region info that is included in the cxl_poison trace
> > event: the DPA->HPA translation, region name, and region uuid.
> > 
> > The new fields are inserted in the trace event and no existing
> > fields are modified. If the DPA is not mapped, user will see:
> > hpa=ULLONG_MAX, region="", and uuid=0
> > 
> > This work must be protected by dpa_rwsem & region_rwsem since
> > it is looking up region mappings.
> > 
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
> [..]
> > diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
> > index 03fa6d50d46f..5342755777cc 100644
> > --- a/include/linux/cxl-event.h
> > +++ b/include/linux/cxl-event.h
> > @@ -91,11 +91,21 @@ struct cxl_event_mem_module {
> >  	u8 reserved[0x3d];
> >  } __packed;
> >  
> > +/*
> > + * General Media or DRAM Event Common Fields
> > + * - provides common access to phys_addr
> > + */
> > +struct cxl_event_common {
> > +	struct cxl_event_record_hdr hdr;
> > +	__le64 phys_addr;
> > +} __packed;
> > +
> >  union cxl_event {
> >  	struct cxl_event_generic generic;
> >  	struct cxl_event_gen_media gen_media;
> >  	struct cxl_event_dram dram;
> >  	struct cxl_event_mem_module mem_module;
> > +	struct cxl_event_common common;
> 
> I think we should rename this. As I was doing one last once-over my
> brain went "wait, there's no physical address in the Common Event
> Record".
> 
> So at a minimum call this "media_hdr" to match the style of
> cxl_event_record_hdr.
> 
> Now I say "minimum" because 'struct cxl_event_gen_media' and 'struct
> cxl_event_dram' should be including that struct in their definition to
> make it impossible that they have different definitions. I.e. I am not a
> fan of creating a new 'struct cxl_event_media_hdr' that is only consumed
> by 'union cxl_event'.
> 
> The "maximum" would be to combine these definitions into one common
> flexible format to maximize shared definitions, something like the
> below, but the thrash it would cause is probably on the same order as
> just adding a 'struct cxl_event_media_hdr media_hdr' member to existing
> 'struct cxl_event_gen_media' and 'struct cxl_event_dram', fixup all the
> users and call it a day,

I've added this to my queue to fixup. 
Thanks for the review.

-- Alison

> 
> struct cxl_event_media {
>         struct cxl_event_record_hdr hdr;
>         struct_group_tagged(cxl_event_media_hdr, media_hdr,
>                 __le64 phys_addr,
>                 u8 descriptor,
>                 u8 type,
>                 u8 transaction_type,
>                 u8 validity_flags[2],
>                 u8 channel,
>                 u8 rank,
>         };
>         union { 
>                 struct_group(general,
>                         u8 device[3],
>                         u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE],
>                         u8 gen_reserved[46],
>                 );
>                 struct_group(media,
>                         u8 nibble_mask[3],
>                         u8 bank_group,
>                         u8 bank,
>                         u8 row[3],
>                         u8 column[2],
>                         u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE],
>                         u8 dram_reserved[0x17],
>                 );
>         };
> } __packed;

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-08 16:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-30 17:28 [PATCH v6 0/4] Add DPA->HPA translation to dram & general_media events alison.schofield
2024-04-30 17:28 ` [PATCH v6 1/4] cxl/trace: Correct DPA field masks for general_media & dram events alison.schofield
2024-04-30 17:28 ` [PATCH v6 2/4] cxl/region: Move cxl_dpa_to_region() work to the region driver alison.schofield
2024-04-30 17:28 ` [PATCH v6 3/4] cxl/region: Move cxl_trace_hpa() " alison.schofield
2024-04-30 17:28 ` [PATCH v6 4/4] cxl/core: Add region info to cxl_general_media and cxl_dram events alison.schofield
2024-05-03 18:05   ` Dan Williams
2024-05-03 18:08     ` Dan Williams
2024-05-08 16:54     ` Alison Schofield

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).