All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] cxl/core: Hold region_rwsem during poison ops
@ 2023-11-27  0:09 alison.schofield
  2023-11-27  0:09 ` [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists alison.schofield
  2023-11-27  0:09 ` [PATCH 2/2] cxl/memdev: Hold region_rwsem during inject and clear poison ops alison.schofield
  0 siblings, 2 replies; 8+ messages in thread
From: alison.schofield @ 2023-11-27  0:09 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

From: Alison Schofield <alison.schofield@intel.com>

This work was previously posted as a single patch [1] and is reposted
here split into 2 patches. The first patch fixes up the poison list
and the second patch the poison inject and clear.

They both are cleaning up locking issues exposed by:
Commit 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper")
added a lockdep_assert_held() to make sure all callers hold the
region state stable while doing work that depends on the number
of committed decoders.

Changes since that single patch posting:
- Put back the _interruptible() on down_reads (DavidLohr, Dan)
- Split into 2 patches (Dan)
- Add impact statements in commit logs (Dan)

[1] https://lore.kernel.org/linux-cxl/20231114025342.1123681-1-alison.schofield@intel.com/

Alison Schofield (2):
  cxl/core: Always hold region_rwsem while reading poison lists
  cxl/memdev: Hold region_rwsem during inject and clear poison ops

 drivers/cxl/core/memdev.c | 27 ++++++++++++++++++++++++---
 drivers/cxl/core/region.c |  5 -----
 2 files changed, 24 insertions(+), 8 deletions(-)


base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
-- 
2.37.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists
  2023-11-27  0:09 [PATCH 0/2] cxl/core: Hold region_rwsem during poison ops alison.schofield
@ 2023-11-27  0:09 ` alison.schofield
  2023-11-27 17:40   ` Dave Jiang
                     ` (2 more replies)
  2023-11-27  0:09 ` [PATCH 2/2] cxl/memdev: Hold region_rwsem during inject and clear poison ops alison.schofield
  1 sibling, 3 replies; 8+ messages in thread
From: alison.schofield @ 2023-11-27  0:09 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

From: Alison Schofield <alison.schofield@intel.com>

A read of a device poison list is triggered via a sysfs attribute
and the results are logged as kernel trace events of type cxl_poison.
The work is managed by either: a) the region driver when one of more
regions map the device, or by b) the memdev driver when no regions
map the device.

In the case of a) the region driver holds the region_rwsem while
reading the poison by committed endpoint decoder mappings and for
any unmapped resources. This makes sure that the cxl_poison trace
event trace reports valid region info. (Region name, HPA, and UUID).

In the case of b) the memdev driver holds the dpa_rwsem preventing
new DPA resources from being attached to a region. However, it leaves
a gap between region attach and decoder commit actions. If a DPA in
the gap is in the poison list, the cxl_poison trace event will omit
the region info.

Close the gap by holding the region_rwsem and the dpa_rwsem when
reading poison per memdev. Since both methods now hold both locks,
down_read both from the caller. Doing so also addresses the lockdep
assert that found this issue:
Commit 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper")

Fixes: 7ff6ad107588 ("cxl/memdev: Add trigger_poison_list sysfs attribute")
Fixes: f0832a586396 ("cxl/region: Provide region info to the cxl_poison trace event")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
 drivers/cxl/core/memdev.c | 9 ++++++++-
 drivers/cxl/core/region.c | 5 -----
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index fc5c2b414793..5ad1b13e780a 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -227,10 +227,16 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
 	if (!port || !is_cxl_endpoint(port))
 		return -EINVAL;
 
-	rc = down_read_interruptible(&cxl_dpa_rwsem);
+	rc = down_read_interruptible(&cxl_region_rwsem);
 	if (rc)
 		return rc;
 
+	rc = down_read_interruptible(&cxl_dpa_rwsem);
+	if (rc) {
+		up_read(&cxl_region_rwsem);
+		return rc;
+	}
+
 	if (cxl_num_decoders_committed(port) == 0) {
 		/* No regions mapped to this memdev */
 		rc = cxl_get_poison_by_memdev(cxlmd);
@@ -239,6 +245,7 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
 		rc =  cxl_get_poison_by_endpoint(port);
 	}
 	up_read(&cxl_dpa_rwsem);
+	up_read(&cxl_region_rwsem);
 
 	return rc;
 }
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 56e575c79bb4..3e817a6f94c6 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2467,10 +2467,6 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port)
 	struct cxl_poison_context ctx;
 	int rc = 0;
 
-	rc = down_read_interruptible(&cxl_region_rwsem);
-	if (rc)
-		return rc;
-
 	ctx = (struct cxl_poison_context) {
 		.port = port
 	};
@@ -2480,7 +2476,6 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port)
 		rc = cxl_get_poison_unmapped(to_cxl_memdev(port->uport_dev),
 					     &ctx);
 
-	up_read(&cxl_region_rwsem);
 	return rc;
 }
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] cxl/memdev: Hold region_rwsem during inject and clear poison ops
  2023-11-27  0:09 [PATCH 0/2] cxl/core: Hold region_rwsem during poison ops alison.schofield
  2023-11-27  0:09 ` [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists alison.schofield
@ 2023-11-27  0:09 ` alison.schofield
  2023-11-27 17:58   ` Dave Jiang
  2023-11-28  0:58   ` Davidlohr Bueso
  1 sibling, 2 replies; 8+ messages in thread
From: alison.schofield @ 2023-11-27  0:09 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

From: Alison Schofield <alison.schofield@intel.com>

Poison inject and clear are supported via debugfs where a privileged
user can inject and clear poison to a device physical address.

Commit 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper")
added a lockdep assert that highlighted a gap in poison inject and
clear functions where holding the dpa_rwsem does not assure that a
a DPA is not added to a region.

The impact for inject and clear is that if the DPA address being
injected or cleared has been attached to a region, but not yet
committed, the dev_dbg() message intended to alert the debug user
that they are acting on a mapped address is not emitted. Also, the
cxl_poison trace event that serves as a log of the inject and clear
activity will not include region info.

Close this gap by snapshotting an unchangeable region state during
poison inject and clear operations. That means holding both the
region_rwsem and the dpa_rwsem during the inject and clear ops.

Fixes: d2fbc4865802 ("cxl/memdev: Add support for the Inject Poison mailbox command")
Fixes: 9690b07748d1 ("cxl/memdev: Add support for the Clear Poison mailbox command")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
 drivers/cxl/core/memdev.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 5ad1b13e780a..2f43d368ba07 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -331,10 +331,16 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 	if (!IS_ENABLED(CONFIG_DEBUG_FS))
 		return 0;
 
-	rc = down_read_interruptible(&cxl_dpa_rwsem);
+	rc = down_read_interruptible(&cxl_region_rwsem);
 	if (rc)
 		return rc;
 
+	rc = down_read_interruptible(&cxl_dpa_rwsem);
+	if (rc) {
+		up_read(&cxl_region_rwsem);
+		return rc;
+	}
+
 	rc = cxl_validate_poison_dpa(cxlmd, dpa);
 	if (rc)
 		goto out;
@@ -362,6 +368,7 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT);
 out:
 	up_read(&cxl_dpa_rwsem);
+	up_read(&cxl_region_rwsem);
 
 	return rc;
 }
@@ -379,10 +386,16 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
 	if (!IS_ENABLED(CONFIG_DEBUG_FS))
 		return 0;
 
-	rc = down_read_interruptible(&cxl_dpa_rwsem);
+	rc = down_read_interruptible(&cxl_region_rwsem);
 	if (rc)
 		return rc;
 
+	rc = down_read_interruptible(&cxl_dpa_rwsem);
+	if (rc) {
+		up_read(&cxl_region_rwsem);
+		return rc;
+	}
+
 	rc = cxl_validate_poison_dpa(cxlmd, dpa);
 	if (rc)
 		goto out;
@@ -419,6 +432,7 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
 	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_CLEAR);
 out:
 	up_read(&cxl_dpa_rwsem);
+	up_read(&cxl_region_rwsem);
 
 	return rc;
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists
  2023-11-27  0:09 ` [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists alison.schofield
@ 2023-11-27 17:40   ` Dave Jiang
  2023-11-27 21:20   ` Davidlohr Bueso
  2023-11-30  1:21   ` Dan Williams
  2 siblings, 0 replies; 8+ messages in thread
From: Dave Jiang @ 2023-11-27 17:40 UTC (permalink / raw)
  To: alison.schofield, Davidlohr Bueso, Jonathan Cameron,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl



On 11/26/23 17:09, alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> A read of a device poison list is triggered via a sysfs attribute
> and the results are logged as kernel trace events of type cxl_poison.
> The work is managed by either: a) the region driver when one of more
> regions map the device, or by b) the memdev driver when no regions
> map the device.
> 
> In the case of a) the region driver holds the region_rwsem while
> reading the poison by committed endpoint decoder mappings and for
> any unmapped resources. This makes sure that the cxl_poison trace
> event trace reports valid region info. (Region name, HPA, and UUID).
> 
> In the case of b) the memdev driver holds the dpa_rwsem preventing
> new DPA resources from being attached to a region. However, it leaves
> a gap between region attach and decoder commit actions. If a DPA in
> the gap is in the poison list, the cxl_poison trace event will omit
> the region info.
> 
> Close the gap by holding the region_rwsem and the dpa_rwsem when
> reading poison per memdev. Since both methods now hold both locks,
> down_read both from the caller. Doing so also addresses the lockdep
> assert that found this issue:
> Commit 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper")
> 
> Fixes: 7ff6ad107588 ("cxl/memdev: Add trigger_poison_list sysfs attribute")
> Fixes: f0832a586396 ("cxl/region: Provide region info to the cxl_poison trace event")
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/memdev.c | 9 ++++++++-
>  drivers/cxl/core/region.c | 5 -----
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index fc5c2b414793..5ad1b13e780a 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -227,10 +227,16 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
>  	if (!port || !is_cxl_endpoint(port))
>  		return -EINVAL;
>  
> -	rc = down_read_interruptible(&cxl_dpa_rwsem);
> +	rc = down_read_interruptible(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
>  
> +	rc = down_read_interruptible(&cxl_dpa_rwsem);
> +	if (rc) {
> +		up_read(&cxl_region_rwsem);
> +		return rc;
> +	}
> +
>  	if (cxl_num_decoders_committed(port) == 0) {
>  		/* No regions mapped to this memdev */
>  		rc = cxl_get_poison_by_memdev(cxlmd);
> @@ -239,6 +245,7 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
>  		rc =  cxl_get_poison_by_endpoint(port);
>  	}
>  	up_read(&cxl_dpa_rwsem);
> +	up_read(&cxl_region_rwsem);
>  
>  	return rc;
>  }
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 56e575c79bb4..3e817a6f94c6 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2467,10 +2467,6 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port)
>  	struct cxl_poison_context ctx;
>  	int rc = 0;
>  
> -	rc = down_read_interruptible(&cxl_region_rwsem);
> -	if (rc)
> -		return rc;
> -
>  	ctx = (struct cxl_poison_context) {
>  		.port = port
>  	};
> @@ -2480,7 +2476,6 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port)
>  		rc = cxl_get_poison_unmapped(to_cxl_memdev(port->uport_dev),
>  					     &ctx);
>  
> -	up_read(&cxl_region_rwsem);
>  	return rc;
>  }
>  

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] cxl/memdev: Hold region_rwsem during inject and clear poison ops
  2023-11-27  0:09 ` [PATCH 2/2] cxl/memdev: Hold region_rwsem during inject and clear poison ops alison.schofield
@ 2023-11-27 17:58   ` Dave Jiang
  2023-11-28  0:58   ` Davidlohr Bueso
  1 sibling, 0 replies; 8+ messages in thread
From: Dave Jiang @ 2023-11-27 17:58 UTC (permalink / raw)
  To: alison.schofield, Davidlohr Bueso, Jonathan Cameron,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl



On 11/26/23 17:09, alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> Poison inject and clear are supported via debugfs where a privileged
> user can inject and clear poison to a device physical address.
> 
> Commit 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper")
> added a lockdep assert that highlighted a gap in poison inject and
> clear functions where holding the dpa_rwsem does not assure that a
> a DPA is not added to a region.
> 
> The impact for inject and clear is that if the DPA address being
> injected or cleared has been attached to a region, but not yet
> committed, the dev_dbg() message intended to alert the debug user
> that they are acting on a mapped address is not emitted. Also, the
> cxl_poison trace event that serves as a log of the inject and clear
> activity will not include region info.
> 
> Close this gap by snapshotting an unchangeable region state during
> poison inject and clear operations. That means holding both the
> region_rwsem and the dpa_rwsem during the inject and clear ops.
> 
> Fixes: d2fbc4865802 ("cxl/memdev: Add support for the Inject Poison mailbox command")
> Fixes: 9690b07748d1 ("cxl/memdev: Add support for the Clear Poison mailbox command")
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> ---
>  drivers/cxl/core/memdev.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 5ad1b13e780a..2f43d368ba07 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -331,10 +331,16 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  	if (!IS_ENABLED(CONFIG_DEBUG_FS))
>  		return 0;
>  
> -	rc = down_read_interruptible(&cxl_dpa_rwsem);
> +	rc = down_read_interruptible(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
>  
> +	rc = down_read_interruptible(&cxl_dpa_rwsem);
> +	if (rc) {
> +		up_read(&cxl_region_rwsem);
> +		return rc;
> +	}
> +
>  	rc = cxl_validate_poison_dpa(cxlmd, dpa);
>  	if (rc)
>  		goto out;
> @@ -362,6 +368,7 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT);
>  out:
>  	up_read(&cxl_dpa_rwsem);
> +	up_read(&cxl_region_rwsem);
>  
>  	return rc;
>  }
> @@ -379,10 +386,16 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  	if (!IS_ENABLED(CONFIG_DEBUG_FS))
>  		return 0;
>  
> -	rc = down_read_interruptible(&cxl_dpa_rwsem);
> +	rc = down_read_interruptible(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
>  
> +	rc = down_read_interruptible(&cxl_dpa_rwsem);
> +	if (rc) {
> +		up_read(&cxl_region_rwsem);
> +		return rc;
> +	}
> +
>  	rc = cxl_validate_poison_dpa(cxlmd, dpa);
>  	if (rc)
>  		goto out;
> @@ -419,6 +432,7 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_CLEAR);
>  out:
>  	up_read(&cxl_dpa_rwsem);
> +	up_read(&cxl_region_rwsem);
>  
>  	return rc;
>  }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists
  2023-11-27  0:09 ` [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists alison.schofield
  2023-11-27 17:40   ` Dave Jiang
@ 2023-11-27 21:20   ` Davidlohr Bueso
  2023-11-30  1:21   ` Dan Williams
  2 siblings, 0 replies; 8+ messages in thread
From: Davidlohr Bueso @ 2023-11-27 21:20 UTC (permalink / raw)
  To: alison.schofield
  Cc: Jonathan Cameron, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Sun, 26 Nov 2023, alison.schofield@intel.com wrote:

>From: Alison Schofield <alison.schofield@intel.com>
>
>A read of a device poison list is triggered via a sysfs attribute
>and the results are logged as kernel trace events of type cxl_poison.
>The work is managed by either: a) the region driver when one of more
>regions map the device, or by b) the memdev driver when no regions
>map the device.
>
>In the case of a) the region driver holds the region_rwsem while
>reading the poison by committed endpoint decoder mappings and for
>any unmapped resources. This makes sure that the cxl_poison trace
>event trace reports valid region info. (Region name, HPA, and UUID).
>
>In the case of b) the memdev driver holds the dpa_rwsem preventing
>new DPA resources from being attached to a region. However, it leaves
>a gap between region attach and decoder commit actions. If a DPA in
>the gap is in the poison list, the cxl_poison trace event will omit
>the region info.
>
>Close the gap by holding the region_rwsem and the dpa_rwsem when
>reading poison per memdev. Since both methods now hold both locks,
>down_read both from the caller. Doing so also addresses the lockdep
>assert that found this issue:
>Commit 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper")
>
>Fixes: 7ff6ad107588 ("cxl/memdev: Add trigger_poison_list sysfs attribute")
>Fixes: f0832a586396 ("cxl/region: Provide region info to the cxl_poison trace event")
>Signed-off-by: Alison Schofield <alison.schofield@intel.com>

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] cxl/memdev: Hold region_rwsem during inject and clear poison ops
  2023-11-27  0:09 ` [PATCH 2/2] cxl/memdev: Hold region_rwsem during inject and clear poison ops alison.schofield
  2023-11-27 17:58   ` Dave Jiang
@ 2023-11-28  0:58   ` Davidlohr Bueso
  1 sibling, 0 replies; 8+ messages in thread
From: Davidlohr Bueso @ 2023-11-28  0:58 UTC (permalink / raw)
  To: alison.schofield
  Cc: Jonathan Cameron, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Sun, 26 Nov 2023, alison.schofield@intel.com wrote:

>From: Alison Schofield <alison.schofield@intel.com>
>
>Poison inject and clear are supported via debugfs where a privileged
>user can inject and clear poison to a device physical address.
>
>Commit 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper")
>added a lockdep assert that highlighted a gap in poison inject and
>clear functions where holding the dpa_rwsem does not assure that a
>a DPA is not added to a region.
>
>The impact for inject and clear is that if the DPA address being
>injected or cleared has been attached to a region, but not yet
>committed, the dev_dbg() message intended to alert the debug user
>that they are acting on a mapped address is not emitted. Also, the
>cxl_poison trace event that serves as a log of the inject and clear
>activity will not include region info.
>
>Close this gap by snapshotting an unchangeable region state during
>poison inject and clear operations. That means holding both the
>region_rwsem and the dpa_rwsem during the inject and clear ops.
>
>Fixes: d2fbc4865802 ("cxl/memdev: Add support for the Inject Poison mailbox command")
>Fixes: 9690b07748d1 ("cxl/memdev: Add support for the Clear Poison mailbox command")
>Signed-off-by: Alison Schofield <alison.schofield@intel.com>

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists
  2023-11-27  0:09 ` [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists alison.schofield
  2023-11-27 17:40   ` Dave Jiang
  2023-11-27 21:20   ` Davidlohr Bueso
@ 2023-11-30  1:21   ` Dan Williams
  2 siblings, 0 replies; 8+ messages in thread
From: Dan Williams @ 2023-11-30  1:21 UTC (permalink / raw)
  To: alison.schofield, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> A read of a device poison list is triggered via a sysfs attribute
> and the results are logged as kernel trace events of type cxl_poison.
> The work is managed by either: a) the region driver when one of more
> regions map the device, or by b) the memdev driver when no regions
> map the device.
> 
> In the case of a) the region driver holds the region_rwsem while
> reading the poison by committed endpoint decoder mappings and for
> any unmapped resources. This makes sure that the cxl_poison trace
> event trace reports valid region info. (Region name, HPA, and UUID).
> 
> In the case of b) the memdev driver holds the dpa_rwsem preventing
> new DPA resources from being attached to a region. However, it leaves
> a gap between region attach and decoder commit actions. If a DPA in
> the gap is in the poison list, the cxl_poison trace event will omit
> the region info.
> 
> Close the gap by holding the region_rwsem and the dpa_rwsem when
> reading poison per memdev. Since both methods now hold both locks,
> down_read both from the caller. Doing so also addresses the lockdep
> assert that found this issue:
> Commit 458ba8189cb4 ("cxl: Add cxl_decoders_committed() helper")
> 
> Fixes: 7ff6ad107588 ("cxl/memdev: Add trigger_poison_list sysfs attribute")

I am going to drop this Fixes: tag. Every kernel that has this commit
also has f0832a586396, and the there is nothing left to fix after
backporting this new commit.

> Fixes: f0832a586396 ("cxl/region: Provide region info to the cxl_poison trace event")
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
>  drivers/cxl/core/memdev.c | 9 ++++++++-
>  drivers/cxl/core/region.c | 5 -----
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index fc5c2b414793..5ad1b13e780a 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -227,10 +227,16 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
>  	if (!port || !is_cxl_endpoint(port))
>  		return -EINVAL;
>  
> -	rc = down_read_interruptible(&cxl_dpa_rwsem);
> +	rc = down_read_interruptible(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
>  
> +	rc = down_read_interruptible(&cxl_dpa_rwsem);
> +	if (rc) {
> +		up_read(&cxl_region_rwsem);
> +		return rc;

I am going to leave this as is, but for future consideration I think it
is ok reason that cxl_region_rwsem has a higher chance for contention.
Compare how long cxl_region_rwsem is held vs cxl_dpa_rwsem. Once it is
acquired, just commit to being uninterruptible with plain down_read()
for cxl_dpa_rwsem.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-11-30  1:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-27  0:09 [PATCH 0/2] cxl/core: Hold region_rwsem during poison ops alison.schofield
2023-11-27  0:09 ` [PATCH 1/2] cxl/core: Always hold region_rwsem while reading poison lists alison.schofield
2023-11-27 17:40   ` Dave Jiang
2023-11-27 21:20   ` Davidlohr Bueso
2023-11-30  1:21   ` Dan Williams
2023-11-27  0:09 ` [PATCH 2/2] cxl/memdev: Hold region_rwsem during inject and clear poison ops alison.schofield
2023-11-27 17:58   ` Dave Jiang
2023-11-28  0:58   ` Davidlohr Bueso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.