linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cxl/pmem: Fix reference counting for delayed work
@ 2021-10-29 19:55 Dan Williams
  2021-10-31 18:58 ` Ben Widawsky
  2021-11-01 11:30 ` Jonathan Cameron
  0 siblings, 2 replies; 5+ messages in thread
From: Dan Williams @ 2021-10-29 19:55 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, ben.widawsky, vishal.l.verma, alison.schofield

There is a potential race between queue_work() returning and the
queued-work running that could result in put_device() running before
get_device(). Introduce the cxl_nvdimm_bridge_state_work() helper that
takes the reference unconditionally, but drops it if no new work was
queued, to keep the references balanced.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/pmem.c |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index ceb2115981e5..38bcbb4e9409 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -266,14 +266,24 @@ static void cxl_nvb_update_state(struct work_struct *work)
 	put_device(&cxl_nvb->dev);
 }
 
+static void cxl_nvdimm_bridge_state_work(struct cxl_nvdimm_bridge *cxl_nvb)
+{
+	/*
+	 * Take a reference that the workqueue will drop if new work
+	 * gets queued.
+	 */
+	get_device(&cxl_nvb->dev);
+	if (!queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
+		put_device(&cxl_nvb->dev);
+}
+
 static void cxl_nvdimm_bridge_remove(struct device *dev)
 {
 	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
 
 	if (cxl_nvb->state == CXL_NVB_ONLINE)
 		cxl_nvb->state = CXL_NVB_OFFLINE;
-	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
-		get_device(&cxl_nvb->dev);
+	cxl_nvdimm_bridge_state_work(cxl_nvb);
 }
 
 static int cxl_nvdimm_bridge_probe(struct device *dev)
@@ -294,8 +304,7 @@ static int cxl_nvdimm_bridge_probe(struct device *dev)
 	}
 
 	cxl_nvb->state = CXL_NVB_ONLINE;
-	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
-		get_device(&cxl_nvb->dev);
+	cxl_nvdimm_bridge_state_work(cxl_nvb);
 
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] cxl/pmem: Fix reference counting for delayed work
  2021-10-29 19:55 [PATCH] cxl/pmem: Fix reference counting for delayed work Dan Williams
@ 2021-10-31 18:58 ` Ben Widawsky
  2021-10-31 19:27   ` Dan Williams
  2021-11-01 11:30 ` Jonathan Cameron
  1 sibling, 1 reply; 5+ messages in thread
From: Ben Widawsky @ 2021-10-31 18:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, vishal.l.verma, alison.schofield

On 21-10-29 12:55:47, Dan Williams wrote:
> There is a potential race between queue_work() returning and the
> queued-work running that could result in put_device() running before
> get_device(). Introduce the cxl_nvdimm_bridge_state_work() helper that
> takes the reference unconditionally, but drops it if no new work was
> queued, to keep the references balanced.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Arguably fixes/stable?
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

> ---
>  drivers/cxl/pmem.c |   17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index ceb2115981e5..38bcbb4e9409 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -266,14 +266,24 @@ static void cxl_nvb_update_state(struct work_struct *work)
>  	put_device(&cxl_nvb->dev);
>  }
>  
> +static void cxl_nvdimm_bridge_state_work(struct cxl_nvdimm_bridge *cxl_nvb)
> +{
> +	/*
> +	 * Take a reference that the workqueue will drop if new work
> +	 * gets queued.
> +	 */
> +	get_device(&cxl_nvb->dev);
> +	if (!queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
> +		put_device(&cxl_nvb->dev);
> +}
> +
>  static void cxl_nvdimm_bridge_remove(struct device *dev)
>  {
>  	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
>  
>  	if (cxl_nvb->state == CXL_NVB_ONLINE)
>  		cxl_nvb->state = CXL_NVB_OFFLINE;
> -	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
> -		get_device(&cxl_nvb->dev);
> +	cxl_nvdimm_bridge_state_work(cxl_nvb);
>  }
>  
>  static int cxl_nvdimm_bridge_probe(struct device *dev)
> @@ -294,8 +304,7 @@ static int cxl_nvdimm_bridge_probe(struct device *dev)
>  	}
>  
>  	cxl_nvb->state = CXL_NVB_ONLINE;
> -	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
> -		get_device(&cxl_nvb->dev);
> +	cxl_nvdimm_bridge_state_work(cxl_nvb);
>  
>  	return 0;
>  }
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cxl/pmem: Fix reference counting for delayed work
  2021-10-31 18:58 ` Ben Widawsky
@ 2021-10-31 19:27   ` Dan Williams
  0 siblings, 0 replies; 5+ messages in thread
From: Dan Williams @ 2021-10-31 19:27 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: linux-cxl, Weiny, Ira, Vishal L Verma, Schofield, Alison

On Sun, Oct 31, 2021 at 11:58 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-29 12:55:47, Dan Williams wrote:
> > There is a potential race between queue_work() returning and the
> > queued-work running that could result in put_device() running before
> > get_device(). Introduce the cxl_nvdimm_bridge_state_work() helper that
> > takes the reference unconditionally, but drops it if no new work was
> > queued, to keep the references balanced.
> >
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> Arguably fixes/stable?

Whoops, yes, definitely needs a Fixes:. A stable cc is debatable, but
if autosel picked up this fix I wouldn't say no, so might as well cc
stable upfront.

> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cxl/pmem: Fix reference counting for delayed work
  2021-10-29 19:55 [PATCH] cxl/pmem: Fix reference counting for delayed work Dan Williams
  2021-10-31 18:58 ` Ben Widawsky
@ 2021-11-01 11:30 ` Jonathan Cameron
  2021-11-03  0:57   ` Dan Williams
  1 sibling, 1 reply; 5+ messages in thread
From: Jonathan Cameron @ 2021-11-01 11:30 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, ira.weiny, ben.widawsky, vishal.l.verma, alison.schofield

On Fri, 29 Oct 2021 12:55:47 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> There is a potential race between queue_work() returning and the
> queued-work running that could result in put_device() running before
> get_device(). Introduce the cxl_nvdimm_bridge_state_work() helper that
> takes the reference unconditionally, but drops it if no new work was
> queued, to keep the references balanced.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Good spot, I'm guessing this was an inspection thing rather than a problem
you've managed to trigger, but either way.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/cxl/pmem.c |   17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index ceb2115981e5..38bcbb4e9409 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -266,14 +266,24 @@ static void cxl_nvb_update_state(struct work_struct *work)
>  	put_device(&cxl_nvb->dev);
>  }
>  
> +static void cxl_nvdimm_bridge_state_work(struct cxl_nvdimm_bridge *cxl_nvb)
> +{
> +	/*
> +	 * Take a reference that the workqueue will drop if new work
> +	 * gets queued.
> +	 */
> +	get_device(&cxl_nvb->dev);
> +	if (!queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
> +		put_device(&cxl_nvb->dev);
> +}
> +
>  static void cxl_nvdimm_bridge_remove(struct device *dev)
>  {
>  	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
>  
>  	if (cxl_nvb->state == CXL_NVB_ONLINE)
>  		cxl_nvb->state = CXL_NVB_OFFLINE;
> -	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
> -		get_device(&cxl_nvb->dev);
> +	cxl_nvdimm_bridge_state_work(cxl_nvb);
>  }
>  
>  static int cxl_nvdimm_bridge_probe(struct device *dev)
> @@ -294,8 +304,7 @@ static int cxl_nvdimm_bridge_probe(struct device *dev)
>  	}
>  
>  	cxl_nvb->state = CXL_NVB_ONLINE;
> -	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
> -		get_device(&cxl_nvb->dev);
> +	cxl_nvdimm_bridge_state_work(cxl_nvb);
>  
>  	return 0;
>  }
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] cxl/pmem: Fix reference counting for delayed work
  2021-11-01 11:30 ` Jonathan Cameron
@ 2021-11-03  0:57   ` Dan Williams
  0 siblings, 0 replies; 5+ messages in thread
From: Dan Williams @ 2021-11-03  0:57 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Weiny, Ira, Ben Widawsky, Vishal L Verma, Schofield, Alison

On Mon, Nov 1, 2021 at 4:30 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Fri, 29 Oct 2021 12:55:47 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > There is a potential race between queue_work() returning and the
> > queued-work running that could result in put_device() running before
> > get_device(). Introduce the cxl_nvdimm_bridge_state_work() helper that
> > takes the reference unconditionally, but drops it if no new work was
> > queued, to keep the references balanced.
> >
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> Good spot, I'm guessing this was an inspection thing rather than a problem
> you've managed to trigger, but either way.

Correct, it was by inspection. I expect it would be extremely
difficult to hit this race in practice.

> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-03  0:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-29 19:55 [PATCH] cxl/pmem: Fix reference counting for delayed work Dan Williams
2021-10-31 18:58 ` Ben Widawsky
2021-10-31 19:27   ` Dan Williams
2021-11-01 11:30 ` Jonathan Cameron
2021-11-03  0:57   ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).