linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] cxl/mem: Fix memdev device setup
@ 2021-03-24 21:01 Dan Williams
  2021-03-24 21:01 ` [PATCH 1/4] cxl/mem: Use sysfs_emit() for attribute show routines Dan Williams
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Dan Williams @ 2021-03-24 21:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Jason Gunthorpe, ira.weiny, vishal.l.verma,
	alison.schofield, linux-kernel

A small collection of fixes mostly inspired by Jason's recognition of
dev_set_name() error handling mistakes on other driver review.

dev_set_name() can fail and although device_add() might catch it that's
not a reliable assumption. While fixing that I noticed that the unwind
handling for cdev_device_add() failures leaked the device name.

The sysfs_emit() fixup and unpublishing of device power management files
are just sanity cleanups.

---

Dan Williams (4):
      cxl/mem: Use sysfs_emit() for attribute show routines
      cxl/mem: Fix cdev_device_add() error handling
      cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures
      cxl/mem: Disable cxl device power management


 drivers/cxl/mem.c |   31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

base-commit: a38fd8748464831584a19438cbb3082b5a2dab15

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] cxl/mem: Use sysfs_emit() for attribute show routines
  2021-03-24 21:01 [PATCH 0/4] cxl/mem: Fix memdev device setup Dan Williams
@ 2021-03-24 21:01 ` Dan Williams
  2021-03-25 16:49   ` Jason Gunthorpe
  2021-03-24 21:01 ` [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling Dan Williams
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2021-03-24 21:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Jason Gunthorpe, ira.weiny, vishal.l.verma,
	alison.schofield, linux-kernel

While none the CXL sysfs attributes are threatening to overrun a
PAGE_SIZE of output, it is good form to use the recommended helpers.

Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/mem.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index ecfc9ccdba8d..30bf4f0f3c17 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -1066,7 +1066,7 @@ static ssize_t firmware_version_show(struct device *dev,
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_mem *cxlm = cxlmd->cxlm;
 
-	return sprintf(buf, "%.16s\n", cxlm->firmware_version);
+	return sysfs_emit(buf, "%.16s\n", cxlm->firmware_version);
 }
 static DEVICE_ATTR_RO(firmware_version);
 
@@ -1076,7 +1076,7 @@ static ssize_t payload_max_show(struct device *dev,
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_mem *cxlm = cxlmd->cxlm;
 
-	return sprintf(buf, "%zu\n", cxlm->payload_size);
+	return sysfs_emit(buf, "%zu\n", cxlm->payload_size);
 }
 static DEVICE_ATTR_RO(payload_max);
 
@@ -1087,7 +1087,7 @@ static ssize_t ram_size_show(struct device *dev, struct device_attribute *attr,
 	struct cxl_mem *cxlm = cxlmd->cxlm;
 	unsigned long long len = range_len(&cxlm->ram_range);
 
-	return sprintf(buf, "%#llx\n", len);
+	return sysfs_emit(buf, "%#llx\n", len);
 }
 
 static struct device_attribute dev_attr_ram_size =
@@ -1100,7 +1100,7 @@ static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
 	struct cxl_mem *cxlm = cxlmd->cxlm;
 	unsigned long long len = range_len(&cxlm->pmem_range);
 
-	return sprintf(buf, "%#llx\n", len);
+	return sysfs_emit(buf, "%#llx\n", len);
 }
 
 static struct device_attribute dev_attr_pmem_size =


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling
  2021-03-24 21:01 [PATCH 0/4] cxl/mem: Fix memdev device setup Dan Williams
  2021-03-24 21:01 ` [PATCH 1/4] cxl/mem: Use sysfs_emit() for attribute show routines Dan Williams
@ 2021-03-24 21:01 ` Dan Williams
  2021-03-25 17:11   ` Jason Gunthorpe
  2021-03-24 21:02 ` [PATCH 3/4] cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures Dan Williams
  2021-03-24 21:02 ` [PATCH 4/4] cxl/mem: Disable cxl device power management Dan Williams
  3 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2021-03-24 21:01 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jason Gunthorpe, ira.weiny, vishal.l.verma, alison.schofield,
	linux-kernel

If cdev_device_add() fails then the allocation performed by
dev_set_name() is leaked. Use put_device(), not open coded release, for
device_add() failures.

The comment is obsolete because direct err_id failures need not worry
about the device being live.

The release method expects the percpu_ref is already dead, so
percpu_ref_kill() is needed before put_device(). However, given that the
cdev was partially live wait_for_completion() also belongs in the
release method.

Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/mem.c |   16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 30bf4f0f3c17..e53d573ae4ab 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -1049,6 +1049,7 @@ static void cxl_memdev_release(struct device *dev)
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 
+	wait_for_completion(&cxlmd->ops_dead);
 	percpu_ref_exit(&cxlmd->ops_active);
 	ida_free(&cxl_memdev_ida, cxlmd->id);
 	kfree(cxlmd);
@@ -1157,7 +1158,6 @@ static void cxlmdev_unregister(void *_cxlmd)
 
 	percpu_ref_kill(&cxlmd->ops_active);
 	cdev_device_del(&cxlmd->cdev, dev);
-	wait_for_completion(&cxlmd->ops_dead);
 	cxlmd->cxlm = NULL;
 	put_device(dev);
 }
@@ -1210,20 +1210,16 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
 	cdev_init(cdev, &cxl_memdev_fops);
 
 	rc = cdev_device_add(cdev, dev);
-	if (rc)
-		goto err_add;
+	if (rc) {
+		percpu_ref_kill(&cxlmd->ops_active);
+		put_device(dev);
+		return rc;
+	}
 
 	return devm_add_action_or_reset(dev->parent, cxlmdev_unregister, cxlmd);
 
-err_add:
-	ida_free(&cxl_memdev_ida, cxlmd->id);
 err_id:
-	/*
-	 * Theoretically userspace could have already entered the fops,
-	 * so flush ops_active.
-	 */
 	percpu_ref_kill(&cxlmd->ops_active);
-	wait_for_completion(&cxlmd->ops_dead);
 	percpu_ref_exit(&cxlmd->ops_active);
 err_ref:
 	kfree(cxlmd);


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures
  2021-03-24 21:01 [PATCH 0/4] cxl/mem: Fix memdev device setup Dan Williams
  2021-03-24 21:01 ` [PATCH 1/4] cxl/mem: Use sysfs_emit() for attribute show routines Dan Williams
  2021-03-24 21:01 ` [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling Dan Williams
@ 2021-03-24 21:02 ` Dan Williams
  2021-03-25 17:14   ` Jason Gunthorpe
  2021-03-24 21:02 ` [PATCH 4/4] cxl/mem: Disable cxl device power management Dan Williams
  3 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2021-03-24 21:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jason Gunthorpe, ira.weiny, vishal.l.verma, alison.schofield,
	linux-kernel

While device_add() will happen to catch dev_set_name() failures it is a
broken pattern to follow given that the core may try to fall back to a
different name.

Add explicit checking for dev_set_name() failures to be cleaned up by
put_device(). Skip cdev_device_add() and proceed directly to
put_device() if the name set failure.

Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/mem.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index e53d573ae4ab..d615f183520c 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -1204,12 +1204,14 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
 	dev->bus = &cxl_bus_type;
 	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
 	dev->type = &cxl_memdev_type;
-	dev_set_name(dev, "mem%d", cxlmd->id);
 
 	cdev = &cxlmd->cdev;
 	cdev_init(cdev, &cxl_memdev_fops);
 
-	rc = cdev_device_add(cdev, dev);
+	rc = dev_set_name(dev, "mem%d", cxlmd->id);
+	if (rc == 0)
+		rc = cdev_device_add(cdev, dev);
+
 	if (rc) {
 		percpu_ref_kill(&cxlmd->ops_active);
 		put_device(dev);


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] cxl/mem: Disable cxl device power management
  2021-03-24 21:01 [PATCH 0/4] cxl/mem: Fix memdev device setup Dan Williams
                   ` (2 preceding siblings ...)
  2021-03-24 21:02 ` [PATCH 3/4] cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures Dan Williams
@ 2021-03-24 21:02 ` Dan Williams
  3 siblings, 0 replies; 11+ messages in thread
From: Dan Williams @ 2021-03-24 21:02 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, ira.weiny, vishal.l.verma, alison.schofield, linux-kernel

There is no power management of cxl virtual devices, disable
device-power-management and runtime-power-management to prevent
userspace from growing expectations of those attributes appearing. They
can be added back in the future if needed.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/mem.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index d615f183520c..338923a6b2ef 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -1204,6 +1204,7 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
 	dev->bus = &cxl_bus_type;
 	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
 	dev->type = &cxl_memdev_type;
+	device_set_pm_not_required(dev);
 
 	cdev = &cxlmd->cdev;
 	cdev_init(cdev, &cxl_memdev_fops);


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] cxl/mem: Use sysfs_emit() for attribute show routines
  2021-03-24 21:01 ` [PATCH 1/4] cxl/mem: Use sysfs_emit() for attribute show routines Dan Williams
@ 2021-03-25 16:49   ` Jason Gunthorpe
  0 siblings, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2021-03-25 16:49 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, ira.weiny, vishal.l.verma,
	alison.schofield, linux-kernel

On Wed, Mar 24, 2021 at 02:01:51PM -0700, Dan Williams wrote:
> While none the CXL sysfs attributes are threatening to overrun a
> PAGE_SIZE of output, it is good form to use the recommended helpers.
> 
> Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
> Reported-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/mem.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling
  2021-03-24 21:01 ` [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling Dan Williams
@ 2021-03-25 17:11   ` Jason Gunthorpe
  2021-03-29 21:03     ` Dan Williams
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2021-03-25 17:11 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, ira.weiny, vishal.l.verma, alison.schofield, linux-kernel

On Wed, Mar 24, 2021 at 02:01:56PM -0700, Dan Williams wrote:
> If cdev_device_add() fails then the allocation performed by
> dev_set_name() is leaked. Use put_device(), not open coded release, for
> device_add() failures.
> 
> The comment is obsolete because direct err_id failures need not worry
> about the device being live.
> 
> The release method expects the percpu_ref is already dead, so
> percpu_ref_kill() is needed before put_device(). However, given that the
> cdev was partially live wait_for_completion() also belongs in the
> release method.
> 
> Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
> Reported-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>  drivers/cxl/mem.c |   16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 30bf4f0f3c17..e53d573ae4ab 100644
> +++ b/drivers/cxl/mem.c
> @@ -1049,6 +1049,7 @@ static void cxl_memdev_release(struct device *dev)
>  {
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  
> +	wait_for_completion(&cxlmd->ops_dead);

This only works because the fops stuff is not right, a kref shouldn't
have a completion like this.

Also, don't use devm for unregister. That just makes it extra-hard to
write the driver remove function correctly.

> @@ -1157,7 +1158,6 @@ static void cxlmdev_unregister(void *_cxlmd)
>  
>  	percpu_ref_kill(&cxlmd->ops_active);
>  	cdev_device_del(&cxlmd->cdev, dev);
> -	wait_for_completion(&cxlmd->ops_dead);
>  	cxlmd->cxlm = NULL;
>  	put_device(dev);
>  }
> @@ -1210,20 +1210,16 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
>  	cdev_init(cdev, &cxl_memdev_fops);
>  
>  	rc = cdev_device_add(cdev, dev);
> -	if (rc)
> -		goto err_add;
> +	if (rc) {
> +		percpu_ref_kill(&cxlmd->ops_active);
> +		put_device(dev);

This must be one high performance ioctl to warrant the percpu ref.. If
it is not high performance use a rwsem, otherwise I'd suggest srcu as
a faster/simpler alternative.

This is a use-after-free:

static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
			     unsigned long arg)
{
	struct cxl_memdev *cxlmd;
	struct inode *inode;
	int rc = -ENOTTY;

	inode = file_inode(file);
	cxlmd = container_of(inode->i_cdev, typeof(*cxlmd), cdev);
       ^^^^^ can be freed memory

ioctl needs to store the cxlmd in file->private_data and
open()/release() need to do get/put device on it so the memory stays
around. This is why open gets the inode as an argument and ioctl/etc
does not.

The ordering cxlmdev_unregister should mirror the ordering in create
so cdev_device_del should be first

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures
  2021-03-24 21:02 ` [PATCH 3/4] cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures Dan Williams
@ 2021-03-25 17:14   ` Jason Gunthorpe
  0 siblings, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2021-03-25 17:14 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, ira.weiny, vishal.l.verma, alison.schofield, linux-kernel

On Wed, Mar 24, 2021 at 02:02:01PM -0700, Dan Williams wrote:
> While device_add() will happen to catch dev_set_name() failures it is a
> broken pattern to follow given that the core may try to fall back to a
> different name.
> 
> Add explicit checking for dev_set_name() failures to be cleaned up by
> put_device(). Skip cdev_device_add() and proceed directly to
> put_device() if the name set failure.
> 
> Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
> Reported-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>  drivers/cxl/mem.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index e53d573ae4ab..d615f183520c 100644
> +++ b/drivers/cxl/mem.c
> @@ -1204,12 +1204,14 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
>  	dev->bus = &cxl_bus_type;
>  	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>  	dev->type = &cxl_memdev_type;
> -	dev_set_name(dev, "mem%d", cxlmd->id);
>  
>  	cdev = &cxlmd->cdev;
>  	cdev_init(cdev, &cxl_memdev_fops);
>  
> -	rc = cdev_device_add(cdev, dev);
> +	rc = dev_set_name(dev, "mem%d", cxlmd->id);
> +	if (rc == 0)
> +		rc = cdev_device_add(cdev, dev);

Success oriented flow please

This is much nicer if you split the allocation, then this flow is
clean and simple. cxl_alloc_memdev() is undone by cxl_memdev_release()

and I would reorder the code so they are above/below each other, it is
easy to check and understand when logically paired functions are on
the same screen.

static struct cxl_memdev *cxl_alloc_memdev(struct cxl_mem *cxlm)
{
	struct cxl_memdev *cxlmd;
	struct device *dev;

	cxlmd = kzalloc(sizeof(*cxlmd), GFP_KERNEL);
	if (!cxlmd)
		return PTR_ERR(-ENOMEM);

	init_completion(&cxlmd->ops_dead);
	cxlmd->cxlm = cxlm;

	rc = ida_alloc_range(&cxl_memdev_ida, 0, CXL_MEM_MAX_DEVS, GFP_KERNEL);
	if (rc < 0)
		goto err_free;
	cxlmd->id = rc;

	/*
	 * @cxlm is deallocated when the driver unbinds so operations
	 * that are using it need to hold a live reference.
	 */
	rc = percpu_ref_init(&cxlmd->ops_active, cxlmdev_ops_active_release, 0,
			     GFP_KERNEL);
	if (rc)
		goto err_id;

	dev = &cxlmd->dev;
	dev->parent = &pdev->dev;
	dev->bus = &cxl_bus_type;
	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
	dev->type = &cxl_memdev_type;
	device_initialize(dev);
	return cxlmd;

err_id:
	ida_free(&cxl_memdev_ida, cxlmd->id);
err_free:
	kfree(cxlmd);
	return PTR_ERR(rc);
}

static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
{
	struct pci_dev *pdev = cxlm->pdev;
	struct cdev *cdev;
	int rc;

	cxlmd = cxl_alloc_memdev(cxlm);
	if (IS_ERR(cxlmd))
		return ERR_PTR(cxlmd)

	rc = dev_set_name(dev, "mem%d", cxlmd->id);
	if (rc)
		goto err;

	cdev = &cxlmd->cdev;
	cdev_init(cdev, &cxl_memdev_fops);

	// Must be last
	rc = cdev_device_add(cdev, dev);
	if (rc)
		goto err;

	return devm_add_action_or_reset(dev->parent, cxlmdev_unregister, cxlmd);
err:
	put_device(&cxlmd->dev);
	return rc;
}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling
  2021-03-25 17:11   ` Jason Gunthorpe
@ 2021-03-29 21:03     ` Dan Williams
  2021-03-29 22:44       ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2021-03-29 21:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-cxl, Weiny, Ira, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List

On Thu, Mar 25, 2021 at 10:12 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Mar 24, 2021 at 02:01:56PM -0700, Dan Williams wrote:
> > If cdev_device_add() fails then the allocation performed by
> > dev_set_name() is leaked. Use put_device(), not open coded release, for
> > device_add() failures.
> >
> > The comment is obsolete because direct err_id failures need not worry
> > about the device being live.
> >
> > The release method expects the percpu_ref is already dead, so
> > percpu_ref_kill() is needed before put_device(). However, given that the
> > cdev was partially live wait_for_completion() also belongs in the
> > release method.
> >
> > Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
> > Reported-by: Jason Gunthorpe <jgg@nvidia.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >  drivers/cxl/mem.c |   16 ++++++----------
> >  1 file changed, 6 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> > index 30bf4f0f3c17..e53d573ae4ab 100644
> > +++ b/drivers/cxl/mem.c
> > @@ -1049,6 +1049,7 @@ static void cxl_memdev_release(struct device *dev)
> >  {
> >       struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> >
> > +     wait_for_completion(&cxlmd->ops_dead);
>
> This only works because the fops stuff is not right, a kref shouldn't
> have a completion like this.
>
> Also, don't use devm for unregister. That just makes it extra-hard to
> write the driver remove function correctly.

To date there is no driver remove function, however if that changes
then I expect all the devm needs to go.

>
> > @@ -1157,7 +1158,6 @@ static void cxlmdev_unregister(void *_cxlmd)
> >
> >       percpu_ref_kill(&cxlmd->ops_active);
> >       cdev_device_del(&cxlmd->cdev, dev);
> > -     wait_for_completion(&cxlmd->ops_dead);
> >       cxlmd->cxlm = NULL;
> >       put_device(dev);
> >  }
> > @@ -1210,20 +1210,16 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
> >       cdev_init(cdev, &cxl_memdev_fops);
> >
> >       rc = cdev_device_add(cdev, dev);
> > -     if (rc)
> > -             goto err_add;
> > +     if (rc) {
> > +             percpu_ref_kill(&cxlmd->ops_active);
> > +             put_device(dev);
>
> This must be one high performance ioctl to warrant the percpu ref.. If
> it is not high performance use a rwsem, otherwise I'd suggest srcu as
> a faster/simpler alternative.

The plan is to refactor and share the same reference counted fops
mechanism as debugfs and make that common infrastructure. However, in
the meantime I think global srcu is suitable.

>
> This is a use-after-free:
>
> static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
>                              unsigned long arg)
> {
>         struct cxl_memdev *cxlmd;
>         struct inode *inode;
>         int rc = -ENOTTY;
>
>         inode = file_inode(file);
>         cxlmd = container_of(inode->i_cdev, typeof(*cxlmd), cdev);
>        ^^^^^ can be freed memory
>
> ioctl needs to store the cxlmd in file->private_data and
> open()/release() need to do get/put device on it so the memory stays
> around. This is why open gets the inode as an argument and ioctl/etc
> does not.

Ugh, exactly why I was motivated to attempt to preclude this with new
core infrastructure that attempted to fix this centrally [1]. Remove
the  possibility of "others" getting this wrong. However after my
initial idea bounced off Greg then I ended up shipping this bug in the
local rewrite. I think the debugfs api gets this right in terms of
centralizing the reference count management, and I want to see
something similar for common driver ioctl patterns.

[1]: http://lore.kernel.org/r/CAPcyv4hGxLZGEkfnqdLfF-a1CzfEjLux-TBxXztbknFhEe9mYA@mail.gmail.com

>
> The ordering cxlmdev_unregister should mirror the ordering in create
> so cdev_device_del should be first

Sure.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling
  2021-03-29 21:03     ` Dan Williams
@ 2021-03-29 22:44       ` Jason Gunthorpe
  2021-03-30  4:48         ` Dan Williams
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2021-03-29 22:44 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Weiny, Ira, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List

On Mon, Mar 29, 2021 at 02:03:37PM -0700, Dan Williams wrote:

> Ugh, exactly why I was motivated to attempt to preclude this with new
> core infrastructure that attempted to fix this centrally [1]. Remove
> the  possibility of "others" getting this wrong. However after my
> initial idea bounced off Greg then I ended up shipping this bug in the
> local rewrite. I think the debugfs api gets this right in terms of
> centralizing the reference count management, and I want to see
> something similar for common driver ioctl patterns.

There is a lot of variety here, I'm not sure how much valuable common
code there will be that could be lifted into the core.. srcu,
refcount, rwsem, percpu_ref, etc are all common implementations with
various properties.

The easist implementation is to just block driver destruction with a
refcount & completion pattern

The hardest is to allow the underlying HW driver to be removed from
the fops while the file remains open.

Usually whatever scheme is used has to flow into some in-kernel API as
well, so isolating it in cdev may no be entirely helpful.

The easisted helper API would be to add an 'unregistration lock' to
the struct device that blocks unregistration. A refcount & completion
for instance. I've seen that open coded enough times.

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling
  2021-03-29 22:44       ` Jason Gunthorpe
@ 2021-03-30  4:48         ` Dan Williams
  0 siblings, 0 replies; 11+ messages in thread
From: Dan Williams @ 2021-03-30  4:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-cxl, Weiny, Ira, Vishal L Verma, Schofield, Alison,
	Linux Kernel Mailing List

On Mon, Mar 29, 2021 at 3:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Mon, Mar 29, 2021 at 02:03:37PM -0700, Dan Williams wrote:
>
> > Ugh, exactly why I was motivated to attempt to preclude this with new
> > core infrastructure that attempted to fix this centrally [1]. Remove
> > the  possibility of "others" getting this wrong. However after my
> > initial idea bounced off Greg then I ended up shipping this bug in the
> > local rewrite. I think the debugfs api gets this right in terms of
> > centralizing the reference count management, and I want to see
> > something similar for common driver ioctl patterns.
>
> There is a lot of variety here, I'm not sure how much valuable common
> code there will be that could be lifted into the core.. srcu,
> refcount, rwsem, percpu_ref, etc are all common implementations with
> various properties.
>
> The easist implementation is to just block driver destruction with a
> refcount & completion pattern
>
> The hardest is to allow the underlying HW driver to be removed from
> the fops while the file remains open.
>
> Usually whatever scheme is used has to flow into some in-kernel API as
> well, so isolating it in cdev may no be entirely helpful.
>
> The easisted helper API would be to add an 'unregistration lock' to
> the struct device that blocks unregistration. A refcount & completion
> for instance. I've seen that open coded enough times.

I do agree there is too much variety to widely unify. At the same time
it is a common enough pattern for devices that allow removal before
final close, especially devices that support hot-removal disconnecting
is a better pattern than blocking unregisteration.

Just the small matter of time to see this through...

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-03-30  4:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-24 21:01 [PATCH 0/4] cxl/mem: Fix memdev device setup Dan Williams
2021-03-24 21:01 ` [PATCH 1/4] cxl/mem: Use sysfs_emit() for attribute show routines Dan Williams
2021-03-25 16:49   ` Jason Gunthorpe
2021-03-24 21:01 ` [PATCH 2/4] cxl/mem: Fix cdev_device_add() error handling Dan Williams
2021-03-25 17:11   ` Jason Gunthorpe
2021-03-29 21:03     ` Dan Williams
2021-03-29 22:44       ` Jason Gunthorpe
2021-03-30  4:48         ` Dan Williams
2021-03-24 21:02 ` [PATCH 3/4] cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures Dan Williams
2021-03-25 17:14   ` Jason Gunthorpe
2021-03-24 21:02 ` [PATCH 4/4] cxl/mem: Disable cxl device power management Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).