All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] nvme: fix module ref count Oops
@ 2020-09-04  2:39 Chaitanya Kulkarni
  2020-09-04  2:39 ` [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path() Chaitanya Kulkarni
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Chaitanya Kulkarni @ 2020-09-04  2:39 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, logang, hch, Chaitanya Kulkarni, sagi

Hi,

As per the comment on the latest version of the passthru cleanup and 
fixes V2 this patch seires isolates refcount bug-fix patches.

The patch series is formatted in to bottom up manner with first two
patches are the prep patches to create require infrastructure to fix
the module refcount bug and third patch actually fixes the bug.

Regards,
Chaitanya

Chaitanya Kulkarni (3):
  nvme: decouple nvme_ctrl_get_by_path()
  nvme: move get/put ctrl into dev open/release
  nvme-core: fix nvme module ref count Oops

 drivers/nvme/host/core.c       | 29 ++++++++++++++++++++---------
 drivers/nvme/host/nvme.h       |  2 +-
 drivers/nvme/target/nvmet.h    |  1 +
 drivers/nvme/target/passthru.c | 24 +++++++++++++++---------
 4 files changed, 37 insertions(+), 19 deletions(-)

-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path()
  2020-09-04  2:39 [PATCH 0/3] nvme: fix module ref count Oops Chaitanya Kulkarni
@ 2020-09-04  2:39 ` Chaitanya Kulkarni
  2020-09-04 15:54   ` Logan Gunthorpe
  2020-09-04  2:39 ` [PATCH 2/3] nvme: move get/put ctrl into dev open/release Chaitanya Kulkarni
  2020-09-04  2:39 ` [PATCH 3/3] nvme-core: fix nvme module ref count Oops Chaitanya Kulkarni
  2 siblings, 1 reply; 16+ messages in thread
From: Chaitanya Kulkarni @ 2020-09-04  2:39 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, logang, hch, Chaitanya Kulkarni, sagi

Right now nvme_ctrl_get_by_path() accepts ctrl path, based on that it
opens a file and calls nvme_get_ctrl(). In order to take module
refcount it is important to distinguish the error between file_open()
and module file ops check so that we can unwind the code in the caller
nvmet_passthru_ctrl_enable() in the error path.

Rename nvme_ctrl_get_by_path() -> nvme_ctrl_get_by_file() and lift
the file opening and error handling in the caller so that we can unwind
appropriately in the error path.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/host/core.c       | 10 ++--------
 drivers/nvme/host/nvme.h       |  2 +-
 drivers/nvme/target/nvmet.h    |  1 +
 drivers/nvme/target/passthru.c | 17 +++++++++++++----
 4 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5702a3843746..fa7c0def9184 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4641,14 +4641,9 @@ void nvme_sync_queues(struct nvme_ctrl *ctrl)
 }
 EXPORT_SYMBOL_GPL(nvme_sync_queues);
 
-struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path)
+struct nvme_ctrl *nvme_ctrl_get_by_file(struct file *f)
 {
 	struct nvme_ctrl *ctrl;
-	struct file *f;
-
-	f = filp_open(path, O_RDWR, 0);
-	if (IS_ERR(f))
-		return ERR_CAST(f);
 
 	if (f->f_op != &nvme_dev_fops) {
 		ctrl = ERR_PTR(-EINVAL);
@@ -4659,10 +4654,9 @@ struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path)
 	nvme_get_ctrl(ctrl);
 
 out_close:
-	filp_close(f, NULL);
 	return ctrl;
 }
-EXPORT_SYMBOL_NS_GPL(nvme_ctrl_get_by_path, NVME_TARGET_PASSTHRU);
+EXPORT_SYMBOL_NS_GPL(nvme_ctrl_get_by_file, NVME_TARGET_PASSTHRU);
 
 /*
  * Check we didn't inadvertently grow the command structure sizes:
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 2910f6caab7d..590cc880834a 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -836,7 +836,7 @@ static inline void nvme_hwmon_init(struct nvme_ctrl *ctrl) { }
 u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 			 u8 opcode);
 void nvme_execute_passthru_rq(struct request *rq);
-struct nvme_ctrl *nvme_ctrl_get_by_path(const char *path);
+struct nvme_ctrl *nvme_ctrl_get_by_file(struct file *f);
 struct nvme_ns *nvme_find_get_ns(struct nvme_ctrl *ctrl, unsigned nsid);
 void nvme_put_ns(struct nvme_ns *ns);
 
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 47ee3fb193bd..477439acb8e1 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -248,6 +248,7 @@ struct nvmet_subsys {
 #ifdef CONFIG_NVME_TARGET_PASSTHRU
 	struct nvme_ctrl	*passthru_ctrl;
 	char			*passthru_ctrl_path;
+	struct file             *passthru_ctrl_file;
 	struct config_group	passthru_group;
 #endif /* CONFIG_NVME_TARGET_PASSTHRU */
 };
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 8bd7f656e240..14a842408035 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -473,6 +473,7 @@ u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req)
 
 int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
 {
+	const char *pt_path = subsys->passthru_ctrl_path;
 	struct nvme_ctrl *ctrl;
 	int ret = -EINVAL;
 	void *old;
@@ -480,7 +481,7 @@ int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
 	mutex_lock(&subsys->lock);
 	if (!subsys->passthru_ctrl_path)
 		goto out_unlock;
-	if (subsys->passthru_ctrl)
+	if (!pt_path)
 		goto out_unlock;
 
 	if (subsys->nr_namespaces) {
@@ -488,13 +489,18 @@ int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
 		goto out_unlock;
 	}
 
-	ctrl = nvme_ctrl_get_by_path(subsys->passthru_ctrl_path);
+	subsys->passthru_ctrl_file = filp_open(pt_path, O_RDWR, 0);
+	if (IS_ERR(subsys->passthru_ctrl_file)) {
+		ret = PTR_ERR(subsys->passthru_ctrl_file);
+		goto out_unlock;
+	}
+
+	ctrl = nvme_ctrl_get_by_file(subsys->passthru_ctrl_file);
 	if (IS_ERR(ctrl)) {
 		ret = PTR_ERR(ctrl);
 		pr_err("failed to open nvme controller %s\n",
 		       subsys->passthru_ctrl_path);
-
-		goto out_unlock;
+		goto out_put_file;
 	}
 
 	old = xa_cmpxchg(&passthru_subsystems, ctrl->cntlid, NULL,
@@ -522,6 +528,8 @@ int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
 
 out_put_ctrl:
 	nvme_put_ctrl(ctrl);
+out_put_file:
+	filp_close(subsys->passthru_ctrl_file, NULL);
 out_unlock:
 	mutex_unlock(&subsys->lock);
 	return ret;
@@ -531,6 +539,7 @@ static void __nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
 {
 	if (subsys->passthru_ctrl) {
 		xa_erase(&passthru_subsystems, subsys->passthru_ctrl->cntlid);
+		filp_close(subsys->passthru_ctrl_file, NULL);
 		nvme_put_ctrl(subsys->passthru_ctrl);
 	}
 	subsys->passthru_ctrl = NULL;
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] nvme: move get/put ctrl into dev open/release
  2020-09-04  2:39 [PATCH 0/3] nvme: fix module ref count Oops Chaitanya Kulkarni
  2020-09-04  2:39 ` [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path() Chaitanya Kulkarni
@ 2020-09-04  2:39 ` Chaitanya Kulkarni
  2020-09-04 15:55   ` Logan Gunthorpe
  2020-09-08  8:52   ` Christoph Hellwig
  2020-09-04  2:39 ` [PATCH 3/3] nvme-core: fix nvme module ref count Oops Chaitanya Kulkarni
  2 siblings, 2 replies; 16+ messages in thread
From: Chaitanya Kulkarni @ 2020-09-04  2:39 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, logang, hch, Chaitanya Kulkarni, sagi

Introduce nvme_dev_release ctrl file release callback and move ctrl get
and put operations from target passthru into host core in ctrl open and
release file operations respectively. This is needed to atomically
get/put ctrl and get/put ctrl module refcount without using any locks.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/host/core.c       | 12 +++++++++++-
 drivers/nvme/target/passthru.c |  7 ++-----
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fa7c0def9184..a1707afcb710 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3262,6 +3262,16 @@ static int nvme_dev_open(struct inode *inode, struct file *file)
 	}
 
 	file->private_data = ctrl;
+	nvme_get_ctrl(ctrl);
+	return 0;
+}
+
+static int nvme_dev_release(struct inode *inode, struct file *file)
+{
+	struct nvme_ctrl *ctrl =
+		container_of(inode->i_cdev, struct nvme_ctrl, cdev);
+
+	nvme_put_ctrl(ctrl);
 	return 0;
 }
 
@@ -3327,6 +3337,7 @@ static long nvme_dev_ioctl(struct file *file, unsigned int cmd,
 static const struct file_operations nvme_dev_fops = {
 	.owner		= THIS_MODULE,
 	.open		= nvme_dev_open,
+	.release	= nvme_dev_release,
 	.unlocked_ioctl	= nvme_dev_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
 };
@@ -4651,7 +4662,6 @@ struct nvme_ctrl *nvme_ctrl_get_by_file(struct file *f)
 	}
 
 	ctrl = f->private_data;
-	nvme_get_ctrl(ctrl);
 
 out_close:
 	return ctrl;
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 14a842408035..9cd8fd2ddc72 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -507,11 +507,11 @@ int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
 			 subsys, GFP_KERNEL);
 	if (xa_is_err(old)) {
 		ret = xa_err(old);
-		goto out_put_ctrl;
+		goto out_put_file;
 	}
 
 	if (old)
-		goto out_put_ctrl;
+		goto out_put_file;
 
 	subsys->passthru_ctrl = ctrl;
 	subsys->ver = ctrl->vs;
@@ -526,8 +526,6 @@ int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
 	mutex_unlock(&subsys->lock);
 	return 0;
 
-out_put_ctrl:
-	nvme_put_ctrl(ctrl);
 out_put_file:
 	filp_close(subsys->passthru_ctrl_file, NULL);
 out_unlock:
@@ -540,7 +538,6 @@ static void __nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
 	if (subsys->passthru_ctrl) {
 		xa_erase(&passthru_subsystems, subsys->passthru_ctrl->cntlid);
 		filp_close(subsys->passthru_ctrl_file, NULL);
-		nvme_put_ctrl(subsys->passthru_ctrl);
 	}
 	subsys->passthru_ctrl = NULL;
 	subsys->ver = NVMET_DEFAULT_VS;
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] nvme-core: fix nvme module ref count Oops
  2020-09-04  2:39 [PATCH 0/3] nvme: fix module ref count Oops Chaitanya Kulkarni
  2020-09-04  2:39 ` [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path() Chaitanya Kulkarni
  2020-09-04  2:39 ` [PATCH 2/3] nvme: move get/put ctrl into dev open/release Chaitanya Kulkarni
@ 2020-09-04  2:39 ` Chaitanya Kulkarni
  2020-09-04 15:57   ` Logan Gunthorpe
  2020-09-08  8:54   ` Christoph Hellwig
  2 siblings, 2 replies; 16+ messages in thread
From: Chaitanya Kulkarni @ 2020-09-04  2:39 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, logang, hch, Chaitanya Kulkarni, sagi

In the passthru controller enable path current code doesn't take the
reference to the passthru ctrl module. Which produces following Oops :-

Entering kdb (current=0xffff8887f8290000, pid 3128) on processor 30 Oops: (null)
due to oops @ 0xffffffffa01019ad
CPU: 30 PID: 3128 Comm: bash Tainted: G        W  OE     5.8.0-rc4nvme-5.9+ #35
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.4
RIP: 0010:nvme_free_ctrl+0x234/0x285 [nvme_core]
Code: 57 10 a0 e8 73 bf 02 e1 ba 3d 11 00 00 48 c7 c6 98 33 10 a0 48 c7 c7 1d 57 10 a0 e8 5b bf 02 e1 8
RSP: 0018:ffffc90001d63de0 EFLAGS: 00010246
RAX: ffffffffa05c0440 RBX: ffff8888119e45a0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8888177e9550 RDI: ffff8888119e43b0
RBP: ffff8887d4768000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: ffffc90001d63c90 R12: ffff8888119e43b0
R13: ffff8888119e5108 R14: dead000000000100 R15: ffff8888119e5108
FS:  00007f1ef27b0740(0000) GS:ffff888817600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa05c0470 CR3: 00000007f6bee000 CR4: 00000000003406e0
Call Trace:
 device_release+0x27/0x80
 kobject_put+0x98/0x170
 nvmet_passthru_ctrl_disable+0x4a/0x70 [nvmet]
 nvmet_passthru_enable_store+0x4c/0x90 [nvmet]
 configfs_write_file+0xe6/0x150
 vfs_write+0xba/0x1e0
 ksys_write+0x5f/0xe0
 do_syscall_64+0x52/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f1ef1eb2840
Code: Bad RIP value.
RSP: 002b:00007fffdbff0eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1ef1eb2840
RDX: 0000000000000002 RSI: 00007f1ef27d2000 RDI: 0000000000000001
RBP: 00007f1ef27d2000 R08: 000000000000000a R09: 00007f1ef27b0740
R10: 0000000000000001 R11: 0000000000000246 R12: 00007f1ef2186400
R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000

We fix that by taking a module ref count in nvme_dev_open() and release
that ref count in nvme_dev_release() atomically with ctrl get/put
respectively.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/nvme/host/core.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a1707afcb710..8445293c74e3 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3263,6 +3263,12 @@ static int nvme_dev_open(struct inode *inode, struct file *file)
 
 	file->private_data = ctrl;
 	nvme_get_ctrl(ctrl);
+	if (!try_module_get(ctrl->ops->module)) {
+		pr_err("try_module_get failed for cntlid 0x%x\n", ctrl->cntlid);
+		nvme_put_ctrl(ctrl);
+		return -EINVAL;
+	}
+
 	return 0;
 }
 
@@ -3271,6 +3277,7 @@ static int nvme_dev_release(struct inode *inode, struct file *file)
 	struct nvme_ctrl *ctrl =
 		container_of(inode->i_cdev, struct nvme_ctrl, cdev);
 
+	module_put(ctrl->ops->module);
 	nvme_put_ctrl(ctrl);
 	return 0;
 }
-- 
2.22.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path()
  2020-09-04  2:39 ` [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path() Chaitanya Kulkarni
@ 2020-09-04 15:54   ` Logan Gunthorpe
  2020-09-05  7:20     ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Logan Gunthorpe @ 2020-09-04 15:54 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: kbusch, hch, sagi



On 2020-09-03 8:39 p.m., Chaitanya Kulkarni wrote:
> Right now nvme_ctrl_get_by_path() accepts ctrl path, based on that it
> opens a file and calls nvme_get_ctrl(). In order to take module
> refcount it is important to distinguish the error between file_open()
> and module file ops check so that we can unwind the code in the caller
> nvmet_passthru_ctrl_enable() in the error path.
> 
> Rename nvme_ctrl_get_by_path() -> nvme_ctrl_get_by_file() and lift
> the file opening and error handling in the caller so that we can unwind
> appropriately in the error path.

I still don't really like this idea. No need to keep a reference to a
whole other file.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] nvme: move get/put ctrl into dev open/release
  2020-09-04  2:39 ` [PATCH 2/3] nvme: move get/put ctrl into dev open/release Chaitanya Kulkarni
@ 2020-09-04 15:55   ` Logan Gunthorpe
  2020-09-08  8:52   ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Logan Gunthorpe @ 2020-09-04 15:55 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: kbusch, hch, sagi



On 2020-09-03 8:39 p.m., Chaitanya Kulkarni wrote:
> Introduce nvme_dev_release ctrl file release callback and move ctrl get
> and put operations from target passthru into host core in ctrl open and
> release file operations respectively. This is needed to atomically
> get/put ctrl and get/put ctrl module refcount without using any locks.
> 
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> ---
>  drivers/nvme/host/core.c       | 12 +++++++++++-
>  drivers/nvme/target/passthru.c |  7 ++-----
>  2 files changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index fa7c0def9184..a1707afcb710 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3262,6 +3262,16 @@ static int nvme_dev_open(struct inode *inode, struct file *file)
>  	}
>  
>  	file->private_data = ctrl;
> +	nvme_get_ctrl(ctrl);
> +	return 0;
> +}
> +
> +static int nvme_dev_release(struct inode *inode, struct file *file)
> +{
> +	struct nvme_ctrl *ctrl =
> +		container_of(inode->i_cdev, struct nvme_ctrl, cdev);
> +
> +	nvme_put_ctrl(ctrl);
>  	return 0;
>  }
>  
> @@ -3327,6 +3337,7 @@ static long nvme_dev_ioctl(struct file *file, unsigned int cmd,
>  static const struct file_operations nvme_dev_fops = {
>  	.owner		= THIS_MODULE,
>  	.open		= nvme_dev_open,
> +	.release	= nvme_dev_release,
>  	.unlocked_ioctl	= nvme_dev_ioctl,
>  	.compat_ioctl	= compat_ptr_ioctl,
>  };

Once again, above this makes sense. Below this seems very much unrelated.



> @@ -4651,7 +4662,6 @@ struct nvme_ctrl *nvme_ctrl_get_by_file(struct file *f)
>  	}
>  
>  	ctrl = f->private_data;
> -	nvme_get_ctrl(ctrl);
>  
>  out_close:
>  	return ctrl;
> diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
> index 14a842408035..9cd8fd2ddc72 100644
> --- a/drivers/nvme/target/passthru.c
> +++ b/drivers/nvme/target/passthru.c
> @@ -507,11 +507,11 @@ int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
>  			 subsys, GFP_KERNEL);
>  	if (xa_is_err(old)) {
>  		ret = xa_err(old);
> -		goto out_put_ctrl;
> +		goto out_put_file;
>  	}
>  
>  	if (old)
> -		goto out_put_ctrl;
> +		goto out_put_file;
>  
>  	subsys->passthru_ctrl = ctrl;
>  	subsys->ver = ctrl->vs;
> @@ -526,8 +526,6 @@ int nvmet_passthru_ctrl_enable(struct nvmet_subsys *subsys)
>  	mutex_unlock(&subsys->lock);
>  	return 0;
>  
> -out_put_ctrl:
> -	nvme_put_ctrl(ctrl);
>  out_put_file:
>  	filp_close(subsys->passthru_ctrl_file, NULL);
>  out_unlock:
> @@ -540,7 +538,6 @@ static void __nvmet_passthru_ctrl_disable(struct nvmet_subsys *subsys)
>  	if (subsys->passthru_ctrl) {
>  		xa_erase(&passthru_subsystems, subsys->passthru_ctrl->cntlid);
>  		filp_close(subsys->passthru_ctrl_file, NULL);
> -		nvme_put_ctrl(subsys->passthru_ctrl);
>  	}
>  	subsys->passthru_ctrl = NULL;
>  	subsys->ver = NVMET_DEFAULT_VS;
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] nvme-core: fix nvme module ref count Oops
  2020-09-04  2:39 ` [PATCH 3/3] nvme-core: fix nvme module ref count Oops Chaitanya Kulkarni
@ 2020-09-04 15:57   ` Logan Gunthorpe
  2020-09-05 22:03     ` Chaitanya Kulkarni
  2020-09-08  8:54     ` Christoph Hellwig
  2020-09-08  8:54   ` Christoph Hellwig
  1 sibling, 2 replies; 16+ messages in thread
From: Logan Gunthorpe @ 2020-09-04 15:57 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: kbusch, hch, sagi



On 2020-09-03 8:39 p.m., Chaitanya Kulkarni wrote:
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index a1707afcb710..8445293c74e3 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3263,6 +3263,12 @@ static int nvme_dev_open(struct inode *inode, struct file *file)
>  
>  	file->private_data = ctrl;
>  	nvme_get_ctrl(ctrl);
> +	if (!try_module_get(ctrl->ops->module)) {
> +		pr_err("try_module_get failed for cntlid 0x%x\n", ctrl->cntlid);
> +		nvme_put_ctrl(ctrl);
> +		return -EINVAL;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -3271,6 +3277,7 @@ static int nvme_dev_release(struct inode *inode, struct file *file)
>  	struct nvme_ctrl *ctrl =
>  		container_of(inode->i_cdev, struct nvme_ctrl, cdev);
>  
> +	module_put(ctrl->ops->module);
>  	nvme_put_ctrl(ctrl);
>  	return 0;
>  }

I'd probably just fold these changes into the previous patch. When we
get a reference to the controller we must also get a reference to the
transport module. So no sense getting the different references in
different patches.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path()
  2020-09-04 15:54   ` Logan Gunthorpe
@ 2020-09-05  7:20     ` Christoph Hellwig
  2020-09-05 22:57       ` Chaitanya Kulkarni
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-05  7:20 UTC (permalink / raw)
  To: Logan Gunthorpe; +Cc: kbusch, sagi, linux-nvme, Chaitanya Kulkarni, hch

On Fri, Sep 04, 2020 at 09:54:48AM -0600, Logan Gunthorpe wrote:
> 
> 
> On 2020-09-03 8:39 p.m., Chaitanya Kulkarni wrote:
> > Right now nvme_ctrl_get_by_path() accepts ctrl path, based on that it
> > opens a file and calls nvme_get_ctrl(). In order to take module
> > refcount it is important to distinguish the error between file_open()
> > and module file ops check so that we can unwind the code in the caller
> > nvmet_passthru_ctrl_enable() in the error path.
> > 
> > Rename nvme_ctrl_get_by_path() -> nvme_ctrl_get_by_file() and lift
> > the file opening and error handling in the caller so that we can unwind
> > appropriately in the error path.
> 
> I still don't really like this idea. No need to keep a reference to a
> whole other file.

We could still lift the filp_open into the caller, but drop the file
reference right after we got the controller reference.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] nvme-core: fix nvme module ref count Oops
  2020-09-04 15:57   ` Logan Gunthorpe
@ 2020-09-05 22:03     ` Chaitanya Kulkarni
  2020-09-08 15:33       ` Logan Gunthorpe
  2020-09-08  8:54     ` Christoph Hellwig
  1 sibling, 1 reply; 16+ messages in thread
From: Chaitanya Kulkarni @ 2020-09-05 22:03 UTC (permalink / raw)
  To: Logan Gunthorpe, linux-nvme; +Cc: kbusch, hch, sagi

On 9/4/20 08:57, Logan Gunthorpe wrote:
> I'd probably just fold these changes into the previous patch. When we
> get a reference to the controller we must also get a reference to the
> transport module. So no sense getting the different references in
> different patches.
> 
> Logan
> 

That is a preparation needed to actually fix the bug.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path()
  2020-09-05  7:20     ` Christoph Hellwig
@ 2020-09-05 22:57       ` Chaitanya Kulkarni
  2020-09-08  8:49         ` Christoph Hellwig
  2020-09-08 15:36         ` Logan Gunthorpe
  0 siblings, 2 replies; 16+ messages in thread
From: Chaitanya Kulkarni @ 2020-09-05 22:57 UTC (permalink / raw)
  To: Christoph Hellwig, Logan Gunthorpe; +Cc: kbusch, sagi, linux-nvme

On 9/5/20 00:20, Christoph Hellwig wrote:
>> I still don't really like this idea. No need to keep a reference to a
>> whole other file.

File reference open/close is coupled with ctrl refcnt and module refcnt.
What is a problem with keeping the file reference open ?

> We could still lift the filp_open into the caller, but drop the file
> reference right after we got the controller reference.
> 

The file open call gets the ctrl and module reference atomically and on
close release the module and ctrl reference respectively.

If dropping the file reference means closing the file then it will lead
to loosing the module and ctrl reference.

That also means, moving the references get/put out of
nvme_dev_open()/nvme_dev_release() in host-core, it will fix the problem
only for the passthru case. Having done it in the core takes care of the
all the cases if any.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path()
  2020-09-05 22:57       ` Chaitanya Kulkarni
@ 2020-09-08  8:49         ` Christoph Hellwig
  2020-09-08 15:36         ` Logan Gunthorpe
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-08  8:49 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: kbusch, Logan Gunthorpe, Christoph Hellwig, linux-nvme, sagi

On Sat, Sep 05, 2020 at 10:57:42PM +0000, Chaitanya Kulkarni wrote:
> On 9/5/20 00:20, Christoph Hellwig wrote:
> >> I still don't really like this idea. No need to keep a reference to a
> >> whole other file.
> 
> File reference open/close is coupled with ctrl refcnt and module refcnt.
> What is a problem with keeping the file reference open ?
> 
> > We could still lift the filp_open into the caller, but drop the file
> > reference right after we got the controller reference.
> > 
> 
> The file open call gets the ctrl and module reference atomically and on
> close release the module and ctrl reference respectively.
> 
> If dropping the file reference means closing the file then it will lead
> to loosing the module and ctrl reference.

Not if you grab an extra reference before closing it.

> That also means, moving the references get/put out of
> nvme_dev_open()/nvme_dev_release() in host-core, it will fix the problem
> only for the passthru case. Having done it in the core takes care of the
> all the cases if any.

I don't see why that is an either/or instead of doing both.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] nvme: move get/put ctrl into dev open/release
  2020-09-04  2:39 ` [PATCH 2/3] nvme: move get/put ctrl into dev open/release Chaitanya Kulkarni
  2020-09-04 15:55   ` Logan Gunthorpe
@ 2020-09-08  8:52   ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-08  8:52 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: kbusch, logang, hch, linux-nvme, sagi

On Thu, Sep 03, 2020 at 07:39:28PM -0700, Chaitanya Kulkarni wrote:
> Introduce nvme_dev_release ctrl file release callback and move ctrl get
> and put operations from target passthru into host core in ctrl open and
> release file operations respectively. This is needed to atomically
> get/put ctrl and get/put ctrl module refcount without using any locks.

Besides the comment from Logan:  this doesn't move as said in the
subject but adds a reference.  I think this should be patch 1 as it
is a fixed that will need backports further back.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] nvme-core: fix nvme module ref count Oops
  2020-09-04  2:39 ` [PATCH 3/3] nvme-core: fix nvme module ref count Oops Chaitanya Kulkarni
  2020-09-04 15:57   ` Logan Gunthorpe
@ 2020-09-08  8:54   ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-08  8:54 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: kbusch, logang, hch, linux-nvme, sagi

This looks odd.  The char_dev code should grab the reference to the
module in 

On Thu, Sep 03, 2020 at 07:39:29PM -0700, Chaitanya Kulkarni wrote:
> In the passthru controller enable path current code doesn't take the
> reference to the passthru ctrl module. Which produces following Oops :-
> 
> Entering kdb (current=0xffff8887f8290000, pid 3128) on processor 30 Oops: (null)
> due to oops @ 0xffffffffa01019ad
> CPU: 30 PID: 3128 Comm: bash Tainted: G        W  OE     5.8.0-rc4nvme-5.9+ #35
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.4
> RIP: 0010:nvme_free_ctrl+0x234/0x285 [nvme_core]
> Code: 57 10 a0 e8 73 bf 02 e1 ba 3d 11 00 00 48 c7 c6 98 33 10 a0 48 c7 c7 1d 57 10 a0 e8 5b bf 02 e1 8
> RSP: 0018:ffffc90001d63de0 EFLAGS: 00010246
> RAX: ffffffffa05c0440 RBX: ffff8888119e45a0 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffff8888177e9550 RDI: ffff8888119e43b0
> RBP: ffff8887d4768000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: ffffc90001d63c90 R12: ffff8888119e43b0
> R13: ffff8888119e5108 R14: dead000000000100 R15: ffff8888119e5108
> FS:  00007f1ef27b0740(0000) GS:ffff888817600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffa05c0470 CR3: 00000007f6bee000 CR4: 00000000003406e0
> Call Trace:
>  device_release+0x27/0x80
>  kobject_put+0x98/0x170
>  nvmet_passthru_ctrl_disable+0x4a/0x70 [nvmet]
>  nvmet_passthru_enable_store+0x4c/0x90 [nvmet]
>  configfs_write_file+0xe6/0x150
>  vfs_write+0xba/0x1e0
>  ksys_write+0x5f/0xe0
>  do_syscall_64+0x52/0xb0
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f1ef1eb2840
> Code: Bad RIP value.
> RSP: 002b:00007fffdbff0eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1ef1eb2840
> RDX: 0000000000000002 RSI: 00007f1ef27d2000 RDI: 0000000000000001
> RBP: 00007f1ef27d2000 R08: 000000000000000a R09: 00007f1ef27b0740
> R10: 0000000000000001 R11: 0000000000000246 R12: 00007f1ef2186400
> R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
> 
> We fix that by taking a module ref count in nvme_dev_open() and release
> that ref count in nvme_dev_release() atomically with ctrl get/put
> respectively.
> 
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> ---
>  drivers/nvme/host/core.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index a1707afcb710..8445293c74e3 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3263,6 +3263,12 @@ static int nvme_dev_open(struct inode *inode, struct file *file)
>  
>  	file->private_data = ctrl;
>  	nvme_get_ctrl(ctrl);
> +	if (!try_module_get(ctrl->ops->module)) {
> +		pr_err("try_module_get failed for cntlid 0x%x\n", ctrl->cntlid);

No need for a debug printk here..

Otherwise this looks good and should go to the front as it is a problem
even without the passthrough controller.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] nvme-core: fix nvme module ref count Oops
  2020-09-04 15:57   ` Logan Gunthorpe
  2020-09-05 22:03     ` Chaitanya Kulkarni
@ 2020-09-08  8:54     ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-08  8:54 UTC (permalink / raw)
  To: Logan Gunthorpe; +Cc: kbusch, sagi, linux-nvme, Chaitanya Kulkarni, hch

On Fri, Sep 04, 2020 at 09:57:02AM -0600, Logan Gunthorpe wrote:
> I'd probably just fold these changes into the previous patch. When we
> get a reference to the controller we must also get a reference to the
> transport module. So no sense getting the different references in
> different patches.

Yes, that sounds sensible.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] nvme-core: fix nvme module ref count Oops
  2020-09-05 22:03     ` Chaitanya Kulkarni
@ 2020-09-08 15:33       ` Logan Gunthorpe
  0 siblings, 0 replies; 16+ messages in thread
From: Logan Gunthorpe @ 2020-09-08 15:33 UTC (permalink / raw)
  To: Chaitanya Kulkarni, linux-nvme; +Cc: kbusch, hch, sagi



On 2020-09-05 4:03 p.m., Chaitanya Kulkarni wrote:
> On 9/4/20 08:57, Logan Gunthorpe wrote:
>> I'd probably just fold these changes into the previous patch. When we
>> get a reference to the controller we must also get a reference to the
>> transport module. So no sense getting the different references in
>> different patches.
>>
>> Logan
>>
> 
> That is a preparation needed to actually fix the bug.


Depends which bug you are talking about. I'd fix the bug with the char
dev file open which requires both taking a reference to the controller
and a reference to the transport module. Might as well do both in one patch.

Then I'd fix the bug with the passthru reference to the controller by
taking a reference to the module in that path somehow.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path()
  2020-09-05 22:57       ` Chaitanya Kulkarni
  2020-09-08  8:49         ` Christoph Hellwig
@ 2020-09-08 15:36         ` Logan Gunthorpe
  1 sibling, 0 replies; 16+ messages in thread
From: Logan Gunthorpe @ 2020-09-08 15:36 UTC (permalink / raw)
  To: Chaitanya Kulkarni, Christoph Hellwig; +Cc: kbusch, sagi, linux-nvme



On 2020-09-05 4:57 p.m., Chaitanya Kulkarni wrote:
> On 9/5/20 00:20, Christoph Hellwig wrote:
>>> I still don't really like this idea. No need to keep a reference to a
>>> whole other file.
> 
> File reference open/close is coupled with ctrl refcnt and module refcnt.
> What is a problem with keeping the file reference open ?

I find it a bit distasteful to keep an extra pointer and object around
when all we need is to add a module reference.

>> We could still lift the filp_open into the caller, but drop the file
>> reference right after we got the controller reference.

I'm not opposed to lifting the filp_open() into the caller if it makes
the code cleaner. I just think dropping the file reference should be kept.

Logan

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-09-08 15:36 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04  2:39 [PATCH 0/3] nvme: fix module ref count Oops Chaitanya Kulkarni
2020-09-04  2:39 ` [PATCH 1/3] nvme: decouple nvme_ctrl_get_by_path() Chaitanya Kulkarni
2020-09-04 15:54   ` Logan Gunthorpe
2020-09-05  7:20     ` Christoph Hellwig
2020-09-05 22:57       ` Chaitanya Kulkarni
2020-09-08  8:49         ` Christoph Hellwig
2020-09-08 15:36         ` Logan Gunthorpe
2020-09-04  2:39 ` [PATCH 2/3] nvme: move get/put ctrl into dev open/release Chaitanya Kulkarni
2020-09-04 15:55   ` Logan Gunthorpe
2020-09-08  8:52   ` Christoph Hellwig
2020-09-04  2:39 ` [PATCH 3/3] nvme-core: fix nvme module ref count Oops Chaitanya Kulkarni
2020-09-04 15:57   ` Logan Gunthorpe
2020-09-05 22:03     ` Chaitanya Kulkarni
2020-09-08 15:33       ` Logan Gunthorpe
2020-09-08  8:54     ` Christoph Hellwig
2020-09-08  8:54   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.