All of lore.kernel.org
 help / color / mirror / Atom feed
* Issue with namespace delete
@ 2019-05-16  1:23 Heitke, Kenneth
  2019-05-16 15:11 ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Heitke, Kenneth @ 2019-05-16  1:23 UTC (permalink / raw)


I have been doing some namespace testing with Ubuntu 18.04 (kernel 
4.15.0-43-generic). I'm running into an issue with namespace deletes 
where the driver seems to hang.

[  363.484013]  synchronize_srcu+0x57/0xdc
[  363.484016]  nvme_ns_remove+0xcc/0x180 [nvme_core]
[  363.484018]  nvme_remove_invalid_namespaces+0xb1/0xe0 [nvme_core]
[  363.484020]  nvme_user_cmd+0x282/0x370 [nvme_core]
[  363.484022]  nvme_ioctl+0xd0/0x1d0 [nvme_core]
[  363.484024]  blkdev_ioctl+0x3b8/0x980
[  363.484025]  block_ioctl+0x3d/0x50
[  363.484027]  do_vfs_ioctl+0xa8/0x620
[  363.484028]  ? ptrace_notify+0x5b/0x90
[  363.484030]  ? syscall_trace_enter+0x7b/0x2c0
[  363.484031]  SyS_ioctl+0x7a/0x90
[  363.484032]  do_syscall_64+0x73/0x130
[  363.484033]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

I don't understand RCUs very well but I found the following in the 
documentation

"Note that it is illegal to call synchronize_srcu from the corresponding 
SRCU read-side critical section; doing so will result in deadlock."

I noticed in the driver that when multi-path is enabled, the context for 
ioctl calls would be in a read-side critical section 
(nvme_get_ns_from_disk) and I believe that the synchronize_srcu() call 
is made in the same context.

If I disable NVME_MULTIPATH, I don't see any issues when I try to delete 
a namespace.

I re-enabled multi-path and enabled DEBUG_LOCK_ALLOC. I used the 
following patch to check if the lock is held and then only call 
synchronize if the lock is not held.
[I am not sure I trust this because lock_held returns true by default]

@@ -3006,7 +3008,11 @@ static void nvme_ns_remove(struct nvme_ns *ns)
         list_del_init(&ns->list);
         up_write(&ns->ctrl->namespaces_rwsem);

-       synchronize_srcu(&ns->head->srcu);
+       WARN_ON(srcu_read_lock_held(&ns->head->srcu));
+
+       if (!srcu_read_lock_held(&ns->head->srcu))
+               synchronize_srcu(&ns->head->srcu);

I do get the warning and the namespace delete is successful.

[  136.316398] WARNING: CPU: 1 PID: 2201 at 
drivers/nvme/host/core.c:3013 nvme_ns_remove+0xf8/0x250 [nvme_core]
[  136.316489] Call Trace:
[  136.316494]  nvme_remove_invalid_namespaces+0xce/0x100 [nvme_core]
[  136.316498]  nvme_user_cmd+0x292/0x3a0 [nvme_core]
[  136.316507]  nvme_ioctl+0x123/0x220 [nvme_core]


Is there a possible issue here or am I off in the weeds?

Btw, I also see this issue with the 4.18 and 4.20 kernels

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Issue with namespace delete
  2019-05-16  1:23 Issue with namespace delete Heitke, Kenneth
@ 2019-05-16 15:11 ` Keith Busch
  2019-05-16 15:53   ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2019-05-16 15:11 UTC (permalink / raw)


On Wed, May 15, 2019@06:23:53PM -0700, Heitke, Kenneth wrote:
> I have been doing some namespace testing with Ubuntu 18.04 (kernel 
> 4.15.0-43-generic). I'm running into an issue with namespace deletes 
> where the driver seems to hang.
> 
> [  363.484013]  synchronize_srcu+0x57/0xdc
> [  363.484016]  nvme_ns_remove+0xcc/0x180 [nvme_core]
> [  363.484018]  nvme_remove_invalid_namespaces+0xb1/0xe0 [nvme_core]
> [  363.484020]  nvme_user_cmd+0x282/0x370 [nvme_core]
> [  363.484022]  nvme_ioctl+0xd0/0x1d0 [nvme_core]
> [  363.484024]  blkdev_ioctl+0x3b8/0x980
> [  363.484025]  block_ioctl+0x3d/0x50
> [  363.484027]  do_vfs_ioctl+0xa8/0x620
> [  363.484028]  ? ptrace_notify+0x5b/0x90
> [  363.484030]  ? syscall_trace_enter+0x7b/0x2c0
> [  363.484031]  SyS_ioctl+0x7a/0x90
> [  363.484032]  do_syscall_64+0x73/0x130
> [  363.484033]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> 
> I don't understand RCUs very well but I found the following in the 
> documentation
> 
> "Note that it is illegal to call synchronize_srcu from the corresponding 
> SRCU read-side critical section; doing so will result in deadlock."
> 
> I noticed in the driver that when multi-path is enabled, the context for 
> ioctl calls would be in a read-side critical section 
> (nvme_get_ns_from_disk) and I believe that the synchronize_srcu() call 
> is made in the same context.

Yeah, you're right.

You may have avoided this if you send the ioctl through the controller
char dev rather than the namespace block dev handle.

I'm not sure what the best way to fix this might be right now.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Issue with namespace delete
  2019-05-16 15:11 ` Keith Busch
@ 2019-05-16 15:53   ` Christoph Hellwig
  2019-05-16 16:03     ` Keith Busch
  2019-05-16 17:49     ` Heitke, Kenneth
  0 siblings, 2 replies; 5+ messages in thread
From: Christoph Hellwig @ 2019-05-16 15:53 UTC (permalink / raw)


On Thu, May 16, 2019@09:11:30AM -0600, Keith Busch wrote:
> You may have avoided this if you send the ioctl through the controller
> char dev rather than the namespace block dev handle.
> 
> I'm not sure what the best way to fix this might be right now.

We could try something like the changes below, although they are
completely untested for now and will need to be split up into
a few patches:


diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a6644a2c3ef7..537cbef5bc4a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1362,9 +1362,14 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct gendisk *disk,
 {
 #ifdef CONFIG_NVME_MULTIPATH
 	if (disk->fops == &nvme_ns_head_ops) {
+		struct nvme_ns *ns;
+
 		*head = disk->private_data;
 		*srcu_idx = srcu_read_lock(&(*head)->srcu);
-		return nvme_find_path(*head);
+		ns = nvme_find_path(*head);
+		if (!ns)
+			srcu_read_unlock(&(*head)->srcu, *srcu_idx);
+		return ns;
 	}
 #endif
 	*head = NULL;
@@ -1384,8 +1389,6 @@ static int nvme_ns_ioctl(struct nvme_ns *ns, unsigned cmd, unsigned long arg)
 	case NVME_IOCTL_ID:
 		force_successful_syscall_return();
 		return ns->head->ns_id;
-	case NVME_IOCTL_ADMIN_CMD:
-		return nvme_user_cmd(ns->ctrl, NULL, (void __user *)arg);
 	case NVME_IOCTL_IO_CMD:
 		return nvme_user_cmd(ns->ctrl, ns, (void __user *)arg);
 	case NVME_IOCTL_SUBMIT_IO:
@@ -1395,9 +1398,6 @@ static int nvme_ns_ioctl(struct nvme_ns *ns, unsigned cmd, unsigned long arg)
 		if (ns->ndev)
 			return nvme_nvm_ioctl(ns, cmd, arg);
 #endif
-		if (is_sed_ioctl(cmd))
-			return sed_ioctl(ns->ctrl->opal_dev, cmd,
-					 (void __user *) arg);
 		return -ENOTTY;
 	}
 }
@@ -1405,16 +1405,30 @@ static int nvme_ns_ioctl(struct nvme_ns *ns, unsigned cmd, unsigned long arg)
 static int nvme_ioctl(struct block_device *bdev, fmode_t mode,
 		unsigned int cmd, unsigned long arg)
 {
+	void __user *argp = (void __user *)arg;
 	struct nvme_ns_head *head = NULL;
+	struct nvme_ctrl *ctrl = NULL;
 	struct nvme_ns *ns;
-	int srcu_idx, ret;
+	int srcu_idx, ret = 0;
 
 	ns = nvme_get_ns_from_disk(bdev->bd_disk, &head, &srcu_idx);
 	if (unlikely(!ns))
-		ret = -EWOULDBLOCK;
+		return -EWOULDBLOCK;
+
+	if (cmd == NVME_IOCTL_ADMIN_CMD || is_sed_ioctl(cmd))
+		ctrl = nvme_get_ctrl(ns->ctrl);
 	else
 		ret = nvme_ns_ioctl(ns, cmd, arg);
 	nvme_put_ns_from_disk(head, srcu_idx);
+
+	if (ctrl) {
+		if (cmd == NVME_IOCTL_ADMIN_CMD)
+			return nvme_user_cmd(ctrl, NULL, argp);
+		if (is_sed_ioctl(cmd))
+			return sed_ioctl(ctrl->opal_dev, cmd, argp);
+		nvme_put_ctrl(ctrl);
+	}
+
 	return ret;
 }
 
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 5ee75b5ff83f..86625767da8b 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -405,9 +405,10 @@ static inline void nvme_end_request(struct request *req, __le16 status,
 	blk_mq_complete_request(req);
 }
 
-static inline void nvme_get_ctrl(struct nvme_ctrl *ctrl)
+static inline struct nvme_ctrl *nvme_get_ctrl(struct nvme_ctrl *ctrl)
 {
 	get_device(ctrl->device);
+	return ctrl;
 }
 
 static inline void nvme_put_ctrl(struct nvme_ctrl *ctrl)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Issue with namespace delete
  2019-05-16 15:53   ` Christoph Hellwig
@ 2019-05-16 16:03     ` Keith Busch
  2019-05-16 17:49     ` Heitke, Kenneth
  1 sibling, 0 replies; 5+ messages in thread
From: Keith Busch @ 2019-05-16 16:03 UTC (permalink / raw)


On Thu, May 16, 2019@08:53:56AM -0700, Christoph Hellwig wrote:
> We could try something like the changes below, although they are
> completely untested for now and will need to be split up into
> a few patches:

Looks correct to me.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Issue with namespace delete
  2019-05-16 15:53   ` Christoph Hellwig
  2019-05-16 16:03     ` Keith Busch
@ 2019-05-16 17:49     ` Heitke, Kenneth
  1 sibling, 0 replies; 5+ messages in thread
From: Heitke, Kenneth @ 2019-05-16 17:49 UTC (permalink / raw)


Thanks Christoph. With my limited testing, your patch resolves my issue.

On 5/16/2019 9:53 AM, Christoph Hellwig wrote:
> On Thu, May 16, 2019@09:11:30AM -0600, Keith Busch wrote:
>> You may have avoided this if you send the ioctl through the controller
>> char dev rather than the namespace block dev handle.
>>
>> I'm not sure what the best way to fix this might be right now.
> 
> We could try something like the changes below, although they are
> completely untested for now and will need to be split up into
> a few patches:
> 
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index a6644a2c3ef7..537cbef5bc4a 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1362,9 +1362,14 @@ static struct nvme_ns *nvme_get_ns_from_disk(struct gendisk *disk,
>   {
>   #ifdef CONFIG_NVME_MULTIPATH
>   	if (disk->fops == &nvme_ns_head_ops) {
> +		struct nvme_ns *ns;
> +
>   		*head = disk->private_data;
>   		*srcu_idx = srcu_read_lock(&(*head)->srcu);
> -		return nvme_find_path(*head);
> +		ns = nvme_find_path(*head);
> +		if (!ns)
> +			srcu_read_unlock(&(*head)->srcu, *srcu_idx);
> +		return ns;
>   	}
>   #endif
>   	*head = NULL;
> @@ -1384,8 +1389,6 @@ static int nvme_ns_ioctl(struct nvme_ns *ns, unsigned cmd, unsigned long arg)
>   	case NVME_IOCTL_ID:
>   		force_successful_syscall_return();
>   		return ns->head->ns_id;
> -	case NVME_IOCTL_ADMIN_CMD:
> -		return nvme_user_cmd(ns->ctrl, NULL, (void __user *)arg);
>   	case NVME_IOCTL_IO_CMD:
>   		return nvme_user_cmd(ns->ctrl, ns, (void __user *)arg);
>   	case NVME_IOCTL_SUBMIT_IO:
> @@ -1395,9 +1398,6 @@ static int nvme_ns_ioctl(struct nvme_ns *ns, unsigned cmd, unsigned long arg)
>   		if (ns->ndev)
>   			return nvme_nvm_ioctl(ns, cmd, arg);
>   #endif
> -		if (is_sed_ioctl(cmd))
> -			return sed_ioctl(ns->ctrl->opal_dev, cmd,
> -					 (void __user *) arg);
>   		return -ENOTTY;
>   	}
>   }
> @@ -1405,16 +1405,30 @@ static int nvme_ns_ioctl(struct nvme_ns *ns, unsigned cmd, unsigned long arg)
>   static int nvme_ioctl(struct block_device *bdev, fmode_t mode,
>   		unsigned int cmd, unsigned long arg)
>   {
> +	void __user *argp = (void __user *)arg;
>   	struct nvme_ns_head *head = NULL;
> +	struct nvme_ctrl *ctrl = NULL;
>   	struct nvme_ns *ns;
> -	int srcu_idx, ret;
> +	int srcu_idx, ret = 0;
>   
>   	ns = nvme_get_ns_from_disk(bdev->bd_disk, &head, &srcu_idx);
>   	if (unlikely(!ns))
> -		ret = -EWOULDBLOCK;
> +		return -EWOULDBLOCK;
> +
> +	if (cmd == NVME_IOCTL_ADMIN_CMD || is_sed_ioctl(cmd))
> +		ctrl = nvme_get_ctrl(ns->ctrl);
>   	else
>   		ret = nvme_ns_ioctl(ns, cmd, arg);
>   	nvme_put_ns_from_disk(head, srcu_idx);
> +
> +	if (ctrl) {
> +		if (cmd == NVME_IOCTL_ADMIN_CMD)
> +			return nvme_user_cmd(ctrl, NULL, argp);
> +		if (is_sed_ioctl(cmd))
> +			return sed_ioctl(ctrl->opal_dev, cmd, argp);
> +		nvme_put_ctrl(ctrl);
> +	}
> +
>   	return ret;
>   }
>   
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 5ee75b5ff83f..86625767da8b 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -405,9 +405,10 @@ static inline void nvme_end_request(struct request *req, __le16 status,
>   	blk_mq_complete_request(req);
>   }
>   
> -static inline void nvme_get_ctrl(struct nvme_ctrl *ctrl)
> +static inline struct nvme_ctrl *nvme_get_ctrl(struct nvme_ctrl *ctrl)
>   {
>   	get_device(ctrl->device);
> +	return ctrl;
>   }
>   
>   static inline void nvme_put_ctrl(struct nvme_ctrl *ctrl)
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-05-16 17:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-16  1:23 Issue with namespace delete Heitke, Kenneth
2019-05-16 15:11 ` Keith Busch
2019-05-16 15:53   ` Christoph Hellwig
2019-05-16 16:03     ` Keith Busch
2019-05-16 17:49     ` Heitke, Kenneth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.