[PATCH V2 0/2] scsi: fix hang when device state is set via sysfs

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V2 0/2] scsi: fix hang when device state is set via sysfs
@ 2021-11-05 22:10 Mike Christie
  2021-11-05 22:10 ` [PATCH 1/2] iscsi: Unblock session then wake up error handler Mike Christie
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Mike Christie @ 2021-11-05 22:10 UTC (permalink / raw)
  To: lduncan, cleech, martin.petersen, linux-scsi, jejb

The following patches fix an issue where during iscsi recovery we deadlock
due to iscsid calling into sysfs to set the device state to running, but
the scsi error handler is also just finishing up.

My first attempt was here:

https://patchwork.kernel.org/project/linux-scsi/patch/20211006043117.11121-1-michael.christie@oracle.com/

But the problem with that patch is that iscsid can still get stuck.

This version 2 of the patchset allows the transport class to always
set the device state to running after the iscsi layer has done a relogin.
It then relies on a check in the second patch where if the transport class
has already set the state to running, iscsid write to the state sysfs
file returns without doing a rescan. I think this is hacky, but I'm not
sure how else to fix the issue without breaking iscsid or the users of
the new rescan behavior.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] iscsi: Unblock session then wake up error handler
  2021-11-05 22:10 [PATCH V2 0/2] scsi: fix hang when device state is set via sysfs Mike Christie
@ 2021-11-05 22:10 ` Mike Christie
  2021-11-06 17:44   ` Lee Duncan
  2021-11-19  4:16   ` Martin K. Petersen
  2021-11-05 22:10 ` [PATCH 2/2] scsi: Fix hang when device state is set via sysfs Mike Christie
  2021-11-09  4:09 ` [PATCH V2 0/2] scsi: fix " Martin K. Petersen
  2 siblings, 2 replies; 11+ messages in thread
From: Mike Christie @ 2021-11-05 22:10 UTC (permalink / raw)
  To: lduncan, cleech, martin.petersen, linux-scsi, jejb; +Cc: Mike Christie

We can race where iscsi_session_recovery_timedout has woken up the error
handler thread and it's now setting the devices to offline, and
session_recovery_timedout's call to scsi_target_unblock is also trying to
set the device's state to transport-offline. We can then get a mix of
states.

For the case where we can't relogin we want the devices to be in
transport-offline so when we have repaired the connection
__iscsi_unblock_session can set the state back to running. So this patch
has us set the device state then call into libiscsi to wake up the error
handler.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_transport_iscsi.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 78343d3f9385..554b6f784223 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1899,12 +1899,12 @@ static void session_recovery_timedout(struct work_struct *work)
 	}
 	spin_unlock_irqrestore(&session->lock, flags);
 
-	if (session->transport->session_recovery_timedout)
-		session->transport->session_recovery_timedout(session);
-
 	ISCSI_DBG_TRANS_SESSION(session, "Unblocking SCSI target\n");
 	scsi_target_unblock(&session->dev, SDEV_TRANSPORT_OFFLINE);
 	ISCSI_DBG_TRANS_SESSION(session, "Completed unblocking SCSI target\n");
+
+	if (session->transport->session_recovery_timedout)
+		session->transport->session_recovery_timedout(session);
 }
 
 static void __iscsi_unblock_session(struct work_struct *work)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] scsi: Fix hang when device state is set via sysfs
  2021-11-05 22:10 [PATCH V2 0/2] scsi: fix hang when device state is set via sysfs Mike Christie
  2021-11-05 22:10 ` [PATCH 1/2] iscsi: Unblock session then wake up error handler Mike Christie
@ 2021-11-05 22:10 ` Mike Christie
  2021-11-06 18:17   ` Lee Duncan
                     ` (2 more replies)
  2021-11-09  4:09 ` [PATCH V2 0/2] scsi: fix " Martin K. Petersen
  2 siblings, 3 replies; 11+ messages in thread
From: Mike Christie @ 2021-11-05 22:10 UTC (permalink / raw)
  To: lduncan, cleech, martin.petersen, linux-scsi, jejb
  Cc: Mike Christie, Bart Van Assche, lijinlin, Wu Bo

This fixes a regression added with:

commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
offlinining device")

The problem is that after iSCSI recovery, iscsid will call into the kernel
to set the dev's state to running, and with that patch we now call
scsi_rescan_device with the state_mutex held. If the scsi error handler
thread is just starting to test the device in scsi_send_eh_cmnd then it's
going to try to grab the state_mutex.

We are then stuck, because when scsi_rescan_device tries to send its IO
scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery
which will return true (the host state is still in recovery) and IO will
just be requeued. scsi_send_eh_cmnd will then never be able to grab the
state_mutex to finish error handling.

To prevent the deadlock this moves the rescan related code to after we
drop the state_mutex.

This also adds a check for if we are already in the running state. This
prevents extra scans and helps the iscsid case where if the transport
class has already onlined the device during it's recovery process then we
don't need userspace to do it again plus possibly block that daemon.

Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
offlinining device")
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: lijinlin <lijinlin3@huawei.com>
Cc: Wu Bo <wubo40@huawei.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index a35841b34bfd..53e23a7bc0d3 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct device_attribute *attr,
 	int i, ret;
 	struct scsi_device *sdev = to_scsi_device(dev);
 	enum scsi_device_state state = 0;
+	bool rescan_dev = false;
 
 	for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
 		const int len = strlen(sdev_states[i].name);
@@ -815,20 +816,27 @@ store_state_field(struct device *dev, struct device_attribute *attr,
 	}
 
 	mutex_lock(&sdev->state_mutex);
-	ret = scsi_device_set_state(sdev, state);
-	/*
-	 * If the device state changes to SDEV_RUNNING, we need to
-	 * run the queue to avoid I/O hang, and rescan the device
-	 * to revalidate it. Running the queue first is necessary
-	 * because another thread may be waiting inside
-	 * blk_mq_freeze_queue_wait() and because that call may be
-	 * waiting for pending I/O to finish.
-	 */
-	if (ret == 0 && state == SDEV_RUNNING) {
+	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) {
+		ret = count;
+	} else {
+		ret = scsi_device_set_state(sdev, state);
+		if (ret == 0 && state == SDEV_RUNNING)
+			rescan_dev = true;
+	}
+	mutex_unlock(&sdev->state_mutex);
+
+	if (rescan_dev) {
+		/*
+		 * If the device state changes to SDEV_RUNNING, we need to
+		 * run the queue to avoid I/O hang, and rescan the device
+		 * to revalidate it. Running the queue first is necessary
+		 * because another thread may be waiting inside
+		 * blk_mq_freeze_queue_wait() and because that call may be
+		 * waiting for pending I/O to finish.
+		 */
 		blk_mq_run_hw_queues(sdev->request_queue, true);
 		scsi_rescan_device(dev);
 	}
-	mutex_unlock(&sdev->state_mutex);
 
 	return ret == 0 ? count : -EINVAL;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] iscsi: Unblock session then wake up error handler
  2021-11-05 22:10 ` [PATCH 1/2] iscsi: Unblock session then wake up error handler Mike Christie
@ 2021-11-06 17:44   ` Lee Duncan
  2021-11-19  4:16   ` Martin K. Petersen
  1 sibling, 0 replies; 11+ messages in thread
From: Lee Duncan @ 2021-11-06 17:44 UTC (permalink / raw)
  To: Mike Christie, cleech, martin.petersen, linux-scsi, jejb

On 11/5/21 3:10 PM, Mike Christie wrote:
> We can race where iscsi_session_recovery_timedout has woken up the error
> handler thread and it's now setting the devices to offline, and
> session_recovery_timedout's call to scsi_target_unblock is also trying to
> set the device's state to transport-offline. We can then get a mix of
> states.
> 
> For the case where we can't relogin we want the devices to be in
> transport-offline so when we have repaired the connection
> __iscsi_unblock_session can set the state back to running. So this patch
> has us set the device state then call into libiscsi to wake up the error
> handler.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/scsi_transport_iscsi.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index 78343d3f9385..554b6f784223 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -1899,12 +1899,12 @@ static void session_recovery_timedout(struct work_struct *work)
>  	}
>  	spin_unlock_irqrestore(&session->lock, flags);
>  
> -	if (session->transport->session_recovery_timedout)
> -		session->transport->session_recovery_timedout(session);
> -
>  	ISCSI_DBG_TRANS_SESSION(session, "Unblocking SCSI target\n");
>  	scsi_target_unblock(&session->dev, SDEV_TRANSPORT_OFFLINE);
>  	ISCSI_DBG_TRANS_SESSION(session, "Completed unblocking SCSI target\n");
> +
> +	if (session->transport->session_recovery_timedout)
> +		session->transport->session_recovery_timedout(session);
>  }
>  
>  static void __iscsi_unblock_session(struct work_struct *work)
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] scsi: Fix hang when device state is set via sysfs
  2021-11-05 22:10 ` [PATCH 2/2] scsi: Fix hang when device state is set via sysfs Mike Christie
@ 2021-11-06 18:17   ` Lee Duncan
  2021-11-08  1:50   ` Wu Bo
  2021-11-19 13:35   ` James Bottomley
  2 siblings, 0 replies; 11+ messages in thread
From: Lee Duncan @ 2021-11-06 18:17 UTC (permalink / raw)
  To: Mike Christie, cleech, martin.petersen, linux-scsi, jejb
  Cc: Bart Van Assche, lijinlin, Wu Bo

On 11/5/21 3:10 PM, Mike Christie wrote:
> This fixes a regression added with:
> 
> commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> offlinining device")
> 
> The problem is that after iSCSI recovery, iscsid will call into the kernel
> to set the dev's state to running, and with that patch we now call
> scsi_rescan_device with the state_mutex held. If the scsi error handler
> thread is just starting to test the device in scsi_send_eh_cmnd then it's
> going to try to grab the state_mutex.
> 
> We are then stuck, because when scsi_rescan_device tries to send its IO
> scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery
> which will return true (the host state is still in recovery) and IO will
> just be requeued. scsi_send_eh_cmnd will then never be able to grab the
> state_mutex to finish error handling.
> 
> To prevent the deadlock this moves the rescan related code to after we
> drop the state_mutex.
> 
> This also adds a check for if we are already in the running state. This
> prevents extra scans and helps the iscsid case where if the transport
> class has already onlined the device during it's recovery process then we
> don't need userspace to do it again plus possibly block that daemon.
> 
> Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> offlinining device")
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: lijinlin <lijinlin3@huawei.com>
> Cc: Wu Bo <wubo40@huawei.com>
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++-----------
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index a35841b34bfd..53e23a7bc0d3 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>  	int i, ret;
>  	struct scsi_device *sdev = to_scsi_device(dev);
>  	enum scsi_device_state state = 0;
> +	bool rescan_dev = false;
>  
>  	for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
>  		const int len = strlen(sdev_states[i].name);
> @@ -815,20 +816,27 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>  	}
>  
>  	mutex_lock(&sdev->state_mutex);
> -	ret = scsi_device_set_state(sdev, state);
> -	/*
> -	 * If the device state changes to SDEV_RUNNING, we need to
> -	 * run the queue to avoid I/O hang, and rescan the device
> -	 * to revalidate it. Running the queue first is necessary
> -	 * because another thread may be waiting inside
> -	 * blk_mq_freeze_queue_wait() and because that call may be
> -	 * waiting for pending I/O to finish.
> -	 */
> -	if (ret == 0 && state == SDEV_RUNNING) {
> +	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) {
> +		ret = count;
> +	} else {
> +		ret = scsi_device_set_state(sdev, state);
> +		if (ret == 0 && state == SDEV_RUNNING)
> +			rescan_dev = true;
> +	}
> +	mutex_unlock(&sdev->state_mutex);
> +
> +	if (rescan_dev) {
> +		/*
> +		 * If the device state changes to SDEV_RUNNING, we need to
> +		 * run the queue to avoid I/O hang, and rescan the device
> +		 * to revalidate it. Running the queue first is necessary
> +		 * because another thread may be waiting inside
> +		 * blk_mq_freeze_queue_wait() and because that call may be
> +		 * waiting for pending I/O to finish.
> +		 */
>  		blk_mq_run_hw_queues(sdev->request_queue, true);
>  		scsi_rescan_device(dev);
>  	}
> -	mutex_unlock(&sdev->state_mutex);
>  
>  	return ret == 0 ? count : -EINVAL;
>  }
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] scsi: Fix hang when device state is set via sysfs
  2021-11-05 22:10 ` [PATCH 2/2] scsi: Fix hang when device state is set via sysfs Mike Christie
  2021-11-06 18:17   ` Lee Duncan
@ 2021-11-08  1:50   ` Wu Bo
  2021-11-19 13:35   ` James Bottomley
  2 siblings, 0 replies; 11+ messages in thread
From: Wu Bo @ 2021-11-08  1:50 UTC (permalink / raw)
  To: Mike Christie, lduncan, cleech, martin.petersen, linux-scsi, jejb
  Cc: Bart Van Assche, lijinlin

On 2021/11/6 6:10, Mike Christie wrote:
> This fixes a regression added with:
> 
> commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> offlinining device")
> 
> The problem is that after iSCSI recovery, iscsid will call into the kernel
> to set the dev's state to running, and with that patch we now call
> scsi_rescan_device with the state_mutex held. If the scsi error handler
> thread is just starting to test the device in scsi_send_eh_cmnd then it's
> going to try to grab the state_mutex.
> 
> We are then stuck, because when scsi_rescan_device tries to send its IO
> scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery
> which will return true (the host state is still in recovery) and IO will
> just be requeued. scsi_send_eh_cmnd will then never be able to grab the
> state_mutex to finish error handling.
> 
> To prevent the deadlock this moves the rescan related code to after we
> drop the state_mutex.
> 
> This also adds a check for if we are already in the running state. This
> prevents extra scans and helps the iscsid case where if the transport
> class has already onlined the device during it's recovery process then we
> don't need userspace to do it again plus possibly block that daemon.
> 
> Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> offlinining device")
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: lijinlin <lijinlin3@huawei.com>
> Cc: Wu Bo <wubo40@huawei.com>
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++-----------
>   1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index a35841b34bfd..53e23a7bc0d3 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>   	int i, ret;
>   	struct scsi_device *sdev = to_scsi_device(dev);
>   	enum scsi_device_state state = 0;
> +	bool rescan_dev = false;
>   
>   	for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
>   		const int len = strlen(sdev_states[i].name);
> @@ -815,20 +816,27 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>   	}
>   
>   	mutex_lock(&sdev->state_mutex);
> -	ret = scsi_device_set_state(sdev, state);
> -	/*
> -	 * If the device state changes to SDEV_RUNNING, we need to
> -	 * run the queue to avoid I/O hang, and rescan the device
> -	 * to revalidate it. Running the queue first is necessary
> -	 * because another thread may be waiting inside
> -	 * blk_mq_freeze_queue_wait() and because that call may be
> -	 * waiting for pending I/O to finish.
> -	 */
> -	if (ret == 0 && state == SDEV_RUNNING) {
> +	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) {
> +		ret = count;
> +	} else {
> +		ret = scsi_device_set_state(sdev, state);
> +		if (ret == 0 && state == SDEV_RUNNING)
> +			rescan_dev = true;
> +	}
> +	mutex_unlock(&sdev->state_mutex);
> +
> +	if (rescan_dev) {
> +		/*
> +		 * If the device state changes to SDEV_RUNNING, we need to
> +		 * run the queue to avoid I/O hang, and rescan the device
> +		 * to revalidate it. Running the queue first is necessary
> +		 * because another thread may be waiting inside
> +		 * blk_mq_freeze_queue_wait() and because that call may be
> +		 * waiting for pending I/O to finish.
> +		 */
>   		blk_mq_run_hw_queues(sdev->request_queue, true);
>   		scsi_rescan_device(dev);
>   	}
> -	mutex_unlock(&sdev->state_mutex);
>   
>   	return ret == 0 ? count : -EINVAL;
>   }
> 

Reviewed-by: Wu Bo <wubo40@huawei.com>

-- 
Wu Bo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 0/2] scsi: fix hang when device state is set via sysfs
  2021-11-05 22:10 [PATCH V2 0/2] scsi: fix hang when device state is set via sysfs Mike Christie
  2021-11-05 22:10 ` [PATCH 1/2] iscsi: Unblock session then wake up error handler Mike Christie
  2021-11-05 22:10 ` [PATCH 2/2] scsi: Fix hang when device state is set via sysfs Mike Christie
@ 2021-11-09  4:09 ` Martin K. Petersen
  2 siblings, 0 replies; 11+ messages in thread
From: Martin K. Petersen @ 2021-11-09  4:09 UTC (permalink / raw)
  To: Mike Christie; +Cc: lduncan, cleech, martin.petersen, linux-scsi, jejb

Mike,

> The following patches fix an issue where during iscsi recovery we deadlock
> due to iscsid calling into sysfs to set the device state to running, but
> the scsi error handler is also just finishing up.

Applied to 5.16/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] iscsi: Unblock session then wake up error handler
  2021-11-05 22:10 ` [PATCH 1/2] iscsi: Unblock session then wake up error handler Mike Christie
  2021-11-06 17:44   ` Lee Duncan
@ 2021-11-19  4:16   ` Martin K. Petersen
  1 sibling, 0 replies; 11+ messages in thread
From: Martin K. Petersen @ 2021-11-19  4:16 UTC (permalink / raw)
  To: cleech, Mike Christie, jejb, lduncan, linux-scsi; +Cc: Martin K . Petersen

On Fri, 5 Nov 2021 17:10:47 -0500, Mike Christie wrote:

> We can race where iscsi_session_recovery_timedout has woken up the error
> handler thread and it's now setting the devices to offline, and
> session_recovery_timedout's call to scsi_target_unblock is also trying to
> set the device's state to transport-offline. We can then get a mix of
> states.
> 
> For the case where we can't relogin we want the devices to be in
> transport-offline so when we have repaired the connection
> __iscsi_unblock_session can set the state back to running. So this patch
> has us set the device state then call into libiscsi to wake up the error
> handler.
> 
> [...]

Applied to 5.16/scsi-fixes, thanks!

[1/2] iscsi: Unblock session then wake up error handler
      https://git.kernel.org/mkp/scsi/c/a0c2f8b6709a
[2/2] scsi: Fix hang when device state is set via sysfs
      https://git.kernel.org/mkp/scsi/c/4edd8cd4e86d

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] scsi: Fix hang when device state is set via sysfs
  2021-11-05 22:10 ` [PATCH 2/2] scsi: Fix hang when device state is set via sysfs Mike Christie
  2021-11-06 18:17   ` Lee Duncan
  2021-11-08  1:50   ` Wu Bo
@ 2021-11-19 13:35   ` James Bottomley
  2021-11-19 17:13     ` Mike Christie
  2 siblings, 1 reply; 11+ messages in thread
From: James Bottomley @ 2021-11-19 13:35 UTC (permalink / raw)
  To: Mike Christie, lduncan, cleech, martin.petersen, linux-scsi
  Cc: Bart Van Assche, lijinlin, Wu Bo

On Fri, 2021-11-05 at 17:10 -0500, Mike Christie wrote:
> This fixes a regression added with:
> 
> commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> offlinining device")
> 
> The problem is that after iSCSI recovery, iscsid will call into the
> kernel
> to set the dev's state to running, and with that patch we now call
> scsi_rescan_device with the state_mutex held. If the scsi error
> handler
> thread is just starting to test the device in scsi_send_eh_cmnd then
> it's
> going to try to grab the state_mutex.
> 
> We are then stuck, because when scsi_rescan_device tries to send its
> IO
> scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery
> which will return true (the host state is still in recovery) and IO
> will
> just be requeued. scsi_send_eh_cmnd will then never be able to grab
> the
> state_mutex to finish error handling.
> 
> To prevent the deadlock this moves the rescan related code to after
> we
> drop the state_mutex.
> 
> This also adds a check for if we are already in the running state.
> This
> prevents extra scans and helps the iscsid case where if the transport
> class has already onlined the device during it's recovery process
> then we
> don't need userspace to do it again plus possibly block that daemon.
> 
> Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> offlinining device")
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: lijinlin <lijinlin3@huawei.com>
> Cc: Wu Bo <wubo40@huawei.com>
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++-----------
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index a35841b34bfd..53e23a7bc0d3 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct
> device_attribute *attr,
>  	int i, ret;
>  	struct scsi_device *sdev = to_scsi_device(dev);
>  	enum scsi_device_state state = 0;
> +	bool rescan_dev = false;
>  
>  	for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
>  		const int len = strlen(sdev_states[i].name);
> @@ -815,20 +816,27 @@ store_state_field(struct device *dev, struct
> device_attribute *attr,
>  	}
>  
>  	mutex_lock(&sdev->state_mutex);
> -	ret = scsi_device_set_state(sdev, state);
> -	/*
> -	 * If the device state changes to SDEV_RUNNING, we need to
> -	 * run the queue to avoid I/O hang, and rescan the device
> -	 * to revalidate it. Running the queue first is necessary
> -	 * because another thread may be waiting inside
> -	 * blk_mq_freeze_queue_wait() and because that call may be
> -	 * waiting for pending I/O to finish.
> -	 */
> -	if (ret == 0 && state == SDEV_RUNNING) {
> +	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING)
> {
> +		ret = count;

This looks wrong because of this

[...]
>  	return ret == 0 ? count : -EINVAL;

Don't we now return EINVAL on idempotent set state running because the
count is always non-zero?

I think the first statement should be 'ret = 0;' instead to cause
idempotent state setting to succeed as a nop.

James



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] scsi: Fix hang when device state is set via sysfs
  2021-11-19 13:35   ` James Bottomley
@ 2021-11-19 17:13     ` Mike Christie
  2021-11-19 18:19       ` James Bottomley
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Christie @ 2021-11-19 17:13 UTC (permalink / raw)
  To: jejb, lduncan, cleech, martin.petersen, linux-scsi
  Cc: Bart Van Assche, lijinlin, Wu Bo

On 11/19/21 7:35 AM, James Bottomley wrote:
> On Fri, 2021-11-05 at 17:10 -0500, Mike Christie wrote:
>> This fixes a regression added with:
>>
>> commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
>> offlinining device")
>>
>> The problem is that after iSCSI recovery, iscsid will call into the
>> kernel
>> to set the dev's state to running, and with that patch we now call
>> scsi_rescan_device with the state_mutex held. If the scsi error
>> handler
>> thread is just starting to test the device in scsi_send_eh_cmnd then
>> it's
>> going to try to grab the state_mutex.
>>
>> We are then stuck, because when scsi_rescan_device tries to send its
>> IO
>> scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery
>> which will return true (the host state is still in recovery) and IO
>> will
>> just be requeued. scsi_send_eh_cmnd will then never be able to grab
>> the
>> state_mutex to finish error handling.
>>
>> To prevent the deadlock this moves the rescan related code to after
>> we
>> drop the state_mutex.
>>
>> This also adds a check for if we are already in the running state.
>> This
>> prevents extra scans and helps the iscsid case where if the transport
>> class has already onlined the device during it's recovery process
>> then we
>> don't need userspace to do it again plus possibly block that daemon.
>>
>> Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
>> offlinining device")
>> Cc: Bart Van Assche <bvanassche@acm.org>
>> Cc: lijinlin <lijinlin3@huawei.com>
>> Cc: Wu Bo <wubo40@huawei.com>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
>> ---
>>  drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++-----------
>>  1 file changed, 19 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>> index a35841b34bfd..53e23a7bc0d3 100644
>> --- a/drivers/scsi/scsi_sysfs.c
>> +++ b/drivers/scsi/scsi_sysfs.c
>> @@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct
>> device_attribute *attr,
>>  	int i, ret;
>>  	struct scsi_device *sdev = to_scsi_device(dev);
>>  	enum scsi_device_state state = 0;
>> +	bool rescan_dev = false;
>>  
>>  	for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
>>  		const int len = strlen(sdev_states[i].name);
>> @@ -815,20 +816,27 @@ store_state_field(struct device *dev, struct
>> device_attribute *attr,
>>  	}
>>  
>>  	mutex_lock(&sdev->state_mutex);
>> -	ret = scsi_device_set_state(sdev, state);
>> -	/*
>> -	 * If the device state changes to SDEV_RUNNING, we need to
>> -	 * run the queue to avoid I/O hang, and rescan the device
>> -	 * to revalidate it. Running the queue first is necessary
>> -	 * because another thread may be waiting inside
>> -	 * blk_mq_freeze_queue_wait() and because that call may be
>> -	 * waiting for pending I/O to finish.
>> -	 */
>> -	if (ret == 0 && state == SDEV_RUNNING) {
>> +	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING)
>> {
>> +		ret = count;
> 
> This looks wrong because of this
> 
> [...]
>>  	return ret == 0 ? count : -EINVAL;
> 
> Don't we now return EINVAL on idempotent set state running because the
> count is always non-zero?
> 
> I think the first statement should be 'ret = 0;' instead to cause
> idempotent state setting to succeed as a nop.
> 

You're right.

I'll resend this patchset with James's comment handled.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] scsi: Fix hang when device state is set via sysfs
  2021-11-19 17:13     ` Mike Christie
@ 2021-11-19 18:19       ` James Bottomley
  0 siblings, 0 replies; 11+ messages in thread
From: James Bottomley @ 2021-11-19 18:19 UTC (permalink / raw)
  To: Mike Christie, lduncan, cleech, martin.petersen, linux-scsi
  Cc: Bart Van Assche, lijinlin, Wu Bo

On Fri, 2021-11-19 at 11:13 -0600, Mike Christie wrote:
> On 11/19/21 7:35 AM, James Bottomley wrote:
> > On Fri, 2021-11-05 at 17:10 -0500, Mike Christie wrote:
> > > This fixes a regression added with:
> > > 
> > > commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> > > offlinining device")
> > > 
> > > The problem is that after iSCSI recovery, iscsid will call into
> > > the kernel to set the dev's state to running, and with that patch
> > > we now call scsi_rescan_device with the state_mutex held. If the
> > > scsi error handler thread is just starting to test the device in
> > > scsi_send_eh_cmnd then it's going to try to grab the state_mutex.
> > > 
> > > We are then stuck, because when scsi_rescan_device tries to send
> > > its IO scsi_queue_rq calls -> scsi_host_queue_ready ->
> > > scsi_host_in_recovery which will return true (the host state is
> > > still in recovery) and IO will just be requeued.
> > > scsi_send_eh_cmnd will then never be able to grab the state_mutex
> > > to finish error handling.
> > > 
> > > To prevent the deadlock this moves the rescan related code to
> > > after we drop the state_mutex.
> > > 
> > > This also adds a check for if we are already in the running
> > > state. This prevents extra scans and helps the iscsid case where
> > > if the transport class has already onlined the device during it's
> > > recovery process then we don't need userspace to do it again plus
> > > possibly block that daemon.
> > > 
> > > Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> > > offlinining device")
> > > Cc: Bart Van Assche <bvanassche@acm.org>
> > > Cc: lijinlin <lijinlin3@huawei.com>
> > > Cc: Wu Bo <wubo40@huawei.com>
> > > Signed-off-by: Mike Christie <michael.christie@oracle.com>
> > > ---
> > >  drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++-----------
> > >  1 file changed, 19 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/drivers/scsi/scsi_sysfs.c
> > > b/drivers/scsi/scsi_sysfs.c
> > > index a35841b34bfd..53e23a7bc0d3 100644
> > > --- a/drivers/scsi/scsi_sysfs.c
> > > +++ b/drivers/scsi/scsi_sysfs.c
> > > @@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct
> > > device_attribute *attr,
> > >  	int i, ret;
> > >  	struct scsi_device *sdev = to_scsi_device(dev);
> > >  	enum scsi_device_state state = 0;
> > > +	bool rescan_dev = false;
> > >  
> > >  	for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
> > >  		const int len = strlen(sdev_states[i].name);
> > > @@ -815,20 +816,27 @@ store_state_field(struct device *dev,
> > > struct
> > > device_attribute *attr,
> > >  	}
> > >  
> > >  	mutex_lock(&sdev->state_mutex);
> > > -	ret = scsi_device_set_state(sdev, state);
> > > -	/*
> > > -	 * If the device state changes to SDEV_RUNNING, we need to
> > > -	 * run the queue to avoid I/O hang, and rescan the device
> > > -	 * to revalidate it. Running the queue first is necessary
> > > -	 * because another thread may be waiting inside
> > > -	 * blk_mq_freeze_queue_wait() and because that call may be
> > > -	 * waiting for pending I/O to finish.
> > > -	 */
> > > -	if (ret == 0 && state == SDEV_RUNNING) {
> > > +	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING)
> > > {
> > > +		ret = count;
> > 
> > This looks wrong because of this
> > 
> > [...]
> > >  	return ret == 0 ? count : -EINVAL;
> > 
> > Don't we now return EINVAL on idempotent set state running because
> > the count is always non-zero?
> > 
> > I think the first statement should be 'ret = 0;' instead to cause
> > idempotent state setting to succeed as a nop.
> > 
> 
> You're right.
> 
> I'll resend this patchset with James's comment handled.

Send an incremental patch because this is already in the queue for
Linus and he'd go ballistic if we had to rebase to remove it.

James



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-11-19 18:19 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-05 22:10 [PATCH V2 0/2] scsi: fix hang when device state is set via sysfs Mike Christie
2021-11-05 22:10 ` [PATCH 1/2] iscsi: Unblock session then wake up error handler Mike Christie
2021-11-06 17:44   ` Lee Duncan
2021-11-19  4:16   ` Martin K. Petersen
2021-11-05 22:10 ` [PATCH 2/2] scsi: Fix hang when device state is set via sysfs Mike Christie
2021-11-06 18:17   ` Lee Duncan
2021-11-08  1:50   ` Wu Bo
2021-11-19 13:35   ` James Bottomley
2021-11-19 17:13     ` Mike Christie
2021-11-19 18:19       ` James Bottomley
2021-11-09  4:09 ` [PATCH V2 0/2] scsi: fix " Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).