All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi_dh_alua: Add return value and check for alua_rtpg_queue() to avoid DM devices I/Os hang
@ 2016-11-24 10:25 tang.junhui
  2016-12-01 18:12 ` Bart Van Assche
  0 siblings, 1 reply; 4+ messages in thread
From: tang.junhui @ 2016-11-24 10:25 UTC (permalink / raw)
  To: hare, matthew; +Cc: linux-scsi, zhang.kai16, tang.junhui

From: "tang.junhui" <tang.junhui@zte.com.cn>

Activate_complete fn() must be called in alua_activate() if
alua_rtpg_queue() failed, otherwise, it would cause I/Os hang in DM
devices. So this patch add return value and check for alua_rtpg_queue().

Signed-off-by: tang.junhui <tang.junhui@zte.com.cn>
---
 drivers/scsi/device_handler/scsi_dh_alua.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 7bb2068..62075c7 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -113,7 +113,7 @@ struct alua_queue_data {
 #define ALUA_POLICY_SWITCH_ALL		1
 
 static void alua_rtpg_work(struct work_struct *work);
-static void alua_rtpg_queue(struct alua_port_group *pg,
+static int alua_rtpg_queue(struct alua_port_group *pg,
 			    struct scsi_device *sdev,
 			    struct alua_queue_data *qdata, bool force);
 static void alua_check(struct scsi_device *sdev, bool force);
@@ -862,7 +862,7 @@ static void alua_rtpg_work(struct work_struct *work)
 	kref_put(&pg->kref, release_port_group);
 }
 
-static void alua_rtpg_queue(struct alua_port_group *pg,
+static int alua_rtpg_queue(struct alua_port_group *pg,
 			    struct scsi_device *sdev,
 			    struct alua_queue_data *qdata, bool force)
 {
@@ -871,7 +871,7 @@ static void alua_rtpg_queue(struct alua_port_group *pg,
 	struct workqueue_struct *alua_wq = kaluad_wq;
 
 	if (!pg)
-		return;
+		return SCSI_DH_IO;
 
 	spin_lock_irqsave(&pg->lock, flags);
 	if (qdata) {
@@ -906,7 +906,10 @@ static void alua_rtpg_queue(struct alua_port_group *pg,
 		if (sdev)
 			scsi_device_put(sdev);
 		kref_put(&pg->kref, release_port_group);
+		return SCSI_DH_IO;
 	}
+
+	return SCSI_DH_OK;
 }
 
 /*
@@ -1007,11 +1010,12 @@ static int alua_activate(struct scsi_device *sdev,
 		mutex_unlock(&h->init_mutex);
 		goto out;
 	}
-	fn = NULL;
 	rcu_read_unlock();
 	mutex_unlock(&h->init_mutex);
 
-	alua_rtpg_queue(pg, sdev, qdata, true);
+	err = alua_rtpg_queue(pg, sdev, qdata, true);
+	if (!err)
+		fn = NULL;
 	kref_put(&pg->kref, release_port_group);
 out:
 	if (fn)
-- 
2.8.1.windows.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] scsi_dh_alua: Add return value and check for alua_rtpg_queue() to avoid DM devices I/Os hang
  2016-11-24 10:25 [PATCH] scsi_dh_alua: Add return value and check for alua_rtpg_queue() to avoid DM devices I/Os hang tang.junhui
@ 2016-12-01 18:12 ` Bart Van Assche
       [not found]   ` <OFADBB4E3F.FF85F07C-ON4825807D.0009F390-4825807D.0009FA01@zte.com.cn>
  0 siblings, 1 reply; 4+ messages in thread
From: Bart Van Assche @ 2016-12-01 18:12 UTC (permalink / raw)
  To: tang.junhui, hare, matthew; +Cc: linux-scsi, zhang.kai16

On 11/24/2016 02:25 AM, tang.junhui@zte.com.cn wrote:
> Activate_complete fn() must be called in alua_activate() if
> alua_rtpg_queue() failed, otherwise, it would cause I/Os hang in DM
> devices. So this patch add return value and check for alua_rtpg_queue().

Hello Tang,

Please drop this patch. I think the alua_rtpg_queue() caller should 
ensure that pg != NULL.

Bart.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] scsi_dh_alua: Add return value and check for alua_rtpg_queue() to avoid DM devices I/Os hang
       [not found]   ` <OFADBB4E3F.FF85F07C-ON4825807D.0009F390-4825807D.0009FA01@zte.com.cn>
@ 2016-12-02  2:58     ` Bart Van Assche
       [not found]       ` <OF78D5B916.DC66A918-ON4825807D.0010D67E-4825807D.001269A1@zte.com.cn>
  0 siblings, 1 reply; 4+ messages in thread
From: Bart Van Assche @ 2016-12-02  2:58 UTC (permalink / raw)
  To: tang.junhui; +Cc: hare, linux-scsi

On 12/01/16 17:49, tang.junhui@zte.com.cn wrote:
>> Bart wrote:
>> Please drop this patch. I think the alua_rtpg_queue() caller should
>> ensure that pg != NULL.
>
> Failure may also be occurred in queue_delayed_work(),
> since it would cause serious problems,
> so I think we are worth checking for it.

Hello Tang,

Have you been able to trigger the condition explained in the patch 
description or is this only something you think that can happen based on 
your interpretation of the source code? My comments about the checks 
that have been added are:
* All alua_rtpg_queue() callers pass a non-NULL pointer as first 
argument which means that the return statement under "if (!pg)" in 
alua_rtpg_queue() is never executed.
* Even if queue_delayed_work() returns 0 the qdata work passed to 
alua_rtpg_queue() is still added to the pg->rtpg_list and hence will be 
executed once the delayed work is executed. So I think that the 
condition you described (fn() not called) cannot happen. From 
alua_rtpg_work():

	list_splice_init(&pg->rtpg_list, &qdata_list);
         [ ... ]
	list_for_each_entry_safe(qdata, tmp, &qdata_list, entry) {
		list_del(&qdata->entry);
		if (qdata->callback_fn)
			qdata->callback_fn(qdata->callback_data, err);
		kfree(qdata);
	}

Bart.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] scsi_dh_alua: Add return value and check for alua_rtpg_queue() to avoid DM devices I/Os hang
       [not found]       ` <OF78D5B916.DC66A918-ON4825807D.0010D67E-4825807D.001269A1@zte.com.cn>
@ 2016-12-02  3:40         ` Bart Van Assche
  0 siblings, 0 replies; 4+ messages in thread
From: Bart Van Assche @ 2016-12-02  3:40 UTC (permalink / raw)
  To: tang.junhui; +Cc: hare, linux-scsi

On 12/01/16 19:21, tang.junhui@zte.com.cn wrote:
> Hello Bart,
>> * Even if queue_delayed_work() returns 0 the qdata work passed to
>> alua_rtpg_queue() is still added to the pg->rtpg_list and hence will be
>> executed once the delayed work is executed. So I think that the
>> condition you described (fn() not called) cannot happen.
>
> I find it by reading code.
>
> How did you think that it will be
> executed once the delayed work is executed?
> It is not re-queued to the pg->rtpg_work again.
> It is triggered by pgpath->activate_path.work in dm-mod,
> and maybe it would never run anymore.

Hello Tang,

Are you aware that if queue_delayed_work() returns 0 that that means 
that a work item has already been queued (pg->rtpg_work in this case)? 
 From kernel/workqueue.c:

/**
  * queue_delayed_work_on - queue work on specific CPU after delay
  * @cpu: CPU number to execute work on
  * @wq: workqueue to use
  * @dwork: work to queue
  * @delay: number of jiffies to wait before queueing
  *
  * Return: %false if @work was already on a queue, %true otherwise.  If
  * @delay is zero and @dwork is idle, it will be scheduled for immediate
  * execution.
  */

Bart.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-02  3:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-24 10:25 [PATCH] scsi_dh_alua: Add return value and check for alua_rtpg_queue() to avoid DM devices I/Os hang tang.junhui
2016-12-01 18:12 ` Bart Van Assche
     [not found]   ` <OFADBB4E3F.FF85F07C-ON4825807D.0009F390-4825807D.0009FA01@zte.com.cn>
2016-12-02  2:58     ` Bart Van Assche
     [not found]       ` <OF78D5B916.DC66A918-ON4825807D.0010D67E-4825807D.001269A1@zte.com.cn>
2016-12-02  3:40         ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.