All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun
@ 2017-10-17 17:35 Cathy Avery
  2017-10-19 15:35 ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Cathy Avery @ 2017-10-17 17:35 UTC (permalink / raw)
  To: kys, hch, haiyangz, jejb, martin.petersen, dan.carpenter
  Cc: devel, linux-kernel, linux-scsi

When running multipath on a VM if all available paths go down
the driver can schedule large amounts of storvsc_remove_lun
work items to the same lun. In response to the failing paths
typically storvsc responds by taking host->scan_mutex and issuing
a TUR per lun. If there has been heavy IO to the failed device
all the failed IOs are returned from the host. A remove lun work
item is issued per failed IO. If the outstanding TURs have not been
completed in a timely manner the scan_mutex is never released or
released too late. Consequently the many remove lun work items are
not completed as scsi_remove_device also tries to take host->scan_mutex.
This results in dragging the VM down and sometimes completely.

This patch only allows one remove lun to be issued to a particular
lun while it is an instantiated member of the scsi stack.

Changes since v1:

Use single threaded workqueue to serialize work in
storvsc_handle_error [Christoph Hellwig]

Signed-off-by: Cathy Avery <cavery@redhat.com>
---
 drivers/scsi/storvsc_drv.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 5e7200f..6febcdb 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -486,6 +486,8 @@ struct hv_host_device {
 	unsigned int port;
 	unsigned char path;
 	unsigned char target;
+	struct workqueue_struct *handle_error_wq;
+	char work_q_name[20];
 };
 
 struct storvsc_scan_work {
@@ -922,6 +924,7 @@ static void storvsc_handle_error(struct vmscsi_request *vm_srb,
 {
 	struct storvsc_scan_work *wrk;
 	void (*process_err_fn)(struct work_struct *work);
+	struct hv_host_device *host_dev = shost_priv(host);
 	bool do_work = false;
 
 	switch (SRB_STATUS(vm_srb->srb_status)) {
@@ -988,7 +991,7 @@ static void storvsc_handle_error(struct vmscsi_request *vm_srb,
 	wrk->lun = vm_srb->lun;
 	wrk->tgt_id = vm_srb->target_id;
 	INIT_WORK(&wrk->work, process_err_fn);
-	schedule_work(&wrk->work);
+	queue_work(host_dev->handle_error_wq, &wrk->work);
 }
 
 
@@ -1803,10 +1806,19 @@ static int storvsc_probe(struct hv_device *device,
 	if (stor_device->num_sc != 0)
 		host->nr_hw_queues = stor_device->num_sc + 1;
 
+	/*
+	 * Set the error handler work queue.
+	 */
+	snprintf(host_dev->work_q_name, sizeof(host_dev->work_q_name),
+		 "storvsc_error_wq_%d", host->host_no);
+	host_dev->handle_error_wq =
+			create_singlethread_workqueue(host_dev->work_q_name);
+	if (!host_dev->handle_error_wq)
+		goto err_out2;
 	/* Register the HBA and start the scsi bus scan */
 	ret = scsi_add_host(host, &device->device);
 	if (ret != 0)
-		goto err_out2;
+		goto err_out3;
 
 	if (!dev_is_ide) {
 		scsi_scan_host(host);
@@ -1815,7 +1827,7 @@ static int storvsc_probe(struct hv_device *device,
 			 device->dev_instance.b[4]);
 		ret = scsi_add_device(host, 0, target, 0);
 		if (ret)
-			goto err_out3;
+			goto err_out4;
 	}
 #if IS_ENABLED(CONFIG_SCSI_FC_ATTRS)
 	if (host->transportt == fc_transport_template) {
@@ -1827,14 +1839,17 @@ static int storvsc_probe(struct hv_device *device,
 		fc_host_port_name(host) = stor_device->port_name;
 		stor_device->rport = fc_remote_port_add(host, 0, &ids);
 		if (!stor_device->rport)
-			goto err_out3;
+			goto err_out4;
 	}
 #endif
 	return 0;
 
-err_out3:
+err_out4:
 	scsi_remove_host(host);
 
+err_out3:
+	destroy_workqueue(host_dev->handle_error_wq);
+
 err_out2:
 	/*
 	 * Once we have connected with the host, we would need to
@@ -1858,6 +1873,7 @@ static int storvsc_remove(struct hv_device *dev)
 {
 	struct storvsc_device *stor_device = hv_get_drvdata(dev);
 	struct Scsi_Host *host = stor_device->host;
+	struct hv_host_device *host_dev = shost_priv(host);
 
 #if IS_ENABLED(CONFIG_SCSI_FC_ATTRS)
 	if (host->transportt == fc_transport_template) {
@@ -1865,6 +1881,7 @@ static int storvsc_remove(struct hv_device *dev)
 		fc_remove_host(host);
 	}
 #endif
+	destroy_workqueue(host_dev->handle_error_wq);
 	scsi_remove_host(host);
 	storvsc_dev_remove(dev);
 	scsi_host_put(host);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun
  2017-10-17 17:35 [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun Cathy Avery
@ 2017-10-19 15:35 ` Christoph Hellwig
  2017-10-19 22:06   ` Long Li
  2017-10-21 15:44   ` Tejun Heo
  0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2017-10-19 15:35 UTC (permalink / raw)
  To: Cathy Avery
  Cc: kys, hch, haiyangz, jejb, martin.petersen, dan.carpenter, devel,
	linux-kernel, linux-scsi, Tejun Heo

On Tue, Oct 17, 2017 at 01:35:21PM -0400, Cathy Avery wrote:
> +	/*
> +	 * Set the error handler work queue.
> +	 */
> +	snprintf(host_dev->work_q_name, sizeof(host_dev->work_q_name),
> +		 "storvsc_error_wq_%d", host->host_no);
> +	host_dev->handle_error_wq =
> +			create_singlethread_workqueue(host_dev->work_q_name);

If you use alloc_ordered_workqueue directly instead of
create_singlethread_workqueue you can pass a format string and don't
need the separate allocation.

But I'm not sure if Tejun is fine with using __WQ_LEGACY directly..

Except for this nit this looks fine to me:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun
  2017-10-19 15:35 ` Christoph Hellwig
@ 2017-10-19 22:06   ` Long Li
  2017-10-21 15:44   ` Tejun Heo
  1 sibling, 0 replies; 7+ messages in thread
From: Long Li @ 2017-10-19 22:06 UTC (permalink / raw)
  To: Christoph Hellwig, Cathy Avery
  Cc: jejb, linux-scsi, martin.petersen, Haiyang Zhang, linux-kernel,
	Tejun Heo, devel, dan.carpenter

> On Tue, Oct 17, 2017 at 01:35:21PM -0400, Cathy Avery wrote:
> > +	/*
> > +	 * Set the error handler work queue.
> > +	 */
> > +	snprintf(host_dev->work_q_name, sizeof(host_dev-
> >work_q_name),
> > +		 "storvsc_error_wq_%d", host->host_no);
> > +	host_dev->handle_error_wq =
> > +			create_singlethread_workqueue(host_dev-
> >work_q_name);
> 
> If you use alloc_ordered_workqueue directly instead of
> create_singlethread_workqueue you can pass a format string and don't need
> the separate allocation.
> 
> But I'm not sure if Tejun is fine with using __WQ_LEGACY directly..
> 
> Except for this nit this looks fine to me:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

The work storvsc_host_scan (scheduled from function storvsc_on_receive) should also use this workqueue. We can do it in another patch.

Reviewed-by: Long Li <longli@microsoft.com>

> _______________________________________________
> devel mailing list
> devel@linuxdriverproject.org
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdriverd
> ev.linuxdriverproject.org%2Fmailman%2Flistinfo%2Fdriverdev-
> devel&data=02%7C01%7Clongli%40microsoft.com%7C9c303c3630ef490cecc3
> 08d5170702a2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636440
> 241242573253&sdata=tbCBOnKxtRR38rAdsBDa7zA0Jc2XwrySTsH3uyRxHxA%
> 3D&reserved=0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun
  2017-10-19 15:35 ` Christoph Hellwig
  2017-10-19 22:06   ` Long Li
@ 2017-10-21 15:44   ` Tejun Heo
  2017-10-31 12:24       ` Martin K. Petersen
  1 sibling, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2017-10-21 15:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Cathy Avery, kys, haiyangz, jejb, martin.petersen, dan.carpenter,
	devel, linux-kernel, linux-scsi

Hello,

On Thu, Oct 19, 2017 at 08:35:10AM -0700, Christoph Hellwig wrote:
> On Tue, Oct 17, 2017 at 01:35:21PM -0400, Cathy Avery wrote:
> > +	/*
> > +	 * Set the error handler work queue.
> > +	 */
> > +	snprintf(host_dev->work_q_name, sizeof(host_dev->work_q_name),
> > +		 "storvsc_error_wq_%d", host->host_no);
> > +	host_dev->handle_error_wq =
> > +			create_singlethread_workqueue(host_dev->work_q_name);
> 
> If you use alloc_ordered_workqueue directly instead of
> create_singlethread_workqueue you can pass a format string and don't
> need the separate allocation.
> 
> But I'm not sure if Tejun is fine with using __WQ_LEGACY directly..

The only thing that flag does is exempting the workqueue from possible
flush deadlock check as we don't know whether WQ_MEM_RECLAIM on a
legacy workqueue is intentional.  There's no reason to add it when
converting to alloc_ordered_workqueue().  Just decide whether it needs
forward progress guarantee and use WQ_MEM_RECLAIM if so.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun
  2017-10-21 15:44   ` Tejun Heo
@ 2017-10-31 12:24       ` Martin K. Petersen
  0 siblings, 0 replies; 7+ messages in thread
From: Martin K. Petersen @ 2017-10-31 12:24 UTC (permalink / raw)
  To: Cathy Avery
  Cc: Christoph Hellwig, Tejun Heo, kys, haiyangz, jejb,
	martin.petersen, dan.carpenter, devel, linux-kernel, linux-scsi


>> If you use alloc_ordered_workqueue directly instead of
>> create_singlethread_workqueue you can pass a format string and don't
>> need the separate allocation.
>> 
>> But I'm not sure if Tejun is fine with using __WQ_LEGACY directly..
>
> The only thing that flag does is exempting the workqueue from possible
> flush deadlock check as we don't know whether WQ_MEM_RECLAIM on a
> legacy workqueue is intentional.  There's no reason to add it when
> converting to alloc_ordered_workqueue().  Just decide whether it needs
> forward progress guarantee and use WQ_MEM_RECLAIM if so.

Cathy?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun
@ 2017-10-31 12:24       ` Martin K. Petersen
  0 siblings, 0 replies; 7+ messages in thread
From: Martin K. Petersen @ 2017-10-31 12:24 UTC (permalink / raw)
  To: Cathy Avery
  Cc: jejb, linux-scsi, martin.petersen, haiyangz, linux-kernel,
	Christoph Hellwig, Tejun Heo, devel, dan.carpenter


>> If you use alloc_ordered_workqueue directly instead of
>> create_singlethread_workqueue you can pass a format string and don't
>> need the separate allocation.
>> 
>> But I'm not sure if Tejun is fine with using __WQ_LEGACY directly..
>
> The only thing that flag does is exempting the workqueue from possible
> flush deadlock check as we don't know whether WQ_MEM_RECLAIM on a
> legacy workqueue is intentional.  There's no reason to add it when
> converting to alloc_ordered_workqueue().  Just decide whether it needs
> forward progress guarantee and use WQ_MEM_RECLAIM if so.

Cathy?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun
  2017-10-31 12:24       ` Martin K. Petersen
  (?)
@ 2017-10-31 12:31       ` Cathy Avery
  -1 siblings, 0 replies; 7+ messages in thread
From: Cathy Avery @ 2017-10-31 12:31 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Tejun Heo, kys, haiyangz, jejb, dan.carpenter,
	devel, linux-kernel, linux-scsi

On 10/31/2017 08:24 AM, Martin K. Petersen wrote:
>>> If you use alloc_ordered_workqueue directly instead of
>>> create_singlethread_workqueue you can pass a format string and don't
>>> need the separate allocation.
>>>
>>> But I'm not sure if Tejun is fine with using __WQ_LEGACY directly..
>> The only thing that flag does is exempting the workqueue from possible
>> flush deadlock check as we don't know whether WQ_MEM_RECLAIM on a
>> legacy workqueue is intentional.  There's no reason to add it when
>> converting to alloc_ordered_workqueue().  Just decide whether it needs
>> forward progress guarantee and use WQ_MEM_RECLAIM if so.
> Cathy?
>

Sorry for the delay. Long was working on a similar problem and we needed 
to add a couple of extra patches. I was thinking of sending all three in 
series but I can send the V3 of this now and follow up with the 
additional patches. Does that make sense?

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-10-31 12:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-17 17:35 [PATCH V2] scsi: storvsc: Allow only one remove lun work item to be issued per lun Cathy Avery
2017-10-19 15:35 ` Christoph Hellwig
2017-10-19 22:06   ` Long Li
2017-10-21 15:44   ` Tejun Heo
2017-10-31 12:24     ` Martin K. Petersen
2017-10-31 12:24       ` Martin K. Petersen
2017-10-31 12:31       ` Cathy Avery

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.