All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sd: use async_probe cookie to avoid deadlocks
@ 2017-03-21 12:14 Hannes Reinecke
  2017-03-21 13:02 ` Bart Van Assche
  2017-03-21 13:05 ` James Bottomley
  0 siblings, 2 replies; 10+ messages in thread
From: Hannes Reinecke @ 2017-03-21 12:14 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: hck, James Bottomley, Bart van Assche, linux-scsi,
	Hannes Reinecke, Hannes Reinecke

With the current design we're waiting for all async probes to
finish when removing any sd device.
This might lead to a livelock where the 'remove' call is blocking
for any probe calls to finish, and the probe calls are waiting for
a response, which will never be processes as the thread handling
the responses is waiting for the remove call to finish.
Which is completely pointless as we only _really_ care for the
probe on _this_ device to be completed; any other probing can
happily continue for all we care.
So save the async probing cookie in the structure and only wait
if this specific probe is still active.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/sd.c | 7 ++++---
 drivers/scsi/sd.h | 3 +++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index fb9b4d2..9f932e4 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -48,7 +48,6 @@
 #include <linux/delay.h>
 #include <linux/mutex.h>
 #include <linux/string_helpers.h>
-#include <linux/async.h>
 #include <linux/slab.h>
 #include <linux/pm_runtime.h>
 #include <linux/pr.h>
@@ -3217,7 +3216,8 @@ static int sd_probe(struct device *dev)
 	dev_set_drvdata(dev, sdkp);
 
 	get_device(&sdkp->dev);	/* prevent release before async_schedule */
-	async_schedule_domain(sd_probe_async, sdkp, &scsi_sd_probe_domain);
+	sdkp->async_probe = async_schedule_domain(sd_probe_async, sdkp,
+						  &scsi_sd_probe_domain);
 
 	return 0;
 
@@ -3256,7 +3256,8 @@ static int sd_remove(struct device *dev)
 	scsi_autopm_get_device(sdkp->device);
 
 	async_synchronize_full_domain(&scsi_sd_pm_domain);
-	async_synchronize_full_domain(&scsi_sd_probe_domain);
+	async_synchronize_cookie_domain(sdkp->async_probe,
+					&scsi_sd_probe_domain);
 	device_del(&sdkp->dev);
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 4dac35e..d4b5826 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -1,6 +1,8 @@
 #ifndef _SCSI_DISK_H
 #define _SCSI_DISK_H
 
+#include <linux/async.h>
+
 /*
  * More than enough for everybody ;)  The huge number of majors
  * is a leftover from 16bit dev_t days, we don't really need that
@@ -73,6 +75,7 @@ struct scsi_disk {
 	unsigned int	zones_optimal_nonseq;
 	unsigned int	zones_max_open;
 #endif
+	async_cookie_t	async_probe;
 	atomic_t	openers;
 	sector_t	capacity;	/* size in logical blocks */
 	u32		max_xfer_blocks;
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 12:14 [PATCH] sd: use async_probe cookie to avoid deadlocks Hannes Reinecke
@ 2017-03-21 13:02 ` Bart Van Assche
  2017-03-21 13:05 ` James Bottomley
  1 sibling, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2017-03-21 13:02 UTC (permalink / raw)
  To: hare, martin.petersen; +Cc: linux-scsi, james.bottomley, hck, hare

On Tue, 2017-03-21 at 13:14 +0100, Hannes Reinecke wrote:
> With the current design we're waiting for all async probes to
> finish when removing any sd device.
> This might lead to a livelock where the 'remove' call is blocking
> for any probe calls to finish, and the probe calls are waiting for
> a response, which will never be processes as the thread handling
> the responses is waiting for the remove call to finish.
> Which is completely pointless as we only _really_ care for the
> probe on _this_ device to be completed; any other probing can
> happily continue for all we care.
> So save the async probing cookie in the structure and only wait
> if this specific probe is still active.

Nice work! This may even help to reduce system boot time.

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 12:14 [PATCH] sd: use async_probe cookie to avoid deadlocks Hannes Reinecke
  2017-03-21 13:02 ` Bart Van Assche
@ 2017-03-21 13:05 ` James Bottomley
  2017-03-21 13:30   ` Bart Van Assche
  2017-03-21 15:25   ` Hannes Reinecke
  1 sibling, 2 replies; 10+ messages in thread
From: James Bottomley @ 2017-03-21 13:05 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: hck, Bart van Assche, linux-scsi, Hannes Reinecke

On Tue, 2017-03-21 at 13:14 +0100, Hannes Reinecke wrote:
> With the current design we're waiting for all async probes to
> finish when removing any sd device.
> This might lead to a livelock where the 'remove' call is blocking
> for any probe calls to finish, and the probe calls are waiting for
> a response, which will never be processes as the thread handling
> the responses is waiting for the remove call to finish.
> Which is completely pointless as we only _really_ care for the
> probe on _this_ device to be completed; any other probing can
> happily continue for all we care.
> So save the async probing cookie in the structure and only wait
> if this specific probe is still active.

How does this preserve ordering?  It looks like you have one cookie per
sdkp ... is there some sort of ordering guarantee I'm not seeing?

James

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 13:05 ` James Bottomley
@ 2017-03-21 13:30   ` Bart Van Assche
  2017-03-21 13:33     ` James Bottomley
  2017-03-21 15:25   ` Hannes Reinecke
  1 sibling, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2017-03-21 13:30 UTC (permalink / raw)
  To: James.Bottomley, hare, martin.petersen; +Cc: linux-scsi, hck, hare

On Tue, 2017-03-21 at 09:05 -0400, James Bottomley wrote:
> How does this preserve ordering?  It looks like you have one cookie per
> sdkp ... is there some sort of ordering guarantee I'm not seeing?

Hello James,

Since the probe order depends on the order in which __async_probe() adds
entries to the "pending" list, and since the order of the __async_probe()
calls is not changed by this patch, shouldn't the probe order be preserved
by this patch?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 13:30   ` Bart Van Assche
@ 2017-03-21 13:33     ` James Bottomley
  2017-03-21 13:42       ` Bart Van Assche
  2017-03-21 15:32       ` Hannes Reinecke
  0 siblings, 2 replies; 10+ messages in thread
From: James Bottomley @ 2017-03-21 13:33 UTC (permalink / raw)
  To: Bart Van Assche, hare, martin.petersen; +Cc: linux-scsi, hck, hare

On Tue, 2017-03-21 at 13:30 +0000, Bart Van Assche wrote:
> On Tue, 2017-03-21 at 09:05 -0400, James Bottomley wrote:
> > How does this preserve ordering?  It looks like you have one cookie 
> > per sdkp ... is there some sort of ordering guarantee I'm not
> > seeing?
> 
> Hello James,
> 
> Since the probe order depends on the order in which __async_probe() 
> adds entries to the "pending" list, and since the order of the
> __async_probe() calls is not changed by this patch, shouldn't the 
> probe order be preserved by this patch?

I don't know: that's what I'm asking.  I believe they complete in order
for a single domain.  I thought ordering isn't preserved between
domains?  So moving to multiple domains loses us ordering of disk
appearance.

James

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 13:33     ` James Bottomley
@ 2017-03-21 13:42       ` Bart Van Assche
  2017-03-21 15:32       ` Hannes Reinecke
  1 sibling, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2017-03-21 13:42 UTC (permalink / raw)
  To: James.Bottomley, hare, martin.petersen; +Cc: linux-scsi, hare

On Tue, 2017-03-21 at 09:33 -0400, James Bottomley wrote:
> On Tue, 2017-03-21 at 13:30 +0000, Bart Van Assche wrote:
> > On Tue, 2017-03-21 at 09:05 -0400, James Bottomley wrote:
> > > How does this preserve ordering?  It looks like you have one cookie 
> > > per sdkp ... is there some sort of ordering guarantee I'm not
> > > seeing?
> > 
> > Hello James,
> > 
> > Since the probe order depends on the order in which __async_probe() 
> > adds entries to the "pending" list, and since the order of the
> > __async_probe() calls is not changed by this patch, shouldn't the 
> > probe order be preserved by this patch?
> 
> I don't know: that's what I'm asking.  I believe they complete in order
> for a single domain.  I thought ordering isn't preserved between
> domains?  So moving to multiple domains loses us ordering of disk
> appearance.

Right, since sd_remove() doesn't wait any longer for completion of probes
from other domains the multi-domain probing behavior may change due to this
patch. However, the multi-domain probing order was already dependent on the
duration of individual probes so I don't think that it is guaranteed today
that multi-domain probing happens in the same order during every boot. I
hope that the change introduced by this patch will be considered acceptable.

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 13:05 ` James Bottomley
  2017-03-21 13:30   ` Bart Van Assche
@ 2017-03-21 15:25   ` Hannes Reinecke
  2017-03-21 15:33     ` James Bottomley
  1 sibling, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2017-03-21 15:25 UTC (permalink / raw)
  To: James Bottomley, Martin K. Petersen
  Cc: hck, Bart van Assche, linux-scsi, Hannes Reinecke

On 03/21/2017 02:05 PM, James Bottomley wrote:
> On Tue, 2017-03-21 at 13:14 +0100, Hannes Reinecke wrote:
>> With the current design we're waiting for all async probes to
>> finish when removing any sd device.
>> This might lead to a livelock where the 'remove' call is blocking
>> for any probe calls to finish, and the probe calls are waiting for
>> a response, which will never be processes as the thread handling
>> the responses is waiting for the remove call to finish.
>> Which is completely pointless as we only _really_ care for the
>> probe on _this_ device to be completed; any other probing can
>> happily continue for all we care.
>> So save the async probing cookie in the structure and only wait
>> if this specific probe is still active.
> 
> How does this preserve ordering?  It looks like you have one cookie per
> sdkp ... is there some sort of ordering guarantee I'm not seeing?
> 
Do we need one?
The only thing we care here is that probing for _this_ device has finished.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 13:33     ` James Bottomley
  2017-03-21 13:42       ` Bart Van Assche
@ 2017-03-21 15:32       ` Hannes Reinecke
  1 sibling, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2017-03-21 15:32 UTC (permalink / raw)
  To: James Bottomley, Bart Van Assche, martin.petersen; +Cc: linux-scsi, hck, hare

On 03/21/2017 02:33 PM, James Bottomley wrote:
> On Tue, 2017-03-21 at 13:30 +0000, Bart Van Assche wrote:
>> On Tue, 2017-03-21 at 09:05 -0400, James Bottomley wrote:
>>> How does this preserve ordering?  It looks like you have one cookie 
>>> per sdkp ... is there some sort of ordering guarantee I'm not
>>> seeing?
>>
>> Hello James,
>>
>> Since the probe order depends on the order in which __async_probe() 
>> adds entries to the "pending" list, and since the order of the
>> __async_probe() calls is not changed by this patch, shouldn't the 
>> probe order be preserved by this patch?
> 
> I don't know: that's what I'm asking.  I believe they complete in order
> for a single domain.  I thought ordering isn't preserved between
> domains?  So moving to multiple domains loses us ordering of disk
> appearance.
> 
Ah.
But we don't move to multiple domains, now do we?
We're just terminating the wait until _our_ probe is completed.
It's not that we're having a individual probe domain per device...

Unless I'm misunderstanding something...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 15:25   ` Hannes Reinecke
@ 2017-03-21 15:33     ` James Bottomley
  2017-03-21 16:21       ` Hannes Reinecke
  0 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2017-03-21 15:33 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: hck, Bart van Assche, linux-scsi, Hannes Reinecke

On Tue, 2017-03-21 at 16:25 +0100, Hannes Reinecke wrote:
> On 03/21/2017 02:05 PM, James Bottomley wrote:
> > On Tue, 2017-03-21 at 13:14 +0100, Hannes Reinecke wrote:
> > > With the current design we're waiting for all async probes to
> > > finish when removing any sd device.
> > > This might lead to a livelock where the 'remove' call is blocking
> > > for any probe calls to finish, and the probe calls are waiting
> > > for
> > > a response, which will never be processes as the thread handling
> > > the responses is waiting for the remove call to finish.
> > > Which is completely pointless as we only _really_ care for the
> > > probe on _this_ device to be completed; any other probing can
> > > happily continue for all we care.
> > > So save the async probing cookie in the structure and only wait
> > > if this specific probe is still active.
> > 
> > How does this preserve ordering?  It looks like you have one cookie 
> > per sdkp ... is there some sort of ordering guarantee I'm not
> > seeing?
> > 
> Do we need one?
> The only thing we care here is that probing for _this_ device has 
> finished.

OK, so currently we guarantee the linear ordering luns for individual
hbas.  We also guarantee no interleaving of sdX letters for individual
hbas.  We don't guarantee the scan order of the hbas themselves. 
 Preserve those guarantees and I'm happy with the patch.  If you can't
preserve them I think we need further discussion.

James

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sd: use async_probe cookie to avoid deadlocks
  2017-03-21 15:33     ` James Bottomley
@ 2017-03-21 16:21       ` Hannes Reinecke
  0 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2017-03-21 16:21 UTC (permalink / raw)
  To: James Bottomley, Martin K. Petersen
  Cc: hck, Bart van Assche, linux-scsi, Hannes Reinecke

On 03/21/2017 04:33 PM, James Bottomley wrote:
> On Tue, 2017-03-21 at 16:25 +0100, Hannes Reinecke wrote:
>> On 03/21/2017 02:05 PM, James Bottomley wrote:
>>> On Tue, 2017-03-21 at 13:14 +0100, Hannes Reinecke wrote:
>>>> With the current design we're waiting for all async probes to
>>>> finish when removing any sd device.
>>>> This might lead to a livelock where the 'remove' call is blocking
>>>> for any probe calls to finish, and the probe calls are waiting
>>>> for
>>>> a response, which will never be processes as the thread handling
>>>> the responses is waiting for the remove call to finish.
>>>> Which is completely pointless as we only _really_ care for the
>>>> probe on _this_ device to be completed; any other probing can
>>>> happily continue for all we care.
>>>> So save the async probing cookie in the structure and only wait
>>>> if this specific probe is still active.
>>>
>>> How does this preserve ordering?  It looks like you have one cookie 
>>> per sdkp ... is there some sort of ordering guarantee I'm not
>>> seeing?
>>>
>> Do we need one?
>> The only thing we care here is that probing for _this_ device has 
>> finished.
> 
> OK, so currently we guarantee the linear ordering luns for individual
> hbas.  We also guarantee no interleaving of sdX letters for individual
> hbas.  We don't guarantee the scan order of the hbas themselves. 
>  Preserve those guarantees and I'm happy with the patch.  If you can't
> preserve them I think we need further discussion.
> 
Which is actually not true.
If just some devices are removed from the hba (eg if they belong to the
same remote port) and we're rescanning devices once the port comes back
there is no guarantee that the devices will be getting the same device
letters. Nor that the device letters will be consecutive; just starting
'scsi_debug' with just one device before rescanning will mess up the
ordering. Even now.

So I don't see how we can be worse off than we are today.

Plus we (what with me now speaking for SUSE) never promised our
customers anything regardind sdX stability :-)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-03-21 16:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-21 12:14 [PATCH] sd: use async_probe cookie to avoid deadlocks Hannes Reinecke
2017-03-21 13:02 ` Bart Van Assche
2017-03-21 13:05 ` James Bottomley
2017-03-21 13:30   ` Bart Van Assche
2017-03-21 13:33     ` James Bottomley
2017-03-21 13:42       ` Bart Van Assche
2017-03-21 15:32       ` Hannes Reinecke
2017-03-21 15:25   ` Hannes Reinecke
2017-03-21 15:33     ` James Bottomley
2017-03-21 16:21       ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.