* [PATCH] scsi_dh_alua: add missing transitioning state support
@ 2010-08-17 19:05 Mike Snitzer
2010-08-17 19:23 ` Nicholas A. Bellinger
0 siblings, 1 reply; 14+ messages in thread
From: Mike Snitzer @ 2010-08-17 19:05 UTC (permalink / raw)
To: James Bottomley; +Cc: Mike Christie, linux-scsi
Handle transitioning in the prep_fn.
Handle transitioning in alua_rtpg's implicit alua code too.
These gaps were identified during controller failover testing of an
ALUA array.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
1 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 1a970a7..c1eedc5 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
h->state == TPGS_STATE_STANDBY)
/* Useable path if active */
err = SCSI_DH_OK;
+ else if (h->state == TPGS_STATE_TRANSITIONING)
+ /* State transition, retry */
+ goto retry;
else
/* Path unuseable for unavailable/offline */
err = SCSI_DH_DEV_OFFLINED;
@@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
struct alua_dh_data *h = get_alua_data(sdev);
int ret = BLKPREP_OK;
- if (h->state != TPGS_STATE_OPTIMIZED &&
- h->state != TPGS_STATE_NONOPTIMIZED) {
+ if (h->state == TPGS_STATE_TRANSITIONING)
+ ret = BLKPREP_DEFER;
+ else if (h->state != TPGS_STATE_OPTIMIZED &&
+ h->state != TPGS_STATE_NONOPTIMIZED) {
ret = BLKPREP_KILL;
req->cmd_flags |= REQ_QUIET;
}
return ret;
-
}
static const struct scsi_dh_devlist alua_dev_list[] = {
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] scsi_dh_alua: add missing transitioning state support
2010-08-17 19:05 [PATCH] scsi_dh_alua: add missing transitioning state support Mike Snitzer
@ 2010-08-17 19:23 ` Nicholas A. Bellinger
2010-08-30 9:36 ` Hannes Reinecke
0 siblings, 1 reply; 14+ messages in thread
From: Nicholas A. Bellinger @ 2010-08-17 19:23 UTC (permalink / raw)
To: Mike Snitzer; +Cc: James Bottomley, Mike Christie, linux-scsi
On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
> Handle transitioning in the prep_fn.
> Handle transitioning in alua_rtpg's implicit alua code too.
>
> These gaps were identified during controller failover testing of an
> ALUA array.
>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
> 1 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
> index 1a970a7..c1eedc5 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
> h->state == TPGS_STATE_STANDBY)
> /* Useable path if active */
> err = SCSI_DH_OK;
> + else if (h->state == TPGS_STATE_TRANSITIONING)
> + /* State transition, retry */
> + goto retry;
> else
> /* Path unuseable for unavailable/offline */
> err = SCSI_DH_DEV_OFFLINED;
> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
> struct alua_dh_data *h = get_alua_data(sdev);
> int ret = BLKPREP_OK;
>
> - if (h->state != TPGS_STATE_OPTIMIZED &&
> - h->state != TPGS_STATE_NONOPTIMIZED) {
> + if (h->state == TPGS_STATE_TRANSITIONING)
> + ret = BLKPREP_DEFER;
> + else if (h->state != TPGS_STATE_OPTIMIZED &&
> + h->state != TPGS_STATE_NONOPTIMIZED) {
> ret = BLKPREP_KILL;
> req->cmd_flags |= REQ_QUIET;
> }
> return ret;
> -
> }
>
Makes sense to me..
Acked-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] scsi_dh_alua: add missing transitioning state support
2010-08-17 19:23 ` Nicholas A. Bellinger
@ 2010-08-30 9:36 ` Hannes Reinecke
2010-08-31 15:11 ` Mike Snitzer
0 siblings, 1 reply; 14+ messages in thread
From: Hannes Reinecke @ 2010-08-30 9:36 UTC (permalink / raw)
To: Nicholas A. Bellinger
Cc: Mike Snitzer, James Bottomley, Mike Christie, linux-scsi
Nicholas A. Bellinger wrote:
> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
>> Handle transitioning in the prep_fn.
>> Handle transitioning in alua_rtpg's implicit alua code too.
>>
>> These gaps were identified during controller failover testing of an
>> ALUA array.
>>
>> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
>> ---
>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
>> 1 files changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
>> index 1a970a7..c1eedc5 100644
>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
>> h->state == TPGS_STATE_STANDBY)
>> /* Useable path if active */
>> err = SCSI_DH_OK;
>> + else if (h->state == TPGS_STATE_TRANSITIONING)
>> + /* State transition, retry */
>> + goto retry;
>> else
>> /* Path unuseable for unavailable/offline */
>> err = SCSI_DH_DEV_OFFLINED;
>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
>> struct alua_dh_data *h = get_alua_data(sdev);
>> int ret = BLKPREP_OK;
>>
>> - if (h->state != TPGS_STATE_OPTIMIZED &&
>> - h->state != TPGS_STATE_NONOPTIMIZED) {
>> + if (h->state == TPGS_STATE_TRANSITIONING)
>> + ret = BLKPREP_DEFER;
>> + else if (h->state != TPGS_STATE_OPTIMIZED &&
>> + h->state != TPGS_STATE_NONOPTIMIZED) {
>> ret = BLKPREP_KILL;
>> req->cmd_flags |= REQ_QUIET;
>> }
>> return ret;
>> -
>> }
>>
>
> Makes sense to me..
>
> Acked-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
>
Not so fast. There are two problems with this approach:
The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning'
only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer',
too.
Secondly this path fails with 'directio' multipath checker. Remember that 'directio'
is using 'fs' requests, not block-pc ones. Hence for all I/O the prep_fn() callback
is evaluated, which will return 'DEFER' here once the path is in transitioning.
And the state is never updated as RTPG is never called.
I'm currently preparing a patch which addressed these situations, too.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-08-30 9:36 ` Hannes Reinecke
@ 2010-08-31 15:11 ` Mike Snitzer
2010-09-20 15:35 ` Mike Snitzer
0 siblings, 1 reply; 14+ messages in thread
From: Mike Snitzer @ 2010-08-31 15:11 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Nicholas A. Bellinger, James Bottomley, Mike Christie, linux-scsi
On Mon, Aug 30 2010 at 5:36am -0400,
Hannes Reinecke <hare@suse.de> wrote:
> Nicholas A. Bellinger wrote:
> > On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
> >> Handle transitioning in the prep_fn.
> >> Handle transitioning in alua_rtpg's implicit alua code too.
> >>
> >> These gaps were identified during controller failover testing of an
> >> ALUA array.
> >>
> >> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> >> ---
> >> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
> >> 1 files changed, 7 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
> >> index 1a970a7..c1eedc5 100644
> >> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> >> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> >> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
> >> h->state == TPGS_STATE_STANDBY)
> >> /* Useable path if active */
> >> err = SCSI_DH_OK;
> >> + else if (h->state == TPGS_STATE_TRANSITIONING)
> >> + /* State transition, retry */
> >> + goto retry;
> >> else
> >> /* Path unuseable for unavailable/offline */
> >> err = SCSI_DH_DEV_OFFLINED;
> >> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
> >> struct alua_dh_data *h = get_alua_data(sdev);
> >> int ret = BLKPREP_OK;
> >>
> >> - if (h->state != TPGS_STATE_OPTIMIZED &&
> >> - h->state != TPGS_STATE_NONOPTIMIZED) {
> >> + if (h->state == TPGS_STATE_TRANSITIONING)
> >> + ret = BLKPREP_DEFER;
> >> + else if (h->state != TPGS_STATE_OPTIMIZED &&
> >> + h->state != TPGS_STATE_NONOPTIMIZED) {
> >> ret = BLKPREP_KILL;
> >> req->cmd_flags |= REQ_QUIET;
> >> }
> >> return ret;
> >> -
> >> }
> >>
> >
> > Makes sense to me..
> >
> > Acked-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
> >
> Not so fast. There are two problems with this approach:
>
> The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning'
> only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer',
> too.
And what is the problem with that? The IO will eventually time out.
> Secondly this path fails with 'directio' multipath checker. Remember that 'directio'
> is using 'fs' requests, not block-pc ones. Hence for all I/O the prep_fn() callback
> is evaluated, which will return 'DEFER' here once the path is in transitioning.
> And the state is never updated as RTPG is never called.
Testing ALUA with directio path checker did not result in such immutable
state in the few instances that TPGS_STATE_TRANSITIONING was seen in
alua_prep_fn.
> I'm currently preparing a patch which addressed these situations, too.
OK, please share.
Thanks,
Mike
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-08-31 15:11 ` Mike Snitzer
@ 2010-09-20 15:35 ` Mike Snitzer
2010-09-21 2:27 ` Mike Christie
0 siblings, 1 reply; 14+ messages in thread
From: Mike Snitzer @ 2010-09-20 15:35 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Nicholas A. Bellinger, James Bottomley, Mike Christie, linux-scsi
Hi Hannes,
On Tue, Aug 31 2010 at 11:11am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:
> On Mon, Aug 30 2010 at 5:36am -0400,
> Hannes Reinecke <hare@suse.de> wrote:
>
> > Nicholas A. Bellinger wrote:
> > > On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
> > >> Handle transitioning in the prep_fn.
> > >> Handle transitioning in alua_rtpg's implicit alua code too.
> > >>
> > >> These gaps were identified during controller failover testing of an
> > >> ALUA array.
> > >>
> > >> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > >> ---
> > >> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
> > >> 1 files changed, 7 insertions(+), 3 deletions(-)
> > >>
> > >> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
> > >> index 1a970a7..c1eedc5 100644
> > >> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> > >> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> > >> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
> > >> h->state == TPGS_STATE_STANDBY)
> > >> /* Useable path if active */
> > >> err = SCSI_DH_OK;
> > >> + else if (h->state == TPGS_STATE_TRANSITIONING)
> > >> + /* State transition, retry */
> > >> + goto retry;
> > >> else
> > >> /* Path unuseable for unavailable/offline */
> > >> err = SCSI_DH_DEV_OFFLINED;
> > >> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
> > >> struct alua_dh_data *h = get_alua_data(sdev);
> > >> int ret = BLKPREP_OK;
> > >>
> > >> - if (h->state != TPGS_STATE_OPTIMIZED &&
> > >> - h->state != TPGS_STATE_NONOPTIMIZED) {
> > >> + if (h->state == TPGS_STATE_TRANSITIONING)
> > >> + ret = BLKPREP_DEFER;
> > >> + else if (h->state != TPGS_STATE_OPTIMIZED &&
> > >> + h->state != TPGS_STATE_NONOPTIMIZED) {
> > >> ret = BLKPREP_KILL;
> > >> req->cmd_flags |= REQ_QUIET;
> > >> }
> > >> return ret;
> > >> -
> > >> }
> > >>
> > >
> > > Makes sense to me..
> > >
> > > Acked-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
> > >
> > Not so fast. There are two problems with this approach:
> >
> > The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning'
> > only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer',
> > too.
>
> And what is the problem with that? The IO will eventually time out.
To restate as a question: even though we'll retry in alua_rtpg();
shouldn't the SCSI command eventually time out (via
scsi_attempt_requeue_command)?
Note that my proposed change to alua_rtpg() just adds a
TPGS_STATE_TRANSITIONING handler for implicit ALUA -- explicit ALUA
already has such a handler.
Allowing implicit ALUA to fall through (as we do currently) causes a
return alua_rtpg() of SCSI_DH_DEV_OFFLINED -- which isn't the correct
state.
> > Secondly this path fails with 'directio' multipath checker. Remember that 'directio'
> > is using 'fs' requests, not block-pc ones. Hence for all I/O the prep_fn() callback
> > is evaluated, which will return 'DEFER' here once the path is in transitioning.
> > And the state is never updated as RTPG is never called.
>
> Testing ALUA with directio path checker did not result in such immutable
> state in the few instances that TPGS_STATE_TRANSITIONING was seen in
> alua_prep_fn.
I had another look and I see what you're saying. Thanks for catching this!
> > I'm currently preparing a patch which addressed these situations, too.
>
> OK, please share.
Do you have that patch you were preparing? I look forward to seeing
your solution to this.
Thanks,
Mike
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-20 15:35 ` Mike Snitzer
@ 2010-09-21 2:27 ` Mike Christie
2010-09-21 2:28 ` Mike Christie
2010-09-21 19:33 ` Mike Snitzer
0 siblings, 2 replies; 14+ messages in thread
From: Mike Christie @ 2010-09-21 2:27 UTC (permalink / raw)
To: Mike Snitzer
Cc: Hannes Reinecke, Nicholas A. Bellinger, James Bottomley, linux-scsi
On 09/20/2010 10:35 AM, Mike Snitzer wrote:
> Hi Hannes,
>
> On Tue, Aug 31 2010 at 11:11am -0400,
> Mike Snitzer<snitzer@redhat.com> wrote:
>
>> On Mon, Aug 30 2010 at 5:36am -0400,
>> Hannes Reinecke<hare@suse.de> wrote:
>>
>>> Nicholas A. Bellinger wrote:
>>>> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
>>>>> Handle transitioning in the prep_fn.
>>>>> Handle transitioning in alua_rtpg's implicit alua code too.
>>>>>
>>>>> These gaps were identified during controller failover testing of an
>>>>> ALUA array.
>>>>>
>>>>> Signed-off-by: Mike Snitzer<snitzer@redhat.com>
>>>>> ---
>>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
>>>>> 1 files changed, 7 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>> index 1a970a7..c1eedc5 100644
>>>>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
>>>>> h->state == TPGS_STATE_STANDBY)
>>>>> /* Useable path if active */
>>>>> err = SCSI_DH_OK;
>>>>> + else if (h->state == TPGS_STATE_TRANSITIONING)
>>>>> + /* State transition, retry */
>>>>> + goto retry;
>>>>> else
>>>>> /* Path unuseable for unavailable/offline */
>>>>> err = SCSI_DH_DEV_OFFLINED;
>>>>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
>>>>> struct alua_dh_data *h = get_alua_data(sdev);
>>>>> int ret = BLKPREP_OK;
>>>>>
>>>>> - if (h->state != TPGS_STATE_OPTIMIZED&&
>>>>> - h->state != TPGS_STATE_NONOPTIMIZED) {
>>>>> + if (h->state == TPGS_STATE_TRANSITIONING)
>>>>> + ret = BLKPREP_DEFER;
>>>>> + else if (h->state != TPGS_STATE_OPTIMIZED&&
>>>>> + h->state != TPGS_STATE_NONOPTIMIZED) {
>>>>> ret = BLKPREP_KILL;
>>>>> req->cmd_flags |= REQ_QUIET;
>>>>> }
>>>>> return ret;
>>>>> -
>>>>> }
>>>>>
>>>>
>>>> Makes sense to me..
>>>>
>>>> Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org>
>>>>
>>> Not so fast. There are two problems with this approach:
>>>
>>> The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning'
>>> only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer',
>>> too.
>>
>> And what is the problem with that? The IO will eventually time out.
>
> To restate as a question: even though we'll retry in alua_rtpg();
> shouldn't the SCSI command eventually time out (via
> scsi_attempt_requeue_command)?
That function is only in RHEL. Requests that are prepd and sent to the
scsi layer and driver would eventually timeout in scsi_softirq_done in
upstream.
alua_prep_fn prevents the IO from getting sent to the scsi layer so we
do not hit the check in scsi_softirq_done though.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-21 2:27 ` Mike Christie
@ 2010-09-21 2:28 ` Mike Christie
2010-09-21 19:33 ` Mike Snitzer
1 sibling, 0 replies; 14+ messages in thread
From: Mike Christie @ 2010-09-21 2:28 UTC (permalink / raw)
To: Mike Snitzer
Cc: Hannes Reinecke, Nicholas A. Bellinger, James Bottomley, linux-scsi
On 09/20/2010 09:27 PM, Mike Christie wrote:
> On 09/20/2010 10:35 AM, Mike Snitzer wrote:
>> Hi Hannes,
>>
>> On Tue, Aug 31 2010 at 11:11am -0400,
>> Mike Snitzer<snitzer@redhat.com> wrote:
>>
>>> On Mon, Aug 30 2010 at 5:36am -0400,
>>> Hannes Reinecke<hare@suse.de> wrote:
>>>
>>>> Nicholas A. Bellinger wrote:
>>>>> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
>>>>>> Handle transitioning in the prep_fn.
>>>>>> Handle transitioning in alua_rtpg's implicit alua code too.
>>>>>>
>>>>>> These gaps were identified during controller failover testing of an
>>>>>> ALUA array.
>>>>>>
>>>>>> Signed-off-by: Mike Snitzer<snitzer@redhat.com>
>>>>>> ---
>>>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
>>>>>> 1 files changed, 7 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>> b/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>> index 1a970a7..c1eedc5 100644
>>>>>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev,
>>>>>> struct alua_dh_data *h)
>>>>>> h->state == TPGS_STATE_STANDBY)
>>>>>> /* Useable path if active */
>>>>>> err = SCSI_DH_OK;
>>>>>> + else if (h->state == TPGS_STATE_TRANSITIONING)
>>>>>> + /* State transition, retry */
>>>>>> + goto retry;
>>>>>> else
>>>>>> /* Path unuseable for unavailable/offline */
>>>>>> err = SCSI_DH_DEV_OFFLINED;
>>>>>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device
>>>>>> *sdev, struct request *req)
>>>>>> struct alua_dh_data *h = get_alua_data(sdev);
>>>>>> int ret = BLKPREP_OK;
>>>>>>
>>>>>> - if (h->state != TPGS_STATE_OPTIMIZED&&
>>>>>> - h->state != TPGS_STATE_NONOPTIMIZED) {
>>>>>> + if (h->state == TPGS_STATE_TRANSITIONING)
>>>>>> + ret = BLKPREP_DEFER;
>>>>>> + else if (h->state != TPGS_STATE_OPTIMIZED&&
>>>>>> + h->state != TPGS_STATE_NONOPTIMIZED) {
>>>>>> ret = BLKPREP_KILL;
>>>>>> req->cmd_flags |= REQ_QUIET;
>>>>>> }
>>>>>> return ret;
>>>>>> -
>>>>>> }
>>>>>>
>>>>>
>>>>> Makes sense to me..
>>>>>
>>>>> Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org>
>>>>>
>>>> Not so fast. There are two problems with this approach:
>>>>
>>>> The path is retried indefinitely. Arrays are _supposed_ to be in
>>>> 'transitioning'
>>>> only temporary; however, if the array is stuck due to a fw error
>>>> we're stuck in 'defer',
>>>> too.
>>>
>>> And what is the problem with that? The IO will eventually time out.
>>
>> To restate as a question: even though we'll retry in alua_rtpg();
>> shouldn't the SCSI command eventually time out (via
>> scsi_attempt_requeue_command)?
>
> That function is only in RHEL. Requests that are prepd and sent to the
> scsi layer and driver would eventually timeout in scsi_softirq_done in
> upstream.
Did we want to move this check to the block layer btw? If so it could
solve the problem.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-21 2:27 ` Mike Christie
2010-09-21 2:28 ` Mike Christie
@ 2010-09-21 19:33 ` Mike Snitzer
2010-09-21 21:14 ` Mike Christie
1 sibling, 1 reply; 14+ messages in thread
From: Mike Snitzer @ 2010-09-21 19:33 UTC (permalink / raw)
To: Mike Christie
Cc: Hannes Reinecke, Nicholas A. Bellinger, James Bottomley, linux-scsi
On Mon, Sep 20 2010 at 10:27pm -0400,
Mike Christie <michaelc@cs.wisc.edu> wrote:
> On 09/20/2010 10:35 AM, Mike Snitzer wrote:
> >Hi Hannes,
> >
> >On Tue, Aug 31 2010 at 11:11am -0400,
> >Mike Snitzer<snitzer@redhat.com> wrote:
> >
> >>On Mon, Aug 30 2010 at 5:36am -0400,
> >>Hannes Reinecke<hare@suse.de> wrote:
> >>
> >>>Nicholas A. Bellinger wrote:
> >>>>On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
> >>>>>Handle transitioning in the prep_fn.
> >>>>>Handle transitioning in alua_rtpg's implicit alua code too.
> >>>>>
> >>>>>These gaps were identified during controller failover testing of an
> >>>>>ALUA array.
> >>>>>
> >>>>>Signed-off-by: Mike Snitzer<snitzer@redhat.com>
> >>>>>---
> >>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
> >>>>> 1 files changed, 7 insertions(+), 3 deletions(-)
> >>>>>
> >>>>>diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
> >>>>>index 1a970a7..c1eedc5 100644
> >>>>>--- a/drivers/scsi/device_handler/scsi_dh_alua.c
> >>>>>+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> >>>>>@@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
> >>>>> h->state == TPGS_STATE_STANDBY)
> >>>>> /* Useable path if active */
> >>>>> err = SCSI_DH_OK;
> >>>>>+ else if (h->state == TPGS_STATE_TRANSITIONING)
> >>>>>+ /* State transition, retry */
> >>>>>+ goto retry;
> >>>>> else
> >>>>> /* Path unuseable for unavailable/offline */
> >>>>> err = SCSI_DH_DEV_OFFLINED;
> >>>>>@@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
> >>>>> struct alua_dh_data *h = get_alua_data(sdev);
> >>>>> int ret = BLKPREP_OK;
> >>>>>
> >>>>>- if (h->state != TPGS_STATE_OPTIMIZED&&
> >>>>>- h->state != TPGS_STATE_NONOPTIMIZED) {
> >>>>>+ if (h->state == TPGS_STATE_TRANSITIONING)
> >>>>>+ ret = BLKPREP_DEFER;
> >>>>>+ else if (h->state != TPGS_STATE_OPTIMIZED&&
> >>>>>+ h->state != TPGS_STATE_NONOPTIMIZED) {
> >>>>> ret = BLKPREP_KILL;
> >>>>> req->cmd_flags |= REQ_QUIET;
> >>>>> }
> >>>>> return ret;
> >>>>>-
> >>>>> }
> >>>>>
> >>>>
> >>>>Makes sense to me..
> >>>>
> >>>>Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org>
> >>>>
> >>>Not so fast. There are two problems with this approach:
> >>>
> >>>The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning'
> >>>only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer',
> >>>too.
> >>
> >>And what is the problem with that? The IO will eventually time out.
> >
> >To restate as a question: even though we'll retry in alua_rtpg();
> >shouldn't the SCSI command eventually time out (via
> >scsi_attempt_requeue_command)?
>
> That function is only in RHEL. Requests that are prepd and sent to
> the scsi layer and driver would eventually timeout in
> scsi_softirq_done in upstream.
>
> alua_prep_fn prevents the IO from getting sent to the scsi layer so
> we do not hit the check in scsi_softirq_done though.
That is only the case if alua_prep_fn were to return BLKPREP_DEFER
right?
Taking a step back:
1) the proposed alua_prep_fn BLKPREP_DEFER change is not correct and is
no longer being proposed...
2) the patch also modified alua_rtpg() so implicit ALUA would retry
(just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING
- so why should we avoid retry for implicit but do it for explicit?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-21 19:33 ` Mike Snitzer
@ 2010-09-21 21:14 ` Mike Christie
2010-09-22 10:13 ` Hannes Reinecke
0 siblings, 1 reply; 14+ messages in thread
From: Mike Christie @ 2010-09-21 21:14 UTC (permalink / raw)
To: Mike Snitzer
Cc: Hannes Reinecke, Nicholas A. Bellinger, James Bottomley, linux-scsi
On 09/21/2010 02:33 PM, Mike Snitzer wrote:
> On Mon, Sep 20 2010 at 10:27pm -0400,
> Mike Christie<michaelc@cs.wisc.edu> wrote:
>
>> On 09/20/2010 10:35 AM, Mike Snitzer wrote:
>>> Hi Hannes,
>>>
>>> On Tue, Aug 31 2010 at 11:11am -0400,
>>> Mike Snitzer<snitzer@redhat.com> wrote:
>>>
>>>> On Mon, Aug 30 2010 at 5:36am -0400,
>>>> Hannes Reinecke<hare@suse.de> wrote:
>>>>
>>>>> Nicholas A. Bellinger wrote:
>>>>>> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
>>>>>>> Handle transitioning in the prep_fn.
>>>>>>> Handle transitioning in alua_rtpg's implicit alua code too.
>>>>>>>
>>>>>>> These gaps were identified during controller failover testing of an
>>>>>>> ALUA array.
>>>>>>>
>>>>>>> Signed-off-by: Mike Snitzer<snitzer@redhat.com>
>>>>>>> ---
>>>>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
>>>>>>> 1 files changed, 7 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>>> index 1a970a7..c1eedc5 100644
>>>>>>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
>>>>>>> h->state == TPGS_STATE_STANDBY)
>>>>>>> /* Useable path if active */
>>>>>>> err = SCSI_DH_OK;
>>>>>>> + else if (h->state == TPGS_STATE_TRANSITIONING)
>>>>>>> + /* State transition, retry */
>>>>>>> + goto retry;
>>>>>>> else
>>>>>>> /* Path unuseable for unavailable/offline */
>>>>>>> err = SCSI_DH_DEV_OFFLINED;
>>>>>>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
>>>>>>> struct alua_dh_data *h = get_alua_data(sdev);
>>>>>>> int ret = BLKPREP_OK;
>>>>>>>
>>>>>>> - if (h->state != TPGS_STATE_OPTIMIZED&&
>>>>>>> - h->state != TPGS_STATE_NONOPTIMIZED) {
>>>>>>> + if (h->state == TPGS_STATE_TRANSITIONING)
>>>>>>> + ret = BLKPREP_DEFER;
>>>>>>> + else if (h->state != TPGS_STATE_OPTIMIZED&&
>>>>>>> + h->state != TPGS_STATE_NONOPTIMIZED) {
>>>>>>> ret = BLKPREP_KILL;
>>>>>>> req->cmd_flags |= REQ_QUIET;
>>>>>>> }
>>>>>>> return ret;
>>>>>>> -
>>>>>>> }
>>>>>>>
>>>>>>
>>>>>> Makes sense to me..
>>>>>>
>>>>>> Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org>
>>>>>>
>>>>> Not so fast. There are two problems with this approach:
>>>>>
>>>>> The path is retried indefinitely. Arrays are _supposed_ to be in 'transitioning'
>>>>> only temporary; however, if the array is stuck due to a fw error we're stuck in 'defer',
>>>>> too.
>>>>
>>>> And what is the problem with that? The IO will eventually time out.
>>>
>>> To restate as a question: even though we'll retry in alua_rtpg();
>>> shouldn't the SCSI command eventually time out (via
>>> scsi_attempt_requeue_command)?
>>
>> That function is only in RHEL. Requests that are prepd and sent to
>> the scsi layer and driver would eventually timeout in
>> scsi_softirq_done in upstream.
>>
>> alua_prep_fn prevents the IO from getting sent to the scsi layer so
>> we do not hit the check in scsi_softirq_done though.
>
> That is only the case if alua_prep_fn were to return BLKPREP_DEFER
> right?
Yeah. I misread the email above. Just to clarify..
For the alua case, we will always retry due the prep_fn issue I mentioned.
For the alua_rtpg() retry case you were discussing we will also always
retry, because the timer check in
scsi_attempt_requeue_command/scsi_softirq_done will break us from
retrying forever in that execution of the request started by the
submit_rtpg call. However, the added goto retry will would just end up
starting a another execution. To handle Hannes comment I think you would
want to add some retry/timer checks in alua_rtpg to prevent that from
retrying forever.
>
> 2) the patch also modified alua_rtpg() so implicit ALUA would retry
> (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING
> - so why should we avoid retry for implicit but do it for explicit?
Leaving that for Hannes. I cannot think of a reason. Probably just did
not do it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-21 21:14 ` Mike Christie
@ 2010-09-22 10:13 ` Hannes Reinecke
2010-09-22 12:29 ` Mike Snitzer
0 siblings, 1 reply; 14+ messages in thread
From: Hannes Reinecke @ 2010-09-22 10:13 UTC (permalink / raw)
To: Mike Christie
Cc: Mike Snitzer, Nicholas A. Bellinger, James Bottomley, linux-scsi
[-- Attachment #1: Type: text/plain, Size: 4863 bytes --]
Mike Christie wrote:
> On 09/21/2010 02:33 PM, Mike Snitzer wrote:
>> On Mon, Sep 20 2010 at 10:27pm -0400,
>> Mike Christie<michaelc@cs.wisc.edu> wrote:
>>
>>> On 09/20/2010 10:35 AM, Mike Snitzer wrote:
>>>> Hi Hannes,
>>>>
>>>> On Tue, Aug 31 2010 at 11:11am -0400,
>>>> Mike Snitzer<snitzer@redhat.com> wrote:
>>>>
>>>>> On Mon, Aug 30 2010 at 5:36am -0400,
>>>>> Hannes Reinecke<hare@suse.de> wrote:
>>>>>
>>>>>> Nicholas A. Bellinger wrote:
>>>>>>> On Tue, 2010-08-17 at 15:05 -0400, Mike Snitzer wrote:
>>>>>>>> Handle transitioning in the prep_fn.
>>>>>>>> Handle transitioning in alua_rtpg's implicit alua code too.
>>>>>>>>
>>>>>>>> These gaps were identified during controller failover testing of an
>>>>>>>> ALUA array.
>>>>>>>>
>>>>>>>> Signed-off-by: Mike Snitzer<snitzer@redhat.com>
>>>>>>>> ---
>>>>>>>> drivers/scsi/device_handler/scsi_dh_alua.c | 10 +++++++---
>>>>>>>> 1 files changed, 7 insertions(+), 3 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>>>> b/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>>>> index 1a970a7..c1eedc5 100644
>>>>>>>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>>>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
>>>>>>>> @@ -616,6 +616,9 @@ static int alua_rtpg(struct scsi_device
>>>>>>>> *sdev, struct alua_dh_data *h)
>>>>>>>> h->state == TPGS_STATE_STANDBY)
>>>>>>>> /* Useable path if active */
>>>>>>>> err = SCSI_DH_OK;
>>>>>>>> + else if (h->state == TPGS_STATE_TRANSITIONING)
>>>>>>>> + /* State transition, retry */
>>>>>>>> + goto retry;
>>>>>>>> else
>>>>>>>> /* Path unuseable for unavailable/offline */
>>>>>>>> err = SCSI_DH_DEV_OFFLINED;
>>>>>>>> @@ -698,13 +701,14 @@ static int alua_prep_fn(struct scsi_device
>>>>>>>> *sdev, struct request *req)
>>>>>>>> struct alua_dh_data *h = get_alua_data(sdev);
>>>>>>>> int ret = BLKPREP_OK;
>>>>>>>>
>>>>>>>> - if (h->state != TPGS_STATE_OPTIMIZED&&
>>>>>>>> - h->state != TPGS_STATE_NONOPTIMIZED) {
>>>>>>>> + if (h->state == TPGS_STATE_TRANSITIONING)
>>>>>>>> + ret = BLKPREP_DEFER;
>>>>>>>> + else if (h->state != TPGS_STATE_OPTIMIZED&&
>>>>>>>> + h->state != TPGS_STATE_NONOPTIMIZED) {
>>>>>>>> ret = BLKPREP_KILL;
>>>>>>>> req->cmd_flags |= REQ_QUIET;
>>>>>>>> }
>>>>>>>> return ret;
>>>>>>>> -
>>>>>>>> }
>>>>>>>>
>>>>>>>
>>>>>>> Makes sense to me..
>>>>>>>
>>>>>>> Acked-by: Nicholas A. Bellinger<nab@linux-iscsi.org>
>>>>>>>
>>>>>> Not so fast. There are two problems with this approach:
>>>>>>
>>>>>> The path is retried indefinitely. Arrays are _supposed_ to be in
>>>>>> 'transitioning'
>>>>>> only temporary; however, if the array is stuck due to a fw error
>>>>>> we're stuck in 'defer',
>>>>>> too.
>>>>>
>>>>> And what is the problem with that? The IO will eventually time out.
>>>>
>>>> To restate as a question: even though we'll retry in alua_rtpg();
>>>> shouldn't the SCSI command eventually time out (via
>>>> scsi_attempt_requeue_command)?
>>>
>>> That function is only in RHEL. Requests that are prepd and sent to
>>> the scsi layer and driver would eventually timeout in
>>> scsi_softirq_done in upstream.
>>>
>>> alua_prep_fn prevents the IO from getting sent to the scsi layer so
>>> we do not hit the check in scsi_softirq_done though.
>>
>> That is only the case if alua_prep_fn were to return BLKPREP_DEFER
>> right?
>
> Yeah. I misread the email above. Just to clarify..
>
> For the alua case, we will always retry due the prep_fn issue I mentioned.
>
> For the alua_rtpg() retry case you were discussing we will also always
> retry, because the timer check in
> scsi_attempt_requeue_command/scsi_softirq_done will break us from
> retrying forever in that execution of the request started by the
> submit_rtpg call. However, the added goto retry will would just end up
> starting a another execution. To handle Hannes comment I think you would
> want to add some retry/timer checks in alua_rtpg to prevent that from
> retrying forever.
>
>>
>> 2) the patch also modified alua_rtpg() so implicit ALUA would retry
>> (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING
>> - so why should we avoid retry for implicit but do it for explicit?
>
> Leaving that for Hannes. I cannot think of a reason. Probably just did
> not do it.
Finally I got around to answering this.
I've attached a patch which I made the other day which seems to work
reasonably well.
Looks better from my side, so if you agree I'll be sending it
upstream properly.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
[-- Attachment #2: scsi-dh-alua-handle-all-states --]
[-- Type: text/plain, Size: 4696 bytes --]
>From d3f02c90db3e3177309b78726d082e17dd772ee2 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Wed, 22 Sep 2010 12:09:07 +0200
Subject: [PATCH] scsi_dh_alua: Handle all states correctly
For ALUA we should be handling all states, independent of whether
is explicit or implicit. For 'Transitioning' we should be retry
for a certain amount of time; after that an error should be
returned.
Signed-off-by: Hannes Reinecke <hare@suse.de>
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 1a970a7..c6f57e3 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -31,6 +31,7 @@
#define TPGS_STATE_NONOPTIMIZED 0x1
#define TPGS_STATE_STANDBY 0x2
#define TPGS_STATE_UNAVAILABLE 0x3
+#define TPGS_STATE_LBA_DEPENDENT 0x4
#define TPGS_STATE_OFFLINE 0xe
#define TPGS_STATE_TRANSITIONING 0xf
@@ -39,6 +40,7 @@
#define TPGS_SUPPORT_NONOPTIMIZED 0x02
#define TPGS_SUPPORT_STANDBY 0x04
#define TPGS_SUPPORT_UNAVAILABLE 0x08
+#define TPGS_SUPPORT_LBA_DEPENDENT 0x10
#define TPGS_SUPPORT_OFFLINE 0x40
#define TPGS_SUPPORT_TRANSITION 0x80
@@ -460,6 +462,8 @@ static char print_alua_state(int state)
return 'S';
case TPGS_STATE_UNAVAILABLE:
return 'U';
+ case TPGS_STATE_LBA_DEPENDENT:
+ return 'L';
case TPGS_STATE_OFFLINE:
return 'O';
case TPGS_STATE_TRANSITIONING:
@@ -542,7 +546,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
int len, k, off, valid_states = 0;
char *ucp;
unsigned err;
+ unsigned long expiry, interval = 10;
+ expiry = round_jiffies_up(jiffies + ALUA_FAILOVER_TIMEOUT);
retry:
err = submit_rtpg(sdev, h);
@@ -553,7 +559,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
return SCSI_DH_IO;
err = alua_check_sense(sdev, &sense_hdr);
- if (err == ADD_TO_MLQUEUE)
+ if (err == ADD_TO_MLQUEUE && time_before(jiffies, expiry))
goto retry;
sdev_printk(KERN_INFO, sdev,
"%s: rtpg sense code %02x/%02x/%02x\n",
@@ -587,38 +593,36 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
}
sdev_printk(KERN_INFO, sdev,
- "%s: port group %02x state %c supports %c%c%c%c%c%c\n",
+ "%s: port group %02x state %c supports %c%c%c%c%c%c%c\n",
ALUA_DH_NAME, h->group_id, print_alua_state(h->state),
valid_states&TPGS_SUPPORT_TRANSITION?'T':'t',
valid_states&TPGS_SUPPORT_OFFLINE?'O':'o',
+ valid_states&TPGS_SUPPORT_LBA_DEPENDENT?'L':'l',
valid_states&TPGS_SUPPORT_UNAVAILABLE?'U':'u',
valid_states&TPGS_SUPPORT_STANDBY?'S':'s',
valid_states&TPGS_SUPPORT_NONOPTIMIZED?'N':'n',
valid_states&TPGS_SUPPORT_OPTIMIZED?'A':'a');
- if (h->tpgs & TPGS_MODE_EXPLICIT) {
- switch (h->state) {
- case TPGS_STATE_TRANSITIONING:
+ switch (h->state) {
+ case TPGS_STATE_TRANSITIONING:
+ if (time_before(jiffies, expiry)) {
/* State transition, retry */
+ interval *= 10;
+ msleep(interval);
goto retry;
- break;
- case TPGS_STATE_OFFLINE:
- /* Path is offline, fail */
- err = SCSI_DH_DEV_OFFLINED;
- break;
- default:
- break;
}
- } else {
- /* Only Implicit ALUA support */
- if (h->state == TPGS_STATE_OPTIMIZED ||
- h->state == TPGS_STATE_NONOPTIMIZED ||
- h->state == TPGS_STATE_STANDBY)
- /* Useable path if active */
- err = SCSI_DH_OK;
- else
- /* Path unuseable for unavailable/offline */
- err = SCSI_DH_DEV_OFFLINED;
+ /* Transitioning time exceeded */
+ err = SCSI_DH_RETRY;
+ break;
+ case TPGS_STATE_OFFLINE:
+ case TPGS_STATE_UNAVAILABLE:
+ /* Path unuseable for unavailable/offline */
+ err = SCSI_DH_DEV_OFFLINED;
+ break;
+ default:
+ /* Useable path if active */
+ err = SCSI_DH_OK;
+ break;
}
return err;
}
@@ -672,7 +676,9 @@ static int alua_activate(struct scsi_device *sdev,
goto out;
}
- if (h->tpgs & TPGS_MODE_EXPLICIT && h->state != TPGS_STATE_OPTIMIZED) {
+ if (h->tpgs & TPGS_MODE_EXPLICIT &&
+ h->state != TPGS_STATE_OPTIMIZED &&
+ h->state != TPGS_STATE_LBA_DEPENDENT) {
h->callback_fn = fn;
h->callback_data = data;
err = submit_stpg(h);
@@ -698,8 +704,11 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
struct alua_dh_data *h = get_alua_data(sdev);
int ret = BLKPREP_OK;
- if (h->state != TPGS_STATE_OPTIMIZED &&
- h->state != TPGS_STATE_NONOPTIMIZED) {
+ if (h->state == TPGS_STATE_TRANSITIONING)
+ ret = BLKPREP_DEFER;
+ else if (h->state != TPGS_STATE_OPTIMIZED &&
+ h->state != TPGS_STATE_NONOPTIMIZED &&
+ h->state != TPGS_STATE_LBA_DEPENDENT) {
ret = BLKPREP_KILL;
req->cmd_flags |= REQ_QUIET;
}
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-22 10:13 ` Hannes Reinecke
@ 2010-09-22 12:29 ` Mike Snitzer
2010-09-23 7:15 ` Hannes Reinecke
0 siblings, 1 reply; 14+ messages in thread
From: Mike Snitzer @ 2010-09-22 12:29 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Mike Christie, Nicholas A. Bellinger, James Bottomley, linux-scsi
On Wed, Sep 22 2010 at 6:13am -0400,
Hannes Reinecke <hare@suse.de> wrote:
> Mike Christie wrote:
> > On 09/21/2010 02:33 PM, Mike Snitzer wrote:
> >> 2) the patch also modified alua_rtpg() so implicit ALUA would retry
> >> (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING
> >> - so why should we avoid retry for implicit but do it for explicit?
> >
> > Leaving that for Hannes. I cannot think of a reason. Probably just did
> > not do it.
>
> Finally I got around to answering this.
>
> I've attached a patch which I made the other day which seems to work
> reasonably well.
> Looks better from my side, so if you agree I'll be sending it
> upstream properly.
Looks good for covering this "2)" change above.
But "1)" change you had concern with (in my previous patch) was that
alua_prep_fn would return BLKPREP_DEFER if TPGS_STATE_TRANSITIONING.
This was bad in that FS requests (like the directio path checker) would
always call ->prep_fn and that if TPGS_STATE_TRANSITIONING the RTPG
state would never be reevaluated.. leaving us stuck in BLKPREP_DEFER.
Seems I'm missing a new flow that proves this is no longer a concern.
Please advise, thanks.
Mike
> From d3f02c90db3e3177309b78726d082e17dd772ee2 Mon Sep 17 00:00:00 2001
> From: Hannes Reinecke <hare@suse.de>
> Date: Wed, 22 Sep 2010 12:09:07 +0200
> Subject: [PATCH] scsi_dh_alua: Handle all states correctly
>
> For ALUA we should be handling all states, independent of whether
> is explicit or implicit. For 'Transitioning' we should be retry
> for a certain amount of time; after that an error should be
> returned.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
>
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
> index 1a970a7..c6f57e3 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
...
> @@ -698,8 +704,11 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
> struct alua_dh_data *h = get_alua_data(sdev);
> int ret = BLKPREP_OK;
>
> - if (h->state != TPGS_STATE_OPTIMIZED &&
> - h->state != TPGS_STATE_NONOPTIMIZED) {
> + if (h->state == TPGS_STATE_TRANSITIONING)
> + ret = BLKPREP_DEFER;
> + else if (h->state != TPGS_STATE_OPTIMIZED &&
> + h->state != TPGS_STATE_NONOPTIMIZED &&
> + h->state != TPGS_STATE_LBA_DEPENDENT) {
> ret = BLKPREP_KILL;
> req->cmd_flags |= REQ_QUIET;
> }
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-22 12:29 ` Mike Snitzer
@ 2010-09-23 7:15 ` Hannes Reinecke
2010-09-23 13:44 ` Mike Snitzer
0 siblings, 1 reply; 14+ messages in thread
From: Hannes Reinecke @ 2010-09-23 7:15 UTC (permalink / raw)
To: Mike Snitzer
Cc: Mike Christie, Nicholas A. Bellinger, James Bottomley, linux-scsi
[-- Attachment #1: Type: text/plain, Size: 1827 bytes --]
Mike Snitzer wrote:
>
> On Wed, Sep 22 2010 at 6:13am -0400,
> Hannes Reinecke <hare@suse.de> wrote:
>
>> Mike Christie wrote:
>>> On 09/21/2010 02:33 PM, Mike Snitzer wrote:
>>>> 2) the patch also modified alua_rtpg() so implicit ALUA would retry
>>>> (just like explicit ALUA currently does) if TPGS_STATE_TRANSITIONING
>>>> - so why should we avoid retry for implicit but do it for explicit?
>>> Leaving that for Hannes. I cannot think of a reason. Probably just did
>>> not do it.
>> Finally I got around to answering this.
>>
>> I've attached a patch which I made the other day which seems to work
>> reasonably well.
>> Looks better from my side, so if you agree I'll be sending it
>> upstream properly.
>
> Looks good for covering this "2)" change above.
>
> But "1)" change you had concern with (in my previous patch) was that
> alua_prep_fn would return BLKPREP_DEFER if TPGS_STATE_TRANSITIONING.
> This was bad in that FS requests (like the directio path checker) would
> always call ->prep_fn and that if TPGS_STATE_TRANSITIONING the RTPG
> state would never be reevaluated.. leaving us stuck in BLKPREP_DEFER.
>
> Seems I'm missing a new flow that proves this is no longer a concern.
>
The device_handler will keep retrying any path in 'transitioning'.
After ALUA_FAILOVER_TIMEOUT the device handler will either have
updated the port state to something other than 'transitioning' or
returned with 'SCSI_DH_RETRY'.
But yes, you are right; we should be setting the port to something
else than 'transitioning' then. Probably 'standby' is a good choice
here.
New patch attached.
Thanks for the review.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
[-- Attachment #2: scsi-dh-alua-handle-all-states --]
[-- Type: text/plain, Size: 4832 bytes --]
>From db02ba3c33a4a1dbfcdc4de1b73e378f7b3d8925 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Wed, 22 Sep 2010 12:09:07 +0200
Subject: [PATCH] scsi_dh_alua: Handle all states correctly
For ALUA we should be handling all states, independent of whether
is explicit or implicit. For 'Transitioning' we should be retry
for a certain amount of time; after that we're setting the port
to 'Standby' and return SCSI_DH_RETRY to signal upper layers
a retry is in order here.
Signed-off-by: Hannes Reinecke <hare@suse.de>
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 1a970a7..c7e11ed 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -31,6 +31,7 @@
#define TPGS_STATE_NONOPTIMIZED 0x1
#define TPGS_STATE_STANDBY 0x2
#define TPGS_STATE_UNAVAILABLE 0x3
+#define TPGS_STATE_LBA_DEPENDENT 0x4
#define TPGS_STATE_OFFLINE 0xe
#define TPGS_STATE_TRANSITIONING 0xf
@@ -39,6 +40,7 @@
#define TPGS_SUPPORT_NONOPTIMIZED 0x02
#define TPGS_SUPPORT_STANDBY 0x04
#define TPGS_SUPPORT_UNAVAILABLE 0x08
+#define TPGS_SUPPORT_LBA_DEPENDENT 0x10
#define TPGS_SUPPORT_OFFLINE 0x40
#define TPGS_SUPPORT_TRANSITION 0x80
@@ -460,6 +462,8 @@ static char print_alua_state(int state)
return 'S';
case TPGS_STATE_UNAVAILABLE:
return 'U';
+ case TPGS_STATE_LBA_DEPENDENT:
+ return 'L';
case TPGS_STATE_OFFLINE:
return 'O';
case TPGS_STATE_TRANSITIONING:
@@ -542,7 +546,9 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
int len, k, off, valid_states = 0;
char *ucp;
unsigned err;
+ unsigned long expiry, interval = 10;
+ expiry = round_jiffies_up(jiffies + ALUA_FAILOVER_TIMEOUT);
retry:
err = submit_rtpg(sdev, h);
@@ -553,7 +559,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
return SCSI_DH_IO;
err = alua_check_sense(sdev, &sense_hdr);
- if (err == ADD_TO_MLQUEUE)
+ if (err == ADD_TO_MLQUEUE && time_before(jiffies, expiry))
goto retry;
sdev_printk(KERN_INFO, sdev,
"%s: rtpg sense code %02x/%02x/%02x\n",
@@ -587,38 +593,37 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_dh_data *h)
}
sdev_printk(KERN_INFO, sdev,
- "%s: port group %02x state %c supports %c%c%c%c%c%c\n",
+ "%s: port group %02x state %c supports %c%c%c%c%c%c%c\n",
ALUA_DH_NAME, h->group_id, print_alua_state(h->state),
valid_states&TPGS_SUPPORT_TRANSITION?'T':'t',
valid_states&TPGS_SUPPORT_OFFLINE?'O':'o',
+ valid_states&TPGS_SUPPORT_LBA_DEPENDENT?'L':'l',
valid_states&TPGS_SUPPORT_UNAVAILABLE?'U':'u',
valid_states&TPGS_SUPPORT_STANDBY?'S':'s',
valid_states&TPGS_SUPPORT_NONOPTIMIZED?'N':'n',
valid_states&TPGS_SUPPORT_OPTIMIZED?'A':'a');
- if (h->tpgs & TPGS_MODE_EXPLICIT) {
- switch (h->state) {
- case TPGS_STATE_TRANSITIONING:
+ switch (h->state) {
+ case TPGS_STATE_TRANSITIONING:
+ if (time_before(jiffies, expiry)) {
/* State transition, retry */
+ interval *= 10;
+ msleep(interval);
goto retry;
- break;
- case TPGS_STATE_OFFLINE:
- /* Path is offline, fail */
- err = SCSI_DH_DEV_OFFLINED;
- break;
- default:
- break;
}
- } else {
- /* Only Implicit ALUA support */
- if (h->state == TPGS_STATE_OPTIMIZED ||
- h->state == TPGS_STATE_NONOPTIMIZED ||
- h->state == TPGS_STATE_STANDBY)
- /* Useable path if active */
- err = SCSI_DH_OK;
- else
- /* Path unuseable for unavailable/offline */
- err = SCSI_DH_DEV_OFFLINED;
+ /* Transitioning time exceeded, set port to standby */
+ err = SCSI_DH_RETRY;
+ h->state = TPGS_STATE_STANDBY;
+ break;
+ case TPGS_STATE_OFFLINE:
+ case TPGS_STATE_UNAVAILABLE:
+ /* Path unuseable for unavailable/offline */
+ err = SCSI_DH_DEV_OFFLINED;
+ break;
+ default:
+ /* Useable path if active */
+ err = SCSI_DH_OK;
+ break;
}
return err;
}
@@ -672,7 +677,9 @@ static int alua_activate(struct scsi_device *sdev,
goto out;
}
- if (h->tpgs & TPGS_MODE_EXPLICIT && h->state != TPGS_STATE_OPTIMIZED) {
+ if (h->tpgs & TPGS_MODE_EXPLICIT &&
+ h->state != TPGS_STATE_OPTIMIZED &&
+ h->state != TPGS_STATE_LBA_DEPENDENT) {
h->callback_fn = fn;
h->callback_data = data;
err = submit_stpg(h);
@@ -698,8 +705,11 @@ static int alua_prep_fn(struct scsi_device *sdev, struct request *req)
struct alua_dh_data *h = get_alua_data(sdev);
int ret = BLKPREP_OK;
- if (h->state != TPGS_STATE_OPTIMIZED &&
- h->state != TPGS_STATE_NONOPTIMIZED) {
+ if (h->state == TPGS_STATE_TRANSITIONING)
+ ret = BLKPREP_DEFER;
+ else if (h->state != TPGS_STATE_OPTIMIZED &&
+ h->state != TPGS_STATE_NONOPTIMIZED &&
+ h->state != TPGS_STATE_LBA_DEPENDENT) {
ret = BLKPREP_KILL;
req->cmd_flags |= REQ_QUIET;
}
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-23 7:15 ` Hannes Reinecke
@ 2010-09-23 13:44 ` Mike Snitzer
2010-09-23 18:53 ` Mike Snitzer
0 siblings, 1 reply; 14+ messages in thread
From: Mike Snitzer @ 2010-09-23 13:44 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Mike Christie, Nicholas A. Bellinger, James Bottomley, linux-scsi
On Thu, Sep 23 2010 at 3:15am -0400,
Hannes Reinecke <hare@suse.de> wrote:
> The device_handler will keep retrying any path in 'transitioning'.
> After ALUA_FAILOVER_TIMEOUT the device handler will either have
> updated the port state to something other than 'transitioning' or
> returned with 'SCSI_DH_RETRY'.
> But yes, you are right; we should be setting the port to something
> else than 'transitioning' then. Probably 'standby' is a good choice
> here.
> New patch attached.
>
> Thanks for the review.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke zSeries & Storage
> hare@suse.de +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Markus Rex, HRB 16746 (AG Nürnberg)
> From db02ba3c33a4a1dbfcdc4de1b73e378f7b3d8925 Mon Sep 17 00:00:00 2001
> From: Hannes Reinecke <hare@suse.de>
> Date: Wed, 22 Sep 2010 12:09:07 +0200
> Subject: [PATCH] scsi_dh_alua: Handle all states correctly
>
> For ALUA we should be handling all states, independent of whether
> is explicit or implicit. For 'Transitioning' we should be retry
> for a certain amount of time; after that we're setting the port
> to 'Standby' and return SCSI_DH_RETRY to signal upper layers
> a retry is in order here.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: scsi_dh_alua: add missing transitioning state support
2010-09-23 13:44 ` Mike Snitzer
@ 2010-09-23 18:53 ` Mike Snitzer
0 siblings, 0 replies; 14+ messages in thread
From: Mike Snitzer @ 2010-09-23 18:53 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Mike Christie, Nicholas A. Bellinger, James Bottomley, linux-scsi
On Thu, Sep 23 2010 at 9:44am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:
> > From db02ba3c33a4a1dbfcdc4de1b73e378f7b3d8925 Mon Sep 17 00:00:00 2001
> > From: Hannes Reinecke <hare@suse.de>
> > Date: Wed, 22 Sep 2010 12:09:07 +0200
> > Subject: [PATCH] scsi_dh_alua: Handle all states correctly
> >
> > For ALUA we should be handling all states, independent of whether
> > is explicit or implicit. For 'Transitioning' we should be retry
> > for a certain amount of time; after that we're setting the port
> > to 'Standby' and return SCSI_DH_RETRY to signal upper layers
> > a retry is in order here.
> >
> > Signed-off-by: Hannes Reinecke <hare@suse.de>
>
> Acked-by: Mike Snitzer <snitzer@redhat.com>
BTW, we're missing the include for the msleep() introduced in this
patch:
#include <linux/delay.h>
James, any chance you'd add that?
Or should Hannes send a revised patch?
Mike
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-09-23 18:53 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-17 19:05 [PATCH] scsi_dh_alua: add missing transitioning state support Mike Snitzer
2010-08-17 19:23 ` Nicholas A. Bellinger
2010-08-30 9:36 ` Hannes Reinecke
2010-08-31 15:11 ` Mike Snitzer
2010-09-20 15:35 ` Mike Snitzer
2010-09-21 2:27 ` Mike Christie
2010-09-21 2:28 ` Mike Christie
2010-09-21 19:33 ` Mike Snitzer
2010-09-21 21:14 ` Mike Christie
2010-09-22 10:13 ` Hannes Reinecke
2010-09-22 12:29 ` Mike Snitzer
2010-09-23 7:15 ` Hannes Reinecke
2010-09-23 13:44 ` Mike Snitzer
2010-09-23 18:53 ` Mike Snitzer
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.