* [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions @ 2019-10-07 13:57 Hannes Reinecke 2019-10-07 14:15 ` Laurence Oberman ` (3 more replies) 0 siblings, 4 replies; 7+ messages in thread From: Hannes Reinecke @ 2019-10-07 13:57 UTC (permalink / raw) To: Martin K. Petersen Cc: Christoph Hellwig, James Bottomley, Martin Wilck, linux-scsi, Hannes Reinecke From: Hannes Reinecke <hare@suse.com> Some arrays are not capable of returning RTPG data during state transitioning, but rather return an 'LUN not accessible, asymmetric access state transition' sense code. In these cases we can set the state to 'transitioning' directly and don't need to evaluate the RTPG data (which we won't have anyway). Signed-off-by: Hannes Reinecke <hare@suse.com> --- drivers/scsi/device_handler/scsi_dh_alua.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index 4971104b1817..f32da0ca529e 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -512,6 +512,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) unsigned int tpg_desc_tbl_off; unsigned char orig_transition_tmo; unsigned long flags; + bool transitioning_sense = false; if (!pg->expiry) { unsigned long transition_tmo = ALUA_FAILOVER_TIMEOUT * HZ; @@ -572,13 +573,19 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) goto retry; } /* - * Retry on ALUA state transition or if any - * UNIT ATTENTION occurred. + * If the array returns with 'ALUA state transition' + * sense code here it cannot return RTPG data during + * transition. So set the state to 'transitioning' directly. */ if (sense_hdr.sense_key == NOT_READY && - sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) - err = SCSI_DH_RETRY; - else if (sense_hdr.sense_key == UNIT_ATTENTION) + sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) { + transitioning_sense = true; + goto skip_rtpg; + } + /* + * Retry on any other UNIT ATTENTION occurred. + */ + if (sense_hdr.sense_key == UNIT_ATTENTION) err = SCSI_DH_RETRY; if (err == SCSI_DH_RETRY && pg->expiry != 0 && time_before(jiffies, pg->expiry)) { @@ -666,7 +673,11 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) off = 8 + (desc[7] * 4); } + skip_rtpg: spin_lock_irqsave(&pg->lock, flags); + if (transitioning_sense) + pg->state = SCSI_ACCESS_STATE_TRANSITIONING; + sdev_printk(KERN_INFO, sdev, "%s: port group %02x state %c %s supports %c%c%c%c%c%c%c\n", ALUA_DH_NAME, pg->group_id, print_alua_state(pg->state), -- 2.16.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions 2019-10-07 13:57 [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions Hannes Reinecke @ 2019-10-07 14:15 ` Laurence Oberman 2019-10-07 20:45 ` Ewan D. Milne ` (2 subsequent siblings) 3 siblings, 0 replies; 7+ messages in thread From: Laurence Oberman @ 2019-10-07 14:15 UTC (permalink / raw) To: Hannes Reinecke, Martin K. Petersen Cc: Christoph Hellwig, James Bottomley, Martin Wilck, linux-scsi, Hannes Reinecke On Mon, 2019-10-07 at 15:57 +0200, Hannes Reinecke wrote: > From: Hannes Reinecke <hare@suse.com> > > Some arrays are not capable of returning RTPG data during state > transitioning, but rather return an 'LUN not accessible, asymmetric > access state transition' sense code. In these cases we > can set the state to 'transitioning' directly and don't need to > evaluate the RTPG data (which we won't have anyway). > > Signed-off-by: Hannes Reinecke <hare@suse.com> > --- > drivers/scsi/device_handler/scsi_dh_alua.c | 21 ++++++++++++++++-- > --- > 1 file changed, 16 insertions(+), 5 deletions(-) > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c > b/drivers/scsi/device_handler/scsi_dh_alua.c > index 4971104b1817..f32da0ca529e 100644 > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > @@ -512,6 +512,7 @@ static int alua_rtpg(struct scsi_device *sdev, > struct alua_port_group *pg) > unsigned int tpg_desc_tbl_off; > unsigned char orig_transition_tmo; > unsigned long flags; > + bool transitioning_sense = false; > > if (!pg->expiry) { > unsigned long transition_tmo = ALUA_FAILOVER_TIMEOUT * > HZ; > @@ -572,13 +573,19 @@ static int alua_rtpg(struct scsi_device *sdev, > struct alua_port_group *pg) > goto retry; > } > /* > - * Retry on ALUA state transition or if any > - * UNIT ATTENTION occurred. > + * If the array returns with 'ALUA state transition' > + * sense code here it cannot return RTPG data during > + * transition. So set the state to 'transitioning' > directly. > */ > if (sense_hdr.sense_key == NOT_READY && > - sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) > - err = SCSI_DH_RETRY; > - else if (sense_hdr.sense_key == UNIT_ATTENTION) > + sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) { > + transitioning_sense = true; > + goto skip_rtpg; > + } > + /* > + * Retry on any other UNIT ATTENTION occurred. > + */ > + if (sense_hdr.sense_key == UNIT_ATTENTION) > err = SCSI_DH_RETRY; > if (err == SCSI_DH_RETRY && > pg->expiry != 0 && time_before(jiffies, pg- > >expiry)) { > @@ -666,7 +673,11 @@ static int alua_rtpg(struct scsi_device *sdev, > struct alua_port_group *pg) > off = 8 + (desc[7] * 4); > } > > + skip_rtpg: > spin_lock_irqsave(&pg->lock, flags); > + if (transitioning_sense) > + pg->state = SCSI_ACCESS_STATE_TRANSITIONING; > + > sdev_printk(KERN_INFO, sdev, > "%s: port group %02x state %c %s supports > %c%c%c%c%c%c%c\n", > ALUA_DH_NAME, pg->group_id, print_alua_state(pg- > >state), This makes sense to me and has affected recovery timeouts in the past. Code looks correct to me. Reviewed-by: Laurence Oberman <loberman@redhat.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions 2019-10-07 13:57 [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions Hannes Reinecke 2019-10-07 14:15 ` Laurence Oberman @ 2019-10-07 20:45 ` Ewan D. Milne 2019-10-08 6:21 ` Hannes Reinecke 2019-10-09 16:31 ` Bart Van Assche 2019-10-10 2:43 ` Martin K. Petersen 3 siblings, 1 reply; 7+ messages in thread From: Ewan D. Milne @ 2019-10-07 20:45 UTC (permalink / raw) To: Hannes Reinecke, Martin K. Petersen Cc: Christoph Hellwig, James Bottomley, Martin Wilck, linux-scsi, Hannes Reinecke See below. On Mon, 2019-10-07 at 15:57 +0200, Hannes Reinecke wrote: > From: Hannes Reinecke <hare@suse.com> > > Some arrays are not capable of returning RTPG data during state > transitioning, but rather return an 'LUN not accessible, asymmetric > access state transition' sense code. In these cases we > can set the state to 'transitioning' directly and don't need to > evaluate the RTPG data (which we won't have anyway). > > Signed-off-by: Hannes Reinecke <hare@suse.com> > --- > drivers/scsi/device_handler/scsi_dh_alua.c | 21 ++++++++++++++++----- > 1 file changed, 16 insertions(+), 5 deletions(-) > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > index 4971104b1817..f32da0ca529e 100644 > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > @@ -512,6 +512,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > unsigned int tpg_desc_tbl_off; > unsigned char orig_transition_tmo; > unsigned long flags; > + bool transitioning_sense = false; > > if (!pg->expiry) { > unsigned long transition_tmo = ALUA_FAILOVER_TIMEOUT * HZ; > @@ -572,13 +573,19 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > goto retry; > } > /* > - * Retry on ALUA state transition or if any > - * UNIT ATTENTION occurred. > + * If the array returns with 'ALUA state transition' > + * sense code here it cannot return RTPG data during > + * transition. So set the state to 'transitioning' directly. > */ > if (sense_hdr.sense_key == NOT_READY && > - sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) > - err = SCSI_DH_RETRY; > - else if (sense_hdr.sense_key == UNIT_ATTENTION) > + sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) { > + transitioning_sense = true; > + goto skip_rtpg; > + } > + /* > + * Retry on any other UNIT ATTENTION occurred. > + */ > + if (sense_hdr.sense_key == UNIT_ATTENTION) > err = SCSI_DH_RETRY; > if (err == SCSI_DH_RETRY && > pg->expiry != 0 && time_before(jiffies, pg->expiry)) { > @@ -666,7 +673,11 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > off = 8 + (desc[7] * 4); > } > > + skip_rtpg: > spin_lock_irqsave(&pg->lock, flags); > + if (transitioning_sense) > + pg->state = SCSI_ACCESS_STATE_TRANSITIONING; > + > sdev_printk(KERN_INFO, sdev, > "%s: port group %02x state %c %s supports %c%c%c%c%c%c%c\n", > ALUA_DH_NAME, pg->group_id, print_alua_state(pg->state), The patch itself looks OK, but I was wondering about a couple of things: - There are other places in scsi_dh_alua where the ASC/ASCQ 04 0A is checked and we retry, I understand that this is a particular case you are solving but is the changing of the state to -> transitioning (because that's what the device said the state was) applicable in those other cases? - The code originally seems to have been under the assumption that the transitioning state was a transient event, so the retry would pick up the eventual state. Now, some storage arrays spend a long time in the transitioning state, but if we don't send another command are we going to get the sense (or the UA) that triggers entry to the eventual ALUA state? -Ewan ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions 2019-10-07 20:45 ` Ewan D. Milne @ 2019-10-08 6:21 ` Hannes Reinecke 2019-10-08 15:58 ` Ewan D. Milne 0 siblings, 1 reply; 7+ messages in thread From: Hannes Reinecke @ 2019-10-08 6:21 UTC (permalink / raw) To: Ewan D. Milne, Martin K. Petersen Cc: Christoph Hellwig, James Bottomley, Martin Wilck, linux-scsi, Hannes Reinecke On 10/7/19 10:45 PM, Ewan D. Milne wrote: > See below. > > On Mon, 2019-10-07 at 15:57 +0200, Hannes Reinecke wrote: >> From: Hannes Reinecke <hare@suse.com> >> >> Some arrays are not capable of returning RTPG data during state >> transitioning, but rather return an 'LUN not accessible, asymmetric >> access state transition' sense code. In these cases we >> can set the state to 'transitioning' directly and don't need to >> evaluate the RTPG data (which we won't have anyway). >> >> Signed-off-by: Hannes Reinecke <hare@suse.com> >> --- >> drivers/scsi/device_handler/scsi_dh_alua.c | 21 ++++++++++++++++----- >> 1 file changed, 16 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c >> index 4971104b1817..f32da0ca529e 100644 >> --- a/drivers/scsi/device_handler/scsi_dh_alua.c >> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c >> @@ -512,6 +512,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) >> unsigned int tpg_desc_tbl_off; >> unsigned char orig_transition_tmo; >> unsigned long flags; >> + bool transitioning_sense = false; >> >> if (!pg->expiry) { >> unsigned long transition_tmo = ALUA_FAILOVER_TIMEOUT * HZ; >> @@ -572,13 +573,19 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) >> goto retry; >> } >> /* >> - * Retry on ALUA state transition or if any >> - * UNIT ATTENTION occurred. >> + * If the array returns with 'ALUA state transition' >> + * sense code here it cannot return RTPG data during >> + * transition. So set the state to 'transitioning' directly. >> */ >> if (sense_hdr.sense_key == NOT_READY && >> - sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) >> - err = SCSI_DH_RETRY; >> - else if (sense_hdr.sense_key == UNIT_ATTENTION) >> + sense_hdr.asc == 0x04 && sense_hdr.ascq == 0x0a) { >> + transitioning_sense = true; >> + goto skip_rtpg; >> + } >> + /* >> + * Retry on any other UNIT ATTENTION occurred. >> + */ >> + if (sense_hdr.sense_key == UNIT_ATTENTION) >> err = SCSI_DH_RETRY; >> if (err == SCSI_DH_RETRY && >> pg->expiry != 0 && time_before(jiffies, pg->expiry)) { >> @@ -666,7 +673,11 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) >> off = 8 + (desc[7] * 4); >> } >> >> + skip_rtpg: >> spin_lock_irqsave(&pg->lock, flags); >> + if (transitioning_sense) >> + pg->state = SCSI_ACCESS_STATE_TRANSITIONING; >> + >> sdev_printk(KERN_INFO, sdev, >> "%s: port group %02x state %c %s supports %c%c%c%c%c%c%c\n", >> ALUA_DH_NAME, pg->group_id, print_alua_state(pg->state), > > The patch itself looks OK, but I was wondering about a couple of things: > > - There are other places in scsi_dh_alua where the ASC/ASCQ 04 0A is checked > and we retry, I understand that this is a particular case you are solving > but is the changing of the state to -> transitioning (because that's what > the device said the state was) applicable in those other cases? No. The original code was built around the assumption that RTPG would return the status of the device; consequently we would have to retry RTPG until we get a final status. But as mentioned, there are arrays which cannot return RTPG data during transitioning, so the code would never be able to detect a transitioning state. With this patch we set the state directly once the said sense code is received. But this applies _only_ to the RTPG command, as this is required to move the state machine along. None of the other commands are affected. > - The code originally seems to have been under the assumption that the > transitioning state was a transient event, so the retry would pick up > the eventual state. Now, some storage arrays spend a long time in the > transitioning state, but if we don't send another command are we going to > get the sense (or the UA) that triggers entry to the eventual ALUA state? > Note, there are two types of retries. The one is the 'normal' command retry, where we resend a command a given number of times to retrieve the final status. This is precisely the error which caused this patch. And then there is a scheduled retry; here we essentially poll the array with sending RTPG in regular intervals until the 'transitioning' state is gone. (Check for 'alua_rtpg()' and the handling of the SCSI_DH_RETRY return value). With the patch we continue to trigger that second type of retries, which will eventually clear the transitioning state. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 247165 (AG München), GF: Felix Imendörffer ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions 2019-10-08 6:21 ` Hannes Reinecke @ 2019-10-08 15:58 ` Ewan D. Milne 0 siblings, 0 replies; 7+ messages in thread From: Ewan D. Milne @ 2019-10-08 15:58 UTC (permalink / raw) To: Hannes Reinecke, Martin K. Petersen Cc: Christoph Hellwig, James Bottomley, Martin Wilck, linux-scsi, Hannes Reinecke On Tue, 2019-10-08 at 08:21 +0200, Hannes Reinecke wrote: > On 10/7/19 10:45 PM, Ewan D. Milne wrote: > > > > The patch itself looks OK, but I was wondering about a couple of things: > > > > - There are other places in scsi_dh_alua where the ASC/ASCQ 04 0A is checked > > and we retry, I understand that this is a particular case you are solving > > but is the changing of the state to -> transitioning (because that's what > > the device said the state was) applicable in those other cases? > > No. The original code was built around the assumption that RTPG would > return the status of the device; consequently we would have to retry > RTPG until we get a final status. But as mentioned, there are arrays > which cannot return RTPG data during transitioning, so the code would > never be able to detect a transitioning state. > With this patch we set the state directly once the said sense code is > received. > But this applies _only_ to the RTPG command, as this is required to move > the state machine along. > None of the other commands are affected. > > > - The code originally seems to have been under the assumption that the > > transitioning state was a transient event, so the retry would pick up > > the eventual state. Now, some storage arrays spend a long time in the > > transitioning state, but if we don't send another command are we going to > > get the sense (or the UA) that triggers entry to the eventual ALUA state? > > > > Note, there are two types of retries. > The one is the 'normal' command retry, where we resend a command a given > number of times to retrieve the final status. > This is precisely the error which caused this patch. > > And then there is a scheduled retry; here we essentially poll the array > with sending RTPG in regular intervals until the 'transitioning' state > is gone. (Check for 'alua_rtpg()' and the handling of the SCSI_DH_RETRY > return value). With the patch we continue to trigger that second type of > retries, which will eventually clear the transitioning state. > > Cheers, > > Hannes Thanks for the explanation. The patch looks good. Reviewed-by: Ewan D. Milne <emilne@redhat.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions 2019-10-07 13:57 [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions Hannes Reinecke 2019-10-07 14:15 ` Laurence Oberman 2019-10-07 20:45 ` Ewan D. Milne @ 2019-10-09 16:31 ` Bart Van Assche 2019-10-10 2:43 ` Martin K. Petersen 3 siblings, 0 replies; 7+ messages in thread From: Bart Van Assche @ 2019-10-09 16:31 UTC (permalink / raw) To: Hannes Reinecke, Martin K. Petersen Cc: Christoph Hellwig, James Bottomley, Martin Wilck, linux-scsi, Hannes Reinecke On 10/7/19 6:57 AM, Hannes Reinecke wrote: > Some arrays are not capable of returning RTPG data during state > transitioning, but rather return an 'LUN not accessible, asymmetric > access state transition' sense code. In these cases we > can set the state to 'transitioning' directly and don't need to > evaluate the RTPG data (which we won't have anyway). Reviewed-by: Bart Van Assche <bvanassche@acm.org> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions 2019-10-07 13:57 [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions Hannes Reinecke ` (2 preceding siblings ...) 2019-10-09 16:31 ` Bart Van Assche @ 2019-10-10 2:43 ` Martin K. Petersen 3 siblings, 0 replies; 7+ messages in thread From: Martin K. Petersen @ 2019-10-10 2:43 UTC (permalink / raw) To: Hannes Reinecke Cc: Martin K. Petersen, Christoph Hellwig, James Bottomley, Martin Wilck, linux-scsi, Hannes Reinecke Hannes, > Some arrays are not capable of returning RTPG data during state > transitioning, but rather return an 'LUN not accessible, asymmetric > access state transition' sense code. In these cases we can set the > state to 'transitioning' directly and don't need to evaluate the RTPG > data (which we won't have anyway). Applied to 5.4/scsi-fixes, thanks you! -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-10-10 2:43 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-07 13:57 [PATCH] scsi_dh_alua: handle RTPG sense code correctly during state transitions Hannes Reinecke 2019-10-07 14:15 ` Laurence Oberman 2019-10-07 20:45 ` Ewan D. Milne 2019-10-08 6:21 ` Hannes Reinecke 2019-10-08 15:58 ` Ewan D. Milne 2019-10-09 16:31 ` Bart Van Assche 2019-10-10 2:43 ` Martin K. Petersen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.