All of lore.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM Topic] SCSI Unit Attention Handling
@ 2011-02-06 20:44 Richard Sharpe
  2011-02-06 22:32 ` Shyam_Iyer
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Sharpe @ 2011-02-06 20:44 UTC (permalink / raw)
  To: lsf-pc, linux-scsi, Hannes Reinecke

I would like to propose a topic around SCSI Unit Attention Handling.

The current scsi_error.c:scsi_check_sense handling of UNIT ATTENTION
consists of explicitly printing warnings for for ASC=0x3f events and
then returning SOFT_ERROR which scsi_error.c:scsi_decide_disposition
ignores because it returns SUCCESS to SOFT_ERROR being returned from
scst_check_sense on a CHECK_CONDITION.

There are a number of cases where we might want to perform further
processing on a UNIT ATTENTION. For example, ASC/ASCQ 0x3f/0x0e
REPORTED LUNS DATA HAS CHANGED or 0x2a/0x09 CAPACITY DATA HAS CHANGED,
0x28/0x03 IMPORT/EXPORT ELEMENT ACCESSED, MEDIUM CHANGED, etc. When
the LUNS have changed it would be useful to have a recan performed
automatically. If capacity data has changed, it would be useful if
someone could react to that and perhaps resize the file system on that
LUN if possible, and so forth.

It is not clear that any of these items should be handled in the
kernel anyway, and perhaps they should be exported to user-space for
correct handling, but rather than just the raw SENSE data being
exported, perhaps some sort of relevant event should be exported.

To avoid having to code all of the relevant combinations in the above
routine, Hannes and I have been discussing a framework for handling
this. Hannes suggested a notifier chain of some sort to deal with
this, and points out that because the above routine is called in a
softirq context we don't want to be performing lots of processing in
that context.

It seems that we need to defer processing of these items as well as
provide some mechanism for drivers (sd.c, st.c, etc, to register the
UNIT ATTENTIONs they are interested in). The registration seems quite
straight forward ... each driver can provide a list of the ASC/ASCQ
pairs they are interested in and a mapping to an event of some sort,
but the issue then is how to defer this processing. One approach I
have thought of is to extend the error handler thread to handle these
sorts of events and on a UNIT ATTENTION give the command to the error
handling thread. However, others might suggest that the processing
done in the error handler thread should be moved to work queues
anyway, and overloading the error handling thread like this is the
wrong way to do this and that they would rather see the error handling
thread go away.

So, I would like to have a discussion around the issues involved in
providing some sort of a framework for letting drivers indicate what
UNIT ATTENTIONS they are interested in and how to handle those, either
by exporting them to userspace or providing a callback or other
mechanism for handling them. We also need some discussion around
communicating with user space. Whether to use uevent/udev, use netlink
(Hannes suggests this has issues in heavy memory use cases), relayfs,
etc.

-- 
Regards,
Richard Sharpe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [LSF/MM Topic] SCSI Unit Attention Handling
  2011-02-06 20:44 [LSF/MM Topic] SCSI Unit Attention Handling Richard Sharpe
@ 2011-02-06 22:32 ` Shyam_Iyer
  2011-02-07  7:46   ` Hannes Reinecke
  2011-02-08  2:00   ` Richard Sharpe
  0 siblings, 2 replies; 8+ messages in thread
From: Shyam_Iyer @ 2011-02-06 22:32 UTC (permalink / raw)
  To: realrichardsharpe, lsf-pc, linux-scsi, hare



> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Richard Sharpe
> Sent: Sunday, February 06, 2011 3:44 PM
> To: lsf-pc@lists.linuxfoundation.org; linux-scsi; Hannes Reinecke
> Subject: [LSF/MM Topic] SCSI Unit Attention Handling
> 
> I would like to propose a topic around SCSI Unit Attention Handling.
> 
> The current scsi_error.c:scsi_check_sense handling of UNIT ATTENTION
> consists of explicitly printing warnings for for ASC=0x3f events and
> then returning SOFT_ERROR which scsi_error.c:scsi_decide_disposition
> ignores because it returns SUCCESS to SOFT_ERROR being returned from
> scst_check_sense on a CHECK_CONDITION.
> 
> There are a number of cases where we might want to perform further
> processing on a UNIT ATTENTION. For example, ASC/ASCQ 0x3f/0x0e
> REPORTED LUNS DATA HAS CHANGED or 0x2a/0x09 CAPACITY DATA HAS CHANGED,
> 0x28/0x03 IMPORT/EXPORT ELEMENT ACCESSED, MEDIUM CHANGED, etc. When
> the LUNS have changed it would be useful to have a recan performed
> automatically. If capacity data has changed, it would be useful if
> someone could react to that and perhaps resize the file system on that
> LUN if possible, and so forth.
> 
> It is not clear that any of these items should be handled in the
> kernel anyway, and perhaps they should be exported to user-space for
> correct handling, but rather than just the raw SENSE data being
> exported, perhaps some sort of relevant event should be exported.
> 
We spoke about this in the plumbers conf last November as well and the few ideas then was to handle them via scsi netlink.
I see that Hannes is working on a relayfs method to handle them.

Some of the new problems that we can see with handling such events are -

If the thin provisioned LUN is snapshotted or cloned then you can also get a flurry of UNIT attentions for the same data that has been replicated. 


> To avoid having to code all of the relevant combinations in the above
> routine, Hannes and I have been discussing a framework for handling
> this. Hannes suggested a notifier chain of some sort to deal with
> this, and points out that because the above routine is called in a
> softirq context we don't want to be performing lots of processing in
> that context.
> 
I guess my curiosity would be on why the scsi_netlink framework abandoned or possibly not considered..

> It seems that we need to defer processing of these items as well as
> provide some mechanism for drivers (sd.c, st.c, etc, to register the
> UNIT ATTENTIONs they are interested in). The registration seems quite
> straight forward ... each driver can provide a list of the ASC/ASCQ
> pairs they are interested in and a mapping to an event of some sort,
> but the issue then is how to defer this processing. One approach I
> have thought of is to extend the error handler thread to handle these
> sorts of events and on a UNIT ATTENTION give the command to the error
> handling thread. However, others might suggest that the processing
> done in the error handler thread should be moved to work queues
> anyway, and overloading the error handling thread like this is the
> wrong way to do this and that they would rather see the error handling
> thread go away.
> 
> So, I would like to have a discussion around the issues involved in
> providing some sort of a framework for letting drivers indicate what
> UNIT ATTENTIONS they are interested in and how to handle those, either
> by exporting them to userspace or providing a callback or other
> mechanism for handling them. We also need some discussion around
> communicating with user space. Whether to use uevent/udev, use netlink
> (Hannes suggests this has issues in heavy memory use cases), relayfs,
> etc.
I see that the uevent method has been tried in the past.. and I am not currently inclined to anything at the moment but I can think that although the events will follow T10 guidelines, the frequency of the events is vendor dependent and user configurable. So they need to be tied to a thin profile.

In another thread Douglas Gilbert talks about improving efficiency of sparse files and I think that such events can be very closely tied to creating profiles per LUN before formatting them and taking dynamic corrective actions.

-Shyam
> 
> --
> Regards,
> Richard Sharpe
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM Topic] SCSI Unit Attention Handling
  2011-02-06 22:32 ` Shyam_Iyer
@ 2011-02-07  7:46   ` Hannes Reinecke
  2011-02-08  2:00   ` Richard Sharpe
  1 sibling, 0 replies; 8+ messages in thread
From: Hannes Reinecke @ 2011-02-07  7:46 UTC (permalink / raw)
  To: Shyam_Iyer; +Cc: realrichardsharpe, lsf-pc, linux-scsi

On 02/06/2011 11:32 PM, Shyam_Iyer@Dell.com wrote:
> 
> 
>> -----Original Message-----
>> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
>> owner@vger.kernel.org] On Behalf Of Richard Sharpe
>> Sent: Sunday, February 06, 2011 3:44 PM
>> To: lsf-pc@lists.linuxfoundation.org; linux-scsi; Hannes Reinecke
>> Subject: [LSF/MM Topic] SCSI Unit Attention Handling
>>
>> I would like to propose a topic around SCSI Unit Attention Handling.
>>
>> The current scsi_error.c:scsi_check_sense handling of UNIT ATTENTION
>> consists of explicitly printing warnings for for ASC=0x3f events and
>> then returning SOFT_ERROR which scsi_error.c:scsi_decide_disposition
>> ignores because it returns SUCCESS to SOFT_ERROR being returned from
>> scst_check_sense on a CHECK_CONDITION.
>>
>> There are a number of cases where we might want to perform further
>> processing on a UNIT ATTENTION. For example, ASC/ASCQ 0x3f/0x0e
>> REPORTED LUNS DATA HAS CHANGED or 0x2a/0x09 CAPACITY DATA HAS CHANGED,
>> 0x28/0x03 IMPORT/EXPORT ELEMENT ACCESSED, MEDIUM CHANGED, etc. When
>> the LUNS have changed it would be useful to have a recan performed
>> automatically. If capacity data has changed, it would be useful if
>> someone could react to that and perhaps resize the file system on that
>> LUN if possible, and so forth.
>>
>> It is not clear that any of these items should be handled in the
>> kernel anyway, and perhaps they should be exported to user-space for
>> correct handling, but rather than just the raw SENSE data being
>> exported, perhaps some sort of relevant event should be exported.
>>
> We spoke about this in the plumbers conf last November as well and
> the few ideas then was to handle them via scsi netlink.
> I see that Hannes is working on a relayfs method to handle them.
>
I made an initial framework using netlink some way back (and it's
actually part of SLES11 :-), but I figured it's not the best way of
handling things.

> Some of the new problems that we can see with handling such events are -
> 
> If the thin provisioned LUN is snapshotted or cloned then you can also
> get a flurry of UNIT attentions for the same data that has been
replicated.
>
Yes.


> 
>> To avoid having to code all of the relevant combinations in the above
>> routine, Hannes and I have been discussing a framework for handling
>> this. Hannes suggested a notifier chain of some sort to deal with
>> this, and points out that because the above routine is called in a
>> softirq context we don't want to be performing lots of processing in
>> that context.
>>
> I guess my curiosity would be on why the scsi_netlink framework
> abandoned or possibly not considered..
>
It was (see above).
There are two major issues with it:
- memory allocation: For each and every event you have to allocate
  skbs. Either you do it in-line (ie at the time when the event
  happens), which means you have to do a memory allocation in the
  interrupt service routine. Or you do it asynchronously, in which
  case you have to have a separate memory area into which the event
  can be stored temporarily before the skb is allocated.
  But then you already have some sort of ring-buffer here, which
  you might as well use directly and do away with the skbs
  altogether -> relayfs.
- Scalability. I'm not sure how well netlink behaves under pressure,
  and what does happen with those events (blame me for not being a
  network guy). ISTR that netlink will just drop events if the
  buffer is full.

>> It seems that we need to defer processing of these items as well as
>> provide some mechanism for drivers (sd.c, st.c, etc, to register the
>> UNIT ATTENTIONs they are interested in). The registration seems quite
>> straight forward ... each driver can provide a list of the ASC/ASCQ
>> pairs they are interested in and a mapping to an event of some sort,
>> but the issue then is how to defer this processing. One approach I
>> have thought of is to extend the error handler thread to handle these
>> sorts of events and on a UNIT ATTENTION give the command to the error
>> handling thread. However, others might suggest that the processing
>> done in the error handler thread should be moved to work queues
>> anyway, and overloading the error handling thread like this is the
>> wrong way to do this and that they would rather see the error handling
>> thread go away.
>>
>> So, I would like to have a discussion around the issues involved in
>> providing some sort of a framework for letting drivers indicate what
>> UNIT ATTENTIONS they are interested in and how to handle those, either
>> by exporting them to userspace or providing a callback or other
>> mechanism for handling them. We also need some discussion around
>> communicating with user space. Whether to use uevent/udev, use netlink
>> (Hannes suggests this has issues in heavy memory use cases), relayfs,
>> etc.
> I see that the uevent method has been tried in the past.. and I am not
> currently inclined to anything at the moment but I can think that
> although the events will follow T10 guidelines, the frequency of the
> events is vendor dependent and user configurable.
> So they need to be tied to a thin profile.
Errm. Yes to the former, but I'd be very suprised if we can _set_
the frequency of the events.

> 
> In another thread Douglas Gilbert talks about improving efficiency
> of sparse files and I think that such events can be very closely tied
> to creating profiles per LUN before formatting them and taking dynamic
> corrective actions.
> 
Profiles for LUNs? Vendor-independent?

Now _that_ would be a very interesting topic to talk about :-)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM Topic] SCSI Unit Attention Handling
  2011-02-06 22:32 ` Shyam_Iyer
  2011-02-07  7:46   ` Hannes Reinecke
@ 2011-02-08  2:00   ` Richard Sharpe
  2011-02-08  5:06     ` Shyam_Iyer
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Sharpe @ 2011-02-08  2:00 UTC (permalink / raw)
  To: Shyam_Iyer; +Cc: lsf-pc, linux-scsi, hare

On Sun, Feb 6, 2011 at 5:32 PM,  <Shyam_Iyer@dell.com> wrote:
>
>
>> -----Original Message-----
>> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
>> owner@vger.kernel.org] On Behalf Of Richard Sharpe
>> Sent: Sunday, February 06, 2011 3:44 PM
>> To: lsf-pc@lists.linuxfoundation.org; linux-scsi; Hannes Reinecke
>> Subject: [LSF/MM Topic] SCSI Unit Attention Handling
>>
>> I would like to propose a topic around SCSI Unit Attention Handling.
>>
>> The current scsi_error.c:scsi_check_sense handling of UNIT ATTENTION
>> consists of explicitly printing warnings for for ASC=0x3f events and
>> then returning SOFT_ERROR which scsi_error.c:scsi_decide_disposition
>> ignores because it returns SUCCESS to SOFT_ERROR being returned from
>> scst_check_sense on a CHECK_CONDITION.
>>
>> There are a number of cases where we might want to perform further
>> processing on a UNIT ATTENTION. For example, ASC/ASCQ 0x3f/0x0e
>> REPORTED LUNS DATA HAS CHANGED or 0x2a/0x09 CAPACITY DATA HAS CHANGED,
>> 0x28/0x03 IMPORT/EXPORT ELEMENT ACCESSED, MEDIUM CHANGED, etc. When
>> the LUNS have changed it would be useful to have a recan performed
>> automatically. If capacity data has changed, it would be useful if
>> someone could react to that and perhaps resize the file system on that
>> LUN if possible, and so forth.
>>
>> It is not clear that any of these items should be handled in the
>> kernel anyway, and perhaps they should be exported to user-space for
>> correct handling, but rather than just the raw SENSE data being
>> exported, perhaps some sort of relevant event should be exported.
>>
> We spoke about this in the plumbers conf last November as well and the few ideas then was to handle them via scsi netlink.
> I see that Hannes is working on a relayfs method to handle them.
>
> Some of the new problems that we can see with handling such events are -
>
> If the thin provisioned LUN is snapshotted or cloned then you can also get a flurry of UNIT attentions for the same data
> that has been replicated.

So, I wonder if adding just the ability for SCSI upper drivers (sd,
st, etc) to register interest in different UNIT ATTENTIONS is all that
interesting and whether vendors would rather have the ability to tell
drivers (via an ioctl, say) the UNIT ATTENTIONS they are interested
in, and how they should be mapped.

It might be more useful to allow user-land utilities to perform the re-scanning.

I would imagine that you will get unit attentions saying that REPORTED
LUNS DATA HAS CHANGED, but what other UNIT ATTENTIONS would you get?
If you add storage to a LUN, then perhaps CAPACITY DATA HAS CHANGED.

Perhaps there is also a need to say things like, for these ASC/ASCQ
values, take the device off line, and all the rest are just advisory
but pass them all to user land as well.

-- 
Regards,
Richard Sharpe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [LSF/MM Topic] SCSI Unit Attention Handling
  2011-02-08  2:00   ` Richard Sharpe
@ 2011-02-08  5:06     ` Shyam_Iyer
  2011-02-08  5:51       ` Richard Sharpe
  0 siblings, 1 reply; 8+ messages in thread
From: Shyam_Iyer @ 2011-02-08  5:06 UTC (permalink / raw)
  To: realrichardsharpe; +Cc: lsf-pc, linux-scsi, hare

> -----Original Message-----
> From: Richard Sharpe [mailto:realrichardsharpe@gmail.com]
> Sent: Monday, February 07, 2011 9:01 PM
> To: Iyer, Shyam
> Cc: lsf-pc@lists.linuxfoundation.org; linux-scsi@vger.kernel.org;
> hare@suse.de
> Subject: Re: [LSF/MM Topic] SCSI Unit Attention Handling
> 
> On Sun, Feb 6, 2011 at 5:32 PM,  <Shyam_Iyer@dell.com> wrote:
> >
> >
> >> -----Original Message-----
> >> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> >> owner@vger.kernel.org] On Behalf Of Richard Sharpe
> >> Sent: Sunday, February 06, 2011 3:44 PM
> >> To: lsf-pc@lists.linuxfoundation.org; linux-scsi; Hannes Reinecke
> >> Subject: [LSF/MM Topic] SCSI Unit Attention Handling
> >>
> >> I would like to propose a topic around SCSI Unit Attention Handling.
> >>
> >> The current scsi_error.c:scsi_check_sense handling of UNIT ATTENTION
> >> consists of explicitly printing warnings for for ASC=0x3f events and
> >> then returning SOFT_ERROR which scsi_error.c:scsi_decide_disposition
> >> ignores because it returns SUCCESS to SOFT_ERROR being returned from
> >> scst_check_sense on a CHECK_CONDITION.
> >>
> >> There are a number of cases where we might want to perform further
> >> processing on a UNIT ATTENTION. For example, ASC/ASCQ 0x3f/0x0e
> >> REPORTED LUNS DATA HAS CHANGED or 0x2a/0x09 CAPACITY DATA HAS
> CHANGED,
> >> 0x28/0x03 IMPORT/EXPORT ELEMENT ACCESSED, MEDIUM CHANGED, etc. When
> >> the LUNS have changed it would be useful to have a recan performed
> >> automatically. If capacity data has changed, it would be useful if
> >> someone could react to that and perhaps resize the file system on
> that
> >> LUN if possible, and so forth.
> >>
> >> It is not clear that any of these items should be handled in the
> >> kernel anyway, and perhaps they should be exported to user-space for
> >> correct handling, but rather than just the raw SENSE data being
> >> exported, perhaps some sort of relevant event should be exported.
> >>
> > We spoke about this in the plumbers conf last November as well and
> the few ideas then was to handle them via scsi netlink.
> > I see that Hannes is working on a relayfs method to handle them.
> >
> > Some of the new problems that we can see with handling such events
> are -
> >
> > If the thin provisioned LUN is snapshotted or cloned then you can
> also get a flurry of UNIT attentions for the same data
> > that has been replicated.
> 
> So, I wonder if adding just the ability for SCSI upper drivers (sd,
> st, etc) to register interest in different UNIT ATTENTIONS is all that
> interesting and whether vendors would rather have the ability to tell
> drivers (via an ioctl, say) the UNIT ATTENTIONS they are interested
> in, and how they should be mapped.
> 

An ioctl implementation would not be elegant.

Even if registering for UAs per vendor was envisioned there are scenarios that can cause a flurry of UAs too..
(I initially opined to have a vendor specific implementation of logging scsi_netlink events from the scsi_device handler, it was gloriously shot down ;-))

Consider this scenario.. 

Above water mark.. --> Unit Attention
Discard to free up space
Below water mark ... -> Unit Attention

Consider a ripple scenario where this repeats..
(Although this can not happen too often it is very much akin to a thrashing scenario)

The UA should be hints for the filesystem to optimize online. Here is where the thin profile can reduce the UAs.

Also, you delete a file - select a good age time to discard the associated blocks(debatable and worth any good algorithm writer's salt).
Now I am not sure if the filesystem should run an inkernel thread to do this profile management.. 

> It might be more useful to allow user-land utilities to perform the re-
> scanning.
> 
> I would imagine that you will get unit attentions saying that REPORTED
> LUNS DATA HAS CHANGED, but what other UNIT ATTENTIONS would you get?
> If you add storage to a LUN, then perhaps CAPACITY DATA HAS CHANGED.
>
> Perhaps there is also a need to say things like, for these ASC/ASCQ
> values, take the device off line, and all the rest are just advisory
> but pass them all to user land as well.
> 

This is a kind of policy that needs to go into the thin profile although Storage Arrays do take the device offline on reaching certain hard limits there is nothing like mounting a filesystem read-only ;-)

-Shyam

> --
> Regards,
> Richard Sharpe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM Topic] SCSI Unit Attention Handling
  2011-02-08  5:06     ` Shyam_Iyer
@ 2011-02-08  5:51       ` Richard Sharpe
  2011-02-09  5:02         ` Shyam_Iyer
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Sharpe @ 2011-02-08  5:51 UTC (permalink / raw)
  To: Shyam_Iyer; +Cc: lsf-pc, linux-scsi, hare

On Tue, Feb 8, 2011 at 12:06 AM,  <Shyam_Iyer@dell.com> wrote:
>> -----Original Message-----
>> From: Richard Sharpe [mailto:realrichardsharpe@gmail.com]

[snip]

>>
>> So, I wonder if adding just the ability for SCSI upper drivers (sd,
>> st, etc) to register interest in different UNIT ATTENTIONS is all that
>> interesting and whether vendors would rather have the ability to tell
>> drivers (via an ioctl, say) the UNIT ATTENTIONS they are interested
>> in, and how they should be mapped.
>>
>
> An ioctl implementation would not be elegant.

I was only suggesting an ioctl for informing the driver of the
ASC/ASCQ pairs of interest. To get the stream of UAs out of the kernel
a different mechanism would be used, like, say, the relayfs.

> Even if registering for UAs per vendor was envisioned there are scenarios that can cause a flurry of UAs too..
> (I initially opined to have a vendor specific implementation of logging scsi_netlink events from the scsi_device handler,
> it was gloriously shot down ;-))
>
> Consider this scenario..
>
> Above water mark.. --> Unit Attention
> Discard to free up space
> Below water mark ... -> Unit Attention
>
> Consider a ripple scenario where this repeats..
> (Although this can not happen too often it is very much akin to a thrashing scenario)
>
> The UA should be hints for the filesystem to optimize online. Here is where the thin profile can reduce the UAs.
>
> Also, you delete a file - select a good age time to discard the associated blocks(debatable and worth any good algorithm writer's salt).
> Now I am not sure if the filesystem should run an inkernel thread to do this profile management..
>
>> It might be more useful to allow user-land utilities to perform the re-
>> scanning.
>>
>> I would imagine that you will get unit attentions saying that REPORTED
>> LUNS DATA HAS CHANGED, but what other UNIT ATTENTIONS would you get?
>> If you add storage to a LUN, then perhaps CAPACITY DATA HAS CHANGED.
>>
>> Perhaps there is also a need to say things like, for these ASC/ASCQ
>> values, take the device off line, and all the rest are just advisory
>> but pass them all to user land as well.
>>
>
> This is a kind of policy that needs to go into the thin profile although Storage Arrays do take the device offline on reaching
> certain hard limits there is nothing like mounting a filesystem read-only ;-)

Well, yes, but Ext3/4 and XFS tend to remount the fs RO when writes to
the journal fail as well because the SCSI stack takes the device
offline :-(

If the device has lied in its response to a READ_CAPACITY or
READ_CAPACITY16 that is hard to prevent unless the file system has the
concept of a lying reserve ...

-- 
Regards,
Richard Sharpe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [LSF/MM Topic] SCSI Unit Attention Handling
  2011-02-08  5:51       ` Richard Sharpe
@ 2011-02-09  5:02         ` Shyam_Iyer
  2011-02-09 15:44           ` Hannes Reinecke
  0 siblings, 1 reply; 8+ messages in thread
From: Shyam_Iyer @ 2011-02-09  5:02 UTC (permalink / raw)
  To: realrichardsharpe, Shyam_Iyer; +Cc: lsf-pc, linux-scsi, hare

> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Richard Sharpe
> Sent: Tuesday, February 08, 2011 12:51 AM
> To: Iyer, Shyam
> Cc: lsf-pc@lists.linuxfoundation.org; linux-scsi@vger.kernel.org;
> hare@suse.de
> Subject: Re: [LSF/MM Topic] SCSI Unit Attention Handling
> 
> On Tue, Feb 8, 2011 at 12:06 AM,  <Shyam_Iyer@dell.com> wrote:
> >> -----Original Message-----
> >> From: Richard Sharpe [mailto:realrichardsharpe@gmail.com]
> 
> [snip]
> 
> >>
> >> So, I wonder if adding just the ability for SCSI upper drivers (sd,
> >> st, etc) to register interest in different UNIT ATTENTIONS is all
> that
> >> interesting and whether vendors would rather have the ability to
> tell
> >> drivers (via an ioctl, say) the UNIT ATTENTIONS they are interested
> >> in, and how they should be mapped.
> >>
> >
> > An ioctl implementation would not be elegant.
> 
> I was only suggesting an ioctl for informing the driver of the
> ASC/ASCQ pairs of interest. To get the stream of UAs out of the kernel
> a different mechanism would be used, like, say, the relayfs.

I get you there. The ioctl implementation to inform the driver will not plug the storage from sending the UAs.

The LUN could be multipathed so then if you have UAs come through one sdX path and not through the other that is adding complication.
Also, if you are using persistent reservations and one of the path goes down the UAs could be going to a path that has been excluded.
We are introducing scenarios for bugs here.

> 
> > Even if registering for UAs per vendor was envisioned there are
> scenarios that can cause a flurry of UAs too..
> > (I initially opined to have a vendor specific implementation of
> logging scsi_netlink events from the scsi_device handler,
> > it was gloriously shot down ;-))
> >
> > Consider this scenario..
> >
> > Above water mark.. --> Unit Attention
> > Discard to free up space
> > Below water mark ... -> Unit Attention
> >
> > Consider a ripple scenario where this repeats..
> > (Although this can not happen too often it is very much akin to a
> thrashing scenario)
> >
> > The UA should be hints for the filesystem to optimize online. Here is
> where the thin profile can reduce the UAs.
> >
> > Also, you delete a file - select a good age time to discard the
> associated blocks(debatable and worth any good algorithm writer's
> salt).
> > Now I am not sure if the filesystem should run an inkernel thread to
> do this profile management..
> >
> >> It might be more useful to allow user-land utilities to perform the
> re-
> >> scanning.
> >>
> >> I would imagine that you will get unit attentions saying that
> REPORTED
> >> LUNS DATA HAS CHANGED, but what other UNIT ATTENTIONS would you get?
> >> If you add storage to a LUN, then perhaps CAPACITY DATA HAS CHANGED.
> >>
> >> Perhaps there is also a need to say things like, for these ASC/ASCQ
> >> values, take the device off line, and all the rest are just advisory
> >> but pass them all to user land as well.
> >>
> >
> > This is a kind of policy that needs to go into the thin profile
> although Storage Arrays do take the device offline on reaching
> > certain hard limits there is nothing like mounting a filesystem read-
> only ;-)
> 
> Well, yes, but Ext3/4 and XFS tend to remount the fs RO when writes to
> the journal fail as well because the SCSI stack takes the device
> offline :-(
> 
> If the device has lied in its response to a READ_CAPACITY or
> READ_CAPACITY16 that is hard to prevent unless the file system has the
> concept of a lying reserve ...
The lying reserve is again a profile/policy setting aka like a SWAP concept.

If the device has lied in either READ_CAPACITY_16 or GET_LBA_STATUS.. then we are anyways not consistent to the tee on the profile. Putting my open-source hat on that is a Carrot and stick bait.
-Shyam

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM Topic] SCSI Unit Attention Handling
  2011-02-09  5:02         ` Shyam_Iyer
@ 2011-02-09 15:44           ` Hannes Reinecke
  0 siblings, 0 replies; 8+ messages in thread
From: Hannes Reinecke @ 2011-02-09 15:44 UTC (permalink / raw)
  To: Shyam_Iyer; +Cc: realrichardsharpe, linux-scsi

Hi all,

On 02/09/2011 06:02 AM, Shyam_Iyer@Dell.com wrote:
[ .. ]
> 
> I get you there. The ioctl implementation to inform the driver will
> not plug the storage from sending the UAs.
> 
> The LUN could be multipathed so then if you have UAs come through
> one sdX path and not through the other that is adding complication.
> Also, if you are using persistent reservations and one of the path
> goes down the UAs could be going to a path that has been excluded.
> We are introducing scenarios for bugs here.
> 
Hence using debugfs; with this we would be getting an entire
configfs space for free which would allow us to set this kind of things.
ioctls are evil. Avoid at all cost.

>>
>>> Even if registering for UAs per vendor was envisioned there are
>>> scenarios that can cause a flurry of UAs too..
>>> (I initially opined to have a vendor specific implementation of
>>> logging scsi_netlink events from the scsi_device handler,
>>> it was gloriously shot down ;-))
>>>
>>> Consider this scenario..
>>>
>>> Above water mark.. --> Unit Attention
>>> Discard to free up space
>>> Below water mark ... -> Unit Attention
>>>
>>> Consider a ripple scenario where this repeats..
>>> (Although this can not happen too often it is very much akin to a
>>> thrashing scenario)
>>>
>>> The UA should be hints for the filesystem to optimize online. Here is
>>> where the thin profile can reduce the UAs.
>>>
>>> Also, you delete a file - select a good age time to discard the
>>> associated blocks(debatable and worth any good algorithm writer's
>>> salt).
>>> Now I am not sure if the filesystem should run an inkernel thread to
>>> do this profile management..
>>>
>>>> It might be more useful to allow user-land utilities to perform the
>>>> re-scanning.
>>>>
>>>> I would imagine that you will get unit attentions saying that
>>>> REPORTED LUNS DATA HAS CHANGED, but what other UNIT ATTENTIONS would
>>>> you get?
>>>> If you add storage to a LUN, then perhaps CAPACITY DATA HAS CHANGED.
>>>>
>>>> Perhaps there is also a need to say things like, for these ASC/ASCQ
>>>> values, take the device off line, and all the rest are just advisory
>>>> but pass them all to user land as well.
>>>>
>>>
>>> This is a kind of policy that needs to go into the thin profile
>>> although Storage Arrays do take the device offline on reaching
>>> certain hard limits there is nothing like mounting a filesystem read-
>>> only ;-)
>>
>> Well, yes, but Ext3/4 and XFS tend to remount the fs RO when writes to
>> the journal fail as well because the SCSI stack takes the device
>> offline :-(
>>
>> If the device has lied in its response to a READ_CAPACITY or
>> READ_CAPACITY16 that is hard to prevent unless the file system has the
>> concept of a lying reserve ...
> The lying reserve is again a profile/policy setting aka like a SWAP concept.
> 
> If the device has lied in either READ_CAPACITY_16 or GET_LBA_STATUS..
> then we are anyways not consistent to the tee on the profile. Putting
> my open-source hat on that is a Carrot and stick bait.

Quite. Currently we know of about three events / event classes which
need to be handled:

REPORTED LUNS DATA HAS CHANGED
CAPACITY DATA HAS CHANGED
thin provisioning water mark warnings

Everything else is pretty much handled by the SCSI stack nowadays
anyway.

However, currently we don't handle them at all and hence don't have
any experience as to how often they would occur. Which would be
pretty much vendor-specific anyway.
So we need to design something which is
a) capable of handling even large number of events
b) selectable per device
c) modular enough to have further sense codes added

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-02-09 15:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-06 20:44 [LSF/MM Topic] SCSI Unit Attention Handling Richard Sharpe
2011-02-06 22:32 ` Shyam_Iyer
2011-02-07  7:46   ` Hannes Reinecke
2011-02-08  2:00   ` Richard Sharpe
2011-02-08  5:06     ` Shyam_Iyer
2011-02-08  5:51       ` Richard Sharpe
2011-02-09  5:02         ` Shyam_Iyer
2011-02-09 15:44           ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.