All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE
@ 2022-02-03  6:40 Song Liu
  2022-02-03  6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Song Liu @ 2022-02-03  6:40 UTC (permalink / raw)
  To: linx-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu

We have a use case where HDDs are regularly power on/off to perserve power.
When a drive is being removed, we often see errors like

   [  172.803279] I/O error, dev sda, sector 3137184

These messages are confusing for automations that grep dmesg, as they look
very similar to real HDD error.

Solve this issue with a new block state BLK_STS_OFFLINE. After the change,
the error message looks like

   [  172.803279] device offline error, dev sda, sector 3137184

so that the automations won't confuse them with real I/O error.

Song Liu (2):
  block: introduce BLK_STS_OFFLINE
  scsi: use BLK_STS_OFFLINE for not fully online devices

 block/blk-core.c          | 1 +
 drivers/scsi/scsi_lib.c   | 2 +-
 include/linux/blk_types.h | 7 +++++++
 3 files changed, 9 insertions(+), 1 deletion(-)

--
2.30.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] block: introduce BLK_STS_OFFLINE
  2022-02-03  6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
@ 2022-02-03  6:40 ` Song Liu
  2022-02-03  6:52   ` Song Liu
  2022-02-03  6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
  2022-02-03  6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
  2 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2022-02-03  6:40 UTC (permalink / raw)
  To: linx-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu

Currently, drivers reports BLK_STS_IOERR for devices that are not full
online or being removed. This behavior could cause confusion for users,
as they are not really I/O errors from the device.

Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
offline error" in dmesg instead of "I/O error".

Signed-off-by: Song Liu <song@kernel.org>
---
 block/blk-core.c          | 1 +
 include/linux/blk_types.h | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 61f6a0dc4511..24035dd2eef1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -164,6 +164,7 @@ static const struct {
 	[BLK_STS_RESOURCE]	= { -ENOMEM,	"kernel resource" },
 	[BLK_STS_DEV_RESOURCE]	= { -EBUSY,	"device resource" },
 	[BLK_STS_AGAIN]		= { -EAGAIN,	"nonblocking retry" },
+	[BLK_STS_OFFLINE]	= { -EIO,	"device offline" },
 
 	/* device mapper special case, should not leak out: */
 	[BLK_STS_DM_REQUEUE]	= { -EREMCHG, "dm internal retry" },
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index fe065c394fff..5561e58d158a 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
  */
 #define BLK_STS_ZONE_ACTIVE_RESOURCE	((__force blk_status_t)16)
 
+/*
+ * BLK_STS_OFFLINE is returned from the driver when the target device is offline
+ * or is being taken offline. This could help differentiate the case where a
+ * device is intentionally being shut down from a real I/O error.
+ */
+#define BLK_STS_OFFLINE		((__force blk_status_t)17)
+
 /**
  * blk_path_error - returns true if error may be path related
  * @error: status the request was completed with
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices
  2022-02-03  6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
  2022-02-03  6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
@ 2022-02-03  6:40 ` Song Liu
  2022-02-03  6:53   ` Song Liu
  2022-02-03  6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
  2 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2022-02-03  6:40 UTC (permalink / raw)
  To: linx-block, linux-scsi
  Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu

The new error message for such case looks like

[  172.809565] device offline error, dev sda, sector 3138208 ...

which will not be confused with regular I/O error (BLK_STS_IOERR).

Signed-off-by: Song Liu <song@kernel.org>
---
 drivers/scsi/scsi_lib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..e30bc51578e9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req)
 		 * power management commands.
 		 */
 		if (req && !(req->rq_flags & RQF_PM))
-			return BLK_STS_IOERR;
+			return BLK_STS_OFFLINE;
 		return BLK_STS_OK;
 	}
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE
  2022-02-03  6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
  2022-02-03  6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
  2022-02-03  6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
@ 2022-02-03  6:52 ` Song Liu
  2 siblings, 0 replies; 12+ messages in thread
From: Song Liu @ 2022-02-03  6:52 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe

CC linux-block (it was a typo in the original email)

On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>
> We have a use case where HDDs are regularly power on/off to perserve power.
> When a drive is being removed, we often see errors like
>
>    [  172.803279] I/O error, dev sda, sector 3137184
>
> These messages are confusing for automations that grep dmesg, as they look
> very similar to real HDD error.
>
> Solve this issue with a new block state BLK_STS_OFFLINE. After the change,
> the error message looks like
>
>    [  172.803279] device offline error, dev sda, sector 3137184
>
> so that the automations won't confuse them with real I/O error.
>
> Song Liu (2):
>   block: introduce BLK_STS_OFFLINE
>   scsi: use BLK_STS_OFFLINE for not fully online devices
>
>  block/blk-core.c          | 1 +
>  drivers/scsi/scsi_lib.c   | 2 +-
>  include/linux/blk_types.h | 7 +++++++
>  3 files changed, 9 insertions(+), 1 deletion(-)
>
> --
> 2.30.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
  2022-02-03  6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
@ 2022-02-03  6:52   ` Song Liu
  2022-02-03  7:24     ` Hannes Reinecke
  0 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2022-02-03  6:52 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe

CC linux-block (it was a typo in the original email)

On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>
> Currently, drivers reports BLK_STS_IOERR for devices that are not full
> online or being removed. This behavior could cause confusion for users,
> as they are not really I/O errors from the device.
>
> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
> offline error" in dmesg instead of "I/O error".
>
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>  block/blk-core.c          | 1 +
>  include/linux/blk_types.h | 7 +++++++
>  2 files changed, 8 insertions(+)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 61f6a0dc4511..24035dd2eef1 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -164,6 +164,7 @@ static const struct {
>         [BLK_STS_RESOURCE]      = { -ENOMEM,    "kernel resource" },
>         [BLK_STS_DEV_RESOURCE]  = { -EBUSY,     "device resource" },
>         [BLK_STS_AGAIN]         = { -EAGAIN,    "nonblocking retry" },
> +       [BLK_STS_OFFLINE]       = { -EIO,       "device offline" },
>
>         /* device mapper special case, should not leak out: */
>         [BLK_STS_DM_REQUEUE]    = { -EREMCHG, "dm internal retry" },
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index fe065c394fff..5561e58d158a 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>   */
>  #define BLK_STS_ZONE_ACTIVE_RESOURCE   ((__force blk_status_t)16)
>
> +/*
> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
> + * or is being taken offline. This could help differentiate the case where a
> + * device is intentionally being shut down from a real I/O error.
> + */
> +#define BLK_STS_OFFLINE                ((__force blk_status_t)17)
> +
>  /**
>   * blk_path_error - returns true if error may be path related
>   * @error: status the request was completed with
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices
  2022-02-03  6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
@ 2022-02-03  6:53   ` Song Liu
  2022-02-03  7:24     ` Hannes Reinecke
  0 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2022-02-03  6:53 UTC (permalink / raw)
  To: linux-scsi, linux-block
  Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe

CC linux-block (it was a typo in the original email)

On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>
> The new error message for such case looks like
>
> [  172.809565] device offline error, dev sda, sector 3138208 ...
>
> which will not be confused with regular I/O error (BLK_STS_IOERR).
>
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>  drivers/scsi/scsi_lib.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 0a70aa763a96..e30bc51578e9 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req)
>                  * power management commands.
>                  */
>                 if (req && !(req->rq_flags & RQF_PM))
> -                       return BLK_STS_IOERR;
> +                       return BLK_STS_OFFLINE;
>                 return BLK_STS_OK;
>         }
>  }
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
  2022-02-03  6:52   ` Song Liu
@ 2022-02-03  7:24     ` Hannes Reinecke
  2022-02-03 13:47       ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Hannes Reinecke @ 2022-02-03  7:24 UTC (permalink / raw)
  To: Song Liu, linux-scsi, linux-block
  Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe

On 2/3/22 07:52, Song Liu wrote:
> CC linux-block (it was a typo in the original email)
> 
> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>
>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>> online or being removed. This behavior could cause confusion for users,
>> as they are not really I/O errors from the device.
>>
>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>> offline error" in dmesg instead of "I/O error".
>>
>> Signed-off-by: Song Liu <song@kernel.org>
>> ---
>>   block/blk-core.c          | 1 +
>>   include/linux/blk_types.h | 7 +++++++
>>   2 files changed, 8 insertions(+)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 61f6a0dc4511..24035dd2eef1 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -164,6 +164,7 @@ static const struct {
>>          [BLK_STS_RESOURCE]      = { -ENOMEM,    "kernel resource" },
>>          [BLK_STS_DEV_RESOURCE]  = { -EBUSY,     "device resource" },
>>          [BLK_STS_AGAIN]         = { -EAGAIN,    "nonblocking retry" },
>> +       [BLK_STS_OFFLINE]       = { -EIO,       "device offline" },
>>
>>          /* device mapper special case, should not leak out: */
>>          [BLK_STS_DM_REQUEUE]    = { -EREMCHG, "dm internal retry" },
>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>> index fe065c394fff..5561e58d158a 100644
>> --- a/include/linux/blk_types.h
>> +++ b/include/linux/blk_types.h
>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>    */
>>   #define BLK_STS_ZONE_ACTIVE_RESOURCE   ((__force blk_status_t)16)
>>
>> +/*
>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>> + * or is being taken offline. This could help differentiate the case where a
>> + * device is intentionally being shut down from a real I/O error.
>> + */
>> +#define BLK_STS_OFFLINE                ((__force blk_status_t)17)
>> +
>>   /**
>>    * blk_path_error - returns true if error may be path related
>>    * @error: status the request was completed with
>> --
>> 2.30.2
>>
Please do not overload EIO here.
EIO already is a catch-all error if we don't know any better, but for 
the 'device offline' case we do (or rather should).
Please map it onto 'ENODEV' or 'ENXIO'.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices
  2022-02-03  6:53   ` Song Liu
@ 2022-02-03  7:24     ` Hannes Reinecke
  0 siblings, 0 replies; 12+ messages in thread
From: Hannes Reinecke @ 2022-02-03  7:24 UTC (permalink / raw)
  To: Song Liu, linux-scsi, linux-block
  Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe

On 2/3/22 07:53, Song Liu wrote:
> CC linux-block (it was a typo in the original email)
> 
> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>
>> The new error message for such case looks like
>>
>> [  172.809565] device offline error, dev sda, sector 3138208 ...
>>
>> which will not be confused with regular I/O error (BLK_STS_IOERR).
>>
>> Signed-off-by: Song Liu <song@kernel.org>
>> ---
>>   drivers/scsi/scsi_lib.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 0a70aa763a96..e30bc51578e9 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req)
>>                   * power management commands.
>>                   */
>>                  if (req && !(req->rq_flags & RQF_PM))
>> -                       return BLK_STS_IOERR;
>> +                       return BLK_STS_OFFLINE;
>>                  return BLK_STS_OK;
>>          }
>>   }
>> --
>> 2.30.2
>>
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes

-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
  2022-02-03  7:24     ` Hannes Reinecke
@ 2022-02-03 13:47       ` Jens Axboe
  2022-02-03 17:23         ` Song Liu
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2022-02-03 13:47 UTC (permalink / raw)
  To: Hannes Reinecke, Song Liu, linux-scsi, linux-block
  Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen

On 2/3/22 12:24 AM, Hannes Reinecke wrote:
> On 2/3/22 07:52, Song Liu wrote:
>> CC linux-block (it was a typo in the original email)
>>
>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>>
>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>>> online or being removed. This behavior could cause confusion for users,
>>> as they are not really I/O errors from the device.
>>>
>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>>> offline error" in dmesg instead of "I/O error".
>>>
>>> Signed-off-by: Song Liu <song@kernel.org>
>>> ---
>>>   block/blk-core.c          | 1 +
>>>   include/linux/blk_types.h | 7 +++++++
>>>   2 files changed, 8 insertions(+)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index 61f6a0dc4511..24035dd2eef1 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -164,6 +164,7 @@ static const struct {
>>>          [BLK_STS_RESOURCE]      = { -ENOMEM,    "kernel resource" },
>>>          [BLK_STS_DEV_RESOURCE]  = { -EBUSY,     "device resource" },
>>>          [BLK_STS_AGAIN]         = { -EAGAIN,    "nonblocking retry" },
>>> +       [BLK_STS_OFFLINE]       = { -EIO,       "device offline" },
>>>
>>>          /* device mapper special case, should not leak out: */
>>>          [BLK_STS_DM_REQUEUE]    = { -EREMCHG, "dm internal retry" },
>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>> index fe065c394fff..5561e58d158a 100644
>>> --- a/include/linux/blk_types.h
>>> +++ b/include/linux/blk_types.h
>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>>    */
>>>   #define BLK_STS_ZONE_ACTIVE_RESOURCE   ((__force blk_status_t)16)
>>>
>>> +/*
>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>>> + * or is being taken offline. This could help differentiate the case where a
>>> + * device is intentionally being shut down from a real I/O error.
>>> + */
>>> +#define BLK_STS_OFFLINE                ((__force blk_status_t)17)
>>> +
>>>   /**
>>>    * blk_path_error - returns true if error may be path related
>>>    * @error: status the request was completed with
>>> --
>>> 2.30.2
>>>
> Please do not overload EIO here.
> EIO already is a catch-all error if we don't know any better, but for 
> the 'device offline' case we do (or rather should).
> Please map it onto 'ENODEV' or 'ENXIO'.

It's deliberately EIO as not to force a change in behavior. I don't mind
using something else, but that should be a separate change then.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
  2022-02-03 13:47       ` Jens Axboe
@ 2022-02-03 17:23         ` Song Liu
  2022-02-03 18:51           ` Jens Axboe
  2022-02-04  7:14           ` Hannes Reinecke
  0 siblings, 2 replies; 12+ messages in thread
From: Song Liu @ 2022-02-03 17:23 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Hannes Reinecke, Song Liu, linux-scsi, linux-block, Kernel Team,
	James E.J. Bottomley, Martin K. Petersen

Hi Hannes and Jens,

> On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote:
> 
> On 2/3/22 12:24 AM, Hannes Reinecke wrote:
>> On 2/3/22 07:52, Song Liu wrote:
>>> CC linux-block (it was a typo in the original email)
>>> 
>>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>>> 
>>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>>>> online or being removed. This behavior could cause confusion for users,
>>>> as they are not really I/O errors from the device.
>>>> 
>>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>>>> offline error" in dmesg instead of "I/O error".
>>>> 
>>>> Signed-off-by: Song Liu <song@kernel.org>
>>>> ---
>>>>  block/blk-core.c          | 1 +
>>>>  include/linux/blk_types.h | 7 +++++++
>>>>  2 files changed, 8 insertions(+)
>>>> 
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index 61f6a0dc4511..24035dd2eef1 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -164,6 +164,7 @@ static const struct {
>>>>         [BLK_STS_RESOURCE]      = { -ENOMEM,    "kernel resource" },
>>>>         [BLK_STS_DEV_RESOURCE]  = { -EBUSY,     "device resource" },
>>>>         [BLK_STS_AGAIN]         = { -EAGAIN,    "nonblocking retry" },
>>>> +       [BLK_STS_OFFLINE]       = { -EIO,       "device offline" },
>>>> 
>>>>         /* device mapper special case, should not leak out: */
>>>>         [BLK_STS_DM_REQUEUE]    = { -EREMCHG, "dm internal retry" },
>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>>> index fe065c394fff..5561e58d158a 100644
>>>> --- a/include/linux/blk_types.h
>>>> +++ b/include/linux/blk_types.h
>>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>>>   */
>>>>  #define BLK_STS_ZONE_ACTIVE_RESOURCE   ((__force blk_status_t)16)
>>>> 
>>>> +/*
>>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>>>> + * or is being taken offline. This could help differentiate the case where a
>>>> + * device is intentionally being shut down from a real I/O error.
>>>> + */
>>>> +#define BLK_STS_OFFLINE                ((__force blk_status_t)17)
>>>> +
>>>>  /**
>>>>   * blk_path_error - returns true if error may be path related
>>>>   * @error: status the request was completed with
>>>> --
>>>> 2.30.2
>>>> 
>> Please do not overload EIO here.
>> EIO already is a catch-all error if we don't know any better, but for 
>> the 'device offline' case we do (or rather should).
>> Please map it onto 'ENODEV' or 'ENXIO'.
> 
> It's deliberately EIO as not to force a change in behavior. I don't mind
> using something else, but that should be a separate change then.

Thanks for these feedbacks. Shall I send v2 with an extra patch that 
changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch? 
Also, any preference between ENODEV and ENXIO? 

Thanks,
Song

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
  2022-02-03 17:23         ` Song Liu
@ 2022-02-03 18:51           ` Jens Axboe
  2022-02-04  7:14           ` Hannes Reinecke
  1 sibling, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2022-02-03 18:51 UTC (permalink / raw)
  To: Song Liu, Jens Axboe
  Cc: Hannes Reinecke, Song Liu, linux-scsi, linux-block, Kernel Team,
	James E.J. Bottomley, Martin K. Petersen

On 2/3/22 10:23 AM, Song Liu wrote:
> Hi Hannes and Jens,
> 
>> On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 2/3/22 12:24 AM, Hannes Reinecke wrote:
>>> On 2/3/22 07:52, Song Liu wrote:
>>>> CC linux-block (it was a typo in the original email)
>>>>
>>>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>>>>
>>>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>>>>> online or being removed. This behavior could cause confusion for users,
>>>>> as they are not really I/O errors from the device.
>>>>>
>>>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>>>>> offline error" in dmesg instead of "I/O error".
>>>>>
>>>>> Signed-off-by: Song Liu <song@kernel.org>
>>>>> ---
>>>>>  block/blk-core.c          | 1 +
>>>>>  include/linux/blk_types.h | 7 +++++++
>>>>>  2 files changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>>> index 61f6a0dc4511..24035dd2eef1 100644
>>>>> --- a/block/blk-core.c
>>>>> +++ b/block/blk-core.c
>>>>> @@ -164,6 +164,7 @@ static const struct {
>>>>>         [BLK_STS_RESOURCE]      = { -ENOMEM,    "kernel resource" },
>>>>>         [BLK_STS_DEV_RESOURCE]  = { -EBUSY,     "device resource" },
>>>>>         [BLK_STS_AGAIN]         = { -EAGAIN,    "nonblocking retry" },
>>>>> +       [BLK_STS_OFFLINE]       = { -EIO,       "device offline" },
>>>>>
>>>>>         /* device mapper special case, should not leak out: */
>>>>>         [BLK_STS_DM_REQUEUE]    = { -EREMCHG, "dm internal retry" },
>>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>>>> index fe065c394fff..5561e58d158a 100644
>>>>> --- a/include/linux/blk_types.h
>>>>> +++ b/include/linux/blk_types.h
>>>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>>>>   */
>>>>>  #define BLK_STS_ZONE_ACTIVE_RESOURCE   ((__force blk_status_t)16)
>>>>>
>>>>> +/*
>>>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>>>>> + * or is being taken offline. This could help differentiate the case where a
>>>>> + * device is intentionally being shut down from a real I/O error.
>>>>> + */
>>>>> +#define BLK_STS_OFFLINE                ((__force blk_status_t)17)
>>>>> +
>>>>>  /**
>>>>>   * blk_path_error - returns true if error may be path related
>>>>>   * @error: status the request was completed with
>>>>> --
>>>>> 2.30.2
>>>>>
>>> Please do not overload EIO here.
>>> EIO already is a catch-all error if we don't know any better, but for 
>>> the 'device offline' case we do (or rather should).
>>> Please map it onto 'ENODEV' or 'ENXIO'.
>>
>> It's deliberately EIO as not to force a change in behavior. I don't mind
>> using something else, but that should be a separate change then.
> 
> Thanks for these feedbacks. Shall I send v2 with an extra patch that 
> changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch? 
> Also, any preference between ENODEV and ENXIO? 

Yeah I think so, and perhaps put a mention in this patch on why EIO is
chosen to not change the user visible return value.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
  2022-02-03 17:23         ` Song Liu
  2022-02-03 18:51           ` Jens Axboe
@ 2022-02-04  7:14           ` Hannes Reinecke
  1 sibling, 0 replies; 12+ messages in thread
From: Hannes Reinecke @ 2022-02-04  7:14 UTC (permalink / raw)
  To: Song Liu, Jens Axboe
  Cc: Song Liu, linux-scsi, linux-block, Kernel Team,
	James E.J. Bottomley, Martin K. Petersen

On 2/3/22 18:23, Song Liu wrote:
> Hi Hannes and Jens,
> 
>> On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 2/3/22 12:24 AM, Hannes Reinecke wrote:
>>> On 2/3/22 07:52, Song Liu wrote:
>>>> CC linux-block (it was a typo in the original email)
>>>>
>>>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>>>>
>>>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>>>>> online or being removed. This behavior could cause confusion for users,
>>>>> as they are not really I/O errors from the device.
>>>>>
>>>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>>>>> offline error" in dmesg instead of "I/O error".
>>>>>
>>>>> Signed-off-by: Song Liu <song@kernel.org>
>>>>> ---
>>>>>   block/blk-core.c          | 1 +
>>>>>   include/linux/blk_types.h | 7 +++++++
>>>>>   2 files changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>>> index 61f6a0dc4511..24035dd2eef1 100644
>>>>> --- a/block/blk-core.c
>>>>> +++ b/block/blk-core.c
>>>>> @@ -164,6 +164,7 @@ static const struct {
>>>>>          [BLK_STS_RESOURCE]      = { -ENOMEM,    "kernel resource" },
>>>>>          [BLK_STS_DEV_RESOURCE]  = { -EBUSY,     "device resource" },
>>>>>          [BLK_STS_AGAIN]         = { -EAGAIN,    "nonblocking retry" },
>>>>> +       [BLK_STS_OFFLINE]       = { -EIO,       "device offline" },
>>>>>
>>>>>          /* device mapper special case, should not leak out: */
>>>>>          [BLK_STS_DM_REQUEUE]    = { -EREMCHG, "dm internal retry" },
>>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>>>> index fe065c394fff..5561e58d158a 100644
>>>>> --- a/include/linux/blk_types.h
>>>>> +++ b/include/linux/blk_types.h
>>>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>>>>    */
>>>>>   #define BLK_STS_ZONE_ACTIVE_RESOURCE   ((__force blk_status_t)16)
>>>>>
>>>>> +/*
>>>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>>>>> + * or is being taken offline. This could help differentiate the case where a
>>>>> + * device is intentionally being shut down from a real I/O error.
>>>>> + */
>>>>> +#define BLK_STS_OFFLINE                ((__force blk_status_t)17)
>>>>> +
>>>>>   /**
>>>>>    * blk_path_error - returns true if error may be path related
>>>>>    * @error: status the request was completed with
>>>>> --
>>>>> 2.30.2
>>>>>
>>> Please do not overload EIO here.
>>> EIO already is a catch-all error if we don't know any better, but for
>>> the 'device offline' case we do (or rather should).
>>> Please map it onto 'ENODEV' or 'ENXIO'.
>>
>> It's deliberately EIO as not to force a change in behavior. I don't mind
>> using something else, but that should be a separate change then.
> 
> Thanks for these feedbacks. Shall I send v2 with an extra patch that
> changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch?
> Also, any preference between ENODEV and ENXIO?
> 
Please make it an addtional patch, and use ENODEV as a return value.
For this patch you can add:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-02-04  7:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-03  6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
2022-02-03  6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
2022-02-03  6:52   ` Song Liu
2022-02-03  7:24     ` Hannes Reinecke
2022-02-03 13:47       ` Jens Axboe
2022-02-03 17:23         ` Song Liu
2022-02-03 18:51           ` Jens Axboe
2022-02-04  7:14           ` Hannes Reinecke
2022-02-03  6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
2022-02-03  6:53   ` Song Liu
2022-02-03  7:24     ` Hannes Reinecke
2022-02-03  6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.