* [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE
@ 2022-02-03 6:40 Song Liu
2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Song Liu @ 2022-02-03 6:40 UTC (permalink / raw)
To: linx-block, linux-scsi
Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu
We have a use case where HDDs are regularly power on/off to perserve power.
When a drive is being removed, we often see errors like
[ 172.803279] I/O error, dev sda, sector 3137184
These messages are confusing for automations that grep dmesg, as they look
very similar to real HDD error.
Solve this issue with a new block state BLK_STS_OFFLINE. After the change,
the error message looks like
[ 172.803279] device offline error, dev sda, sector 3137184
so that the automations won't confuse them with real I/O error.
Song Liu (2):
block: introduce BLK_STS_OFFLINE
scsi: use BLK_STS_OFFLINE for not fully online devices
block/blk-core.c | 1 +
drivers/scsi/scsi_lib.c | 2 +-
include/linux/blk_types.h | 7 +++++++
3 files changed, 9 insertions(+), 1 deletion(-)
--
2.30.2
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] block: introduce BLK_STS_OFFLINE
2022-02-03 6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
@ 2022-02-03 6:40 ` Song Liu
2022-02-03 6:52 ` Song Liu
2022-02-03 6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
2022-02-03 6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
2 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2022-02-03 6:40 UTC (permalink / raw)
To: linx-block, linux-scsi
Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu
Currently, drivers reports BLK_STS_IOERR for devices that are not full
online or being removed. This behavior could cause confusion for users,
as they are not really I/O errors from the device.
Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
offline error" in dmesg instead of "I/O error".
Signed-off-by: Song Liu <song@kernel.org>
---
block/blk-core.c | 1 +
include/linux/blk_types.h | 7 +++++++
2 files changed, 8 insertions(+)
diff --git a/block/blk-core.c b/block/blk-core.c
index 61f6a0dc4511..24035dd2eef1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -164,6 +164,7 @@ static const struct {
[BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" },
[BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" },
[BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" },
+ [BLK_STS_OFFLINE] = { -EIO, "device offline" },
/* device mapper special case, should not leak out: */
[BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index fe065c394fff..5561e58d158a 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
*/
#define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16)
+/*
+ * BLK_STS_OFFLINE is returned from the driver when the target device is offline
+ * or is being taken offline. This could help differentiate the case where a
+ * device is intentionally being shut down from a real I/O error.
+ */
+#define BLK_STS_OFFLINE ((__force blk_status_t)17)
+
/**
* blk_path_error - returns true if error may be path related
* @error: status the request was completed with
--
2.30.2
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices
2022-02-03 6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
@ 2022-02-03 6:40 ` Song Liu
2022-02-03 6:53 ` Song Liu
2022-02-03 6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
2 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2022-02-03 6:40 UTC (permalink / raw)
To: linx-block, linux-scsi
Cc: kernel-team, jejb, martin.petersen, axboe, Song Liu
The new error message for such case looks like
[ 172.809565] device offline error, dev sda, sector 3138208 ...
which will not be confused with regular I/O error (BLK_STS_IOERR).
Signed-off-by: Song Liu <song@kernel.org>
---
drivers/scsi/scsi_lib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..e30bc51578e9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req)
* power management commands.
*/
if (req && !(req->rq_flags & RQF_PM))
- return BLK_STS_IOERR;
+ return BLK_STS_OFFLINE;
return BLK_STS_OK;
}
}
--
2.30.2
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE
2022-02-03 6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
2022-02-03 6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
@ 2022-02-03 6:52 ` Song Liu
2 siblings, 0 replies; 12+ messages in thread
From: Song Liu @ 2022-02-03 6:52 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe
CC linux-block (it was a typo in the original email)
On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>
> We have a use case where HDDs are regularly power on/off to perserve power.
> When a drive is being removed, we often see errors like
>
> [ 172.803279] I/O error, dev sda, sector 3137184
>
> These messages are confusing for automations that grep dmesg, as they look
> very similar to real HDD error.
>
> Solve this issue with a new block state BLK_STS_OFFLINE. After the change,
> the error message looks like
>
> [ 172.803279] device offline error, dev sda, sector 3137184
>
> so that the automations won't confuse them with real I/O error.
>
> Song Liu (2):
> block: introduce BLK_STS_OFFLINE
> scsi: use BLK_STS_OFFLINE for not fully online devices
>
> block/blk-core.c | 1 +
> drivers/scsi/scsi_lib.c | 2 +-
> include/linux/blk_types.h | 7 +++++++
> 3 files changed, 9 insertions(+), 1 deletion(-)
>
> --
> 2.30.2
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
@ 2022-02-03 6:52 ` Song Liu
2022-02-03 7:24 ` Hannes Reinecke
0 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2022-02-03 6:52 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe
CC linux-block (it was a typo in the original email)
On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>
> Currently, drivers reports BLK_STS_IOERR for devices that are not full
> online or being removed. This behavior could cause confusion for users,
> as they are not really I/O errors from the device.
>
> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
> offline error" in dmesg instead of "I/O error".
>
> Signed-off-by: Song Liu <song@kernel.org>
> ---
> block/blk-core.c | 1 +
> include/linux/blk_types.h | 7 +++++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 61f6a0dc4511..24035dd2eef1 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -164,6 +164,7 @@ static const struct {
> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" },
> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" },
> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" },
> + [BLK_STS_OFFLINE] = { -EIO, "device offline" },
>
> /* device mapper special case, should not leak out: */
> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index fe065c394fff..5561e58d158a 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
> */
> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16)
>
> +/*
> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
> + * or is being taken offline. This could help differentiate the case where a
> + * device is intentionally being shut down from a real I/O error.
> + */
> +#define BLK_STS_OFFLINE ((__force blk_status_t)17)
> +
> /**
> * blk_path_error - returns true if error may be path related
> * @error: status the request was completed with
> --
> 2.30.2
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices
2022-02-03 6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
@ 2022-02-03 6:53 ` Song Liu
2022-02-03 7:24 ` Hannes Reinecke
0 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2022-02-03 6:53 UTC (permalink / raw)
To: linux-scsi, linux-block
Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe
CC linux-block (it was a typo in the original email)
On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>
> The new error message for such case looks like
>
> [ 172.809565] device offline error, dev sda, sector 3138208 ...
>
> which will not be confused with regular I/O error (BLK_STS_IOERR).
>
> Signed-off-by: Song Liu <song@kernel.org>
> ---
> drivers/scsi/scsi_lib.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 0a70aa763a96..e30bc51578e9 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req)
> * power management commands.
> */
> if (req && !(req->rq_flags & RQF_PM))
> - return BLK_STS_IOERR;
> + return BLK_STS_OFFLINE;
> return BLK_STS_OK;
> }
> }
> --
> 2.30.2
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
2022-02-03 6:52 ` Song Liu
@ 2022-02-03 7:24 ` Hannes Reinecke
2022-02-03 13:47 ` Jens Axboe
0 siblings, 1 reply; 12+ messages in thread
From: Hannes Reinecke @ 2022-02-03 7:24 UTC (permalink / raw)
To: Song Liu, linux-scsi, linux-block
Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe
On 2/3/22 07:52, Song Liu wrote:
> CC linux-block (it was a typo in the original email)
>
> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>
>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>> online or being removed. This behavior could cause confusion for users,
>> as they are not really I/O errors from the device.
>>
>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>> offline error" in dmesg instead of "I/O error".
>>
>> Signed-off-by: Song Liu <song@kernel.org>
>> ---
>> block/blk-core.c | 1 +
>> include/linux/blk_types.h | 7 +++++++
>> 2 files changed, 8 insertions(+)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 61f6a0dc4511..24035dd2eef1 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -164,6 +164,7 @@ static const struct {
>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" },
>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" },
>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" },
>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" },
>>
>> /* device mapper special case, should not leak out: */
>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>> index fe065c394fff..5561e58d158a 100644
>> --- a/include/linux/blk_types.h
>> +++ b/include/linux/blk_types.h
>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>> */
>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16)
>>
>> +/*
>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>> + * or is being taken offline. This could help differentiate the case where a
>> + * device is intentionally being shut down from a real I/O error.
>> + */
>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17)
>> +
>> /**
>> * blk_path_error - returns true if error may be path related
>> * @error: status the request was completed with
>> --
>> 2.30.2
>>
Please do not overload EIO here.
EIO already is a catch-all error if we don't know any better, but for
the 'device offline' case we do (or rather should).
Please map it onto 'ENODEV' or 'ENXIO'.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices
2022-02-03 6:53 ` Song Liu
@ 2022-02-03 7:24 ` Hannes Reinecke
0 siblings, 0 replies; 12+ messages in thread
From: Hannes Reinecke @ 2022-02-03 7:24 UTC (permalink / raw)
To: Song Liu, linux-scsi, linux-block
Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen, Jens Axboe
On 2/3/22 07:53, Song Liu wrote:
> CC linux-block (it was a typo in the original email)
>
> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>
>> The new error message for such case looks like
>>
>> [ 172.809565] device offline error, dev sda, sector 3138208 ...
>>
>> which will not be confused with regular I/O error (BLK_STS_IOERR).
>>
>> Signed-off-by: Song Liu <song@kernel.org>
>> ---
>> drivers/scsi/scsi_lib.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 0a70aa763a96..e30bc51578e9 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1276,7 +1276,7 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req)
>> * power management commands.
>> */
>> if (req && !(req->rq_flags & RQF_PM))
>> - return BLK_STS_IOERR;
>> + return BLK_STS_OFFLINE;
>> return BLK_STS_OK;
>> }
>> }
>> --
>> 2.30.2
>>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
2022-02-03 7:24 ` Hannes Reinecke
@ 2022-02-03 13:47 ` Jens Axboe
2022-02-03 17:23 ` Song Liu
0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2022-02-03 13:47 UTC (permalink / raw)
To: Hannes Reinecke, Song Liu, linux-scsi, linux-block
Cc: Kernel Team, James E.J. Bottomley, Martin K. Petersen
On 2/3/22 12:24 AM, Hannes Reinecke wrote:
> On 2/3/22 07:52, Song Liu wrote:
>> CC linux-block (it was a typo in the original email)
>>
>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>>
>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>>> online or being removed. This behavior could cause confusion for users,
>>> as they are not really I/O errors from the device.
>>>
>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>>> offline error" in dmesg instead of "I/O error".
>>>
>>> Signed-off-by: Song Liu <song@kernel.org>
>>> ---
>>> block/blk-core.c | 1 +
>>> include/linux/blk_types.h | 7 +++++++
>>> 2 files changed, 8 insertions(+)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index 61f6a0dc4511..24035dd2eef1 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -164,6 +164,7 @@ static const struct {
>>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" },
>>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" },
>>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" },
>>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" },
>>>
>>> /* device mapper special case, should not leak out: */
>>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>> index fe065c394fff..5561e58d158a 100644
>>> --- a/include/linux/blk_types.h
>>> +++ b/include/linux/blk_types.h
>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>> */
>>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16)
>>>
>>> +/*
>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>>> + * or is being taken offline. This could help differentiate the case where a
>>> + * device is intentionally being shut down from a real I/O error.
>>> + */
>>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17)
>>> +
>>> /**
>>> * blk_path_error - returns true if error may be path related
>>> * @error: status the request was completed with
>>> --
>>> 2.30.2
>>>
> Please do not overload EIO here.
> EIO already is a catch-all error if we don't know any better, but for
> the 'device offline' case we do (or rather should).
> Please map it onto 'ENODEV' or 'ENXIO'.
It's deliberately EIO as not to force a change in behavior. I don't mind
using something else, but that should be a separate change then.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
2022-02-03 13:47 ` Jens Axboe
@ 2022-02-03 17:23 ` Song Liu
2022-02-03 18:51 ` Jens Axboe
2022-02-04 7:14 ` Hannes Reinecke
0 siblings, 2 replies; 12+ messages in thread
From: Song Liu @ 2022-02-03 17:23 UTC (permalink / raw)
To: Jens Axboe
Cc: Hannes Reinecke, Song Liu, linux-scsi, linux-block, Kernel Team,
James E.J. Bottomley, Martin K. Petersen
Hi Hannes and Jens,
> On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote:
>
> On 2/3/22 12:24 AM, Hannes Reinecke wrote:
>> On 2/3/22 07:52, Song Liu wrote:
>>> CC linux-block (it was a typo in the original email)
>>>
>>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>>>
>>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>>>> online or being removed. This behavior could cause confusion for users,
>>>> as they are not really I/O errors from the device.
>>>>
>>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>>>> offline error" in dmesg instead of "I/O error".
>>>>
>>>> Signed-off-by: Song Liu <song@kernel.org>
>>>> ---
>>>> block/blk-core.c | 1 +
>>>> include/linux/blk_types.h | 7 +++++++
>>>> 2 files changed, 8 insertions(+)
>>>>
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index 61f6a0dc4511..24035dd2eef1 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -164,6 +164,7 @@ static const struct {
>>>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" },
>>>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" },
>>>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" },
>>>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" },
>>>>
>>>> /* device mapper special case, should not leak out: */
>>>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>>> index fe065c394fff..5561e58d158a 100644
>>>> --- a/include/linux/blk_types.h
>>>> +++ b/include/linux/blk_types.h
>>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>>> */
>>>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16)
>>>>
>>>> +/*
>>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>>>> + * or is being taken offline. This could help differentiate the case where a
>>>> + * device is intentionally being shut down from a real I/O error.
>>>> + */
>>>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17)
>>>> +
>>>> /**
>>>> * blk_path_error - returns true if error may be path related
>>>> * @error: status the request was completed with
>>>> --
>>>> 2.30.2
>>>>
>> Please do not overload EIO here.
>> EIO already is a catch-all error if we don't know any better, but for
>> the 'device offline' case we do (or rather should).
>> Please map it onto 'ENODEV' or 'ENXIO'.
>
> It's deliberately EIO as not to force a change in behavior. I don't mind
> using something else, but that should be a separate change then.
Thanks for these feedbacks. Shall I send v2 with an extra patch that
changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch?
Also, any preference between ENODEV and ENXIO?
Thanks,
Song
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
2022-02-03 17:23 ` Song Liu
@ 2022-02-03 18:51 ` Jens Axboe
2022-02-04 7:14 ` Hannes Reinecke
1 sibling, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2022-02-03 18:51 UTC (permalink / raw)
To: Song Liu, Jens Axboe
Cc: Hannes Reinecke, Song Liu, linux-scsi, linux-block, Kernel Team,
James E.J. Bottomley, Martin K. Petersen
On 2/3/22 10:23 AM, Song Liu wrote:
> Hi Hannes and Jens,
>
>> On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 2/3/22 12:24 AM, Hannes Reinecke wrote:
>>> On 2/3/22 07:52, Song Liu wrote:
>>>> CC linux-block (it was a typo in the original email)
>>>>
>>>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>>>>
>>>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>>>>> online or being removed. This behavior could cause confusion for users,
>>>>> as they are not really I/O errors from the device.
>>>>>
>>>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>>>>> offline error" in dmesg instead of "I/O error".
>>>>>
>>>>> Signed-off-by: Song Liu <song@kernel.org>
>>>>> ---
>>>>> block/blk-core.c | 1 +
>>>>> include/linux/blk_types.h | 7 +++++++
>>>>> 2 files changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>>> index 61f6a0dc4511..24035dd2eef1 100644
>>>>> --- a/block/blk-core.c
>>>>> +++ b/block/blk-core.c
>>>>> @@ -164,6 +164,7 @@ static const struct {
>>>>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" },
>>>>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" },
>>>>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" },
>>>>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" },
>>>>>
>>>>> /* device mapper special case, should not leak out: */
>>>>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
>>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>>>> index fe065c394fff..5561e58d158a 100644
>>>>> --- a/include/linux/blk_types.h
>>>>> +++ b/include/linux/blk_types.h
>>>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>>>> */
>>>>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16)
>>>>>
>>>>> +/*
>>>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>>>>> + * or is being taken offline. This could help differentiate the case where a
>>>>> + * device is intentionally being shut down from a real I/O error.
>>>>> + */
>>>>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17)
>>>>> +
>>>>> /**
>>>>> * blk_path_error - returns true if error may be path related
>>>>> * @error: status the request was completed with
>>>>> --
>>>>> 2.30.2
>>>>>
>>> Please do not overload EIO here.
>>> EIO already is a catch-all error if we don't know any better, but for
>>> the 'device offline' case we do (or rather should).
>>> Please map it onto 'ENODEV' or 'ENXIO'.
>>
>> It's deliberately EIO as not to force a change in behavior. I don't mind
>> using something else, but that should be a separate change then.
>
> Thanks for these feedbacks. Shall I send v2 with an extra patch that
> changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch?
> Also, any preference between ENODEV and ENXIO?
Yeah I think so, and perhaps put a mention in this patch on why EIO is
chosen to not change the user visible return value.
--
Jens Axboe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] block: introduce BLK_STS_OFFLINE
2022-02-03 17:23 ` Song Liu
2022-02-03 18:51 ` Jens Axboe
@ 2022-02-04 7:14 ` Hannes Reinecke
1 sibling, 0 replies; 12+ messages in thread
From: Hannes Reinecke @ 2022-02-04 7:14 UTC (permalink / raw)
To: Song Liu, Jens Axboe
Cc: Song Liu, linux-scsi, linux-block, Kernel Team,
James E.J. Bottomley, Martin K. Petersen
On 2/3/22 18:23, Song Liu wrote:
> Hi Hannes and Jens,
>
>> On Feb 3, 2022, at 5:47 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 2/3/22 12:24 AM, Hannes Reinecke wrote:
>>> On 2/3/22 07:52, Song Liu wrote:
>>>> CC linux-block (it was a typo in the original email)
>>>>
>>>> On Wed, Feb 2, 2022 at 10:40 PM Song Liu <song@kernel.org> wrote:
>>>>>
>>>>> Currently, drivers reports BLK_STS_IOERR for devices that are not full
>>>>> online or being removed. This behavior could cause confusion for users,
>>>>> as they are not really I/O errors from the device.
>>>>>
>>>>> Solve this issue with a new state BLK_STS_OFFLINE, which reports "device
>>>>> offline error" in dmesg instead of "I/O error".
>>>>>
>>>>> Signed-off-by: Song Liu <song@kernel.org>
>>>>> ---
>>>>> block/blk-core.c | 1 +
>>>>> include/linux/blk_types.h | 7 +++++++
>>>>> 2 files changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>>> index 61f6a0dc4511..24035dd2eef1 100644
>>>>> --- a/block/blk-core.c
>>>>> +++ b/block/blk-core.c
>>>>> @@ -164,6 +164,7 @@ static const struct {
>>>>> [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" },
>>>>> [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" },
>>>>> [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" },
>>>>> + [BLK_STS_OFFLINE] = { -EIO, "device offline" },
>>>>>
>>>>> /* device mapper special case, should not leak out: */
>>>>> [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
>>>>> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
>>>>> index fe065c394fff..5561e58d158a 100644
>>>>> --- a/include/linux/blk_types.h
>>>>> +++ b/include/linux/blk_types.h
>>>>> @@ -153,6 +153,13 @@ typedef u8 __bitwise blk_status_t;
>>>>> */
>>>>> #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16)
>>>>>
>>>>> +/*
>>>>> + * BLK_STS_OFFLINE is returned from the driver when the target device is offline
>>>>> + * or is being taken offline. This could help differentiate the case where a
>>>>> + * device is intentionally being shut down from a real I/O error.
>>>>> + */
>>>>> +#define BLK_STS_OFFLINE ((__force blk_status_t)17)
>>>>> +
>>>>> /**
>>>>> * blk_path_error - returns true if error may be path related
>>>>> * @error: status the request was completed with
>>>>> --
>>>>> 2.30.2
>>>>>
>>> Please do not overload EIO here.
>>> EIO already is a catch-all error if we don't know any better, but for
>>> the 'device offline' case we do (or rather should).
>>> Please map it onto 'ENODEV' or 'ENXIO'.
>>
>> It's deliberately EIO as not to force a change in behavior. I don't mind
>> using something else, but that should be a separate change then.
>
> Thanks for these feedbacks. Shall I send v2 with an extra patch that
> changes EIO to ENODEV/ENXIO? Or shall we do that in a follow up patch?
> Also, any preference between ENODEV and ENXIO?
>
Please make it an addtional patch, and use ENODEV as a return value.
For this patch you can add:
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-02-04 7:14 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-03 6:40 [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
2022-02-03 6:40 ` [PATCH 1/2] block: introduce BLK_STS_OFFLINE Song Liu
2022-02-03 6:52 ` Song Liu
2022-02-03 7:24 ` Hannes Reinecke
2022-02-03 13:47 ` Jens Axboe
2022-02-03 17:23 ` Song Liu
2022-02-03 18:51 ` Jens Axboe
2022-02-04 7:14 ` Hannes Reinecke
2022-02-03 6:40 ` [PATCH 2/2] scsi: use BLK_STS_OFFLINE for not fully online devices Song Liu
2022-02-03 6:53 ` Song Liu
2022-02-03 7:24 ` Hannes Reinecke
2022-02-03 6:52 ` [PATCH 0/2] block: scsi: introduce and use BLK_STS_OFFLINE Song Liu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.