Best practices for handling drive failures during a run?

fio.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Best practices for handling drive failures during a run?
@ 2022-09-07 15:58 ` Nick Neumann
  2022-09-07 21:01   ` Damien Le Moal
  2022-09-08 14:59   ` Vincent Fu
  0 siblings, 2 replies; 11+ messages in thread
From: Nick Neumann @ 2022-09-07 15:58 UTC (permalink / raw)
  To: fio

I was wondering if there were any recommendations/suggestions on
handling drive failures during a fio run. I hit one yesterday with a
60 second mixed use test on an SSD. 51 seconds in, the drive basically
stopped responding. (A separate program that periodically calls
smartctl to get drive state also showed something was up, as data like
temperature was missing.)

At 107 seconds, a read completed, and fio exited.

It made me wonder what would have happened if the test was not time
limited - e.g., a full drive write. Would it have just hung, waiting
forever? Or would the OS eventually get back to fio and tell it the
submitted operations have failed and fio would exit?

Any ideas on ways to test the behavior, or areas of the code to look at?

I'm basically looking for input on how to make sure fio does not hang
in such situations. And even better would be if I could get fio to
return an error if it does happen - I could see the controls for
reporting error being configurable - e.g., if an operation doesn't
return for N seconds, stop the job and return an error. I'm happy to
work on implementing stuff to help with this, and wanted to see where
things currently are at and what others thought about the general
issue.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Best practices for handling drive failures during a run?
  2022-09-07 15:58 ` Best practices for handling drive failures during a run? Nick Neumann
@ 2022-09-07 21:01   ` Damien Le Moal
  2022-09-08 16:24     ` Nick Neumann
  2022-09-08 14:59   ` Vincent Fu
  1 sibling, 1 reply; 11+ messages in thread
From: Damien Le Moal @ 2022-09-07 21:01 UTC (permalink / raw)
  To: Nick Neumann, fio

On 9/8/22 00:58, Nick Neumann wrote:
> I was wondering if there were any recommendations/suggestions on
> handling drive failures during a fio run. I hit one yesterday with a
> 60 second mixed use test on an SSD. 51 seconds in, the drive basically
> stopped responding. (A separate program that periodically calls
> smartctl to get drive state also showed something was up, as data like
> temperature was missing.)
> 
> At 107 seconds, a read completed, and fio exited.
> 
> It made me wonder what would have happened if the test was not time
> limited - e.g., a full drive write. Would it have just hung, waiting
> forever? Or would the OS eventually get back to fio and tell it the
> submitted operations have failed and fio would exit?

Unless you are using continue_on_error=io (or "all"), fio will stop if it
sees an IO error, or at least the job that gets the IO error will stop.
The IO error will come from the kernel when your drive stops responding
(IO timeout and is failed and the drive is reset in that case).

> 
> Any ideas on ways to test the behavior, or areas of the code to look at?

Which behavior ? That fio stops ? You can try continue_on_error=none and
fio will not stop until it reaches the time or size limit, even if some
IOs fail.

> 
> I'm basically looking for input on how to make sure fio does not hang
> in such situations. And even better would be if I could get fio to
> return an error if it does happen - I could see the controls for
> reporting error being configurable - e.g., if an operation doesn't
> return for N seconds, stop the job and return an error. I'm happy to
> work on implementing stuff to help with this, and wanted to see where
> things currently are at and what others thought about the general
> issue.

The default IO timeout for the kernel is 30s. If your drive stops
responding for more than that, IOs will be aborted and failed (the user
sees an error) and drive reset.

> 
> Thanks,
> Nick

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Best practices for handling drive failures during a run?
  2022-09-07 21:01   ` Damien Le Moal
@ 2022-09-08 16:24     ` Nick Neumann
  2022-09-10  8:27       ` Damien Le Moal
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Neumann @ 2022-09-08 16:24 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: fio

On Wed, Sep 7, 2022 at 4:01 PM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:

> Unless you are using continue_on_error=io (or "all"), fio will stop if it
> sees an IO error, or at least the job that gets the IO error will stop.
> The IO error will come from the kernel when your drive stops responding
> (IO timeout and is failed and the drive is reset in that case).

Thanks for this info.

> Which behavior ? That fio stops ? You can try continue_on_error=none and
> fio will not stop until it reaches the time or size limit, even if some
> IOs fail.

I would like fio to fail and exit when the I/O error happens. I was
wondering about a way to setup a scenario where an artificial IO error
will occur to make sure it does, if that makes sense.

> The default IO timeout for the kernel is 30s. If your drive stops
> responding for more than that, IOs will be aborted and failed (the user
> sees an error) and drive reset.

Hmm. I had 65 seconds between any I/O; it sounds like that would've
been enough to fail things, but fio returned immediately after that 65
second delayed I/O, and with no error.

I also found the drive timeout error in syslog:
Sep  7 12:37:43 localhost kernel: [ 4354.600211] nvme nvme0: I/O 870
QID 4 timeout, aborting
Sep  7 12:37:43 localhost kernel: [ 4354.615429] nvme nvme0: Abort status: 0x0
Sep  7 12:38:15 localhost kernel: [ 4386.600297] nvme nvme0: I/O 870
QID 4 timeout, reset controller
Sep  7 12:38:17 localhost kernel: [ 4388.050831] nvme nvme0: 7/0/0
default/read/poll queues
Sep  7 12:38:18 localhost kernel: [ 4389.437287]  nvme0n1: AHDI p1 p2 p4
Sep  7 12:38:18 localhost kernel: [ 4389.437347] nvme0n1: p2 start
2240010287 is beyond EOD, truncated
Sep  7 12:38:18 localhost kernel: [ 4389.437350] nvme0n1: p4 start
2472081425 is beyond EOD, truncated

Combining the fio and syslog, the chain of events appears to be:
4332 seconds - drive IO stops
4353 seconds - syslog entry for timeout/abort
4386 seconds - syslog entry for timeout/reset
4387 seconds - read completes and fio exits without error

Thanks,
Nick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Best practices for handling drive failures during a run?
  2022-09-08 16:24     ` Nick Neumann
@ 2022-09-10  8:27       ` Damien Le Moal
  2022-09-10  8:37         ` Damien Le Moal
  2022-09-22 23:11         ` Nick Neumann
  0 siblings, 2 replies; 11+ messages in thread
From: Damien Le Moal @ 2022-09-10  8:27 UTC (permalink / raw)
  To: Nick Neumann; +Cc: fio

On 2022/09/09 1:24, Nick Neumann wrote:
> On Wed, Sep 7, 2022 at 4:01 PM Damien Le Moal
> <damien.lemoal@opensource.wdc.com> wrote:
> 
>> Unless you are using continue_on_error=io (or "all"), fio will stop if it
>> sees an IO error, or at least the job that gets the IO error will stop.
>> The IO error will come from the kernel when your drive stops responding
>> (IO timeout and is failed and the drive is reset in that case).
> 
> Thanks for this info.
> 
>> Which behavior ? That fio stops ? You can try continue_on_error=none and
>> fio will not stop until it reaches the time or size limit, even if some
>> IOs fail.
> 
> I would like fio to fail and exit when the I/O error happens. I was
> wondering about a way to setup a scenario where an artificial IO error
> will occur to make sure it does, if that makes sense.

You can use write-long, to "destroy" sectors: you will get errors when
attempting to read the affected sectors. But that is a really big hammer. A
simpler solution is to use dm-flakey to create "soft" IO errors.

> 
>> The default IO timeout for the kernel is 30s. If your drive stops
>> responding for more than that, IOs will be aborted and failed (the user
>> sees an error) and drive reset.
> 
> Hmm. I had 65 seconds between any I/O; it sounds like that would've
> been enough to fail things, but fio returned immediately after that 65
> second delayed I/O, and with no error.

The IO was likely retried.

> 
> I also found the drive timeout error in syslog:
> Sep  7 12:37:43 localhost kernel: [ 4354.600211] nvme nvme0: I/O 870
> QID 4 timeout, aborting
> Sep  7 12:37:43 localhost kernel: [ 4354.615429] nvme nvme0: Abort status: 0x0
> Sep  7 12:38:15 localhost kernel: [ 4386.600297] nvme nvme0: I/O 870
> QID 4 timeout, reset controller
> Sep  7 12:38:17 localhost kernel: [ 4388.050831] nvme nvme0: 7/0/0
> default/read/poll queues
> Sep  7 12:38:18 localhost kernel: [ 4389.437287]  nvme0n1: AHDI p1 p2 p4
> Sep  7 12:38:18 localhost kernel: [ 4389.437347] nvme0n1: p2 start
> 2240010287 is beyond EOD, truncated
> Sep  7 12:38:18 localhost kernel: [ 4389.437350] nvme0n1: p4 start
> 2472081425 is beyond EOD, truncated
> 
> Combining the fio and syslog, the chain of events appears to be:
> 4332 seconds - drive IO stops
> 4353 seconds - syslog entry for timeout/abort
> 4386 seconds - syslog entry for timeout/reset
> 4387 seconds - read completes and fio exits without error
> 
> Thanks,
> Nick

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Best practices for handling drive failures during a run?
  2022-09-10  8:27       ` Damien Le Moal
@ 2022-09-10  8:37         ` Damien Le Moal
  2022-09-22 23:11         ` Nick Neumann
  1 sibling, 0 replies; 11+ messages in thread
From: Damien Le Moal @ 2022-09-10  8:37 UTC (permalink / raw)
  To: Nick Neumann; +Cc: fio

On 2022/09/10 17:27, Damien Le Moal wrote:
> On 2022/09/09 1:24, Nick Neumann wrote:
>> On Wed, Sep 7, 2022 at 4:01 PM Damien Le Moal
>> <damien.lemoal@opensource.wdc.com> wrote:
>>
>>> Unless you are using continue_on_error=io (or "all"), fio will stop if it
>>> sees an IO error, or at least the job that gets the IO error will stop.
>>> The IO error will come from the kernel when your drive stops responding
>>> (IO timeout and is failed and the drive is reset in that case).
>>
>> Thanks for this info.
>>
>>> Which behavior ? That fio stops ? You can try continue_on_error=none and
>>> fio will not stop until it reaches the time or size limit, even if some
>>> IOs fail.
>>
>> I would like fio to fail and exit when the I/O error happens. I was
>> wondering about a way to setup a scenario where an artificial IO error
>> will occur to make sure it does, if that makes sense.
> 
> You can use write-long, to "destroy" sectors: you will get errors when
> attempting to read the affected sectors. But that is a really big hammer. A

Note: write long is for ATA drives only. That does not apply to nvme.

> simpler solution is to use dm-flakey to create "soft" IO errors.

And Vincent also pointed out null_blk error injection. dm-flakey can go on top
of any block device.

> 
>>
>>> The default IO timeout for the kernel is 30s. If your drive stops
>>> responding for more than that, IOs will be aborted and failed (the user
>>> sees an error) and drive reset.
>>
>> Hmm. I had 65 seconds between any I/O; it sounds like that would've
>> been enough to fail things, but fio returned immediately after that 65
>> second delayed I/O, and with no error.
> 
> The IO was likely retried.
> 
>>
>> I also found the drive timeout error in syslog:
>> Sep  7 12:37:43 localhost kernel: [ 4354.600211] nvme nvme0: I/O 870
>> QID 4 timeout, aborting
>> Sep  7 12:37:43 localhost kernel: [ 4354.615429] nvme nvme0: Abort status: 0x0
>> Sep  7 12:38:15 localhost kernel: [ 4386.600297] nvme nvme0: I/O 870
>> QID 4 timeout, reset controller
>> Sep  7 12:38:17 localhost kernel: [ 4388.050831] nvme nvme0: 7/0/0
>> default/read/poll queues
>> Sep  7 12:38:18 localhost kernel: [ 4389.437287]  nvme0n1: AHDI p1 p2 p4
>> Sep  7 12:38:18 localhost kernel: [ 4389.437347] nvme0n1: p2 start
>> 2240010287 is beyond EOD, truncated
>> Sep  7 12:38:18 localhost kernel: [ 4389.437350] nvme0n1: p4 start
>> 2472081425 is beyond EOD, truncated
>>
>> Combining the fio and syslog, the chain of events appears to be:
>> 4332 seconds - drive IO stops
>> 4353 seconds - syslog entry for timeout/abort
>> 4386 seconds - syslog entry for timeout/reset
>> 4387 seconds - read completes and fio exits without error
>>
>> Thanks,
>> Nick
> 

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Best practices for handling drive failures during a run?
  2022-09-10  8:27       ` Damien Le Moal
  2022-09-10  8:37         ` Damien Le Moal
@ 2022-09-22 23:11         ` Nick Neumann
  2022-09-26 19:14           ` Bryan Gurney
  1 sibling, 1 reply; 11+ messages in thread
From: Nick Neumann @ 2022-09-22 23:11 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: fio

On Sat, Sep 10, 2022 at 3:28 AM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:
> You can use write-long, to "destroy" sectors: you will get errors when
> attempting to read the affected sectors. But that is a really big hammer. A
> simpler solution is to use dm-flakey to create "soft" IO errors.

Thank you for mentioning this - I'm not a linux veteran so I did not
know about these tools.

I tried dm-flakey, but when the device is down, the errors are
returned immediately. I also looked at dm-delay, and that actually
worked pretty well for getting fio to sit and wait on an I/O.

Unfortunately I have a hard time getting the delay to be "big". The
time it takes to add the delay rule appears to be a linear function of
the amount of delay, with a very big constant factor. A half second
delay takes 11 seconds to add, and a 5 second delay takes 112 seconds:
sudo time dmsetup create test9 --table "0 1024 delay /dev/nullb1 0 500
/dev/nullb1 0 0"
0.00user
0.00system
0:11.28elapsed
...
sudo time dmsetup create test10 --table "0 1024 delay /dev/nullb1 0
5000 /dev/nullb1 0 0"
0.00user
0.00system
1:52.70elapsed

And unfortunately something breaks at some point, as my attempt to do
a 70 second delay had not finished after 2 hours. I'm experimenting
right now to try to find a smaller but still big value that is useful
for testing the nvme timeout/retry defaults. I've seen code snippets
online though that set the delay to 100 seconds, so I'm at a loss why
the time to do it is growing so large on my system.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Best practices for handling drive failures during a run?
  2022-09-22 23:11         ` Nick Neumann
@ 2022-09-26 19:14           ` Bryan Gurney
  0 siblings, 0 replies; 11+ messages in thread
From: Bryan Gurney @ 2022-09-26 19:14 UTC (permalink / raw)
  To: Nick Neumann; +Cc: fio

On Thu, Sep 22, 2022 at 7:11 PM Nick Neumann <nick@pcpartpicker.com> wrote:
>
> On Sat, Sep 10, 2022 at 3:28 AM Damien Le Moal
> <damien.lemoal@opensource.wdc.com> wrote:
> > You can use write-long, to "destroy" sectors: you will get errors when
> > attempting to read the affected sectors. But that is a really big hammer. A
> > simpler solution is to use dm-flakey to create "soft" IO errors.
>
> Thank you for mentioning this - I'm not a linux veteran so I did not
> know about these tools.
>
> I tried dm-flakey, but when the device is down, the errors are
> returned immediately. I also looked at dm-delay, and that actually
> worked pretty well for getting fio to sit and wait on an I/O.
>
> Unfortunately I have a hard time getting the delay to be "big". The
> time it takes to add the delay rule appears to be a linear function of
> the amount of delay, with a very big constant factor. A half second
> delay takes 11 seconds to add, and a 5 second delay takes 112 seconds:
> sudo time dmsetup create test9 --table "0 1024 delay /dev/nullb1 0 500
> /dev/nullb1 0 0"
> 0.00user
> 0.00system
> 0:11.28elapsed
> ...
> sudo time dmsetup create test10 --table "0 1024 delay /dev/nullb1 0
> 5000 /dev/nullb1 0 0"
> 0.00user
> 0.00system
> 1:52.70elapsed
>
> And unfortunately something breaks at some point, as my attempt to do
> a 70 second delay had not finished after 2 hours. I'm experimenting
> right now to try to find a smaller but still big value that is useful
> for testing the nvme timeout/retry defaults. I've seen code snippets
> online though that set the delay to 100 seconds, so I'm at a loss why
> the time to do it is growing so large on my system.
>

Hi Nick,

If you're trying to create an error at an arbitrary location, at an
arbitrary time, you might be interested in using the dm-dust target.
The documentation in the admin-guide for dm-dust [1] has information
on the command interface that the target uses (via the "dmsetup
message" command) in order to set up a specific failure scenario for a
test device.


Thanks,

Bryan

[1] https://www.kernel.org/doc/html/v5.19/admin-guide/device-mapper/dm-dust.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Best practices for handling drive failures during a run?
  2022-09-07 15:58 ` Best practices for handling drive failures during a run? Nick Neumann
  2022-09-07 21:01   ` Damien Le Moal
@ 2022-09-08 14:59   ` Vincent Fu
       [not found]     ` <CADqNVTpvftJZaJ2AepPMcbaJJr7vYMaLdivCf85j8Rwrq_02Fw@mail.gmail.com>
  1 sibling, 1 reply; 11+ messages in thread
From: Vincent Fu @ 2022-09-08 14:59 UTC (permalink / raw)
  To: Nick Neumann, fio

> -----Original Message-----
> From: Nick Neumann [mailto:nick@pcpartpicker.com]
> Sent: Wednesday, September 7, 2022 11:58 AM
> To: fio@vger.kernel.org
> Subject: Best practices for handling drive failures during a run?
> 
> I was wondering if there were any recommendations/suggestions on
> handling drive failures during a fio run. I hit one yesterday with a
> 60 second mixed use test on an SSD. 51 seconds in, the drive basically
> stopped responding. (A separate program that periodically calls
> smartctl to get drive state also showed something was up, as data like
> temperature was missing.)
> 
> At 107 seconds, a read completed, and fio exited.
> 
> It made me wonder what would have happened if the test was not time
> limited - e.g., a full drive write. Would it have just hung, waiting
> forever? Or would the OS eventually get back to fio and tell it the
> submitted operations have failed and fio would exit?
> 
> Any ideas on ways to test the behavior, or areas of the code to look at?
> 

The null_blk device supports error injection via the badblocks configfs
variable. So you could use it for testing. There is a help guide for setting up
null_blk devices via configfs at https://zonedstorage.io/docs/getting-started/nullblk

Vincent

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <CADqNVTpvftJZaJ2AepPMcbaJJr7vYMaLdivCf85j8Rwrq_02Fw@mail.gmail.com>]

[parent not found: <CADqNVTpkMUVx2+tUXTRivNqsejbP_Sto90pihzm+L210=MC25A@mail.gmail.com>]

* RE: Best practices for handling drive failures during a run?
       [not found]       ` <CADqNVTpkMUVx2+tUXTRivNqsejbP_Sto90pihzm+L210=MC25A@mail.gmail.com>
@ 2022-09-10  3:56         ` Vincent Fu
  2022-09-10  4:03           ` Nick Neumann
  2022-09-22 20:29           ` Nick Neumann
  0 siblings, 2 replies; 11+ messages in thread
From: Vincent Fu @ 2022-09-10  3:56 UTC (permalink / raw)
  To: Nick Neumann, fio

> -----Original Message-----
> From: Nick Neumann [mailto:nick@pcpartpicker.com]
> Sent: Friday, September 9, 2022 1:36 PM
> To: Vincent Fu <vincent.fu@samsung.com>
> Subject: Re: Best practices for handling drive failures during a run?
> 
> On Thu, Sep 8, 2022 at 9:59 AM Vincent Fu <vincent.fu@samsung.com>
> wrote:
> > The null_blk device supports error injection via the badblocks configfs
> > variable. So you could use it for testing. There is a help guide for setting
> up
> > null_blk devices via configfs at
> https://urldefense.com/v3/__https://protect2.fireeye.com/v1/url?k=bd
> b74897-dccaa0d0-bdb6c3d8-74fe485fff30-
> 0650a1ccc4e5ca1a&q=1&e=291075fc-ffa2-4322-9397-
> 198b60384262&u=https*3A*2F*2Fzonedstorage.io*2Fdocs*2Fgetting-
> started*2Fnullblk__;JSUlJSUl!!EwVzqGoTKBqv-
> 0DWAJBm!W9rOwMglqKKaCnQtFY5uxwBUSkjLX1lhAlPzBQnrfrGmWSeJ_
> 4jkgda9wa6GENWYaiGzIHIB7HHWUnS7si67aQ$
> 
> So this was really nice to learn about and pretty easy to use.
> (Although I will say I saw all kinds of weird behavior like the device
> saying it didn't support O_DIRECT, other wacky behavior, I believe due
> to making config changes while the device was powered on.)
> 
> With it, fio did return immediately after the error, return an error
> code, print error messages above the json output, and set error to 5
> in the json for the job.
> 
> Unfortunately, the same did not happen with the drive hang/abort/reset
> I hit. Which must mean no I/O error was actually returned to fio.
> Checking the fio latency log, that last read reported a latency of
> 63.6 seconds.
> 
> I'm guessing fio sat in wait_for_completion all of this time. For some
> reason the drive's behavior wasn't enough to cause an I/O error -
> perhaps it would have eventually.
> 
> Any other thoughts on why the OS was willing to let this read go for
> so long without an I/O error? I verified
> /sys/module/nvme_core/parameters/io_timeout is 30, but
> /sys/module/nvme_core/parameters/max_retries is 5, so maybe that is
> the issue.
> 
> Thanks,
> Nick

You could test your theory about max_retries by creating an NVMe fabrics
loopback device backed by null_blk with error injection. Then try to access one
of the bad blocks via the nvme device and see if the delay before fio sees
the error depends on io_timeout and max_retries in the way that you expect.

I'm cc'ing the list on this reply in case anyone else wants to chime in.

Vincent

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Best practices for handling drive failures during a run?
  2022-09-10  3:56         ` Vincent Fu
@ 2022-09-10  4:03           ` Nick Neumann
  2022-09-22 20:29           ` Nick Neumann
  1 sibling, 0 replies; 11+ messages in thread
From: Nick Neumann @ 2022-09-10  4:03 UTC (permalink / raw)
  To: Vincent Fu; +Cc: fio

On Fri, Sep 9, 2022 at 10:56 PM Vincent Fu <vincent.fu@samsung.com> wrote:
> You could test your theory about max_retries by creating an NVMe fabrics
> loopback device backed by null_blk with error injection. Then try to access one
> of the bad blocks via the nvme device and see if the delay before fio sees
> the error depends on io_timeout and max_retries in the way that you expect.

Oooh, that sounds great. Thanks for the suggestion. I'll get to it
Monday if I don't find some time this weekend.

Coincidentally, one of the things I found googling was someone using
NVMe fabrics complaining that nvme_core/io_timeout and
nvme_core/max_retries were not being honored. It was from 2019 but
seemed relevant.(https://lore.kernel.org/all/EA2BFA4D4BAD49629F533A98F74DCE42@alyakaslap/T/#m26b5c91ec59de5159961a26a6cb0340c32a05ec9)

I'll report back with what I see.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Best practices for handling drive failures during a run?
  2022-09-10  3:56         ` Vincent Fu
  2022-09-10  4:03           ` Nick Neumann
@ 2022-09-22 20:29           ` Nick Neumann
  1 sibling, 0 replies; 11+ messages in thread
From: Nick Neumann @ 2022-09-22 20:29 UTC (permalink / raw)
  To: Vincent Fu; +Cc: fio

On Fri, Sep 9, 2022 at 10:56 PM Vincent Fu <vincent.fu@samsung.com> wrote:
> You could test your theory about max_retries by creating an NVMe fabrics
> loopback device backed by null_blk with error injection. Then try to access one
> of the bad blocks via the nvme device and see if the delay before fio sees
> the error depends on io_timeout and max_retries in the way that you expect.

I finally got a chance to try this. I had to learn enough about nvme
fabrics and combine it with the null_blk bad blocks as before. I think
I'm doing everything right, but I'm wondering if I missed something,
because the behavior trying to write to the nvme fabric device which
has a null_blk device "backing" it is the same as trying to write to
the null_blk device directly - immediate error and termination of fio.
I wonder if this should really surprise me though since the underlying
device doesn't experience a timeout and its error is immediately
propagated to the client over the nvme fabric (apologies if I'm using
any terminology wrong). This was my basic setup:

sudo modprobe null_blk nr_devices=0
sudo mkdir /sys/kernel/config/nullb/nullb0
echo 1 | sudo tee -a /sys/kernel/config/nullb/nullb0/memory_backed
echo "+1-100" | sudo tee -a /sys/kernel/config/nullb/nullb0/badblocks
echo 1 | sudo tee -a /sys/kernel/config/nullb/nullb0/power
# First fio run directly on null device returns error immediately
sudo fio --filename=/dev/nullb0 --name=job --ioengine=libaio
--direct=1 --size=1M --rw=rw --rwmixwrite=100 --bs=128K
sudo modprobe nvme_tcp
sudo modprobe nvmet-tcp
sudo mkdir /sys/kernel/config/nvmet/subsystems/nvmet-test
cd /sys/kernel/config/nvmet/subsystems/nvmet-test
echo 1 | sudo tee -a attr_allow_any_host
sudo mkdir namespaces/1
cd namespaces/1
echo -n /dev/nullb0 | sudo tee -a device_path
echo 1 | sudo tee -a enable
mkdir /sys/kernel/config/nvmet/ports/1
sudo mkdir /sys/kernel/config/nvmet/ports/1
echo 127.0.0.1 | sudo tee -a /sys/kernel/config/nvmet/ports/1/addr_traddr
echo tcp | sudo tee -a /sys/kernel/config/nvmet/ports/1/addr_trtype
echo 4420 | sudo tee -a /sys/kernel/config/nvmet/ports/1/addr_trsvcid
echo ipv4 | sudo tee -a /sys/kernel/config/nvmet/ports/1/addr_adrfam
sudo ln -s /sys/kernel/config/nvmet/subsystems/nvmet-test
/sys/kernel/config/nvmet/ports/1/subsystems/nvmet-test
sudo dmesg | grep nvmet_tcp
sudo modprobe nvme
sudo nvme discover -t tcp -a 127.0.0.1 -s 4420
sudo nvme connect -t tcp -n nvmet-test -a 127.0.0.1 -s 4420
sudo nvme list
cat /proc/partitions | grep nvme
# This guy also returns error immediately
sudo fio --filename=/dev/nvme0n1 --name=job --ioengine=libaio
--direct=1 --size=1M --rw=rw --rwmixwrite=100 --bs=128K

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-09-26 19:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20220907155824uscas1p14b8cddb2c97e9b29b194f25ec3edc4b1@uscas1p1.samsung.com>
2022-09-07 15:58 ` Best practices for handling drive failures during a run? Nick Neumann
2022-09-07 21:01   ` Damien Le Moal
2022-09-08 16:24     ` Nick Neumann
2022-09-10  8:27       ` Damien Le Moal
2022-09-10  8:37         ` Damien Le Moal
2022-09-22 23:11         ` Nick Neumann
2022-09-26 19:14           ` Bryan Gurney
2022-09-08 14:59   ` Vincent Fu
     [not found]     ` <CADqNVTpvftJZaJ2AepPMcbaJJr7vYMaLdivCf85j8Rwrq_02Fw@mail.gmail.com>
     [not found]       ` <CADqNVTpkMUVx2+tUXTRivNqsejbP_Sto90pihzm+L210=MC25A@mail.gmail.com>
2022-09-10  3:56         ` Vincent Fu
2022-09-10  4:03           ` Nick Neumann
2022-09-22 20:29           ` Nick Neumann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).