* Best practices for handling drive failures during a run? @ 2022-09-07 15:58 ` Nick Neumann 2022-09-07 21:01 ` Damien Le Moal 2022-09-08 14:59 ` Vincent Fu 0 siblings, 2 replies; 11+ messages in thread From: Nick Neumann @ 2022-09-07 15:58 UTC (permalink / raw) To: fio I was wondering if there were any recommendations/suggestions on handling drive failures during a fio run. I hit one yesterday with a 60 second mixed use test on an SSD. 51 seconds in, the drive basically stopped responding. (A separate program that periodically calls smartctl to get drive state also showed something was up, as data like temperature was missing.) At 107 seconds, a read completed, and fio exited. It made me wonder what would have happened if the test was not time limited - e.g., a full drive write. Would it have just hung, waiting forever? Or would the OS eventually get back to fio and tell it the submitted operations have failed and fio would exit? Any ideas on ways to test the behavior, or areas of the code to look at? I'm basically looking for input on how to make sure fio does not hang in such situations. And even better would be if I could get fio to return an error if it does happen - I could see the controls for reporting error being configurable - e.g., if an operation doesn't return for N seconds, stop the job and return an error. I'm happy to work on implementing stuff to help with this, and wanted to see where things currently are at and what others thought about the general issue. Thanks, Nick ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Best practices for handling drive failures during a run? 2022-09-07 15:58 ` Best practices for handling drive failures during a run? Nick Neumann @ 2022-09-07 21:01 ` Damien Le Moal 2022-09-08 16:24 ` Nick Neumann 2022-09-08 14:59 ` Vincent Fu 1 sibling, 1 reply; 11+ messages in thread From: Damien Le Moal @ 2022-09-07 21:01 UTC (permalink / raw) To: Nick Neumann, fio On 9/8/22 00:58, Nick Neumann wrote: > I was wondering if there were any recommendations/suggestions on > handling drive failures during a fio run. I hit one yesterday with a > 60 second mixed use test on an SSD. 51 seconds in, the drive basically > stopped responding. (A separate program that periodically calls > smartctl to get drive state also showed something was up, as data like > temperature was missing.) > > At 107 seconds, a read completed, and fio exited. > > It made me wonder what would have happened if the test was not time > limited - e.g., a full drive write. Would it have just hung, waiting > forever? Or would the OS eventually get back to fio and tell it the > submitted operations have failed and fio would exit? Unless you are using continue_on_error=io (or "all"), fio will stop if it sees an IO error, or at least the job that gets the IO error will stop. The IO error will come from the kernel when your drive stops responding (IO timeout and is failed and the drive is reset in that case). > > Any ideas on ways to test the behavior, or areas of the code to look at? Which behavior ? That fio stops ? You can try continue_on_error=none and fio will not stop until it reaches the time or size limit, even if some IOs fail. > > I'm basically looking for input on how to make sure fio does not hang > in such situations. And even better would be if I could get fio to > return an error if it does happen - I could see the controls for > reporting error being configurable - e.g., if an operation doesn't > return for N seconds, stop the job and return an error. I'm happy to > work on implementing stuff to help with this, and wanted to see where > things currently are at and what others thought about the general > issue. The default IO timeout for the kernel is 30s. If your drive stops responding for more than that, IOs will be aborted and failed (the user sees an error) and drive reset. > > Thanks, > Nick -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Best practices for handling drive failures during a run? 2022-09-07 21:01 ` Damien Le Moal @ 2022-09-08 16:24 ` Nick Neumann 2022-09-10 8:27 ` Damien Le Moal 0 siblings, 1 reply; 11+ messages in thread From: Nick Neumann @ 2022-09-08 16:24 UTC (permalink / raw) To: Damien Le Moal; +Cc: fio On Wed, Sep 7, 2022 at 4:01 PM Damien Le Moal <damien.lemoal@opensource.wdc.com> wrote: > Unless you are using continue_on_error=io (or "all"), fio will stop if it > sees an IO error, or at least the job that gets the IO error will stop. > The IO error will come from the kernel when your drive stops responding > (IO timeout and is failed and the drive is reset in that case). Thanks for this info. > Which behavior ? That fio stops ? You can try continue_on_error=none and > fio will not stop until it reaches the time or size limit, even if some > IOs fail. I would like fio to fail and exit when the I/O error happens. I was wondering about a way to setup a scenario where an artificial IO error will occur to make sure it does, if that makes sense. > The default IO timeout for the kernel is 30s. If your drive stops > responding for more than that, IOs will be aborted and failed (the user > sees an error) and drive reset. Hmm. I had 65 seconds between any I/O; it sounds like that would've been enough to fail things, but fio returned immediately after that 65 second delayed I/O, and with no error. I also found the drive timeout error in syslog: Sep 7 12:37:43 localhost kernel: [ 4354.600211] nvme nvme0: I/O 870 QID 4 timeout, aborting Sep 7 12:37:43 localhost kernel: [ 4354.615429] nvme nvme0: Abort status: 0x0 Sep 7 12:38:15 localhost kernel: [ 4386.600297] nvme nvme0: I/O 870 QID 4 timeout, reset controller Sep 7 12:38:17 localhost kernel: [ 4388.050831] nvme nvme0: 7/0/0 default/read/poll queues Sep 7 12:38:18 localhost kernel: [ 4389.437287] nvme0n1: AHDI p1 p2 p4 Sep 7 12:38:18 localhost kernel: [ 4389.437347] nvme0n1: p2 start 2240010287 is beyond EOD, truncated Sep 7 12:38:18 localhost kernel: [ 4389.437350] nvme0n1: p4 start 2472081425 is beyond EOD, truncated Combining the fio and syslog, the chain of events appears to be: 4332 seconds - drive IO stops 4353 seconds - syslog entry for timeout/abort 4386 seconds - syslog entry for timeout/reset 4387 seconds - read completes and fio exits without error Thanks, Nick ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Best practices for handling drive failures during a run? 2022-09-08 16:24 ` Nick Neumann @ 2022-09-10 8:27 ` Damien Le Moal 2022-09-10 8:37 ` Damien Le Moal 2022-09-22 23:11 ` Nick Neumann 0 siblings, 2 replies; 11+ messages in thread From: Damien Le Moal @ 2022-09-10 8:27 UTC (permalink / raw) To: Nick Neumann; +Cc: fio On 2022/09/09 1:24, Nick Neumann wrote: > On Wed, Sep 7, 2022 at 4:01 PM Damien Le Moal > <damien.lemoal@opensource.wdc.com> wrote: > >> Unless you are using continue_on_error=io (or "all"), fio will stop if it >> sees an IO error, or at least the job that gets the IO error will stop. >> The IO error will come from the kernel when your drive stops responding >> (IO timeout and is failed and the drive is reset in that case). > > Thanks for this info. > >> Which behavior ? That fio stops ? You can try continue_on_error=none and >> fio will not stop until it reaches the time or size limit, even if some >> IOs fail. > > I would like fio to fail and exit when the I/O error happens. I was > wondering about a way to setup a scenario where an artificial IO error > will occur to make sure it does, if that makes sense. You can use write-long, to "destroy" sectors: you will get errors when attempting to read the affected sectors. But that is a really big hammer. A simpler solution is to use dm-flakey to create "soft" IO errors. > >> The default IO timeout for the kernel is 30s. If your drive stops >> responding for more than that, IOs will be aborted and failed (the user >> sees an error) and drive reset. > > Hmm. I had 65 seconds between any I/O; it sounds like that would've > been enough to fail things, but fio returned immediately after that 65 > second delayed I/O, and with no error. The IO was likely retried. > > I also found the drive timeout error in syslog: > Sep 7 12:37:43 localhost kernel: [ 4354.600211] nvme nvme0: I/O 870 > QID 4 timeout, aborting > Sep 7 12:37:43 localhost kernel: [ 4354.615429] nvme nvme0: Abort status: 0x0 > Sep 7 12:38:15 localhost kernel: [ 4386.600297] nvme nvme0: I/O 870 > QID 4 timeout, reset controller > Sep 7 12:38:17 localhost kernel: [ 4388.050831] nvme nvme0: 7/0/0 > default/read/poll queues > Sep 7 12:38:18 localhost kernel: [ 4389.437287] nvme0n1: AHDI p1 p2 p4 > Sep 7 12:38:18 localhost kernel: [ 4389.437347] nvme0n1: p2 start > 2240010287 is beyond EOD, truncated > Sep 7 12:38:18 localhost kernel: [ 4389.437350] nvme0n1: p4 start > 2472081425 is beyond EOD, truncated > > Combining the fio and syslog, the chain of events appears to be: > 4332 seconds - drive IO stops > 4353 seconds - syslog entry for timeout/abort > 4386 seconds - syslog entry for timeout/reset > 4387 seconds - read completes and fio exits without error > > Thanks, > Nick -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Best practices for handling drive failures during a run? 2022-09-10 8:27 ` Damien Le Moal @ 2022-09-10 8:37 ` Damien Le Moal 2022-09-22 23:11 ` Nick Neumann 1 sibling, 0 replies; 11+ messages in thread From: Damien Le Moal @ 2022-09-10 8:37 UTC (permalink / raw) To: Nick Neumann; +Cc: fio On 2022/09/10 17:27, Damien Le Moal wrote: > On 2022/09/09 1:24, Nick Neumann wrote: >> On Wed, Sep 7, 2022 at 4:01 PM Damien Le Moal >> <damien.lemoal@opensource.wdc.com> wrote: >> >>> Unless you are using continue_on_error=io (or "all"), fio will stop if it >>> sees an IO error, or at least the job that gets the IO error will stop. >>> The IO error will come from the kernel when your drive stops responding >>> (IO timeout and is failed and the drive is reset in that case). >> >> Thanks for this info. >> >>> Which behavior ? That fio stops ? You can try continue_on_error=none and >>> fio will not stop until it reaches the time or size limit, even if some >>> IOs fail. >> >> I would like fio to fail and exit when the I/O error happens. I was >> wondering about a way to setup a scenario where an artificial IO error >> will occur to make sure it does, if that makes sense. > > You can use write-long, to "destroy" sectors: you will get errors when > attempting to read the affected sectors. But that is a really big hammer. A Note: write long is for ATA drives only. That does not apply to nvme. > simpler solution is to use dm-flakey to create "soft" IO errors. And Vincent also pointed out null_blk error injection. dm-flakey can go on top of any block device. > >> >>> The default IO timeout for the kernel is 30s. If your drive stops >>> responding for more than that, IOs will be aborted and failed (the user >>> sees an error) and drive reset. >> >> Hmm. I had 65 seconds between any I/O; it sounds like that would've >> been enough to fail things, but fio returned immediately after that 65 >> second delayed I/O, and with no error. > > The IO was likely retried. > >> >> I also found the drive timeout error in syslog: >> Sep 7 12:37:43 localhost kernel: [ 4354.600211] nvme nvme0: I/O 870 >> QID 4 timeout, aborting >> Sep 7 12:37:43 localhost kernel: [ 4354.615429] nvme nvme0: Abort status: 0x0 >> Sep 7 12:38:15 localhost kernel: [ 4386.600297] nvme nvme0: I/O 870 >> QID 4 timeout, reset controller >> Sep 7 12:38:17 localhost kernel: [ 4388.050831] nvme nvme0: 7/0/0 >> default/read/poll queues >> Sep 7 12:38:18 localhost kernel: [ 4389.437287] nvme0n1: AHDI p1 p2 p4 >> Sep 7 12:38:18 localhost kernel: [ 4389.437347] nvme0n1: p2 start >> 2240010287 is beyond EOD, truncated >> Sep 7 12:38:18 localhost kernel: [ 4389.437350] nvme0n1: p4 start >> 2472081425 is beyond EOD, truncated >> >> Combining the fio and syslog, the chain of events appears to be: >> 4332 seconds - drive IO stops >> 4353 seconds - syslog entry for timeout/abort >> 4386 seconds - syslog entry for timeout/reset >> 4387 seconds - read completes and fio exits without error >> >> Thanks, >> Nick > -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Best practices for handling drive failures during a run? 2022-09-10 8:27 ` Damien Le Moal 2022-09-10 8:37 ` Damien Le Moal @ 2022-09-22 23:11 ` Nick Neumann 2022-09-26 19:14 ` Bryan Gurney 1 sibling, 1 reply; 11+ messages in thread From: Nick Neumann @ 2022-09-22 23:11 UTC (permalink / raw) To: Damien Le Moal; +Cc: fio On Sat, Sep 10, 2022 at 3:28 AM Damien Le Moal <damien.lemoal@opensource.wdc.com> wrote: > You can use write-long, to "destroy" sectors: you will get errors when > attempting to read the affected sectors. But that is a really big hammer. A > simpler solution is to use dm-flakey to create "soft" IO errors. Thank you for mentioning this - I'm not a linux veteran so I did not know about these tools. I tried dm-flakey, but when the device is down, the errors are returned immediately. I also looked at dm-delay, and that actually worked pretty well for getting fio to sit and wait on an I/O. Unfortunately I have a hard time getting the delay to be "big". The time it takes to add the delay rule appears to be a linear function of the amount of delay, with a very big constant factor. A half second delay takes 11 seconds to add, and a 5 second delay takes 112 seconds: sudo time dmsetup create test9 --table "0 1024 delay /dev/nullb1 0 500 /dev/nullb1 0 0" 0.00user 0.00system 0:11.28elapsed ... sudo time dmsetup create test10 --table "0 1024 delay /dev/nullb1 0 5000 /dev/nullb1 0 0" 0.00user 0.00system 1:52.70elapsed And unfortunately something breaks at some point, as my attempt to do a 70 second delay had not finished after 2 hours. I'm experimenting right now to try to find a smaller but still big value that is useful for testing the nvme timeout/retry defaults. I've seen code snippets online though that set the delay to 100 seconds, so I'm at a loss why the time to do it is growing so large on my system. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Best practices for handling drive failures during a run? 2022-09-22 23:11 ` Nick Neumann @ 2022-09-26 19:14 ` Bryan Gurney 0 siblings, 0 replies; 11+ messages in thread From: Bryan Gurney @ 2022-09-26 19:14 UTC (permalink / raw) To: Nick Neumann; +Cc: fio On Thu, Sep 22, 2022 at 7:11 PM Nick Neumann <nick@pcpartpicker.com> wrote: > > On Sat, Sep 10, 2022 at 3:28 AM Damien Le Moal > <damien.lemoal@opensource.wdc.com> wrote: > > You can use write-long, to "destroy" sectors: you will get errors when > > attempting to read the affected sectors. But that is a really big hammer. A > > simpler solution is to use dm-flakey to create "soft" IO errors. > > Thank you for mentioning this - I'm not a linux veteran so I did not > know about these tools. > > I tried dm-flakey, but when the device is down, the errors are > returned immediately. I also looked at dm-delay, and that actually > worked pretty well for getting fio to sit and wait on an I/O. > > Unfortunately I have a hard time getting the delay to be "big". The > time it takes to add the delay rule appears to be a linear function of > the amount of delay, with a very big constant factor. A half second > delay takes 11 seconds to add, and a 5 second delay takes 112 seconds: > sudo time dmsetup create test9 --table "0 1024 delay /dev/nullb1 0 500 > /dev/nullb1 0 0" > 0.00user > 0.00system > 0:11.28elapsed > ... > sudo time dmsetup create test10 --table "0 1024 delay /dev/nullb1 0 > 5000 /dev/nullb1 0 0" > 0.00user > 0.00system > 1:52.70elapsed > > And unfortunately something breaks at some point, as my attempt to do > a 70 second delay had not finished after 2 hours. I'm experimenting > right now to try to find a smaller but still big value that is useful > for testing the nvme timeout/retry defaults. I've seen code snippets > online though that set the delay to 100 seconds, so I'm at a loss why > the time to do it is growing so large on my system. > Hi Nick, If you're trying to create an error at an arbitrary location, at an arbitrary time, you might be interested in using the dm-dust target. The documentation in the admin-guide for dm-dust [1] has information on the command interface that the target uses (via the "dmsetup message" command) in order to set up a specific failure scenario for a test device. Thanks, Bryan [1] https://www.kernel.org/doc/html/v5.19/admin-guide/device-mapper/dm-dust.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: Best practices for handling drive failures during a run? 2022-09-07 15:58 ` Best practices for handling drive failures during a run? Nick Neumann 2022-09-07 21:01 ` Damien Le Moal @ 2022-09-08 14:59 ` Vincent Fu [not found] ` <CADqNVTpvftJZaJ2AepPMcbaJJr7vYMaLdivCf85j8Rwrq_02Fw@mail.gmail.com> 1 sibling, 1 reply; 11+ messages in thread From: Vincent Fu @ 2022-09-08 14:59 UTC (permalink / raw) To: Nick Neumann, fio > -----Original Message----- > From: Nick Neumann [mailto:nick@pcpartpicker.com] > Sent: Wednesday, September 7, 2022 11:58 AM > To: fio@vger.kernel.org > Subject: Best practices for handling drive failures during a run? > > I was wondering if there were any recommendations/suggestions on > handling drive failures during a fio run. I hit one yesterday with a > 60 second mixed use test on an SSD. 51 seconds in, the drive basically > stopped responding. (A separate program that periodically calls > smartctl to get drive state also showed something was up, as data like > temperature was missing.) > > At 107 seconds, a read completed, and fio exited. > > It made me wonder what would have happened if the test was not time > limited - e.g., a full drive write. Would it have just hung, waiting > forever? Or would the OS eventually get back to fio and tell it the > submitted operations have failed and fio would exit? > > Any ideas on ways to test the behavior, or areas of the code to look at? > The null_blk device supports error injection via the badblocks configfs variable. So you could use it for testing. There is a help guide for setting up null_blk devices via configfs at https://zonedstorage.io/docs/getting-started/nullblk Vincent ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <CADqNVTpvftJZaJ2AepPMcbaJJr7vYMaLdivCf85j8Rwrq_02Fw@mail.gmail.com>]
[parent not found: <CADqNVTpkMUVx2+tUXTRivNqsejbP_Sto90pihzm+L210=MC25A@mail.gmail.com>]
* RE: Best practices for handling drive failures during a run? [not found] ` <CADqNVTpkMUVx2+tUXTRivNqsejbP_Sto90pihzm+L210=MC25A@mail.gmail.com> @ 2022-09-10 3:56 ` Vincent Fu 2022-09-10 4:03 ` Nick Neumann 2022-09-22 20:29 ` Nick Neumann 0 siblings, 2 replies; 11+ messages in thread From: Vincent Fu @ 2022-09-10 3:56 UTC (permalink / raw) To: Nick Neumann, fio > -----Original Message----- > From: Nick Neumann [mailto:nick@pcpartpicker.com] > Sent: Friday, September 9, 2022 1:36 PM > To: Vincent Fu <vincent.fu@samsung.com> > Subject: Re: Best practices for handling drive failures during a run? > > On Thu, Sep 8, 2022 at 9:59 AM Vincent Fu <vincent.fu@samsung.com> > wrote: > > The null_blk device supports error injection via the badblocks configfs > > variable. So you could use it for testing. There is a help guide for setting > up > > null_blk devices via configfs at > https://urldefense.com/v3/__https://protect2.fireeye.com/v1/url?k=bd > b74897-dccaa0d0-bdb6c3d8-74fe485fff30- > 0650a1ccc4e5ca1a&q=1&e=291075fc-ffa2-4322-9397- > 198b60384262&u=https*3A*2F*2Fzonedstorage.io*2Fdocs*2Fgetting- > started*2Fnullblk__;JSUlJSUl!!EwVzqGoTKBqv- > 0DWAJBm!W9rOwMglqKKaCnQtFY5uxwBUSkjLX1lhAlPzBQnrfrGmWSeJ_ > 4jkgda9wa6GENWYaiGzIHIB7HHWUnS7si67aQ$ > > So this was really nice to learn about and pretty easy to use. > (Although I will say I saw all kinds of weird behavior like the device > saying it didn't support O_DIRECT, other wacky behavior, I believe due > to making config changes while the device was powered on.) > > With it, fio did return immediately after the error, return an error > code, print error messages above the json output, and set error to 5 > in the json for the job. > > Unfortunately, the same did not happen with the drive hang/abort/reset > I hit. Which must mean no I/O error was actually returned to fio. > Checking the fio latency log, that last read reported a latency of > 63.6 seconds. > > I'm guessing fio sat in wait_for_completion all of this time. For some > reason the drive's behavior wasn't enough to cause an I/O error - > perhaps it would have eventually. > > Any other thoughts on why the OS was willing to let this read go for > so long without an I/O error? I verified > /sys/module/nvme_core/parameters/io_timeout is 30, but > /sys/module/nvme_core/parameters/max_retries is 5, so maybe that is > the issue. > > Thanks, > Nick You could test your theory about max_retries by creating an NVMe fabrics loopback device backed by null_blk with error injection. Then try to access one of the bad blocks via the nvme device and see if the delay before fio sees the error depends on io_timeout and max_retries in the way that you expect. I'm cc'ing the list on this reply in case anyone else wants to chime in. Vincent ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Best practices for handling drive failures during a run? 2022-09-10 3:56 ` Vincent Fu @ 2022-09-10 4:03 ` Nick Neumann 2022-09-22 20:29 ` Nick Neumann 1 sibling, 0 replies; 11+ messages in thread From: Nick Neumann @ 2022-09-10 4:03 UTC (permalink / raw) To: Vincent Fu; +Cc: fio On Fri, Sep 9, 2022 at 10:56 PM Vincent Fu <vincent.fu@samsung.com> wrote: > You could test your theory about max_retries by creating an NVMe fabrics > loopback device backed by null_blk with error injection. Then try to access one > of the bad blocks via the nvme device and see if the delay before fio sees > the error depends on io_timeout and max_retries in the way that you expect. Oooh, that sounds great. Thanks for the suggestion. I'll get to it Monday if I don't find some time this weekend. Coincidentally, one of the things I found googling was someone using NVMe fabrics complaining that nvme_core/io_timeout and nvme_core/max_retries were not being honored. It was from 2019 but seemed relevant.(https://lore.kernel.org/all/EA2BFA4D4BAD49629F533A98F74DCE42@alyakaslap/T/#m26b5c91ec59de5159961a26a6cb0340c32a05ec9) I'll report back with what I see. Thanks, Nick ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Best practices for handling drive failures during a run? 2022-09-10 3:56 ` Vincent Fu 2022-09-10 4:03 ` Nick Neumann @ 2022-09-22 20:29 ` Nick Neumann 1 sibling, 0 replies; 11+ messages in thread From: Nick Neumann @ 2022-09-22 20:29 UTC (permalink / raw) To: Vincent Fu; +Cc: fio On Fri, Sep 9, 2022 at 10:56 PM Vincent Fu <vincent.fu@samsung.com> wrote: > You could test your theory about max_retries by creating an NVMe fabrics > loopback device backed by null_blk with error injection. Then try to access one > of the bad blocks via the nvme device and see if the delay before fio sees > the error depends on io_timeout and max_retries in the way that you expect. I finally got a chance to try this. I had to learn enough about nvme fabrics and combine it with the null_blk bad blocks as before. I think I'm doing everything right, but I'm wondering if I missed something, because the behavior trying to write to the nvme fabric device which has a null_blk device "backing" it is the same as trying to write to the null_blk device directly - immediate error and termination of fio. I wonder if this should really surprise me though since the underlying device doesn't experience a timeout and its error is immediately propagated to the client over the nvme fabric (apologies if I'm using any terminology wrong). This was my basic setup: sudo modprobe null_blk nr_devices=0 sudo mkdir /sys/kernel/config/nullb/nullb0 echo 1 | sudo tee -a /sys/kernel/config/nullb/nullb0/memory_backed echo "+1-100" | sudo tee -a /sys/kernel/config/nullb/nullb0/badblocks echo 1 | sudo tee -a /sys/kernel/config/nullb/nullb0/power # First fio run directly on null device returns error immediately sudo fio --filename=/dev/nullb0 --name=job --ioengine=libaio --direct=1 --size=1M --rw=rw --rwmixwrite=100 --bs=128K sudo modprobe nvme_tcp sudo modprobe nvmet-tcp sudo mkdir /sys/kernel/config/nvmet/subsystems/nvmet-test cd /sys/kernel/config/nvmet/subsystems/nvmet-test echo 1 | sudo tee -a attr_allow_any_host sudo mkdir namespaces/1 cd namespaces/1 echo -n /dev/nullb0 | sudo tee -a device_path echo 1 | sudo tee -a enable mkdir /sys/kernel/config/nvmet/ports/1 sudo mkdir /sys/kernel/config/nvmet/ports/1 echo 127.0.0.1 | sudo tee -a /sys/kernel/config/nvmet/ports/1/addr_traddr echo tcp | sudo tee -a /sys/kernel/config/nvmet/ports/1/addr_trtype echo 4420 | sudo tee -a /sys/kernel/config/nvmet/ports/1/addr_trsvcid echo ipv4 | sudo tee -a /sys/kernel/config/nvmet/ports/1/addr_adrfam sudo ln -s /sys/kernel/config/nvmet/subsystems/nvmet-test /sys/kernel/config/nvmet/ports/1/subsystems/nvmet-test sudo dmesg | grep nvmet_tcp sudo modprobe nvme sudo nvme discover -t tcp -a 127.0.0.1 -s 4420 sudo nvme connect -t tcp -n nvmet-test -a 127.0.0.1 -s 4420 sudo nvme list cat /proc/partitions | grep nvme # This guy also returns error immediately sudo fio --filename=/dev/nvme0n1 --name=job --ioengine=libaio --direct=1 --size=1M --rw=rw --rwmixwrite=100 --bs=128K ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-09-26 19:15 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20220907155824uscas1p14b8cddb2c97e9b29b194f25ec3edc4b1@uscas1p1.samsung.com> 2022-09-07 15:58 ` Best practices for handling drive failures during a run? Nick Neumann 2022-09-07 21:01 ` Damien Le Moal 2022-09-08 16:24 ` Nick Neumann 2022-09-10 8:27 ` Damien Le Moal 2022-09-10 8:37 ` Damien Le Moal 2022-09-22 23:11 ` Nick Neumann 2022-09-26 19:14 ` Bryan Gurney 2022-09-08 14:59 ` Vincent Fu [not found] ` <CADqNVTpvftJZaJ2AepPMcbaJJr7vYMaLdivCf85j8Rwrq_02Fw@mail.gmail.com> [not found] ` <CADqNVTpkMUVx2+tUXTRivNqsejbP_Sto90pihzm+L210=MC25A@mail.gmail.com> 2022-09-10 3:56 ` Vincent Fu 2022-09-10 4:03 ` Nick Neumann 2022-09-22 20:29 ` Nick Neumann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).