* during fio scenario with verify=meta some jobs fails @ 2015-04-08 10:37 Samuel Shapiro 2015-04-10 18:29 ` Sitsofe Wheeler 0 siblings, 1 reply; 9+ messages in thread From: Samuel Shapiro @ 2015-04-08 10:37 UTC (permalink / raw) To: fio Hi All, I'd recently started using fio for NAS FS tests, and it's really great tool, as for loading system, so for finding various protocols bugs. Thank You :) One of the scenarios, I'm using is "fsx", that is slightly modified to run with 16 jobs: ; This job file works pretty works similarly to running fsx-linux ; with -r 4096 -w 4096 -Z -N 500000 [global] verify=meta verify_dump=1 [fsx-file-job1] directory=/mnt/FIO-some-mountpoint-to-NAS-01 ioengine=libaio iodepth=256 rw=randrw size=256k bs=4k norandommap direct=1 loops=500000 [fsx-file-job2] directory=/mnt/FIO-some-mountpoint-to-NAS-02 ioengine=libaio iodepth=256 rw=randrw size=256k bs=4k norandommap direct=1 loops=500000 . . . [fsx-file-job16] directory=/mnt/FIO-some-mountpoint-to-NAS-03 etc.... After several minutes if running scenario, jobs starting to fail on "verify meta", generating ".state" files. When I tried to run the same scenario with verify=md5 , there were no more verify failures reported. So I have couple of questions: 1. What is the possible reason verify=meta can fail? 2. Should I suspect my File System for data corruption or there is some known fio issue? 3. How can I parse ".state" file so I could compare its data to original file on storage? Thank You, Samuel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: during fio scenario with verify=meta some jobs fails 2015-04-08 10:37 during fio scenario with verify=meta some jobs fails Samuel Shapiro @ 2015-04-10 18:29 ` Sitsofe Wheeler 2015-04-11 3:04 ` Samuel Shapiro 0 siblings, 1 reply; 9+ messages in thread From: Sitsofe Wheeler @ 2015-04-10 18:29 UTC (permalink / raw) To: Samuel Shapiro; +Cc: fio On 8 April 2015 at 11:37, Samuel Shapiro <samuel.sh79@gmail.com> wrote: > Hi All, > > I'd recently started using fio for NAS FS tests, and it's really great > tool, as for loading system, > so for finding various protocols bugs. Thank You :) > One of the scenarios, I'm using is "fsx", that is slightly modified to > run with 16 jobs: > > ; This job file works pretty works similarly to running fsx-linux > ; with -r 4096 -w 4096 -Z -N 500000 > [global] > verify=meta > verify_dump=1 > > [fsx-file-job1] > directory=/mnt/FIO-some-mountpoint-to-NAS-01 > ioengine=libaio > iodepth=256 > rw=randrw > size=256k > bs=4k > norandommap ^^^ This allows the same block to be overwritten multiple times. Are you sure this is what you want? -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: during fio scenario with verify=meta some jobs fails 2015-04-10 18:29 ` Sitsofe Wheeler @ 2015-04-11 3:04 ` Samuel Shapiro 2015-04-11 6:10 ` Sitsofe Wheeler 0 siblings, 1 reply; 9+ messages in thread From: Samuel Shapiro @ 2015-04-11 3:04 UTC (permalink / raw) To: Sitsofe Wheeler; +Cc: fio Well i guess it could be a good test for data integrity, so why not? :) On Fri, Apr 10, 2015 at 9:29 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > On 8 April 2015 at 11:37, Samuel Shapiro <samuel.sh79@gmail.com> wrote: >> Hi All, >> >> I'd recently started using fio for NAS FS tests, and it's really great >> tool, as for loading system, >> so for finding various protocols bugs. Thank You :) >> One of the scenarios, I'm using is "fsx", that is slightly modified to >> run with 16 jobs: >> >> ; This job file works pretty works similarly to running fsx-linux >> ; with -r 4096 -w 4096 -Z -N 500000 >> [global] >> verify=meta >> verify_dump=1 >> >> [fsx-file-job1] >> directory=/mnt/FIO-some-mountpoint-to-NAS-01 >> ioengine=libaio >> iodepth=256 >> rw=randrw >> size=256k >> bs=4k >> norandommap > > ^^^ This allows the same block to be overwritten multiple times. Are > you sure this is what you want? > > -- > Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: during fio scenario with verify=meta some jobs fails 2015-04-11 3:04 ` Samuel Shapiro @ 2015-04-11 6:10 ` Sitsofe Wheeler 2015-04-11 6:15 ` Sitsofe Wheeler 0 siblings, 1 reply; 9+ messages in thread From: Sitsofe Wheeler @ 2015-04-11 6:10 UTC (permalink / raw) To: Samuel Shapiro; +Cc: fio On 11 April 2015 at 04:04, Samuel Shapiro <samuel.sh79@gmail.com> wrote: > Well i guess it could be a good test for data integrity, so why not? :) > > On Fri, Apr 10, 2015 at 9:29 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote: >> On 8 April 2015 at 11:37, Samuel Shapiro <samuel.sh79@gmail.com> wrote: >>> Hi All, >>> >>> [fsx-file-job1] >>> directory=/mnt/FIO-some-mountpoint-to-NAS-01 >>> ioengine=libaio >>> iodepth=256 >>> rw=randrw >>> size=256k >>> bs=4k >>> norandommap >> >> ^^^ This allows the same block to be overwritten multiple times. Are >> you sure this is what you want? I could be wide of the mark but my concern would be that meta tries to verify block number (I guess this doesn't need to be saved and can be calculated by just following a sequence). If I've overwritten a block how do I verify that the block number of the earlier write was correct - the block will in fact contain the block number of the last write to it... -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: during fio scenario with verify=meta some jobs fails 2015-04-11 6:10 ` Sitsofe Wheeler @ 2015-04-11 6:15 ` Sitsofe Wheeler [not found] ` <CANt18PzSgzy7zVOXuY26iz9gH=tnTLZiKSE=i=pGxAWrxgVq6g@mail.gmail.com> 0 siblings, 1 reply; 9+ messages in thread From: Sitsofe Wheeler @ 2015-04-11 6:15 UTC (permalink / raw) To: Samuel Shapiro; +Cc: fio On 11 April 2015 at 07:10, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > On 11 April 2015 at 04:04, Samuel Shapiro <samuel.sh79@gmail.com> wrote: >> Well i guess it could be a good test for data integrity, so why not? :) >> >> On Fri, Apr 10, 2015 at 9:29 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote: >>> On 8 April 2015 at 11:37, Samuel Shapiro <samuel.sh79@gmail.com> wrote: >>>> Hi All, >>>> >>>> [fsx-file-job1] >>>> directory=/mnt/FIO-some-mountpoint-to-NAS-01 >>>> ioengine=libaio >>>> iodepth=256 >>>> rw=randrw >>>> size=256k >>>> bs=4k >>>> norandommap >>> >>> ^^^ This allows the same block to be overwritten multiple times. Are >>> you sure this is what you want? > > I could be wide of the mark but my concern would be that meta tries to > verify block number (I guess this doesn't need to be saved and can be > calculated by just following a sequence). If I've overwritten a block > how do I verify that the block number of the earlier write was correct > - the block will in fact contain the block number of the last write to > it... Apologies, where I said block number I should have said "io sequence number" (because you're using verify=meta): Assuming I only do my verification at the end of a run, if I've overwritten a block how do I verify that the io sequence number of the earlier writes to that block were correct? Wouldn't the block will contain only the io sequence number of the last write to it? -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CANt18PzSgzy7zVOXuY26iz9gH=tnTLZiKSE=i=pGxAWrxgVq6g@mail.gmail.com>]
* Re: during fio scenario with verify=meta some jobs fails [not found] ` <CANt18PzSgzy7zVOXuY26iz9gH=tnTLZiKSE=i=pGxAWrxgVq6g@mail.gmail.com> @ 2015-04-11 22:06 ` Sitsofe Wheeler 2015-04-12 10:12 ` Samuel Shapiro 0 siblings, 1 reply; 9+ messages in thread From: Sitsofe Wheeler @ 2015-04-11 22:06 UTC (permalink / raw) To: Samuel Shapiro; +Cc: fio On 11 April 2015 at 17:48, Samuel Shapiro <samuel.sh79@gmail.com> wrote: > Thanks Sitsofe, > It make sense, if previous block meta isn't really saved.... > So maybe the right question to ask, in which scenarios should I prefer > "meta" verification method over "md5" etc...? Without looking at the source I would guess they will catch different issues and have different overheads. For example, if I use md5 I write a header to each "block" that tells me what the data should be. When it comes to verification time I read the header and check the body matches the checksum. Reading the body and calculating the checksum take time and CPU. If the error is that two blocks of the same size (including their headers) have been switched I won't catch that because the checksum of the body is still correct with respect to the header (this is speculation on my part)... If I use meta my header contains things like timestamp, block number, io sequence number. I don't need to spend time verifying the body and unless I used verify_pattern I don't even know what the correct data in the body would look like if I were to check/checksum it. I can detect identically sized block swaps because the block number and io sequence number in the header will be wrong. If a problem is detected I've got a better chance of working out where the problem header came from (because more meta data is available to me). If I'm using same sized blocks and I'm worried about entire block swaps or I want to do a minimal a check as possible I think you could use meta. If I'm more worried about sub block sized data being wrong I would say you are better off with one of the checksum routines (preferably one that is hardware accelerated). If I'm overwriting my data within the same pass and I probably can't use meta any more. meta may give me more information about where bad data came from if the header is intact and I know how to decode it. However, this is all speculation on my part! -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: during fio scenario with verify=meta some jobs fails 2015-04-11 22:06 ` Sitsofe Wheeler @ 2015-04-12 10:12 ` Samuel Shapiro 2015-04-12 10:51 ` Sitsofe Wheeler 0 siblings, 1 reply; 9+ messages in thread From: Samuel Shapiro @ 2015-04-12 10:12 UTC (permalink / raw) To: Sitsofe Wheeler; +Cc: fio Thanks for the great answer Sitsofe :) Will stick to md5 verification while testing FS data integrity. Thanks Samuel On Sun, Apr 12, 2015 at 1:06 AM, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > On 11 April 2015 at 17:48, Samuel Shapiro <samuel.sh79@gmail.com> wrote: >> Thanks Sitsofe, >> It make sense, if previous block meta isn't really saved.... >> So maybe the right question to ask, in which scenarios should I prefer >> "meta" verification method over "md5" etc...? > > Without looking at the source I would guess they will catch different > issues and have different overheads. > > For example, if I use md5 I write a header to each "block" that tells > me what the data should be. When it comes to verification time I read > the header and check the body matches the checksum. Reading the body > and calculating the checksum take time and CPU. If the error is that > two blocks of the same size (including their headers) have been > switched I won't catch that because the checksum of the body is still > correct with respect to the header (this is speculation on my part)... > > If I use meta my header contains things like timestamp, block number, > io sequence number. I don't need to spend time verifying the body and > unless I used verify_pattern I don't even know what the correct data > in the body would look like if I were to check/checksum it. I can > detect identically sized block swaps because the block number and io > sequence number in the header will be wrong. If a problem is detected > I've got a better chance of working out where the problem header came > from (because more meta data is available to me). > > If I'm using same sized blocks and I'm worried about entire block > swaps or I want to do a minimal a check as possible I think you could > use meta. If I'm more worried about sub block sized data being wrong I > would say you are better off with one of the checksum routines > (preferably one that is hardware accelerated). If I'm overwriting my > data within the same pass and I probably can't use meta any more. meta > may give me more information about where bad data came from if the > header is intact and I know how to decode it. However, this is all > speculation on my part! > > -- > Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: during fio scenario with verify=meta some jobs fails 2015-04-12 10:12 ` Samuel Shapiro @ 2015-04-12 10:51 ` Sitsofe Wheeler 2015-04-13 8:21 ` Samuel Shapiro 0 siblings, 1 reply; 9+ messages in thread From: Sitsofe Wheeler @ 2015-04-12 10:51 UTC (permalink / raw) To: Samuel Shapiro; +Cc: fio On 12 April 2015 at 11:12, Samuel Shapiro <samuel.sh79@gmail.com> wrote: > Thanks for the great answer Sitsofe :) > Will stick to md5 verification while testing FS data integrity. If fio is in any shape or form CPU bound during writing or verification I'd probably hint to you that crc32c-intel might be better tradeoff if fio is running on recent Intel hardware. If you can't use that (because you aren't on that hardware) you could use xxhash instead. Both crc32c-intel and xxhash will "detect" less corruptions than md5 (their checksum is smaller) but they can be dramatically faster and perhaps allow you to get more runs / more concurrency in and will still detect common problems. I failed to answer question 3. of your original mail: > 3. How can I parse ".state" file so I could compare its data to > original file on storage? You only need to use this file if the writing pass is interrupted before it could be fully completed (so you know when to stop verifying). I don't think it records the contents of the data being written but rather how starting parameters and how far along a given data generation sequence you got (so you have enough to replay it later)... Generally speaking and assuming the verification header part of a block isn't corrupt, verify_dump can be used to make fio dump the expected and mismatched data into two files for later inspection when a mismatch is found. If memory serves, non important parts of the header section may not match between the expected and actual dumps (e.g. padding between the real values might not match up). -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: during fio scenario with verify=meta some jobs fails 2015-04-12 10:51 ` Sitsofe Wheeler @ 2015-04-13 8:21 ` Samuel Shapiro 0 siblings, 0 replies; 9+ messages in thread From: Samuel Shapiro @ 2015-04-13 8:21 UTC (permalink / raw) To: Sitsofe Wheeler; +Cc: fio Thanks Sitsofe, I'll try crc32c then. Hopefully Dell servers should support it :) On Sun, Apr 12, 2015 at 1:51 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > On 12 April 2015 at 11:12, Samuel Shapiro <samuel.sh79@gmail.com> wrote: >> Thanks for the great answer Sitsofe :) >> Will stick to md5 verification while testing FS data integrity. > > If fio is in any shape or form CPU bound during writing or > verification I'd probably hint to you that crc32c-intel might be > better tradeoff if fio is running on recent Intel hardware. If you > can't use that (because you aren't on that hardware) you could use > xxhash instead. Both crc32c-intel and xxhash will "detect" less > corruptions than md5 (their checksum is smaller) but they can be > dramatically faster and perhaps allow you to get more runs / more > concurrency in and will still detect common problems. > > I failed to answer question 3. of your original mail: > >> 3. How can I parse ".state" file so I could compare its data to >> original file on storage? > > You only need to use this file if the writing pass is interrupted > before it could be fully completed (so you know when to stop > verifying). I don't think it records the contents of the data being > written but rather how starting parameters and how far along a given > data generation sequence you got (so you have enough to replay it > later)... > > Generally speaking and assuming the verification header part of a > block isn't corrupt, verify_dump can be used to make fio dump the > expected and mismatched data into two files for later inspection when > a mismatch is found. If memory serves, non important parts of the > header section may not match between the expected and actual dumps > (e.g. padding between the real values might not match up). > > -- > Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-04-13 8:21 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-04-08 10:37 during fio scenario with verify=meta some jobs fails Samuel Shapiro 2015-04-10 18:29 ` Sitsofe Wheeler 2015-04-11 3:04 ` Samuel Shapiro 2015-04-11 6:10 ` Sitsofe Wheeler 2015-04-11 6:15 ` Sitsofe Wheeler [not found] ` <CANt18PzSgzy7zVOXuY26iz9gH=tnTLZiKSE=i=pGxAWrxgVq6g@mail.gmail.com> 2015-04-11 22:06 ` Sitsofe Wheeler 2015-04-12 10:12 ` Samuel Shapiro 2015-04-12 10:51 ` Sitsofe Wheeler 2015-04-13 8:21 ` Samuel Shapiro
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.