All of lore.kernel.org
 help / color / mirror / Atom feed
* during fio scenario with verify=meta some jobs fails
@ 2015-04-08 10:37 Samuel Shapiro
  2015-04-10 18:29 ` Sitsofe Wheeler
  0 siblings, 1 reply; 9+ messages in thread
From: Samuel Shapiro @ 2015-04-08 10:37 UTC (permalink / raw)
  To: fio

Hi All,

I'd recently started using fio for NAS FS tests, and it's really great
tool, as for loading system,
so for finding various protocols bugs. Thank You :)
One of the scenarios, I'm using is "fsx", that is slightly modified to
run with 16 jobs:

; This job file works pretty works similarly to running fsx-linux
; with -r 4096 -w 4096 -Z -N 500000
[global]
verify=meta
verify_dump=1

[fsx-file-job1]
directory=/mnt/FIO-some-mountpoint-to-NAS-01
ioengine=libaio
iodepth=256
rw=randrw
size=256k
bs=4k
norandommap
direct=1
loops=500000

[fsx-file-job2]
directory=/mnt/FIO-some-mountpoint-to-NAS-02
ioengine=libaio
iodepth=256
rw=randrw
size=256k
bs=4k
norandommap
direct=1
loops=500000
.
.
.
[fsx-file-job16]
directory=/mnt/FIO-some-mountpoint-to-NAS-03
etc....

After several minutes if running scenario, jobs starting to fail on
"verify meta", generating
".state" files.
When I tried to run the same scenario with verify=md5 , there were no
more verify failures reported.

So I have couple of questions:
1. What is the possible reason verify=meta can fail?
2. Should I suspect my File System for data corruption or there is
some known fio issue?
3. How can I parse ".state" file so I could compare its data to
original file on storage?

Thank You,
Samuel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: during fio scenario with verify=meta some jobs fails
  2015-04-08 10:37 during fio scenario with verify=meta some jobs fails Samuel Shapiro
@ 2015-04-10 18:29 ` Sitsofe Wheeler
  2015-04-11  3:04   ` Samuel Shapiro
  0 siblings, 1 reply; 9+ messages in thread
From: Sitsofe Wheeler @ 2015-04-10 18:29 UTC (permalink / raw)
  To: Samuel Shapiro; +Cc: fio

On 8 April 2015 at 11:37, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
> Hi All,
>
> I'd recently started using fio for NAS FS tests, and it's really great
> tool, as for loading system,
> so for finding various protocols bugs. Thank You :)
> One of the scenarios, I'm using is "fsx", that is slightly modified to
> run with 16 jobs:
>
> ; This job file works pretty works similarly to running fsx-linux
> ; with -r 4096 -w 4096 -Z -N 500000
> [global]
> verify=meta
> verify_dump=1
>
> [fsx-file-job1]
> directory=/mnt/FIO-some-mountpoint-to-NAS-01
> ioengine=libaio
> iodepth=256
> rw=randrw
> size=256k
> bs=4k
> norandommap

^^^ This allows the same block to be overwritten multiple times. Are
you sure this is what you want?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: during fio scenario with verify=meta some jobs fails
  2015-04-10 18:29 ` Sitsofe Wheeler
@ 2015-04-11  3:04   ` Samuel Shapiro
  2015-04-11  6:10     ` Sitsofe Wheeler
  0 siblings, 1 reply; 9+ messages in thread
From: Samuel Shapiro @ 2015-04-11  3:04 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

Well i guess it could be a good test for data integrity, so why not? :)

On Fri, Apr 10, 2015 at 9:29 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> On 8 April 2015 at 11:37, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
>> Hi All,
>>
>> I'd recently started using fio for NAS FS tests, and it's really great
>> tool, as for loading system,
>> so for finding various protocols bugs. Thank You :)
>> One of the scenarios, I'm using is "fsx", that is slightly modified to
>> run with 16 jobs:
>>
>> ; This job file works pretty works similarly to running fsx-linux
>> ; with -r 4096 -w 4096 -Z -N 500000
>> [global]
>> verify=meta
>> verify_dump=1
>>
>> [fsx-file-job1]
>> directory=/mnt/FIO-some-mountpoint-to-NAS-01
>> ioengine=libaio
>> iodepth=256
>> rw=randrw
>> size=256k
>> bs=4k
>> norandommap
>
> ^^^ This allows the same block to be overwritten multiple times. Are
> you sure this is what you want?
>
> --
> Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: during fio scenario with verify=meta some jobs fails
  2015-04-11  3:04   ` Samuel Shapiro
@ 2015-04-11  6:10     ` Sitsofe Wheeler
  2015-04-11  6:15       ` Sitsofe Wheeler
  0 siblings, 1 reply; 9+ messages in thread
From: Sitsofe Wheeler @ 2015-04-11  6:10 UTC (permalink / raw)
  To: Samuel Shapiro; +Cc: fio

On 11 April 2015 at 04:04, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
> Well i guess it could be a good test for data integrity, so why not? :)
>
> On Fri, Apr 10, 2015 at 9:29 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
>> On 8 April 2015 at 11:37, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
>>> Hi All,
>>>
>>> [fsx-file-job1]
>>> directory=/mnt/FIO-some-mountpoint-to-NAS-01
>>> ioengine=libaio
>>> iodepth=256
>>> rw=randrw
>>> size=256k
>>> bs=4k
>>> norandommap
>>
>> ^^^ This allows the same block to be overwritten multiple times. Are
>> you sure this is what you want?

I could be wide of the mark but my concern would be that meta tries to
verify block number (I guess this doesn't need to be saved and can be
calculated by just following a sequence). If I've overwritten a block
how do I verify that the block number of the earlier write was correct
- the block will in fact contain the block number of the last write to
it...

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: during fio scenario with verify=meta some jobs fails
  2015-04-11  6:10     ` Sitsofe Wheeler
@ 2015-04-11  6:15       ` Sitsofe Wheeler
       [not found]         ` <CANt18PzSgzy7zVOXuY26iz9gH=tnTLZiKSE=i=pGxAWrxgVq6g@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Sitsofe Wheeler @ 2015-04-11  6:15 UTC (permalink / raw)
  To: Samuel Shapiro; +Cc: fio

On 11 April 2015 at 07:10, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> On 11 April 2015 at 04:04, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
>> Well i guess it could be a good test for data integrity, so why not? :)
>>
>> On Fri, Apr 10, 2015 at 9:29 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
>>> On 8 April 2015 at 11:37, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
>>>> Hi All,
>>>>
>>>> [fsx-file-job1]
>>>> directory=/mnt/FIO-some-mountpoint-to-NAS-01
>>>> ioengine=libaio
>>>> iodepth=256
>>>> rw=randrw
>>>> size=256k
>>>> bs=4k
>>>> norandommap
>>>
>>> ^^^ This allows the same block to be overwritten multiple times. Are
>>> you sure this is what you want?
>
> I could be wide of the mark but my concern would be that meta tries to
> verify block number (I guess this doesn't need to be saved and can be
> calculated by just following a sequence). If I've overwritten a block
> how do I verify that the block number of the earlier write was correct
> - the block will in fact contain the block number of the last write to
> it...

Apologies, where I said block number I should have said "io sequence
number" (because you're using verify=meta):

Assuming I only do my verification at the end of a run, if I've
overwritten a block how do I verify that the io sequence number of the
earlier writes to that block were correct? Wouldn't the block will
contain only the io sequence number of the last write to it?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: during fio scenario with verify=meta some jobs fails
       [not found]         ` <CANt18PzSgzy7zVOXuY26iz9gH=tnTLZiKSE=i=pGxAWrxgVq6g@mail.gmail.com>
@ 2015-04-11 22:06           ` Sitsofe Wheeler
  2015-04-12 10:12             ` Samuel Shapiro
  0 siblings, 1 reply; 9+ messages in thread
From: Sitsofe Wheeler @ 2015-04-11 22:06 UTC (permalink / raw)
  To: Samuel Shapiro; +Cc: fio

On 11 April 2015 at 17:48, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
> Thanks Sitsofe,
> It make sense, if previous block meta isn't really saved....
> So maybe the right question to ask,  in which scenarios should I prefer
> "meta" verification method over "md5" etc...?

Without looking at the source I would guess they will catch different
issues and have different overheads.

For example, if I use md5 I write a header to each "block" that tells
me what the data should be. When it comes to verification time I read
the header and check the body matches the checksum. Reading the body
and calculating the checksum take time and CPU. If the error is that
two blocks of the same size (including their headers) have been
switched I won't catch that because the checksum of the body is still
correct with respect to the header (this is speculation on my part)...

If I use meta my header contains things like timestamp, block number,
io sequence number. I don't need to spend time verifying the body and
unless I used verify_pattern I don't even know what the correct data
in the body would look like if I were to check/checksum it. I can
detect identically sized block swaps because the block number and io
sequence number in the header will be wrong. If a problem is detected
I've got a better chance of working out where the problem header came
from (because more meta data is available to me).

If I'm using same sized blocks and I'm worried about entire block
swaps or I want to do a minimal a check as possible I think you could
use meta. If I'm more worried about sub block sized data being wrong I
would say you are better off with one of the checksum routines
(preferably one that is hardware accelerated). If I'm overwriting my
data within the same pass and I probably can't use meta any more. meta
may give me more information about where bad data came from if the
header is intact and I know how to decode it. However, this is all
speculation on my part!

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: during fio scenario with verify=meta some jobs fails
  2015-04-11 22:06           ` Sitsofe Wheeler
@ 2015-04-12 10:12             ` Samuel Shapiro
  2015-04-12 10:51               ` Sitsofe Wheeler
  0 siblings, 1 reply; 9+ messages in thread
From: Samuel Shapiro @ 2015-04-12 10:12 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

Thanks for the great answer Sitsofe :)
Will stick to md5 verification while testing FS data integrity.

Thanks
Samuel

On Sun, Apr 12, 2015 at 1:06 AM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> On 11 April 2015 at 17:48, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
>> Thanks Sitsofe,
>> It make sense, if previous block meta isn't really saved....
>> So maybe the right question to ask,  in which scenarios should I prefer
>> "meta" verification method over "md5" etc...?
>
> Without looking at the source I would guess they will catch different
> issues and have different overheads.
>
> For example, if I use md5 I write a header to each "block" that tells
> me what the data should be. When it comes to verification time I read
> the header and check the body matches the checksum. Reading the body
> and calculating the checksum take time and CPU. If the error is that
> two blocks of the same size (including their headers) have been
> switched I won't catch that because the checksum of the body is still
> correct with respect to the header (this is speculation on my part)...
>
> If I use meta my header contains things like timestamp, block number,
> io sequence number. I don't need to spend time verifying the body and
> unless I used verify_pattern I don't even know what the correct data
> in the body would look like if I were to check/checksum it. I can
> detect identically sized block swaps because the block number and io
> sequence number in the header will be wrong. If a problem is detected
> I've got a better chance of working out where the problem header came
> from (because more meta data is available to me).
>
> If I'm using same sized blocks and I'm worried about entire block
> swaps or I want to do a minimal a check as possible I think you could
> use meta. If I'm more worried about sub block sized data being wrong I
> would say you are better off with one of the checksum routines
> (preferably one that is hardware accelerated). If I'm overwriting my
> data within the same pass and I probably can't use meta any more. meta
> may give me more information about where bad data came from if the
> header is intact and I know how to decode it. However, this is all
> speculation on my part!
>
> --
> Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: during fio scenario with verify=meta some jobs fails
  2015-04-12 10:12             ` Samuel Shapiro
@ 2015-04-12 10:51               ` Sitsofe Wheeler
  2015-04-13  8:21                 ` Samuel Shapiro
  0 siblings, 1 reply; 9+ messages in thread
From: Sitsofe Wheeler @ 2015-04-12 10:51 UTC (permalink / raw)
  To: Samuel Shapiro; +Cc: fio

On 12 April 2015 at 11:12, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
> Thanks for the great answer Sitsofe :)
> Will stick to md5 verification while testing FS data integrity.

If fio is in any shape or form CPU bound during writing or
verification I'd probably hint to you that crc32c-intel might be
better tradeoff if fio is running on recent Intel hardware. If you
can't use that (because you aren't on that hardware) you could use
xxhash instead. Both crc32c-intel and xxhash will "detect" less
corruptions than md5 (their checksum is smaller) but they can be
dramatically faster and perhaps allow you to get more runs / more
concurrency in and will still detect common problems.

I failed to answer question 3. of your original mail:

> 3. How can I parse ".state" file so I could compare its data to
> original file on storage?

You only need to use this file if the writing pass is interrupted
before it could be fully completed (so you know when to stop
verifying). I don't think it records the contents of the data being
written but rather how starting parameters and how far along a given
data generation sequence you got (so you have enough to replay it
later)...

Generally speaking and assuming the verification header part of a
block isn't corrupt, verify_dump can be used to make fio dump the
expected and mismatched data into two files for later inspection when
a mismatch is found. If memory serves, non important parts of the
header section may not match between the expected and actual dumps
(e.g. padding between the real values might not match up).

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: during fio scenario with verify=meta some jobs fails
  2015-04-12 10:51               ` Sitsofe Wheeler
@ 2015-04-13  8:21                 ` Samuel Shapiro
  0 siblings, 0 replies; 9+ messages in thread
From: Samuel Shapiro @ 2015-04-13  8:21 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

Thanks Sitsofe,
I'll try crc32c then.
Hopefully Dell servers should support it :)

On Sun, Apr 12, 2015 at 1:51 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> On 12 April 2015 at 11:12, Samuel Shapiro <samuel.sh79@gmail.com> wrote:
>> Thanks for the great answer Sitsofe :)
>> Will stick to md5 verification while testing FS data integrity.
>
> If fio is in any shape or form CPU bound during writing or
> verification I'd probably hint to you that crc32c-intel might be
> better tradeoff if fio is running on recent Intel hardware. If you
> can't use that (because you aren't on that hardware) you could use
> xxhash instead. Both crc32c-intel and xxhash will "detect" less
> corruptions than md5 (their checksum is smaller) but they can be
> dramatically faster and perhaps allow you to get more runs / more
> concurrency in and will still detect common problems.
>
> I failed to answer question 3. of your original mail:
>
>> 3. How can I parse ".state" file so I could compare its data to
>> original file on storage?
>
> You only need to use this file if the writing pass is interrupted
> before it could be fully completed (so you know when to stop
> verifying). I don't think it records the contents of the data being
> written but rather how starting parameters and how far along a given
> data generation sequence you got (so you have enough to replay it
> later)...
>
> Generally speaking and assuming the verification header part of a
> block isn't corrupt, verify_dump can be used to make fio dump the
> expected and mismatched data into two files for later inspection when
> a mismatch is found. If memory serves, non important parts of the
> header section may not match between the expected and actual dumps
> (e.g. padding between the real values might not match up).
>
> --
> Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-04-13  8:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-08 10:37 during fio scenario with verify=meta some jobs fails Samuel Shapiro
2015-04-10 18:29 ` Sitsofe Wheeler
2015-04-11  3:04   ` Samuel Shapiro
2015-04-11  6:10     ` Sitsofe Wheeler
2015-04-11  6:15       ` Sitsofe Wheeler
     [not found]         ` <CANt18PzSgzy7zVOXuY26iz9gH=tnTLZiKSE=i=pGxAWrxgVq6g@mail.gmail.com>
2015-04-11 22:06           ` Sitsofe Wheeler
2015-04-12 10:12             ` Samuel Shapiro
2015-04-12 10:51               ` Sitsofe Wheeler
2015-04-13  8:21                 ` Samuel Shapiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.