All of lore.kernel.org
 help / color / mirror / Atom feed
* numberio failure with numjobs=1
@ 2017-03-31  2:25 Guruganesh Hegde
  2017-03-31  5:58 ` Sitsofe Wheeler
  0 siblings, 1 reply; 6+ messages in thread
From: Guruganesh Hegde @ 2017-03-31  2:25 UTC (permalink / raw)
  To: fio

Hi,

I am using fio version 2.12 and using following configuration to
perform data integrity check
--------------------------------------------------------------------
[write-phase]
group_reporting
rw=randwrite
bs=4k
direct=1
ioengine=libaio

iodepth=32
numjobs=1

size=32G
randseed=40964096
verify=crc32c

verify_dump=1
refill_buffers

filename=/dev/nvme0n1

loops=10
----------------------------------------------------------------------------------

What I am seeing is fio fails for numberio with mismatch.
"verify: bad header numberio x wanted y"

I have modified fio 'verify.c' to continue on numberio mismatch error.
All other header checks like header crc & data crcs are verified.

After the above changes fio reports only few numberio errors and there
seem to no other data integrity error.

Is there anything I am missing in fio configuration?

Best Regards,
Gruu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: numberio failure with numjobs=1
  2017-03-31  2:25 numberio failure with numjobs=1 Guruganesh Hegde
@ 2017-03-31  5:58 ` Sitsofe Wheeler
       [not found]   ` <CACqzM2AmmgacpGd+QVt42=SdqQrDENigb7TUs-x50iuD9KWt6Q@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2017-03-31  5:58 UTC (permalink / raw)
  To: Guruganesh Hegde; +Cc: fio

Hi,

On 31 March 2017 at 03:25, Guruganesh Hegde <guruhegde4u@gmail.com> wrote:
>
> I am using fio version 2.12 and using following configuration to
> perform data integrity check
> --------------------------------------------------------------------
> [write-phase]
> group_reporting
> rw=randwrite
> bs=4k
> direct=1
> ioengine=libaio
>
> iodepth=32
> numjobs=1
>
> size=32G
> randseed=40964096
> verify=crc32c
>
> verify_dump=1
> refill_buffers
>
> filename=/dev/nvme0n1
>
> loops=10
> ----------------------------------------------------------------------------------
>
> What I am seeing is fio fails for numberio with mismatch.
> "verify: bad header numberio x wanted y"
>
> I have modified fio 'verify.c' to continue on numberio mismatch error.
> All other header checks like header crc & data crcs are verified.
>
> After the above changes fio reports only few numberio errors and there
> seem to no other data integrity error.
>
> Is there anything I am missing in fio configuration?

On first glance the config looks fine. refill_buffers is actually
automatically set when you set verify (see
http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-refill-buffers
) so it's redundant but harmless and won't have anything to do with
the error you reported.

A few suggestions:
Can you reproduce the problem with the latest fio release (2.18)?
Can you make it happen with size=1G ? If so how about size=256M ?
How far does it get through the total I/O before the numberio problem
is shown (i.e. what output is displayed when failure occurs and fio
exits) and is it roughly the same amount each time?
Does everything else still verify when you use bsrange=4k-64k instead of bs=4k?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: numberio failure with numjobs=1
       [not found]   ` <CACqzM2AmmgacpGd+QVt42=SdqQrDENigb7TUs-x50iuD9KWt6Q@mail.gmail.com>
@ 2017-03-31  6:11     ` Sitsofe Wheeler
  2017-03-31  6:25       ` Guruganesh Hegde
  0 siblings, 1 reply; 6+ messages in thread
From: Sitsofe Wheeler @ 2017-03-31  6:11 UTC (permalink / raw)
  To: Guruganesh Hegde; +Cc: fio

(Please use reply to all so mails continue going to the mailing list)

On 31 March 2017 at 07:05, Guruganesh Hegde <guruhegde4u@gmail.com> wrote:
> Thanks for the suggestions.
>
> I will tryout suggested tests and share the observations.
>
> The device which I am using is supporting only 4k block size, I will
> not be able to try different block sizes, however I will try with
> different sizes(1G or 256M etc).

Bigger blocks that are still a multiple of 4Kbytes should work
though... Further by default bsrange will only picks block sizes that
are multiples of the minimum specified blocksize so bsrange=4k-64k
should pick 4k, 8k, 16k, 32k, 64k (see
http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-blocksize-range
). Is that clearer?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: numberio failure with numjobs=1
  2017-03-31  6:11     ` Sitsofe Wheeler
@ 2017-03-31  6:25       ` Guruganesh Hegde
  2017-03-31 13:47         ` Guruganesh Hegde
  0 siblings, 1 reply; 6+ messages in thread
From: Guruganesh Hegde @ 2017-03-31  6:25 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

Thanks..
I will try that out

Regards,
Guru

On Fri, Mar 31, 2017 at 11:41 AM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> (Please use reply to all so mails continue going to the mailing list)
>
> On 31 March 2017 at 07:05, Guruganesh Hegde <guruhegde4u@gmail.com> wrote:
>> Thanks for the suggestions.
>>
>> I will tryout suggested tests and share the observations.
>>
>> The device which I am using is supporting only 4k block size, I will
>> not be able to try different block sizes, however I will try with
>> different sizes(1G or 256M etc).
>
> Bigger blocks that are still a multiple of 4Kbytes should work
> though... Further by default bsrange will only picks block sizes that
> are multiples of the minimum specified blocksize so bsrange=4k-64k
> should pick 4k, 8k, 16k, 32k, 64k (see
> http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-blocksize-range
> ). Is that clearer?
>
> --
> Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: numberio failure with numjobs=1
  2017-03-31  6:25       ` Guruganesh Hegde
@ 2017-03-31 13:47         ` Guruganesh Hegde
  2017-03-31 21:25           ` Sitsofe Wheeler
  0 siblings, 1 reply; 6+ messages in thread
From: Guruganesh Hegde @ 2017-03-31 13:47 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

I am able to reproduce this issue even in fio 2.18 with Size 1G & 256M

I have figured out a work around,
If I set loops=1 in a config file and launch fio from a shell script,
this issue is not seen.

However if I set loops > 1, then most likely this issue is seen.

Another interesting observation, if combination of loop & size results
into very small amount of data and if fio test finishes in few seconds
this issue is not seen.
For example,
if I set size=1M & loops=10, then test runs without failure.

One another observation is,
I had run test with loops=1 size=32 for 100 iterations. (Launching
from shell scripts) and no single failure is observed.
After finishing above test, repeated test with loops =1 and size=64G,
test failed in 1st iteration remaining iterations are fine. (This test
is still running currently at 8th iteration)

Failing log

Jobs: 1 (f=1): [V(1)][72.1%][r=587MiB/s,w=0KiB/s][r=150k,w=0 IOPS][eta
00m:57s]^Mfio: pid=16009, err=84/file:io_u.c:1982,
func=io_u_queued_complete, error=Invalid or incomplete multibyte or
wide character

write-phase: (groupid=0, jobs=1): err=84 (file:io_u.c:1982,
func=io_u_queued_complete, error=Invalid or incomplete multibyte or
wide character): pid=16009: Fri Mar 31 06:23:47 2017
   read: IOPS=165k, BW=643MiB/s (674MB/s)(28.3GiB/45039msec)

Disk stats (read/write):
  nvme0n1: ios=7392130/16777216, merge=0/0, ticks=1261335/792207,
in_queue=2052208, util=99.94%



Thanks,
Guru







On Fri, Mar 31, 2017 at 11:55 AM, Guruganesh Hegde
<guruhegde4u@gmail.com> wrote:
> Thanks..
> I will try that out
>
> Regards,
> Guru
>
> On Fri, Mar 31, 2017 at 11:41 AM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
>> (Please use reply to all so mails continue going to the mailing list)
>>
>> On 31 March 2017 at 07:05, Guruganesh Hegde <guruhegde4u@gmail.com> wrote:
>>> Thanks for the suggestions.
>>>
>>> I will tryout suggested tests and share the observations.
>>>
>>> The device which I am using is supporting only 4k block size, I will
>>> not be able to try different block sizes, however I will try with
>>> different sizes(1G or 256M etc).
>>
>> Bigger blocks that are still a multiple of 4Kbytes should work
>> though... Further by default bsrange will only picks block sizes that
>> are multiples of the minimum specified blocksize so bsrange=4k-64k
>> should pick 4k, 8k, 16k, 32k, 64k (see
>> http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-blocksize-range
>> ). Is that clearer?
>>
>> --
>> Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: numberio failure with numjobs=1
  2017-03-31 13:47         ` Guruganesh Hegde
@ 2017-03-31 21:25           ` Sitsofe Wheeler
  0 siblings, 0 replies; 6+ messages in thread
From: Sitsofe Wheeler @ 2017-03-31 21:25 UTC (permalink / raw)
  To: Guruganesh Hegde; +Cc: fio

Ok good to know. I'm not surprised you can't reproduce it with loops=1
- each time you're laying down the same thing so if the disk is
"forgetting" to write a block what you want will be there anyway from
the last pass. If you keep the size=256M can you keep halving the
iodepth (16, 8, 4, 2) can you still reproduce the issue? When you let
fio pick the initial starting seed by not explicitly setting randseed?
Can you make it happen with rw=write? bsrange=4k-64k?

If it deterministically happens at small sizes and low iodepths it
should be possible for me to reproduce it too. Can you make it happen
on non-nvme devices with the same small workloads?

On 31 March 2017 at 14:47, Guruganesh Hegde <guruhegde4u@gmail.com> wrote:
> I am able to reproduce this issue even in fio 2.18 with Size 1G & 256M
>
> I have figured out a work around,
> If I set loops=1 in a config file and launch fio from a shell script,
> this issue is not seen.
>
> However if I set loops > 1, then most likely this issue is seen.
>
> Another interesting observation, if combination of loop & size results
> into very small amount of data and if fio test finishes in few seconds
> this issue is not seen.
> For example,
> if I set size=1M & loops=10, then test runs without failure.
>
> One another observation is,
> I had run test with loops=1 size=32 for 100 iterations. (Launching
> from shell scripts) and no single failure is observed.
> After finishing above test, repeated test with loops =1 and size=64G,
> test failed in 1st iteration remaining iterations are fine. (This test
> is still running currently at 8th iteration)
>
> Failing log
>
> Jobs: 1 (f=1): [V(1)][72.1%][r=587MiB/s,w=0KiB/s][r=150k,w=0 IOPS][eta
> 00m:57s]^Mfio: pid=16009, err=84/file:io_u.c:1982,
> func=io_u_queued_complete, error=Invalid or incomplete multibyte or
> wide character
>
> write-phase: (groupid=0, jobs=1): err=84 (file:io_u.c:1982,
> func=io_u_queued_complete, error=Invalid or incomplete multibyte or
> wide character): pid=16009: Fri Mar 31 06:23:47 2017
>    read: IOPS=165k, BW=643MiB/s (674MB/s)(28.3GiB/45039msec)
>
> Disk stats (read/write):
>   nvme0n1: ios=7392130/16777216, merge=0/0, ticks=1261335/792207,
> in_queue=2052208, util=99.94%

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-31 21:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-31  2:25 numberio failure with numjobs=1 Guruganesh Hegde
2017-03-31  5:58 ` Sitsofe Wheeler
     [not found]   ` <CACqzM2AmmgacpGd+QVt42=SdqQrDENigb7TUs-x50iuD9KWt6Q@mail.gmail.com>
2017-03-31  6:11     ` Sitsofe Wheeler
2017-03-31  6:25       ` Guruganesh Hegde
2017-03-31 13:47         ` Guruganesh Hegde
2017-03-31 21:25           ` Sitsofe Wheeler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.