All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sitsofe Wheeler <sitsofe@gmail.com>
To: Saju Nair <saju.mad.nair@gmail.com>
Cc: "fio@vger.kernel.org" <fio@vger.kernel.org>
Subject: Re: FIO -- A few basic questions on Data Integrity.
Date: Thu, 22 Dec 2016 07:05:40 +0000	[thread overview]
Message-ID: <CALjAwxhDo2p=LFG-r6ZcGbiEk26sZgO8BAnVCrUFXRDVQ9DTkw@mail.gmail.com> (raw)
In-Reply-To: <CAKV1nBZfDgy_+4nAOYXni0m=Q0WUNrn-_q9J_+XuwWRZR03Enw@mail.gmail.com>

Hi,

Bear in mind I can't see your current test.fio (which must be
different from the one you previously posted because fio appears to be
searching for 0x99 whereas the previous job was searching for 0x33) so
if you have extra options in there they may invalidate any analysis I
make.

On 22 December 2016 at 04:48, Saju Nair <saju.mad.nair@gmail.com> wrote:
> Hi,
> Thanks. Please find the stderr messages snippet.
> We get the offsets where FIO observed failures, during the "verify" phase.
> The nvme**.<num>.received - is reported to contain 0.
>
> However, when we try a "dd" to access the location (offset) specified
> - we do see proper data.
> It might be an issue in our DUT, and we will investigate further.

I'm guessing you didn't use iflag=nocache or (iflag=direct + reading
in aligned block sizes) on your dd - are you sure your dd wasn't just
reading out of Linux's page cache? For example for an offset of 12288
you could use dd iflag=direct if=/dev/nvme0n1 bs=4k skip=3 count=1 |
hexdump (bear in mind it is better to do this directly after the fio .
Do you also get this same effect on disks you know to be good?

This is all very suspicious. If the change above somehow returns the
same data as fio I think you've got a problem on your hands. If the
problem goes away when you add end_fsync=1 to the fio job file then
that suggests someone has configured the NVMe devices for speed and
has thrown away safety.

> *************************************************************
> /root/demo/fio-2.12/bin/fio test.fio -eta=always --eta-newline=1 1 tee

This line seems garbled (" 1 tee"? Did you have to re-enter all output
by hand?) but assuming the real command line is correct your tee will
only save stdout to a file and not stderr. See
http://stackoverflow.com/questions/692000/how-do-i-write-stderr-to-a-file-while-using-tee-with-a-pipe
for suggestions on also saving stderr while using tee...

> fiol.log•rite-and-verify: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K,
> ioengine=libaio, iodepth=16
> •rite-and-verify: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K,
> ioengine=libaio, iodepth=16
> fio-2.12

^^^ Might be worth upgrading to the latest fio release (2.16) so you
don't hit any fixed issues.

> starting 1 process
> fio: got pattern '00', wanted '99'. Bad bits 4
> fio: bad pattern block offset 0

^^ Notice how this says the block offset, what it found and what it
wanted. This should make working out where to look straight forward.

>    received data dumped as nvme0n1.12288.received
>    expected data dumped as nvme0n1.12288.expected
> fio: verify type mismatch (0 media, 14 given)
> fio: got pattern '00', wanted '99'. Bad bits 4
> fio: bad pattern block offset 0
>    received data dumped as nvmeOn1.40960.received
>    expected data dumped as nvme0n1.40960.expected
> fio: verify type mismatch (0 media, 14 given)
>
>
> On Tue, Dec 20, 2016 at 6:56 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
>> Hi,
>>
>> On 20 December 2016 at 12:26, Saju Nair <saju.mad.nair@gmail.com> wrote:
>>>
>>> Thanks for your clarifications.
>>> We ran with a --continue_on_error=verify,
>>> to let the FIO complete the full compare..
>>>
>>> We tried to do a sequential write and compare, using the FIO config
>>> file as below, and to bring in the complexity of "random" as a 2nd
>>> step.
>>> [write-and-verify]
>>> rw=write
>>> bs=4k
>>> direct=1
>>> ioengine=libaio
>>> iodepth=16
>>> size=2m
>>> verify=pattern
>>> verify_pattern=0x33333333
>>> continue_on_error=verify
>>> verify_dump=1
>>> filename=/dev/XXXX
>>>
>>> FIO reports errors and we see files of the following names created:
>>> <filename>.<num>.received
>>> <filename>.<num>.expected
>>>
>>> Wanted help in interpreting the result.
>>>
>>> We wrote 2MB worth of data, with blocksize = 4K.
>>> So, ideally is it expected to do 2MB/4KB = 512 IO operations
>>>
>>> 1) The received/expected files:
>>> Are they for each 4K offset that failed the comparison ?
>>
>> I bet you can deduce this from the size and names of the files...
>>
>>> Is the <num> to be interpreted as the (num/bs)-th block that failed ?
>>>    For ex: if the num=438272, and bs=4096 => 107th block failed ?
>>>
>>> It would be useful to know this information - so that we can debug further,
>>> FYI, if we try a "dd" command and check the disk, based on the above
>>> calculation - the data is proper (as expected).
>>
>> You never answered my question about what you are doing with stderr.
>> I'll repeat it here:
>>
>>>> Are you doing something like redirecting stdout to a file but not
>>>> doing anything with stderr? It would help if you include the command
>>>> line you are using to run fio in your reply.
>>
>> Can you answer this question and post the full command line you ran
>> fio with? I think it might have relevance to your current question.
>>
>>> 2) What were the locations that were written to..
>>> Tried fio-verify-state <.state_file>, and get the below:
>>> Version:        0x3
>>> Size:           408
>>> CRC:            0x70ca464a
>>> Thread:         0
>>> Name:           write-and-verify
>>> Completions:    16
>>> Depth:          16
>>> Number IOs:     512
>>> Index:          0
>>> Completions:
>>>         (file= 0) 2031616
>>>         (file= 0) 2035712
>>>         (file= 0) 2039808
>>>         (file= 0) 2043904
>>>         (file= 0) 2048000
>>>         (file= 0) 2052096
>>>         (file= 0) 2056192
>>>         (file= 0) 2060288
>>>         (file= 0) 2064384
>>>         (file= 0) 2068480
>>>         (file= 0) 2072576
>>>         (file= 0) 2076672
>>>         (file= 0) 2080768
>>>         (file= 0) 2084864
>>>         (file= 0) 2088960
>>>         (file= 0) 2093056
>>>
>>> How do we interpret the above content to understand the locations of Writes.
>>
>> Perhaps fio tracks how far through the sequence it got rather than
>> individual locations written (this would be necessary to handle things
>> like loops=)? I personally don't know the answer to this but you can
>> always take a look at the source code.
>>
>> --
>> Sitsofe | http://sucs.org/~sits/

-- 
Sitsofe | http://sucs.org/~sits/

  reply	other threads:[~2016-12-22  7:05 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-17 10:45 FIO -- A few basic questions on Data Integrity Saju Nair
2016-12-17 16:24 ` Sitsofe Wheeler
2016-12-19  9:49   ` Saju Nair
2016-12-19 11:00     ` Sitsofe Wheeler
2016-12-19 12:29       ` Saju Nair
2016-12-19 14:02         ` Sitsofe Wheeler
2016-12-19 17:15           ` Saju Nair
2016-12-19 20:34             ` Sitsofe Wheeler
2016-12-20 12:26               ` Saju Nair
2016-12-20 13:26                 ` Sitsofe Wheeler
2016-12-22  4:48                   ` Saju Nair
2016-12-22  7:05                     ` Sitsofe Wheeler [this message]
2016-12-26 11:30                       ` Saju Nair
2016-12-26 16:43                         ` Sitsofe Wheeler
     [not found]                         ` <CALjAwxh8Pkgwi2jMUubZCJu-N-7+u8MDFyZw93Uzw28MK2Gz0A@mail.gmail.com>
2016-12-27  4:33                           ` Saju Nair

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALjAwxhDo2p=LFG-r6ZcGbiEk26sZgO8BAnVCrUFXRDVQ9DTkw@mail.gmail.com' \
    --to=sitsofe@gmail.com \
    --cc=fio@vger.kernel.org \
    --cc=saju.mad.nair@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.