All of lore.kernel.org
 help / color / mirror / Atom feed
From: Saju Nair <saju.mad.nair@gmail.com>
To: Sitsofe Wheeler <sitsofe@gmail.com>
Cc: "fio@vger.kernel.org" <fio@vger.kernel.org>
Subject: Re: FIO -- A few basic questions on Data Integrity.
Date: Mon, 19 Dec 2016 22:45:43 +0530	[thread overview]
Message-ID: <CAKV1nBaM=Cj4qHcyQSJuBmTB4htAupp4Uyh1TEAF_bjk8XJtqQ@mail.gmail.com> (raw)
In-Reply-To: <CALjAwxhoGyBt9=pzzOijtzu3RqRJnoZ3r2fV5pntOqvOgMYhZw@mail.gmail.com>

Hi Sitsofe,
Thanks.
On the possible data-verify error,
1. Yes, the config file is what I used.
2. Did not get the verify : bad header info. but got a line as below.
write-and-verify: (groupid=0, jobs=1): err=84 (file:io_u.c:1979,
func=io_u_queued_complete, error=Invalid or incomplete multibyte or
wide character): pid=9067: Mon Dec 19 03:47:40 2016
    Wish that the response was more intuitive!.
3. Below message shows

Run status group 0 (all jobs):
   READ: io=264KB, aggrb=XXXXKB/s, minb=XXXXKB/s, maxb=XXXXKB/s,
mint=tmsec, maxt=tmsec
  WRITE: io=4096.0MB, aggrb=YYYYYKB/s, minb=YYYYYKB/s, maxb=YYYYYKB/s,
mint=t2msec, maxt=t2msec

Appears to indicate that 4GB had been written to, but, reads happened
only upto 264KB, by when we possibly got  an error ?
Is there a way to get additional info - like what was expected, and
what was actually written, which sector (address) is in error ?
Can we set the
--continue_on_error=verify, to get all the errors ?

-------------------------------------

On the Data Integrity @ performance-
our thought was that for us to ensure that the max performance also is
backed up by having data integrity to pass..
Let me think through the suggestions that you have provided for the same..
Many thanks, really appreciate your valuable support & suggestions.

Regards,
- Saju



On Mon, Dec 19, 2016 at 7:32 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> Hi,
>
> On 19 December 2016 at 12:29, Saju Nair <saju.mad.nair@gmail.com> wrote:
>>
>> We tried with the sample [write-and-verify] in the link you specified..
>>
>> [write-and-verify]
>> rw=randwrite
>> bs=4k
>> direct=1
>> ioengine=libaio
>> iodepth=16
>> size=4g                  <-- added this line
>> verify=crc32c
>> filename=/dev/XXXX
>>
>> Unfortunately, we get an error from FIO (both 2.12 and 2.15- latest).
>> fio-2.15
>> Starting 1 process
>> Jobs: 1 (f=1)^MJobs: 1 (f=1)
>> Jobs: 1 (f=1): [w(1)] [30.0% done] [nnnMB/mKB/0KB /s] [xxxK/yyyK/0
>> iops] [eta mm:ss]
>> Jobs: 1 (f=1): [w(1)] [45.5% done] [nnnMB/mKB/0KB /s] [xxxK/yyyK/0
>> iops] [eta mm:ss]
>> Jobs: 1 (f=1): [w(1)] [54.5% done] [nnnMB/mKB/0KB /s] [xxxK/yyyK/0
>> iops] [eta mm:ss]
>> fio: pid=9067, err=84/file:io_u.c:1979, func=io_u_queued_complete,
>> error=Invalid or incomplete multibyte or wide character
>>
>> From a search, this error has been faced by folks before, but, looks
>> like it got fixed with "numjobs=1".
>>
>> We are already using numjobs=1.
>> Are there any pointers on how to get around this issue.
>> We hope that with the above fixed, we will be able to run regular data
>> integrity checks.
>
> Assuming the fio jobfile you posted above was complete (i.e. no global
> section no other jobs etc) it looks like what you've hit is the error
> message you get when a bad verification header is found during the
> verify phase (i.e. there's been a mismatch between the expected and
> read back data). fio normally goes on to print a message about
> "verify: bad header [...]". Did you get that too (if so what did it
> say) and do you get the same error on other disks that you know are
> good (i.e. are you sure the disk isn't suffering a problem)?
>
>> Now, onto the data-integrity checks at performance...
>> Our device under test (DUT) is an SSD disk.
>> Our standalone write and read performance is achieved at a num_jobs >
>> 1, and qdepth > 1.
>> This is validated in standalone "randwrite" and "randread" FIO runs.
>
> Ah I see. I will note that highest possible performance is a bit at
> odds with proving data integrity though because if I only care about
> performance I can write any old junk and just throw the data I read
> away (I've never known benchmark claims to be limited to verified data
> runs)...
>
>> We wanted to develop a strategy to be able to perform data-integrity
>> checks @ performance.
>> Wanted to check if it is feasible to do this check using FIO.
>> Approach#1:
>>  Extend the -do_verify approach, and do a write followed by verify in
>> a single FIO run.
>>  But, as you clarified - this will not be feasible with numjobs > 1.
>>
>> Approach#2:
>> FIO job#1 - do FIO writes, with settings for full performance
>> FIO job#2 - wait for job#1 and then, do FIO reads at performance.
>
> A few ideas spring to mind:
> 1. Try the usual methods that speed up a "normal" single fio job - if
> a single process/thread submits as much I/O as multiple ones it isn't
> going to look different from the disk's perspective (assuming that it
> sheer amount of simultaneous I/O triggering a problem). Things like
> reducing calls that cost CPU, doing things in bigger batches to
> amortize the cost etc should also help verification speed (but I'll
> leave you to find those elsewhere). You can also look at the HOWTO
> information related to verify_async= option to try and allow more
> parallelism.
> 2. Split the disk into different regions and write/verify each region
> separately from any other region. See offset_increment= in the HOWTO
> for something that might help achieve this if you use numjobs. More
> fiddly but a good exercise in learning how to create fio job files.
>
>> Is there any inbuilt way to do an at-speed comparison in FIO.
>
> Personally I'd start with 1. from above and after I got that going I'd
> give 2. a go. If 1. can be made to get similar disk I/O numbers to
> using multiple jobs then you might even stop there.
>
>> If not, we wanted to see if we can use FIO to read from our DUT, to
>> the host's memory or any other storage disk, and then do a simple
>> application that compares the data.
>
> fio isn't a copying tool so it won't "move" data for you (and doing so
> would slow things down). However, if you somehow copied the contents
> into a file fio could verify against the file. The problem you'll then
> have to solve is finding a tool that copies the data faster than fio
> does its verifying reads...
>
> --
> Sitsofe | http://sucs.org/~sits/

  reply	other threads:[~2016-12-19 17:15 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-17 10:45 FIO -- A few basic questions on Data Integrity Saju Nair
2016-12-17 16:24 ` Sitsofe Wheeler
2016-12-19  9:49   ` Saju Nair
2016-12-19 11:00     ` Sitsofe Wheeler
2016-12-19 12:29       ` Saju Nair
2016-12-19 14:02         ` Sitsofe Wheeler
2016-12-19 17:15           ` Saju Nair [this message]
2016-12-19 20:34             ` Sitsofe Wheeler
2016-12-20 12:26               ` Saju Nair
2016-12-20 13:26                 ` Sitsofe Wheeler
2016-12-22  4:48                   ` Saju Nair
2016-12-22  7:05                     ` Sitsofe Wheeler
2016-12-26 11:30                       ` Saju Nair
2016-12-26 16:43                         ` Sitsofe Wheeler
     [not found]                         ` <CALjAwxh8Pkgwi2jMUubZCJu-N-7+u8MDFyZw93Uzw28MK2Gz0A@mail.gmail.com>
2016-12-27  4:33                           ` Saju Nair

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKV1nBaM=Cj4qHcyQSJuBmTB4htAupp4Uyh1TEAF_bjk8XJtqQ@mail.gmail.com' \
    --to=saju.mad.nair@gmail.com \
    --cc=fio@vger.kernel.org \
    --cc=sitsofe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.