All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: FIO distinguish data corruption and io errors
       [not found] <CAOfm_=Vbf8fbusKvMQ+_2B7qpvUoCaYFEnoBnNa8Bwpdh__Zgg@mail.gmail.com>
@ 2017-09-11 17:24 ` Udi-Yehuda Tamar
  2017-09-11 19:43   ` Sitsofe Wheeler
  0 siblings, 1 reply; 4+ messages in thread
From: Udi-Yehuda Tamar @ 2017-09-11 17:24 UTC (permalink / raw)
  To: fio

---------- Forwarded message ----------
From: Udi-Yehuda Tamar <udi@excelero.com>
Date: Mon, Sep 11, 2017 at 8:23 PM
Subject: FIO distinguish data corruption and io errors
To: fio@vger.kernel.org


Hello Guys,

I guess this question answers it's self but I'll give it a try ,
Can Fio tell for sure the IO err is a data corruption ? say the drive
is busy and the read failed
but not necessarily on data corruption can Fio tell that?
I guess not! because every IO failure can be a data corruption , but I
prefer a pro answer - thx in adv.


-- 
Thanks,
Udi-Yehuda Tamar.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FIO distinguish data corruption and io errors
  2017-09-11 17:24 ` Fwd: FIO distinguish data corruption and io errors Udi-Yehuda Tamar
@ 2017-09-11 19:43   ` Sitsofe Wheeler
  2017-09-12  5:42     ` Udi-Yehuda Tamar
  0 siblings, 1 reply; 4+ messages in thread
From: Sitsofe Wheeler @ 2017-09-11 19:43 UTC (permalink / raw)
  To: Udi-Yehuda Tamar; +Cc: fio

Hi,

On 11 September 2017 at 18:24, Udi-Yehuda Tamar <udi@excelero.com> wrote:
> I guess this question answers it's self but I'll give it a try ,
> Can Fio tell for sure the IO err is a data corruption ? say the drive
> is busy and the read failed
> but not necessarily on data corruption can Fio tell that?
> I guess not! because every IO failure can be a data corruption , but I
> prefer a pro answer - thx in adv.

This depends on how the ioengine and the layers beneath handle it. For
example the psync ioengine talks to a filesystem or block device and
in Linux there's typically some sort of error handling that happens
below the block layer (see
http://events.linuxfoundation.org/sites/events/files/slides/SCSI-EH.pdf
for how this happens for SCSI) so if the kernel (or the disk itself!)
has handled the error by retrying then all userspace (and thus fio)
will see is success but the latency for I/Os that get caught up might
look abnormally high or the I/O will return with an error like EIO.
fio can be made to abort if an I/O returns with high latency (see the
max_latency option -
http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-max-latency).

Different I/O engines may be able to return different types of error
depending on what they are talking to and what level they operate at.
For example the sg ioengine sets a timeout of 30 seconds but it looks
like this still makes error handling kick in (see
http://sg.danny.cz/sg/p/sg_v3_ho.html#id2495241 ).

I don't know if this answer is "pro" enough for you though...

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FIO distinguish data corruption and io errors
  2017-09-11 19:43   ` Sitsofe Wheeler
@ 2017-09-12  5:42     ` Udi-Yehuda Tamar
  2017-09-12 22:03       ` Sitsofe Wheeler
  0 siblings, 1 reply; 4+ messages in thread
From: Udi-Yehuda Tamar @ 2017-09-12  5:42 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

thx a lot Sitofe

I'm using only libaio and I'm parsing the output Json file
I was looking to see if there is a unique msg for data corruption but
it's look like every other IO error

--Udi

On Mon, Sep 11, 2017 at 10:43 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> Hi,
>
> On 11 September 2017 at 18:24, Udi-Yehuda Tamar <udi@excelero.com> wrote:
>> I guess this question answers it's self but I'll give it a try ,
>> Can Fio tell for sure the IO err is a data corruption ? say the drive
>> is busy and the read failed
>> but not necessarily on data corruption can Fio tell that?
>> I guess not! because every IO failure can be a data corruption , but I
>> prefer a pro answer - thx in adv.
>
> This depends on how the ioengine and the layers beneath handle it. For
> example the psync ioengine talks to a filesystem or block device and
> in Linux there's typically some sort of error handling that happens
> below the block layer (see
> http://events.linuxfoundation.org/sites/events/files/slides/SCSI-EH.pdf
> for how this happens for SCSI) so if the kernel (or the disk itself!)
> has handled the error by retrying then all userspace (and thus fio)
> will see is success but the latency for I/Os that get caught up might
> look abnormally high or the I/O will return with an error like EIO.
> fio can be made to abort if an I/O returns with high latency (see the
> max_latency option -
> http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-max-latency).
>
> Different I/O engines may be able to return different types of error
> depending on what they are talking to and what level they operate at.
> For example the sg ioengine sets a timeout of 30 seconds but it looks
> like this still makes error handling kick in (see
> http://sg.danny.cz/sg/p/sg_v3_ho.html#id2495241 ).
>
> I don't know if this answer is "pro" enough for you though...
>
> --
> Sitsofe | http://sucs.org/~sits/



-- 
Thanks,
Udi-Yehuda Tamar.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FIO distinguish data corruption and io errors
  2017-09-12  5:42     ` Udi-Yehuda Tamar
@ 2017-09-12 22:03       ` Sitsofe Wheeler
  0 siblings, 0 replies; 4+ messages in thread
From: Sitsofe Wheeler @ 2017-09-12 22:03 UTC (permalink / raw)
  To: Udi-Yehuda Tamar; +Cc: fio

Hi,

Perhaps I answered the wrong question.

If *fio is doing I/O verification for itself* (i.e.
http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-verify
is set etc.) then when fio verification fails it will report EILSEQ
(which with glibc will be reported as "Invalid or incomplete multibyte
or wide character" - see
https://www.spinics.net/lists/fio/msg04977.html for details) whereas
other errors will turn up as a different error message (e.g. EIO).

For example:
$ dd if=/dev/zero of=/tmp/fiofile bs=16k count=1
$ ./fio --name=verifymismatch --ioengine=sync --rw=read
--verify=crc32c --filename=/tmp/fiofile --verify_fatal=1
verifymismatch: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B,
(T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.0-48-g83a9
Starting 1 process
verify: bad magic header 0, wanted acca at file /tmp/fiofile offset 0,
length 4096
verifymismatch: No I/O performed by sync, perhaps try --debug=io
option for details?
fio: pid=3702, err=84/file:io_u.c:1991, func=io_u_sync_complete,
error=Invalid or incomplete multibyte or wide character

verifymismatch: (groupid=0, jobs=1): err=84 (file:io_u.c:1991,
func=io_u_sync_complete, error=Invalid or incomplete multibyte or wide
character): pid=3702: Tue Sep 12 22:55:51 2017
   read: IOPS=62, BW=250KiB/s (256kB/s)(4096B/16msec)
[...]

Is that what you were after?

On 12 September 2017 at 06:42, Udi-Yehuda Tamar <udi@excelero.com> wrote:
>
> I'm using only libaio and I'm parsing the output Json file
> I was looking to see if there is a unique msg for data corruption but
> it's look like every other IO error
>
> On Mon, Sep 11, 2017 at 10:43 PM, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
>>
>> Different I/O engines may be able to return different types of error
>> depending on what they are talking to and what level they operate at.
>> For example the sg ioengine sets a timeout of 30 seconds but it looks
>> like this still makes error handling kick in (see
>> http://sg.danny.cz/sg/p/sg_v3_ho.html#id2495241 ).

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-09-12 22:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAOfm_=Vbf8fbusKvMQ+_2B7qpvUoCaYFEnoBnNa8Bwpdh__Zgg@mail.gmail.com>
2017-09-11 17:24 ` Fwd: FIO distinguish data corruption and io errors Udi-Yehuda Tamar
2017-09-11 19:43   ` Sitsofe Wheeler
2017-09-12  5:42     ` Udi-Yehuda Tamar
2017-09-12 22:03       ` Sitsofe Wheeler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.