All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: bugzilla-daemon@bugzilla.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [Bug 208827] [fio io_uring] io_uring write data crc32c verify failed
Date: Wed, 12 Aug 2020 07:59:13 +1000	[thread overview]
Message-ID: <20200811215913.GP2114@dread.disaster.area> (raw)
In-Reply-To: <547cde58-26f3-05f1-048c-fa2a94d6e176@kernel.dk>

On Tue, Aug 11, 2020 at 07:10:30AM -0600, Jens Axboe wrote:
> On 8/11/20 1:05 AM, Dave Chinner wrote:
> > On Mon, Aug 10, 2020 at 08:19:57PM -0600, Jens Axboe wrote:
> >> On 8/10/20 8:00 PM, Dave Chinner wrote:
> >>> On Mon, Aug 10, 2020 at 07:08:59PM +1000, Dave Chinner wrote:
> >>>> On Mon, Aug 10, 2020 at 05:08:07PM +1000, Dave Chinner wrote:
> >>>>> [cc Jens]
> >>>>>
> >>>>> [Jens, data corruption w/ io_uring and simple fio reproducer. see
> >>>>> the bz link below.]
> >>>
> >>> Looks like a io_uring/fio bugs at this point, Jens. All your go fast
> >>> bits turns the buffered read into a short read, and neither fio nor
> >>> io_uring async buffered read path handle short reads. Details below.
> >>
> >> It's a fio issue. The io_uring engine uses a different path for short
> >> IO completions, and that's being ignored by the backend... Hence the
> >> IO just gets completed and not retried for this case, and that'll then
> >> trigger verification as if it did complete. I'm fixing it up.
> > 
> > I just updated fio to:
> > 
> > cb7d7abb (HEAD -> master, origin/master, origin/HEAD) io_u: set io_u->verify_offset in fill_io_u()
> > 
> > The workload still reports corruption almost instantly. Only this
> > time, the trace is not reporting a short read.
> > 
> > File is patterned with:
> > 
> > verify_pattern=0x33333333%o-16
> > 
> > Offset of "bad" data is 0x1240000.
> > 
> > Expected:
> > 
> > 00000000:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000010:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000020:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000030:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000040:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000050:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000060:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000070:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000080:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > .....
> > 0000ffd0:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff  3333............
> > 0000ffe0:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff  3333............
> > 0000fff0:  33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff  3333............
> > 
> > 
> > Received:
> > 
> > 00000000:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000010:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000020:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000030:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000040:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000050:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000060:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000070:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000080:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > .....
> > 0000ffd0:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff  3333............
> > 0000ffe0:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff  3333............
> > 0000fff0:  33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff  3333............
> > 
> > 
> > Looks like the data in the expected buffer is wrong - the data
> > pattern in the received buffer is correct according the defined
> > pattern.
> > 
> > Error is 100% reproducable from the same test case. Same bad byte in
> > the expected buffer dump every single time.
> 
> What job file are you running? It's not impossible that I broken
> something else in fio, the io_u->verify_offset is a bit risky... I'll
> get it fleshed out shortly.

Details are in the bugzilla I pointed you at. I modified the
original config specified to put per-file and offset identifiers
into the file data rather than using random data. This is
"determining the origin of stale data 101" stuff - the first thing
we _always_ do when trying to diagnose data corruption is identify
where the bad data came from.

Entire config file is below.

CHeers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


[global]
directory=/mnt/scratch
size=256M
iodepth=16
bs=64k
verify=crc32c
verify_fatal=1
verify_dump=1
verify=pattern
thread=1
loops=200
#direct=1
#unlink=1
buffered=1

[uring_w]
rw=randwrite
ioengine=io_uring
hipri=0
fixedbufs=0
registerfiles=0
sqthread_poll=0
verify_pattern=0x11111111%o-16

[uring_sqt_w]
rw=randwrite
ioengine=io_uring
hipri=0
fixedbufs=0
registerfiles=1
sqthread_poll=1
verify_pattern=0x22222222%o-16

[uring_rw]
rw=randrw
ioengine=io_uring
hipri=0
fixedbufs=0
registerfiles=0
sqthread_poll=0
verify_pattern=0x33333333%o-16

[uring_sqt_rw]
rw=randrw
ioengine=io_uring
hipri=0
fixedbufs=0
registerfiles=1
sqthread_poll=1
verify_pattern=0x44444444%o-16


  reply	other threads:[~2020-08-11 21:59 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-06  4:57 [Bug 208827] New: [fio io_uring] io_uring write data crc32c verify failed bugzilla-daemon
2020-08-07  2:42 ` Dave Chinner
2020-08-07  3:12 ` [Bug 208827] " bugzilla-daemon
2020-08-10  0:09   ` Dave Chinner
2020-08-10  3:56     ` Dave Chinner
2020-08-10  7:08       ` Dave Chinner
2020-08-10  9:08         ` Dave Chinner
2020-08-11  1:15           ` Jens Axboe
2020-08-11  1:50             ` Jens Axboe
2020-08-11  2:01               ` Jens Axboe
2020-08-11  3:01                 ` Jens Axboe
2020-08-11 20:56                 ` Jeff Moyer
2020-08-11 22:09                   ` Dave Chinner
2020-08-12 15:13                     ` Jens Axboe
2020-08-12 15:24                       ` Jeff Moyer
2020-08-12 15:26                         ` Jens Axboe
2020-08-11  2:00           ` Dave Chinner
2020-08-11  2:19             ` Jens Axboe
2020-08-11  5:53               ` Dave Chinner
2020-08-11  7:05               ` Dave Chinner
2020-08-11 13:10                 ` Jens Axboe
2020-08-11 21:59                   ` Dave Chinner [this message]
2020-08-11 23:00                     ` Dave Chinner
2020-08-12 15:19                       ` Jens Axboe
2020-08-11  1:07         ` Jens Axboe
2020-08-10  0:09 ` bugzilla-daemon
2020-08-10  3:56 ` bugzilla-daemon
2020-08-10  7:08 ` bugzilla-daemon
2020-08-10  9:09 ` bugzilla-daemon
2020-08-11  1:07 ` bugzilla-daemon
2020-08-11  1:15 ` bugzilla-daemon
2020-08-11  1:50 ` bugzilla-daemon
2020-08-11  2:00 ` bugzilla-daemon
2020-08-11  2:01 ` bugzilla-daemon
2020-08-11  2:20 ` bugzilla-daemon
2020-08-11  3:01 ` bugzilla-daemon
2020-08-11  5:53 ` bugzilla-daemon
2020-08-11  7:05 ` bugzilla-daemon
2020-08-11 13:10 ` bugzilla-daemon
2020-08-11 16:16 ` bugzilla-daemon
2020-08-11 20:56 ` bugzilla-daemon
2020-08-11 21:59 ` bugzilla-daemon
2020-08-11 22:09 ` bugzilla-daemon
2020-08-11 23:00 ` bugzilla-daemon
2020-08-12  3:15 ` bugzilla-daemon
2020-08-12 15:14 ` bugzilla-daemon
2020-08-12 15:19 ` bugzilla-daemon
2020-08-12 15:24 ` bugzilla-daemon
2020-08-12 15:26 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200811215913.GP2114@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=axboe@kernel.dk \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.