All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Ring <stefanrin@gmail.com>
Cc: integration@gluster.org, qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: Strange data corruption issue with gluster (libgfapi) and ZFS
Date: Fri, 28 Feb 2020 12:10:46 +0100	[thread overview]
Message-ID: <20200228111046.GC5274@linux.fritz.box> (raw)
In-Reply-To: <CAAxjCExb8GKP0Y8hwEbv=DETfu1dG3++umYV0n8vX6kxuJW3pQ@mail.gmail.com>

Am 27.02.2020 um 23:25 hat Stefan Ring geschrieben:
> On Thu, Feb 27, 2020 at 10:12 PM Stefan Ring <stefanrin@gmail.com> wrote:
> > Victory! I have a reproducer in the form of a plain C libgfapi client.
> >
> > However, I have not been able to trigger corruption by just executing
> > the simple pattern in an artificial way. Currently, I need to feed my
> > reproducer 2 GB of data that I streamed out of the qemu block driver.
> > I get two possible end states out of my reproducer: The correct one or
> > a corrupted one, where 48 KB are zeroed out. It takes no more than 10
> > runs to get each of them at least once. The corrupted end state is
> > exactly the same that I got from the real qemu process from where I
> > obtained the streamed trace. This gives me a lot of confidence in the
> > soundness of my reproducer.
> >
> > More details will follow.
> 
> Ok, so the exact sequence of activity around the corruption is this:
> 
> 8700 and so on are the sequential request numbers. All of these
> requests are writes. Blocks are 512 bytes.
> 
> 8700
>   grows the file to a certain size (2134144 blocks)
> 
> <8700 retires, nothing in flight>
> 
> 8701
>   writes 55 blocks inside currently allocated file range, close to the
> end (7 blocks short)
> 
> 8702
>   writes 54 blocks from the end of 8701, growing the file by 47 blocks
> 
> <8702 retires, 8701 remains in flight>
> 
> 8703
>   writes from the end of 8702, growing the file by 81 blocks
> 
> <8703 retires, 8701 remains in flight>
> 
> 8704
>   writes 1623 blocks also from the end of 8702, growing the file by 1542 blocks
> 
> <8701 retires>
> <8704 retires>
> 
> The exact range covered by 8703 ends up zeroed out.
> 
> If 8701 retires earlier (before 8702 is issued), everything is fine.

This sounds almost like two other bugs we got fixed recently (in the
QEMU file-posix driver and in the XFS kernel driver) where two write
extending the file size were in flight in parallel, but if the shorter
one completed last, instead extending the file, it would end up
truncating it.

I'm not sure, though, why 8701 would try to change the file size because
it's entirely inside the already allocated file range. But maybe adding
the current file size at the start and completion of each request to
your debug output could give us more data points?

Kevin



  reply	other threads:[~2020-02-28 11:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAAxjCEzHQz4cG_8m7S6=CwCBoN5daQs+KVyuU5GL5Tq3Bky1NA@mail.gmail.com>
2020-02-24 12:35 ` Strange data corruption issue with gluster (libgfapi) and ZFS Stefan Ring
2020-02-24 13:10   ` Stefan Ring
2020-02-24 13:26   ` Kevin Wolf
2020-02-24 15:50     ` Stefan Ring
2020-02-25 14:12   ` Stefan Ring
2020-02-27 21:12     ` Stefan Ring
2020-02-27 22:25       ` Stefan Ring
2020-02-28 11:10         ` Kevin Wolf [this message]
2020-02-28 11:41           ` Stefan Ring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200228111046.GC5274@linux.fritz.box \
    --to=kwolf@redhat.com \
    --cc=integration@gluster.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanrin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.