All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ed Swierk <eswierk@skyportsystems.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Kevin Wolf <kwolf@redhat.com>, "Denis V. Lunev" <den@openvz.org>
Subject: Re: [Qemu-devel] Assertion failure on qcow2 disk with cluster_size != 64k
Date: Mon, 24 Oct 2016 16:06:26 -0700	[thread overview]
Message-ID: <CAO_EM_=N03RGb=HZwnFHwHfwR5z01TT6_BZ3asscePnKZMmWUw@mail.gmail.com> (raw)
In-Reply-To: <8a8853e4-d1ce-66b2-5214-b508fb897386@redhat.com>

On Mon, Oct 24, 2016 at 2:21 PM, Eric Blake <eblake@redhat.com> wrote:
> How are you getting max_transfer == 65536?  I can't reproduce it with
> the following setup:
>
> $ qemu-img create -f qcow2 -o cluster_size=1M file 10M
> $ qemu-io -f qcow2 -c 'w 7m 1k' file
> $ qemu-io -f qcow2 -c 'w -z 8003584 2093056' file
>
> although I did confirm that the above sequence was enough to get the
> -ENOTSUP failure and fall into the code calculating max_transfer.
>
> I'm guessing that you are using something other than a file system as
> the backing protocol for your qcow2 image.  But do you really have a
> protocol that takes AT MOST 64k per transaction, while still trying to a
> cluster size of 1M in the qcow2 format?  That's rather awkward, as it
> means that you are required to do 16 transactions per cluster (the whole
> point of using larger clusters is usually to get fewer transactions).  I
> think we need to get to a root cause of why you are seeing such a small
> max_transfer, before I can propose the right patch, since I haven't been
> able to reproduce it locally yet (although I admit I haven't tried to
> see if blkdebug could reliably introduce artificial limits to simulate
> your setup).  And it may turn out that I just have to fix the
> bdrv_co_do_pwrite_zeroes() code to loop multiple times if the size of
> the unaligned head really does exceed the max_transfer size that the
> underlying protocol is able to support, rather than assuming that the
> unaligned head/tail always fit in a single fallback write.

In this case I'm using a qcow2 image that's stored directly in a raw
dm-crypt/LUKS container, which is itself a loop device on an ext4
filesystem.

It appears loop devices (with or without dm-crypt/LUKS) report a
255-sector maximum per request via the BLKSECTGET ioctl, which qemu
rounds down to 64k in raw_refresh_limits(). However this maximum
appears to be just a hint: bdrv_driver_pwritev() succeeds even with a
385024-byte buffer of zeroes.

As for the 1M cluster size, this is a temporary workaround for another
qemu issue (the default qcow2 L2 table cache size performs well with
random reads covering only up to 8 GB of image data with 64k clusters;
beyond that the L2 table cache thrashes). I agree this is not an
optimal configuration for writes.

> Can you also try this patch? If I'm right, you'll still fail, but the
> assertion will be slightly different.  (Again, I'm passing locally, but
> that's because I'm using the file protocol, and my file system does not
> impose a puny 64k max transfer).
>
> diff --git i/block/io.c w/block/io.c
> index b136c89..8757063 100644
> --- i/block/io.c
> +++ w/block/io.c
> @@ -1179,6 +1179,8 @@ static int coroutine_fn
> bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>      int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
>      int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
>                          bs->bl.request_alignment);
> +    int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
> +                                    MAX_WRITE_ZEROES_BOUNCE_BUFFER);
>
>      assert(alignment % bs->bl.request_alignment == 0);
>      head = offset % alignment;
> @@ -1197,6 +1199,8 @@ static int coroutine_fn
> bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>              /* Make a small request up to the first aligned sector.  */
>              num = MIN(count, alignment - head);
>              head = 0;
> +            assert(num < max_write_zeroes);
> +            assert(num < max_transfer);
>          } else if (tail && num > alignment) {
>              /* Shorten the request to the last aligned sector.  */
>              num -= tail;
> @@ -1222,8 +1226,6 @@ static int coroutine_fn
> bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>
>          if (ret == -ENOTSUP) {
>              /* Fall back to bounce buffer if write zeroes is unsupported */
> -            int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
> -
> MAX_WRITE_ZEROES_BOUNCE_BUFFER);
>              BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
>
>              if ((flags & BDRV_REQ_FUA) &&

With this change, the num < max_transfer assertion fails on the first
iteration (with num=385024 and max_transfer=65536).

--Ed

  reply	other threads:[~2016-10-24 23:08 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-21  0:24 [Qemu-devel] Assertion failure on qcow2 disk with cluster_size != 64k Ed Swierk
2016-10-21  1:38 ` Eric Blake
2016-10-21 13:14   ` Ed Swierk
2016-10-24 20:32     ` Eric Blake
2016-11-14  9:50       ` Kevin Wolf
2016-11-14 15:46         ` Eric Blake
2016-11-14 15:54           ` Kevin Wolf
2016-11-14 16:59             ` Eric Blake
2016-10-24 21:21 ` Eric Blake
2016-10-24 23:06   ` Ed Swierk [this message]
2016-10-25  8:39     ` Kevin Wolf
2016-10-25 13:51       ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAO_EM_=N03RGb=HZwnFHwHfwR5z01TT6_BZ3asscePnKZMmWUw@mail.gmail.com' \
    --to=eswierk@skyportsystems.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.