qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Cc: John Snow <jsnow@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	"fam@euphon.net" <fam@euphon.net>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	"mreitz@redhat.com" <mreitz@redhat.com>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	Denis Lunev <den@virtuozzo.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
Date: Mon, 8 Apr 2019 12:14:49 +0200	[thread overview]
Message-ID: <20190408101449.GB11997@linux.fritz.box> (raw)
In-Reply-To: <20190408100442.GA11997@linux.fritz.box>

Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben:
> Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> > 
> > 
> > On 06/04/2019 01:50, John Snow wrote:
> > > 
> > > 
> > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> > >> On a file system used by the customer, fallocate() returns an error
> > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> > >> fails. We can handle that case the same way as it is done for the
> > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> > >> zeroes to an image for the unaligned chunk of the block.
> > >>
> > >> Suggested-by: Denis V. Lunev <den@openvz.org>
> > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> > >> ---
> > >>   block/io.c | 2 +-
> > >>   1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/block/io.c b/block/io.c
> > >> index dfc153b..0412a51 100644
> > >> --- a/block/io.c
> > >> +++ b/block/io.c
> > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> > >>               assert(!bs->supported_zero_flags);
> > >>           }
> > >>   
> > >> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > >> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > >>               /* Fall back to bounce buffer if write zeroes is unsupported */
> > >>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> > >>   
> > >>
> > > 
> > > I suppose that if fallocate fails for any reason and we're allowing
> > > fallback, we're either going to succeed ... or fail again very soon
> > > thereafter.
> > > 
> > > Are there any cases where it is vital to not ignore the first fallocate
> > > failure? I'm a little wary of ignoring the return code from
> > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > > failure here that the following bounce writes will also fail "safely."
> > > 
> > > I'm not completely confident, but I have no tangible objections:
> > > Reviewed-by: John Snow <jsnow@redhat.com>
> > > 
> > 
> > Thank you for your review, John!
> > 
> > Let me clarify the circumstances and quote the bug report:
> > "Customer had Win-2012 VM with 50GB system disk which was later resized 
> > to 256GB without resizing the partition inside VM.
> > Now, while trying to resize to 50G, the following error will appear
> > 'Failed to reduce the number of L2 tables: Invalid argument'
> > It was found that it is possible to shrink the disk to 128G and any size 
> > above that number, but size below 128G will bring the mentioned error."
> > 
> > The fallocate() returns no error on that file system if the offset and
> > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> > are aligned to 4K.
> 
> What is the return value you get from this file system?
> 
> Maybe turning that into ENOTSUP in file-posix would be less invasive.
> Just falling back for any error gives me the vague feeling that it could
> cause problems sooner or later.

Also, which file system is this? This seems to be a file system bug.
fallocate() isn't documented to possibly have alignment restrictions for
FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about).
FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial
blocks, so there is no doubt that operations for partial blocks are
considered valid. Operations that may impose restrictions are explicitly
documented as such.

So in any case, this would only be a workaround for a buggy file system.
The real bug needs to be fixed there.

Kevin

WARNING: multiple messages have this Message-ID (diff)
From: Kevin Wolf <kwolf@redhat.com>
To: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Cc: "fam@euphon.net" <fam@euphon.net>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	Denis Lunev <den@virtuozzo.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"mreitz@redhat.com" <mreitz@redhat.com>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	John Snow <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure
Date: Mon, 8 Apr 2019 12:14:49 +0200	[thread overview]
Message-ID: <20190408101449.GB11997@linux.fritz.box> (raw)
Message-ID: <20190408101449.gSqeBHJfUXVRVvAKQ3P-xOV3t8oR561Nqsf5xN8U238@z> (raw)
In-Reply-To: <20190408100442.GA11997@linux.fritz.box>

Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben:
> Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> > 
> > 
> > On 06/04/2019 01:50, John Snow wrote:
> > > 
> > > 
> > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> > >> On a file system used by the customer, fallocate() returns an error
> > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> > >> fails. We can handle that case the same way as it is done for the
> > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> > >> zeroes to an image for the unaligned chunk of the block.
> > >>
> > >> Suggested-by: Denis V. Lunev <den@openvz.org>
> > >> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
> > >> ---
> > >>   block/io.c | 2 +-
> > >>   1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/block/io.c b/block/io.c
> > >> index dfc153b..0412a51 100644
> > >> --- a/block/io.c
> > >> +++ b/block/io.c
> > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> > >>               assert(!bs->supported_zero_flags);
> > >>           }
> > >>   
> > >> -        if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > >> +        if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > >>               /* Fall back to bounce buffer if write zeroes is unsupported */
> > >>               BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
> > >>   
> > >>
> > > 
> > > I suppose that if fallocate fails for any reason and we're allowing
> > > fallback, we're either going to succeed ... or fail again very soon
> > > thereafter.
> > > 
> > > Are there any cases where it is vital to not ignore the first fallocate
> > > failure? I'm a little wary of ignoring the return code from
> > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > > failure here that the following bounce writes will also fail "safely."
> > > 
> > > I'm not completely confident, but I have no tangible objections:
> > > Reviewed-by: John Snow <jsnow@redhat.com>
> > > 
> > 
> > Thank you for your review, John!
> > 
> > Let me clarify the circumstances and quote the bug report:
> > "Customer had Win-2012 VM with 50GB system disk which was later resized 
> > to 256GB without resizing the partition inside VM.
> > Now, while trying to resize to 50G, the following error will appear
> > 'Failed to reduce the number of L2 tables: Invalid argument'
> > It was found that it is possible to shrink the disk to 128G and any size 
> > above that number, but size below 128G will bring the mentioned error."
> > 
> > The fallocate() returns no error on that file system if the offset and
> > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> > are aligned to 4K.
> 
> What is the return value you get from this file system?
> 
> Maybe turning that into ENOTSUP in file-posix would be less invasive.
> Just falling back for any error gives me the vague feeling that it could
> cause problems sooner or later.

Also, which file system is this? This seems to be a file system bug.
fallocate() isn't documented to possibly have alignment restrictions for
FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about).
FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial
blocks, so there is no doubt that operations for partial blocks are
considered valid. Operations that may impose restrictions are explicitly
documented as such.

So in any case, this would only be a workaround for a buggy file system.
The real bug needs to be fixed there.

Kevin


  parent reply	other threads:[~2019-04-08 10:15 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-05 14:24 [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure Andrey Shinkevich
2019-04-05 14:24 ` Andrey Shinkevich
2019-04-05 22:50 ` [Qemu-devel] [Qemu-block] " John Snow
2019-04-05 22:50   ` John Snow
2019-04-08  9:44   ` Andrey Shinkevich
2019-04-08  9:44     ` Andrey Shinkevich
2019-04-08 10:04     ` Kevin Wolf
2019-04-08 10:04       ` Kevin Wolf
2019-04-08 10:14       ` Kevin Wolf [this message]
2019-04-08 10:14         ` Kevin Wolf
2019-04-10 14:54         ` Stefan Hajnoczi
2019-04-10 14:54           ` Stefan Hajnoczi
2019-04-08 11:55       ` Andrey Shinkevich
2019-04-08 11:55         ` Andrey Shinkevich
2019-04-08  9:00 ` [Qemu-devel] " Stefan Hajnoczi
2019-04-08  9:00   ` Stefan Hajnoczi
2019-04-08  9:45   ` Andrey Shinkevich
2019-04-08  9:45     ` Andrey Shinkevich
2019-08-17 14:42 ` Eric Blake
2019-08-17 14:49   ` Eric Blake
2019-08-17 14:56     ` Eric Blake
2019-08-19 19:46       ` Denis V. Lunev
2019-08-19 20:30         ` Eric Blake
2019-08-19 20:53           ` Denis V. Lunev
2019-08-19 21:29             ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190408101449.GB11997@linux.fritz.box \
    --to=kwolf@redhat.com \
    --cc=andrey.shinkevich@virtuozzo.com \
    --cc=den@virtuozzo.com \
    --cc=fam@euphon.net \
    --cc=jsnow@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).