qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Alberto Garcia <berto@igalia.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster
Date: Mon, 17 Aug 2020 17:53:07 +0200	[thread overview]
Message-ID: <20200817155307.GS11402@linux.fritz.box> (raw)
In-Reply-To: <w518sedz3td.fsf@maestria.local.igalia.com>

Am 17.08.2020 um 17:31 hat Alberto Garcia geschrieben:
> On Mon 17 Aug 2020 12:10:19 PM CEST, Kevin Wolf wrote:
> >> Since commit c8bb23cbdbe / QEMU 4.1.0 (and if the storage backend
> >> allows it) writing to an image created with preallocation=metadata
> >> can be slower (20% in my tests) than writing to an image with no
> >> preallocation at all.
> >
> > A while ago we had a case where commit c8bb23cbdbe was actually
> > reported as a major performance regression, so it's a big "it
> > depends".
> >
> > XFS people told me that they consider this code a bad idea. Just
> > because it's a specialised "write zeroes" operation, it's not
> > necessarily fast on filesystems. In particular, on XFS, ZERO_RANGE
> > causes a queue drain with O_DIRECT (probably hurts cases with high
> > queue depths) and additionally even a page cache flush without
> > O_DIRECT.
> >
> > So in a way this whole thing is a two-edged sword.
> 
> I see... on ext4 the improvements are clearly visible. Are we not
> detecting this for xfs? We do have an s->is_xfs flag.

My understanding is that XFS and ext4 behave very similar in this
respect. It's not a clear loss on XFS either, some cases are improved.
But cases that get a performance regression exist, too. It's a question
of the workload, the file system state (e.g. fragmentation of the image
file) and the storage.

So I don't think checking for a specific filesystem is going to improve
things.

> >> a) shall we include a warning in the documentation ("note that this
> >> preallocation mode can result in worse performance")?
> >
> > To be honest, I don't really understand this case yet. With metadata
> > preallocation, the clusters are already marked as allocated, so why
> > would handle_alloc_space() even be called? We're not allocating new
> > clusters after all?
> 
> It's not called, what happens is what you say below:
> 
> > Or are you saying that ZERO_RANGE + pwrite on a sparse file (= cluster
> > allocation) is faster for you than just the pwrite alone (= writing to
> > already allocated cluster)?
> 
> Yes, 20% faster in my tests (4KB random writes), but in the latter case
> the cluster is already allocated only at the qcow2 level, not on the
> filesystem. preallocation=falloc is faster than preallocation=metadata
> (preallocation=off sits in the middle).

Hm, this feels wrong. Doing more operations should never be faster than
doing less operations.

Maybe the difference is in allocating 64k at once instead of doing a
separate allocation for every 4k block? But with the extent size hint
patches to file-posix, we should allocate 1 MB at once by default now
(if your test image was newly created). Can you check whether this is in
effect for your image file?

Kevin



  reply	other threads:[~2020-08-17 15:54 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-14 14:57 [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster Alberto Garcia
2020-08-14 14:57 ` [PATCH 1/1] " Alberto Garcia
2020-08-14 18:07   ` Vladimir Sementsov-Ogievskiy
2020-08-14 18:06 ` [PATCH 0/1] " Vladimir Sementsov-Ogievskiy
2020-08-17 10:10 ` Kevin Wolf
2020-08-17 15:31   ` Alberto Garcia
2020-08-17 15:53     ` Kevin Wolf [this message]
2020-08-17 15:58       ` Alberto Garcia
2020-08-17 18:18       ` Alberto Garcia
2020-08-18  8:18         ` Kevin Wolf
2020-08-19 14:25       ` Alberto Garcia
2020-08-19 15:07         ` Kevin Wolf
2020-08-19 15:37           ` Alberto Garcia
2020-08-19 15:53             ` Alberto Garcia
2020-08-19 17:53           ` Brian Foster
2020-08-20 20:03             ` Alberto Garcia
2020-08-20 21:58               ` Dave Chinner
2020-08-21 11:05                 ` Brian Foster
2020-08-21 11:42                   ` Alberto Garcia
2020-08-21 12:12                     ` Alberto Garcia
2020-08-21 17:02                       ` Brian Foster
2020-08-25 12:24                         ` Alberto Garcia
2020-08-25 16:54                           ` Brian Foster
2020-08-25 17:18                             ` Alberto Garcia
2020-08-25 19:47                               ` Brian Foster
2020-08-26 18:34                                 ` Alberto Garcia
2020-08-27 16:47                                   ` Brian Foster
2020-08-23 21:59                       ` Dave Chinner
2020-08-24 20:14                         ` Alberto Garcia
2020-08-21 12:59                     ` Brian Foster
2020-08-21 15:51                       ` Alberto Garcia
2020-08-23 22:16                       ` Dave Chinner
2020-08-21 16:09                 ` Alberto Garcia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200817155307.GS11402@linux.fritz.box \
    --to=kwolf@redhat.com \
    --cc=berto@igalia.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).