All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Denis V. Lunev" <den@openvz.org>
To: Alberto Garcia <berto@igalia.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation
Date: Thu, 13 Apr 2017 18:17:21 +0300	[thread overview]
Message-ID: <e1ae017b-a832-0e52-02d0-eaeb882a2b4d@openvz.org> (raw)
In-Reply-To: <w51h91suz53.fsf@maestria.local.igalia.com>

On 04/13/2017 06:04 PM, Alberto Garcia wrote:
> On Thu 13 Apr 2017 03:30:43 PM CEST, Denis V. Lunev wrote:
>> Yes, block size should be increased. I perfectly in agreement with
>> your.  But I think that we could do that by plain increase of the
>> cluster size without any further dances. Sub-clusters as sub-clusters
>> will help if we are able to avoid COW. With COW I do not see much
>> difference.
> I'm trying to summarize your position, tell me if I got everything
> correctly:
>
> 1. We should try to reduce data fragmentation on the qcow2 file,
>    because it will have a long term effect on the I/O performance (as
>    opposed to an effect on the initial operations on the empty image).
yes

> 2. The way to do that is to increase the cluster size (to 1MB or
>    more).
yes

> 3. Benefit: increasing the cluster size also decreases the amount of
>    metadata (L2 and refcount).
yes

> 4. Problem: L2 tables become too big and fill up the cache more
>    easily. To solve this the cache code should do partial reads
>    instead of complete L2 clusters.
yes. We can read full cluster as originally if L2 cache is empty.

> 5. Problem: larger cluster sizes also mean more data to copy when
>    there's a COW. To solve this the COW code should be modified so it
>    goes from 5 OPs (read head, write head, read tail, write tail,
>    write data) to 2 OPs (read cluster, write modified cluster).
yes, with small tweak if head and tail are in different clusters. In
this case we
will end up with 3 OPs.

> 6. Having subclusters adds incompatible changes to the file format,
>    and they offer no benefit after allocation.
yes

> 7. Subclusters are only really useful if they match the guest fs block
>    size (because you would avoid doing COW on allocation). Otherwise
>    the only thing that you get is a faster COW (because you move less
>    data), but the improvement is not dramatic and it's better if we do
>    what's proposed in point 5.
yes

> 8. Even if the subcluster size matches the guest block size, you'll
>    get very fast initial allocation but also more chances to end up
>    with a very fragmented qcow2 image, which is worse in the long run.
yes

> 9. Problem: larger clusters make a less efficient use of disk space,
>    but that's a drawback you're fine with considering all of the
>    above.
yes

> Is that a fair summary of what you're trying to say? Anything else
> missing?
yes.

5a. Problem: initial cluster allocation without COW. Could be made
      cluster-size agnostic with the help of fallocate() call. Big
clusters are even
      better as the amount of such allocations is reduced.

Thank you very much for this cool summary! I am too tongue-tied.

Den

  reply	other threads:[~2017-04-13 15:32 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-06 15:01 [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation Alberto Garcia
2017-04-06 16:40 ` Eric Blake
2017-04-07  8:49   ` Alberto Garcia
2017-04-07 12:41   ` Kevin Wolf
2017-04-07 14:24     ` Alberto Garcia
2017-04-21 21:09   ` [Qemu-devel] proposed qcow2 extension: cluster reservations [was: " Eric Blake
2017-04-22 17:56     ` Max Reitz
2017-04-24 11:45       ` Kevin Wolf
2017-04-24 12:46       ` Alberto Garcia
2017-04-07 12:20 ` [Qemu-devel] " Stefan Hajnoczi
2017-04-07 12:24   ` Alberto Garcia
2017-04-07 13:01   ` Kevin Wolf
2017-04-10 15:32     ` Stefan Hajnoczi
2017-04-07 17:10 ` Max Reitz
2017-04-10  8:42   ` Kevin Wolf
2017-04-10 15:03     ` Max Reitz
2017-04-11 12:56   ` Alberto Garcia
2017-04-11 14:04     ` Max Reitz
2017-04-11 14:31       ` Alberto Garcia
2017-04-11 14:45         ` [Qemu-devel] [Qemu-block] " Eric Blake
2017-04-12 12:41           ` Alberto Garcia
2017-04-12 14:10             ` Max Reitz
2017-04-13  8:05               ` Alberto Garcia
2017-04-13  9:02                 ` Kevin Wolf
2017-04-13  9:05                   ` Alberto Garcia
2017-04-11 14:49         ` [Qemu-devel] " Kevin Wolf
2017-04-11 14:58           ` Eric Blake
2017-04-11 14:59           ` Max Reitz
2017-04-11 15:08             ` Eric Blake
2017-04-11 15:18               ` Max Reitz
2017-04-11 15:29                 ` Kevin Wolf
2017-04-11 15:29                   ` Max Reitz
2017-04-11 15:30                 ` Eric Blake
2017-04-11 15:34                   ` Max Reitz
2017-04-12 12:47           ` Alberto Garcia
2017-04-12 16:54 ` Denis V. Lunev
2017-04-13 11:58   ` Alberto Garcia
2017-04-13 12:44     ` Denis V. Lunev
2017-04-13 13:05       ` Kevin Wolf
2017-04-13 13:09         ` Denis V. Lunev
2017-04-13 13:36           ` Alberto Garcia
2017-04-13 14:06             ` Denis V. Lunev
2017-04-13 13:21       ` Alberto Garcia
2017-04-13 13:30         ` Denis V. Lunev
2017-04-13 13:59           ` Kevin Wolf
2017-04-13 15:04           ` Alberto Garcia
2017-04-13 15:17             ` Denis V. Lunev [this message]
2017-04-18 11:52               ` Alberto Garcia
2017-04-18 17:27                 ` Denis V. Lunev
2017-04-13 13:51         ` Kevin Wolf
2017-04-13 14:15           ` Alberto Garcia
2017-04-13 14:27             ` Kevin Wolf
2017-04-13 16:42               ` [Qemu-devel] [Qemu-block] " Roman Kagan
2017-04-13 14:42           ` [Qemu-devel] " Denis V. Lunev
2017-04-12 17:55 ` Denis V. Lunev
2017-04-12 18:20   ` Eric Blake
2017-04-12 19:02     ` Denis V. Lunev
2017-04-13  9:44       ` Kevin Wolf
2017-04-13 10:19         ` Denis V. Lunev
2017-04-14  1:06           ` [Qemu-devel] [Qemu-block] " John Snow
2017-04-14  4:17             ` Denis V. Lunev
2017-04-18 11:22               ` Kevin Wolf
2017-04-18 17:30                 ` Denis V. Lunev
2017-04-14  7:40             ` Roman Kagan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e1ae017b-a832-0e52-02d0-eaeb882a2b4d@openvz.org \
    --to=den@openvz.org \
    --cc=berto@igalia.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.