All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Alberto Garcia <berto@igalia.com>, qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>
Subject: Re: [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation
Date: Tue, 11 Apr 2017 16:04:53 +0200	[thread overview]
Message-ID: <9d848582-8c76-4d88-2b31-e0e4c63b61d4@redhat.com> (raw)
In-Reply-To: <w5160ibyufr.fsf@maestria.local.igalia.com>

[-- Attachment #1: Type: text/plain, Size: 2694 bytes --]

On 11.04.2017 14:56, Alberto Garcia wrote:
> On Fri 07 Apr 2017 07:10:46 PM CEST, Max Reitz wrote:
>>> === Changes to the on-disk format ===
>>>
>>> The qcow2 on-disk format needs to change so each L2 entry has a bitmap
>>> indicating the allocation status of each subcluster. There are three
>>> possible states (unallocated, allocated, all zeroes), so we need two
>>> bits per subcluster.
>>
>> You don't need two bits, you need log(3) / log(2) = ld(3) ≈ 1.58. You
>> can encode the status of eight subclusters (3^8 = 6561) in 13 bits
>> (ld(6561) ≈ 12.68).
> 
> Right, although that would make the encoding more cumbersome to use and
> to debug. Is it worth it?

Probably not, considering this is probably not the way we want to go anyway.

>> One case I'd be especially interested in are of course 4 kB
>> subclusters for 64 kB clusters (because 4 kB is a usual page size and
>> can be configured to be the block size of a guest device; and because
>> 64 kB simply is the standard cluster size of qcow2 images
>> nowadays[1]...).
> 
> I think that we should have at least that, but ideally larger
> cluster-to-subcluster ratios.
> 
>> (We could even get one more bit if we had a subcluster-flag, because I
>> guess we can always assume subclustered clusters to have OFLAG_COPIED
>> and be uncompressed. But still, three bits missing.)
> 
> Why can we always assume OFLAG_COPIED?

Because partially allocated clusters cannot be used with internal
snapshots, and that is what OFLAG_COPIED is for.

>> If course, if you'd be willing to give up the all-zeroes state for
>> subclusters, it would be enough...
> 
> I still think that it looks like a better idea to allow having more
> subclusters, but giving up the all-zeroes state is a valid
> alternative. Apart from having to overwrite with zeroes when a
> subcluster is discarded, is there anything else that we would miss?

It if it's a real discard you can just discard it (which is what we do
for compat=0.10 images anyway); but zero-writes will then have to be
come real writes, yes.

>> By the way, if you'd only allow multiple of 1s overhead
>> (i.e. multiples of 32 subclusters), I think (3) would be pretty much
>> the same as (2) if you just always write the subcluster information
>> adjacent to the L2 table. Should be just the same caching-wise and
>> performance-wise.
> 
> Then (3) is effectively the same as (2), just that the subcluster
> bitmaps are at the end of the L2 cluster, and not next to each entry.

Exactly. But it's a difference in implementation, as you won't have to
worry about having changed the L2 table layout; maybe that's a benefit.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

  reply	other threads:[~2017-04-11 14:05 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-06 15:01 [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation Alberto Garcia
2017-04-06 16:40 ` Eric Blake
2017-04-07  8:49   ` Alberto Garcia
2017-04-07 12:41   ` Kevin Wolf
2017-04-07 14:24     ` Alberto Garcia
2017-04-21 21:09   ` [Qemu-devel] proposed qcow2 extension: cluster reservations [was: " Eric Blake
2017-04-22 17:56     ` Max Reitz
2017-04-24 11:45       ` Kevin Wolf
2017-04-24 12:46       ` Alberto Garcia
2017-04-07 12:20 ` [Qemu-devel] " Stefan Hajnoczi
2017-04-07 12:24   ` Alberto Garcia
2017-04-07 13:01   ` Kevin Wolf
2017-04-10 15:32     ` Stefan Hajnoczi
2017-04-07 17:10 ` Max Reitz
2017-04-10  8:42   ` Kevin Wolf
2017-04-10 15:03     ` Max Reitz
2017-04-11 12:56   ` Alberto Garcia
2017-04-11 14:04     ` Max Reitz [this message]
2017-04-11 14:31       ` Alberto Garcia
2017-04-11 14:45         ` [Qemu-devel] [Qemu-block] " Eric Blake
2017-04-12 12:41           ` Alberto Garcia
2017-04-12 14:10             ` Max Reitz
2017-04-13  8:05               ` Alberto Garcia
2017-04-13  9:02                 ` Kevin Wolf
2017-04-13  9:05                   ` Alberto Garcia
2017-04-11 14:49         ` [Qemu-devel] " Kevin Wolf
2017-04-11 14:58           ` Eric Blake
2017-04-11 14:59           ` Max Reitz
2017-04-11 15:08             ` Eric Blake
2017-04-11 15:18               ` Max Reitz
2017-04-11 15:29                 ` Kevin Wolf
2017-04-11 15:29                   ` Max Reitz
2017-04-11 15:30                 ` Eric Blake
2017-04-11 15:34                   ` Max Reitz
2017-04-12 12:47           ` Alberto Garcia
2017-04-12 16:54 ` Denis V. Lunev
2017-04-13 11:58   ` Alberto Garcia
2017-04-13 12:44     ` Denis V. Lunev
2017-04-13 13:05       ` Kevin Wolf
2017-04-13 13:09         ` Denis V. Lunev
2017-04-13 13:36           ` Alberto Garcia
2017-04-13 14:06             ` Denis V. Lunev
2017-04-13 13:21       ` Alberto Garcia
2017-04-13 13:30         ` Denis V. Lunev
2017-04-13 13:59           ` Kevin Wolf
2017-04-13 15:04           ` Alberto Garcia
2017-04-13 15:17             ` Denis V. Lunev
2017-04-18 11:52               ` Alberto Garcia
2017-04-18 17:27                 ` Denis V. Lunev
2017-04-13 13:51         ` Kevin Wolf
2017-04-13 14:15           ` Alberto Garcia
2017-04-13 14:27             ` Kevin Wolf
2017-04-13 16:42               ` [Qemu-devel] [Qemu-block] " Roman Kagan
2017-04-13 14:42           ` [Qemu-devel] " Denis V. Lunev
2017-04-12 17:55 ` Denis V. Lunev
2017-04-12 18:20   ` Eric Blake
2017-04-12 19:02     ` Denis V. Lunev
2017-04-13  9:44       ` Kevin Wolf
2017-04-13 10:19         ` Denis V. Lunev
2017-04-14  1:06           ` [Qemu-devel] [Qemu-block] " John Snow
2017-04-14  4:17             ` Denis V. Lunev
2017-04-18 11:22               ` Kevin Wolf
2017-04-18 17:30                 ` Denis V. Lunev
2017-04-14  7:40             ` Roman Kagan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d848582-8c76-4d88-2b31-e0e4c63b61d4@redhat.com \
    --to=mreitz@redhat.com \
    --cc=berto@igalia.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.