[Qemu-devel] [PATCH v2 0/2] Give the refcount cache the minimum possible size by default

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 0/2] Give the refcount cache the minimum possible size by default
@ 2018-03-14  8:29 Alberto Garcia
  2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 1/2] qcow2: " Alberto Garcia
  2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 2/2] docs: Document the new default sizes of the qcow2 caches Alberto Garcia
  0 siblings, 2 replies; 9+ messages in thread
From: Alberto Garcia @ 2018-03-14  8:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alberto Garcia, qemu-block, Kevin Wolf, Max Reitz, Eric Blake

Hi,

we talked about this the other day, so here are the patches to change
the default cache sizes in qcow2.

Without this patch:

 * refcount-cache-size = l2-cache-size / 4

unless otherwise specified by the user. This is wasteful, the refcount
cache is accessed sequentially during normal I/O, so there's no point
in caching more tables. I measured the effect on the refcount cache
size when populating an empty qcow2 image using random writes, and
there's no difference between having the minimum or the maximum
sizes(*).

With this patch:

 * refcount-cache-size is always 4 clusters by default (the minimum)

 * If "cache-size" is set then l2-cache-size is set to the maximum if
   possible (disk_size * 8 / cluster_size) and the remainder is
   assigned to the refcount cache.

Regards,

Berto

(*) there is, actually: having a very large cache can even make the
    I/O slightly slower, because the larger the cache the longer it
    takes longer to find a cached entry. I only noticed this under
    tmpfs anyway.

Changes:
v2:
- s/overriden/overridden/ (in both patches)

v1: https://lists.gnu.org/archive/html/qemu-block/2018-03/msg00709.html
- Initial release

Alberto Garcia (2):
  qcow2: Give the refcount cache the minimum possible size by default
  docs: Document the new default sizes of the qcow2 caches

 block/qcow2.c              | 31 +++++++++++++++++++------------
 block/qcow2.h              |  4 ----
 docs/qcow2-cache.txt       | 31 ++++++++++++++-----------------
 tests/qemu-iotests/137.out |  2 +-
 4 files changed, 34 insertions(+), 34 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v2 1/2] qcow2: Give the refcount cache the minimum possible size by default
  2018-03-14  8:29 [Qemu-devel] [PATCH v2 0/2] Give the refcount cache the minimum possible size by default Alberto Garcia
@ 2018-03-14  8:29 ` Alberto Garcia
  2018-04-13 15:00   ` Max Reitz
  2018-04-16 16:14   ` Max Reitz
  2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 2/2] docs: Document the new default sizes of the qcow2 caches Alberto Garcia
  1 sibling, 2 replies; 9+ messages in thread
From: Alberto Garcia @ 2018-03-14  8:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alberto Garcia, qemu-block, Kevin Wolf, Max Reitz, Eric Blake

The L2 and refcount caches have default sizes that can be overridden
using the l2-cache-size and refcount-cache-size (an additional
parameter named cache-size sets the combined size of both caches).

Unless forced by one of the aforementioned parameters, QEMU will set
the unspecified sizes so that the L2 cache is 4 times larger than the
refcount cache.

This is based on the premise that the refcount metadata needs to be
only a fourth of the L2 metadata to cover the same amount of disk
space. This is incorrect for two reasons:

 a) The amount of disk covered by an L2 table depends solely on the
    cluster size, but in the case of a refcount block it depends on
    the cluster size *and* the width of each refcount entry.
    The 4/1 ratio is only valid with 16-bit entries (the default).

 b) When we talk about disk space and L2 tables we are talking about
    guest space (L2 tables map guest clusters to host clusters),
    whereas refcount blocks are used for host clusters (including
    L1/L2 tables and the refcount blocks themselves). On a fully
    populated (and uncompressed) qcow2 file, image size > virtual size
    so there are more refcount entries than L2 entries.

Problem (a) could be fixed by adjusting the algorithm to take into
account the refcount entry width. Problem (b) could be fixed by
increasing a bit the refcount cache size to account for the clusters
used for qcow2 metadata.

However this patch takes a completely different approach and instead
of keeping a ratio between both cache sizes it assigns as much as
possible to the L2 cache and the remainder to the refcount cache.

The reason is that L2 tables are used for every single I/O request
from the guest and the effect of increasing the cache is significant
and clearly measurable. Refcount blocks are however only used for
cluster allocation and internal snapshots and in practice are accessed
sequentially in most cases, so the effect of increasing the cache is
negligible (even when doing random writes from the guest).

So, make the refcount cache as small as possible unless the user
explicitly asks for a larger one.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c              | 31 +++++++++++++++++++------------
 block/qcow2.h              |  4 ----
 tests/qemu-iotests/137.out |  2 +-
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7472af6931..8342b0186f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -802,23 +802,30 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
         } else if (refcount_cache_size_set) {
             *l2_cache_size = combined_cache_size - *refcount_cache_size;
         } else {
-            *refcount_cache_size = combined_cache_size
-                                 / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
-            *l2_cache_size = combined_cache_size - *refcount_cache_size;
+            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
+            uint64_t min_refcount_cache =
+                (uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+
+            /* Assign as much memory as possible to the L2 cache, and
+             * use the remainder for the refcount cache */
+            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
+                *l2_cache_size = max_l2_cache;
+                *refcount_cache_size = combined_cache_size - *l2_cache_size;
+            } else {
+                *refcount_cache_size =
+                    MIN(combined_cache_size, min_refcount_cache);
+                *l2_cache_size = combined_cache_size - *refcount_cache_size;
+            }
         }
     } else {
-        if (!l2_cache_size_set && !refcount_cache_size_set) {
+        if (!l2_cache_size_set) {
             *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
                                  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
                                  * s->cluster_size);
-            *refcount_cache_size = *l2_cache_size
-                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-        } else if (!l2_cache_size_set) {
-            *l2_cache_size = *refcount_cache_size
-                           * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-        } else if (!refcount_cache_size_set) {
-            *refcount_cache_size = *l2_cache_size
-                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
+        }
+        if (!refcount_cache_size_set) {
+            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
         }
     }
 
diff --git a/block/qcow2.h b/block/qcow2.h
index ccb92a9696..cdf41055ae 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -77,10 +77,6 @@
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
 #define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
 
-/* The refblock cache needs only a fourth of the L2 cache size to cover as many
- * clusters */
-#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
-
 #define DEFAULT_CLUSTER_SIZE 65536
 
 
diff --git a/tests/qemu-iotests/137.out b/tests/qemu-iotests/137.out
index e28e1eadba..96724a6c33 100644
--- a/tests/qemu-iotests/137.out
+++ b/tests/qemu-iotests/137.out
@@ -22,7 +22,7 @@ refcount-cache-size may not exceed cache-size
 L2 cache size too big
 L2 cache entry size must be a power of two between 512 and the cluster size (65536)
 L2 cache entry size must be a power of two between 512 and the cluster size (65536)
-L2 cache size too big
+Refcount cache size too big
 Conflicting values for qcow2 options 'overlap-check' ('constant') and 'overlap-check.template' ('all')
 Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
 Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v2 2/2] docs: Document the new default sizes of the qcow2 caches
  2018-03-14  8:29 [Qemu-devel] [PATCH v2 0/2] Give the refcount cache the minimum possible size by default Alberto Garcia
  2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 1/2] qcow2: " Alberto Garcia
@ 2018-03-14  8:29 ` Alberto Garcia
  2018-04-16 16:15   ` Max Reitz
  1 sibling, 1 reply; 9+ messages in thread
From: Alberto Garcia @ 2018-03-14  8:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alberto Garcia, qemu-block, Kevin Wolf, Max Reitz, Eric Blake

We have just reduced the refcount cache size to the minimum unless
the user explicitly requests a larger one, so we have to update the
documentation to reflect this change.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 docs/qcow2-cache.txt | 31 ++++++++++++++-----------------
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 170191a242..c640d45d06 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -116,31 +116,28 @@ There are three options available, and all of them take bytes:
 "refcount-cache-size":   maximum size of the refcount block cache
 "cache-size":            maximum size of both caches combined
 
-There are two things that need to be taken into account:
+There are a few things that need to be taken into account:
 
  - Both caches must have a size that is a multiple of the cluster size
    (or the cache entry size: see "Using smaller cache sizes" below).
 
- - If you only set one of the options above, QEMU will automatically
-   adjust the others so that the L2 cache is 4 times bigger than the
-   refcount cache.
+ - The default L2 cache size is 8 clusters or 1MB (whichever is more),
+   and the minimum is 2 clusters (or 2 cache entries, see below).
 
-This means that these options are equivalent:
+ - The default (and minimum) refcount cache size is 4 clusters.
 
-   -drive file=hd.qcow2,l2-cache-size=2097152
-   -drive file=hd.qcow2,refcount-cache-size=524288
-   -drive file=hd.qcow2,cache-size=2621440
+ - If only "cache-size" is specified then QEMU will assign as much
+   memory as possible to the L2 cache before increasing the refcount
+   cache size.
 
-The reason for this 1/4 ratio is to ensure that both caches cover the
-same amount of disk space. Note however that this is only valid with
-the default value of refcount_bits (16). If you are using a different
-value you might want to calculate both cache sizes yourself since QEMU
-will always use the same 1/4 ratio.
+Unlike L2 tables, refcount blocks are not used during normal I/O but
+only during allocations and internal snapshots. In most cases they are
+accessed sequentially (even during random guest I/O) so increasing the
+refcount cache size won't have any measurable effect in performance.
 
-It's also worth mentioning that there's no strict need for both caches
-to cover the same amount of disk space. The refcount cache is used
-much less often than the L2 cache, so it's perfectly reasonable to
-keep it small.
+Before QEMU 2.12 the refcount cache had a default size of 1/4 of the
+L2 cache size. This resulted in unnecessarily large caches, so now the
+refcount cache is as small as possible unless overridden by the user.
 
 
 Using smaller cache entries
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/2] qcow2: Give the refcount cache the minimum possible size by default
  2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 1/2] qcow2: " Alberto Garcia
@ 2018-04-13 15:00   ` Max Reitz
  2018-04-16 13:56     ` Alberto Garcia
  2018-04-16 16:14   ` Max Reitz
  1 sibling, 1 reply; 9+ messages in thread
From: Max Reitz @ 2018-04-13 15:00 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Eric Blake

On 2018-03-14 09:29, Alberto Garcia wrote:
> The L2 and refcount caches have default sizes that can be overridden
> using the l2-cache-size and refcount-cache-size (an additional
> parameter named cache-size sets the combined size of both caches).
> 
> Unless forced by one of the aforementioned parameters, QEMU will set
> the unspecified sizes so that the L2 cache is 4 times larger than the
> refcount cache.
> 
> This is based on the premise that the refcount metadata needs to be
> only a fourth of the L2 metadata to cover the same amount of disk
> space. This is incorrect for two reasons:
> 
>  a) The amount of disk covered by an L2 table depends solely on the
>     cluster size, but in the case of a refcount block it depends on
>     the cluster size *and* the width of each refcount entry.
>     The 4/1 ratio is only valid with 16-bit entries (the default).
> 
>  b) When we talk about disk space and L2 tables we are talking about
>     guest space (L2 tables map guest clusters to host clusters),
>     whereas refcount blocks are used for host clusters (including
>     L1/L2 tables and the refcount blocks themselves). On a fully
>     populated (and uncompressed) qcow2 file, image size > virtual size
>     so there are more refcount entries than L2 entries.
> 
> Problem (a) could be fixed by adjusting the algorithm to take into
> account the refcount entry width. Problem (b) could be fixed by
> increasing a bit the refcount cache size to account for the clusters
> used for qcow2 metadata.

Yeah, well, but I wouldn't give too much thought to (b).  In case of
internal snapshots, you won't write to them, so those clusters don't
matter.  So, yes, you get a bit of overhead for the metadata, but come
on, that's not really much.

And (a) would be trivial, as you yourself say.

> However this patch takes a completely different approach and instead
> of keeping a ratio between both cache sizes it assigns as much as
> possible to the L2 cache and the remainder to the refcount cache.

Seems reasonable in a sense, but I'm wary still.  I especially don't
like that it just changes the default when nothing has been specified.

I can see that we want to limit the refcount cache size, so if you want
to give really much RAM to the L2 tables, then you don't want a 25 %
overhead for the refcount cache, but do we really need to limit it to
the minimum size?

OTOH, four clusters doesn't seem extremely limited...  Did you do your
benchmark on an HDD?

Max

> The reason is that L2 tables are used for every single I/O request
> from the guest and the effect of increasing the cache is significant
> and clearly measurable. Refcount blocks are however only used for
> cluster allocation and internal snapshots and in practice are accessed
> sequentially in most cases, so the effect of increasing the cache is
> negligible (even when doing random writes from the guest).
> 
> So, make the refcount cache as small as possible unless the user
> explicitly asks for a larger one.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.c              | 31 +++++++++++++++++++------------
>  block/qcow2.h              |  4 ----
>  tests/qemu-iotests/137.out |  2 +-
>  3 files changed, 20 insertions(+), 17 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 7472af6931..8342b0186f 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -802,23 +802,30 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
>          } else if (refcount_cache_size_set) {
>              *l2_cache_size = combined_cache_size - *refcount_cache_size;
>          } else {
> -            *refcount_cache_size = combined_cache_size
> -                                 / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
> -            *l2_cache_size = combined_cache_size - *refcount_cache_size;
> +            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
> +            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
> +            uint64_t min_refcount_cache =
> +                (uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
> +
> +            /* Assign as much memory as possible to the L2 cache, and
> +             * use the remainder for the refcount cache */
> +            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
> +                *l2_cache_size = max_l2_cache;
> +                *refcount_cache_size = combined_cache_size - *l2_cache_size;
> +            } else {
> +                *refcount_cache_size =
> +                    MIN(combined_cache_size, min_refcount_cache);
> +                *l2_cache_size = combined_cache_size - *refcount_cache_size;
> +            }
>          }
>      } else {
> -        if (!l2_cache_size_set && !refcount_cache_size_set) {
> +        if (!l2_cache_size_set) {
>              *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
>                                   (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
>                                   * s->cluster_size);
> -            *refcount_cache_size = *l2_cache_size
> -                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> -        } else if (!l2_cache_size_set) {
> -            *l2_cache_size = *refcount_cache_size
> -                           * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> -        } else if (!refcount_cache_size_set) {
> -            *refcount_cache_size = *l2_cache_size
> -                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> +        }
> +        if (!refcount_cache_size_set) {
> +            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
>          }
>      }
>  
> diff --git a/block/qcow2.h b/block/qcow2.h
> index ccb92a9696..cdf41055ae 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -77,10 +77,6 @@
>  #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
>  #define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
>  
> -/* The refblock cache needs only a fourth of the L2 cache size to cover as many
> - * clusters */
> -#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
> -
>  #define DEFAULT_CLUSTER_SIZE 65536
>  
>  
> diff --git a/tests/qemu-iotests/137.out b/tests/qemu-iotests/137.out
> index e28e1eadba..96724a6c33 100644
> --- a/tests/qemu-iotests/137.out
> +++ b/tests/qemu-iotests/137.out
> @@ -22,7 +22,7 @@ refcount-cache-size may not exceed cache-size
>  L2 cache size too big
>  L2 cache entry size must be a power of two between 512 and the cluster size (65536)
>  L2 cache entry size must be a power of two between 512 and the cluster size (65536)
> -L2 cache size too big
> +Refcount cache size too big
>  Conflicting values for qcow2 options 'overlap-check' ('constant') and 'overlap-check.template' ('all')
>  Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
>  Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/2] qcow2: Give the refcount cache the minimum possible size by default
  2018-04-13 15:00   ` Max Reitz
@ 2018-04-16 13:56     ` Alberto Garcia
  2018-04-16 14:05       ` Max Reitz
  0 siblings, 1 reply; 9+ messages in thread
From: Alberto Garcia @ 2018-04-16 13:56 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: qemu-block, Kevin Wolf, Eric Blake

On Fri 13 Apr 2018 05:00:48 PM CEST, Max Reitz wrote:
> On 2018-03-14 09:29, Alberto Garcia wrote:
>> The L2 and refcount caches have default sizes that can be overridden
>> using the l2-cache-size and refcount-cache-size (an additional
>> parameter named cache-size sets the combined size of both caches).
>> 
>> Unless forced by one of the aforementioned parameters, QEMU will set
>> the unspecified sizes so that the L2 cache is 4 times larger than the
>> refcount cache.
>> 
>> This is based on the premise that the refcount metadata needs to be
>> only a fourth of the L2 metadata to cover the same amount of disk
>> space. This is incorrect for two reasons:
>> 
>>  a) The amount of disk covered by an L2 table depends solely on the
>>     cluster size, but in the case of a refcount block it depends on
>>     the cluster size *and* the width of each refcount entry.
>>     The 4/1 ratio is only valid with 16-bit entries (the default).
>> 
>>  b) When we talk about disk space and L2 tables we are talking about
>>     guest space (L2 tables map guest clusters to host clusters),
>>     whereas refcount blocks are used for host clusters (including
>>     L1/L2 tables and the refcount blocks themselves). On a fully
>>     populated (and uncompressed) qcow2 file, image size > virtual size
>>     so there are more refcount entries than L2 entries.
>> 
>> Problem (a) could be fixed by adjusting the algorithm to take into
>> account the refcount entry width. Problem (b) could be fixed by
>> increasing a bit the refcount cache size to account for the clusters
>> used for qcow2 metadata.
>
> Yeah, well, but I wouldn't give too much thought to (b).  In case of
> internal snapshots, you won't write to them, so those clusters don't
> matter.  So, yes, you get a bit of overhead for the metadata, but come
> on, that's not really much.

With that text I just wanted to point out that the calculation mixes
host clusters and guest clusters. I agree that the ratio we're using is
a good approximation (apart from the refcount width issue).

>> However this patch takes a completely different approach and instead
>> of keeping a ratio between both cache sizes it assigns as much as
>> possible to the L2 cache and the remainder to the refcount cache.
>
> Seems reasonable in a sense, but I'm wary still.  I especially don't
> like that it just changes the default when nothing has been specified.
>
> I can see that we want to limit the refcount cache size, so if you
> want to give really much RAM to the L2 tables, then you don't want a
> 25 % overhead for the refcount cache, but do we really need to limit
> it to the minimum size?
>
> OTOH, four clusters doesn't seem extremely limited...  Did you do your
> benchmark on an HDD?

I tried, I see no difference, and I don't see why there would be any.

Refcount entries are used to reference host clusters, and host clusters
are always(*) allocated sequentially, so caching refcount blocks doesn't
do much. You're always loading the same refcount block until it's full,
then you move on to the next one. As I mentioned in a previous e-mail,
having a very large refcount cache would even be detrimental because it
would make cache hits slower (noticeable under tmpfs).

(*) when a cluster gets deallocated (e.g. when the refcount table needs
to grow) then there's a hole in the file and QEMU starts looking for
free clusters again from the beginning of the file. In that case old
refcount blocks are loaded again, but the cache wouldn't help here
either (unless we would have all refcount blocks cached).

Berto

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/2] qcow2: Give the refcount cache the minimum possible size by default
  2018-04-16 13:56     ` Alberto Garcia
@ 2018-04-16 14:05       ` Max Reitz
  2018-04-16 14:31         ` Alberto Garcia
  0 siblings, 1 reply; 9+ messages in thread
From: Max Reitz @ 2018-04-16 14:05 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 4282 bytes --]

On 2018-04-16 15:56, Alberto Garcia wrote:
> On Fri 13 Apr 2018 05:00:48 PM CEST, Max Reitz wrote:
>> On 2018-03-14 09:29, Alberto Garcia wrote:
>>> The L2 and refcount caches have default sizes that can be overridden
>>> using the l2-cache-size and refcount-cache-size (an additional
>>> parameter named cache-size sets the combined size of both caches).
>>>
>>> Unless forced by one of the aforementioned parameters, QEMU will set
>>> the unspecified sizes so that the L2 cache is 4 times larger than the
>>> refcount cache.
>>>
>>> This is based on the premise that the refcount metadata needs to be
>>> only a fourth of the L2 metadata to cover the same amount of disk
>>> space. This is incorrect for two reasons:
>>>
>>>  a) The amount of disk covered by an L2 table depends solely on the
>>>     cluster size, but in the case of a refcount block it depends on
>>>     the cluster size *and* the width of each refcount entry.
>>>     The 4/1 ratio is only valid with 16-bit entries (the default).
>>>
>>>  b) When we talk about disk space and L2 tables we are talking about
>>>     guest space (L2 tables map guest clusters to host clusters),
>>>     whereas refcount blocks are used for host clusters (including
>>>     L1/L2 tables and the refcount blocks themselves). On a fully
>>>     populated (and uncompressed) qcow2 file, image size > virtual size
>>>     so there are more refcount entries than L2 entries.
>>>
>>> Problem (a) could be fixed by adjusting the algorithm to take into
>>> account the refcount entry width. Problem (b) could be fixed by
>>> increasing a bit the refcount cache size to account for the clusters
>>> used for qcow2 metadata.
>>
>> Yeah, well, but I wouldn't give too much thought to (b).  In case of
>> internal snapshots, you won't write to them, so those clusters don't
>> matter.  So, yes, you get a bit of overhead for the metadata, but come
>> on, that's not really much.
> 
> With that text I just wanted to point out that the calculation mixes
> host clusters and guest clusters. I agree that the ratio we're using is
> a good approximation (apart from the refcount width issue).
> 
>>> However this patch takes a completely different approach and instead
>>> of keeping a ratio between both cache sizes it assigns as much as
>>> possible to the L2 cache and the remainder to the refcount cache.
>>
>> Seems reasonable in a sense, but I'm wary still.  I especially don't
>> like that it just changes the default when nothing has been specified.
>>
>> I can see that we want to limit the refcount cache size, so if you
>> want to give really much RAM to the L2 tables, then you don't want a
>> 25 % overhead for the refcount cache, but do we really need to limit
>> it to the minimum size?
>>
>> OTOH, four clusters doesn't seem extremely limited...  Did you do your
>> benchmark on an HDD?
> 
> I tried, I see no difference, and I don't see why there would be any.
> 
> Refcount entries are used to reference host clusters, and host clusters
> are always(*) allocated sequentially, so caching refcount blocks doesn't
> do much. You're always loading the same refcount block until it's full,
> then you move on to the next one. As I mentioned in a previous e-mail,
> having a very large refcount cache would even be detrimental because it
> would make cache hits slower (noticeable under tmpfs).

Ah, right.  I'm not sure whether I want to ask you whether you have
tested internal snapshots.  I suppose it can be detrimental for them
because when taking or deleting a snapshot you need to update the
refcounts of a whole bunch of clusters that might be spread randomly
across the image.  But I suspect that it would be rather time-consuming
to produce an image with such a configuration; and I suppose if people
want to make heavy use of internal snapshots they can adapt the cache
size themselves?

Max

> (*) when a cluster gets deallocated (e.g. when the refcount table needs
> to grow) then there's a hole in the file and QEMU starts looking for
> free clusters again from the beginning of the file. In that case old
> refcount blocks are loaded again, but the cache wouldn't help here
> either (unless we would have all refcount blocks cached).


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/2] qcow2: Give the refcount cache the minimum possible size by default
  2018-04-16 14:05       ` Max Reitz
@ 2018-04-16 14:31         ` Alberto Garcia
  0 siblings, 0 replies; 9+ messages in thread
From: Alberto Garcia @ 2018-04-16 14:31 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: qemu-block, Kevin Wolf, Eric Blake

On Mon 16 Apr 2018 04:05:21 PM CEST, Max Reitz wrote:
>> Refcount entries are used to reference host clusters, and host
>> clusters are always(*) allocated sequentially, so caching refcount
>> blocks doesn't do much. You're always loading the same refcount block
>> until it's full, then you move on to the next one. As I mentioned in
>> a previous e-mail, having a very large refcount cache would even be
>> detrimental because it would make cache hits slower (noticeable under
>> tmpfs).
>
> Ah, right.  I'm not sure whether I want to ask you whether you have
> tested internal snapshots.  I suppose it can be detrimental for them
> because when taking or deleting a snapshot you need to update the
> refcounts of a whole bunch of clusters that might be spread randomly
> across the image.  But I suspect that it would be rather
> time-consuming to produce an image with such a configuration; and I
> suppose if people want to make heavy use of internal snapshots they
> can adapt the cache size themselves?

Exactly.

Berto

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/2] qcow2: Give the refcount cache the minimum possible size by default
  2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 1/2] qcow2: " Alberto Garcia
  2018-04-13 15:00   ` Max Reitz
@ 2018-04-16 16:14   ` Max Reitz
  1 sibling, 0 replies; 9+ messages in thread
From: Max Reitz @ 2018-04-16 16:14 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 2651 bytes --]

On 2018-03-14 09:29, Alberto Garcia wrote:
> The L2 and refcount caches have default sizes that can be overridden
> using the l2-cache-size and refcount-cache-size (an additional
> parameter named cache-size sets the combined size of both caches).
> 
> Unless forced by one of the aforementioned parameters, QEMU will set
> the unspecified sizes so that the L2 cache is 4 times larger than the
> refcount cache.
> 
> This is based on the premise that the refcount metadata needs to be
> only a fourth of the L2 metadata to cover the same amount of disk
> space. This is incorrect for two reasons:
> 
>  a) The amount of disk covered by an L2 table depends solely on the
>     cluster size, but in the case of a refcount block it depends on
>     the cluster size *and* the width of each refcount entry.
>     The 4/1 ratio is only valid with 16-bit entries (the default).
> 
>  b) When we talk about disk space and L2 tables we are talking about
>     guest space (L2 tables map guest clusters to host clusters),
>     whereas refcount blocks are used for host clusters (including
>     L1/L2 tables and the refcount blocks themselves). On a fully
>     populated (and uncompressed) qcow2 file, image size > virtual size
>     so there are more refcount entries than L2 entries.
> 
> Problem (a) could be fixed by adjusting the algorithm to take into
> account the refcount entry width. Problem (b) could be fixed by
> increasing a bit the refcount cache size to account for the clusters
> used for qcow2 metadata.
> 
> However this patch takes a completely different approach and instead
> of keeping a ratio between both cache sizes it assigns as much as
> possible to the L2 cache and the remainder to the refcount cache.
> 
> The reason is that L2 tables are used for every single I/O request
> from the guest and the effect of increasing the cache is significant
> and clearly measurable. Refcount blocks are however only used for
> cluster allocation and internal snapshots and in practice are accessed
> sequentially in most cases, so the effect of increasing the cache is
> negligible (even when doing random writes from the guest).
> 
> So, make the refcount cache as small as possible unless the user
> explicitly asks for a larger one.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.c              | 31 +++++++++++++++++++------------
>  block/qcow2.h              |  4 ----
>  tests/qemu-iotests/137.out |  2 +-
>  3 files changed, 20 insertions(+), 17 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] docs: Document the new default sizes of the qcow2 caches
  2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 2/2] docs: Document the new default sizes of the qcow2 caches Alberto Garcia
@ 2018-04-16 16:15   ` Max Reitz
  0 siblings, 0 replies; 9+ messages in thread
From: Max Reitz @ 2018-04-16 16:15 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 3218 bytes --]

On 2018-03-14 09:29, Alberto Garcia wrote:
> We have just reduced the refcount cache size to the minimum unless
> the user explicitly requests a larger one, so we have to update the
> documentation to reflect this change.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  docs/qcow2-cache.txt | 31 ++++++++++++++-----------------
>  1 file changed, 14 insertions(+), 17 deletions(-)
> 
> diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
> index 170191a242..c640d45d06 100644
> --- a/docs/qcow2-cache.txt
> +++ b/docs/qcow2-cache.txt
> @@ -116,31 +116,28 @@ There are three options available, and all of them take bytes:
>  "refcount-cache-size":   maximum size of the refcount block cache
>  "cache-size":            maximum size of both caches combined
>  
> -There are two things that need to be taken into account:
> +There are a few things that need to be taken into account:
>  
>   - Both caches must have a size that is a multiple of the cluster size
>     (or the cache entry size: see "Using smaller cache sizes" below).
>  
> - - If you only set one of the options above, QEMU will automatically
> -   adjust the others so that the L2 cache is 4 times bigger than the
> -   refcount cache.
> + - The default L2 cache size is 8 clusters or 1MB (whichever is more),
> +   and the minimum is 2 clusters (or 2 cache entries, see below).
>  
> -This means that these options are equivalent:
> + - The default (and minimum) refcount cache size is 4 clusters.
>  
> -   -drive file=hd.qcow2,l2-cache-size=2097152
> -   -drive file=hd.qcow2,refcount-cache-size=524288
> -   -drive file=hd.qcow2,cache-size=2621440
> + - If only "cache-size" is specified then QEMU will assign as much
> +   memory as possible to the L2 cache before increasing the refcount
> +   cache size.
>  
> -The reason for this 1/4 ratio is to ensure that both caches cover the
> -same amount of disk space. Note however that this is only valid with
> -the default value of refcount_bits (16). If you are using a different
> -value you might want to calculate both cache sizes yourself since QEMU
> -will always use the same 1/4 ratio.
> +Unlike L2 tables, refcount blocks are not used during normal I/O but
> +only during allocations and internal snapshots. In most cases they are
> +accessed sequentially (even during random guest I/O) so increasing the
> +refcount cache size won't have any measurable effect in performance.

Could you add a note that it can have a measurable effect when using
internal snapshots and the user may want to think about increasing its
size when making heavy use of them?

Max

> -It's also worth mentioning that there's no strict need for both caches
> -to cover the same amount of disk space. The refcount cache is used
> -much less often than the L2 cache, so it's perfectly reasonable to
> -keep it small.
> +Before QEMU 2.12 the refcount cache had a default size of 1/4 of the
> +L2 cache size. This resulted in unnecessarily large caches, so now the
> +refcount cache is as small as possible unless overridden by the user.
>  
>  
>  Using smaller cache entries
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-04-16 16:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-14  8:29 [Qemu-devel] [PATCH v2 0/2] Give the refcount cache the minimum possible size by default Alberto Garcia
2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 1/2] qcow2: " Alberto Garcia
2018-04-13 15:00   ` Max Reitz
2018-04-16 13:56     ` Alberto Garcia
2018-04-16 14:05       ` Max Reitz
2018-04-16 14:31         ` Alberto Garcia
2018-04-16 16:14   ` Max Reitz
2018-03-14  8:29 ` [Qemu-devel] [PATCH v2 2/2] docs: Document the new default sizes of the qcow2 caches Alberto Garcia
2018-04-16 16:15   ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.