[Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache
@ 2018-08-10  6:26 Leonid Bloch
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 1/9] qcow2: Options' documentation fixes Leonid Bloch
                   ` (8 more replies)
  0 siblings, 9 replies; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB. To
modify this limit one can use the already existing 'l2-cache-size' and
'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size',
as it was documented as the *maximum* L2 cache size, but in practice
behaved as the absolute size.

To compensate the memory overhead which may be increased following this
behavior, the default cache-clean-interval is set to 10 minutes by default
(was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.

Additionally, few minor changes are made (refactoring and documentation
fixes).

Differences from v1:
* .gitignore modification patch removed (unneeded).
* The grammar fix in conflicting cache sizing patch removed (merged).
* The update total_sectors when resizing patch squashed with the
  resizing patch.
* L2 cache is now capped at 32 MB.
* The default cache-clean-interval is set to 30 seconds.

Differences from v2:
* Made it clear in the documentation that setting cache-clean-interval
  to 0 disables this feature.

Differences from v3:
* The default cache-clean-interval is set to 10 minutes instead of 30
  seconds before.
* Commit message changes to better explain the patches.
* Some refactoring.

Differences from v4:
* Refactoring.
* Documentation and commit message fixes.

Differences from v5:
* A patch with cosmetic changes is split from the main patch
* A patch for avoiding duplication in refcount cache size assignment is
  split from the main patch.
* In the main patch the cap on the L2 cache size is set to only 1 MB (in
  order to be close to the previous behavior) and a separate patch sets
  it to 32 MB.
* Changes in the documentation fixes patch [1/8].

Differences from v6:
* A patch is added to make the defined sizes more humanly readable

Leonid Bloch (9):
  qcow2: Options' documentation fixes
  qcow2: Cosmetic changes
  qcow2: Make sizes more humanly readable
  qcow2: Avoid duplication in setting the refcount cache size
  qcow2: Assign the L2 cache relatively to the image size
  qcow2: Increase the default upper limit on the L2 cache size
  qcow2: Resize the cache upon image resizing
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Explicit number replaced by a constant

 block/qcow2.c              | 54 +++++++++++++++++++++-----------------
 block/qcow2.h              | 12 ++++-----
 docs/qcow2-cache.txt       | 33 ++++++++++++++---------
 qapi/block-core.json       |  3 ++-
 qemu-options.hx            | 11 +++++---
 tests/qemu-iotests/137     |  1 -
 tests/qemu-iotests/137.out |  1 -
 7 files changed, 66 insertions(+), 49 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 1/9] qcow2: Options' documentation fixes
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  2018-08-10 11:50   ` Alberto Garcia
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 2/9] qcow2: Cosmetic changes Leonid Bloch
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
---
 docs/qcow2-cache.txt | 16 +++++++++++-----
 qemu-options.hx      |  9 ++++++---
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..0f157d859a 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -97,12 +97,15 @@ need:
    l2_cache_size = disk_size_GB * 131072
    refcount_cache_size = disk_size_GB * 32768
 
-QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
-cache of 256KB (262144 bytes), so using the formulas we've just seen
-we have
+With the default cluster size, to cover each 8 GB of the virtual image
+size, 1MB of L2 cache is needed:
 
-   1048576 / 131072 = 8 GB of virtual disk covered by that cache
-    262144 /  32768 = 8 GB
+   65536 / 8 = 8192 = 8 GB / 1 MB
+
+A default refcount cache is 4 times the cluster size, which defaults to
+256 KB (262144 bytes). This is sufficient for 8 GB of image size:
+
+   262144 / 32768 = 8 GB
 
 
 How to configure the cache sizes
@@ -130,6 +133,9 @@ There are a few things that need to be taken into account:
    memory as possible to the L2 cache before increasing the refcount
    cache size.
 
+ - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
+   can be set simultaneously.
+
 Unlike L2 tables, refcount blocks are not used during normal I/O but
 only during allocations and internal snapshots. In most cases they are
 accessed sequentially (even during random guest I/O) so increasing the
diff --git a/qemu-options.hx b/qemu-options.hx
index b1bf0f485f..f6804758d3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -752,15 +752,18 @@ image file)
 
 @item cache-size
 The maximum total size of the L2 table and refcount block caches in bytes
-(default: 1048576 bytes or 8 clusters, whichever is larger)
+(default: the sum of l2-cache-size and refcount-cache-size)
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: 4/5 of the total cache size)
+(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever
+is larger; otherwise, as large as possible or needed within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-(default: 1/5 of the total cache size)
+(default: 4 times the cluster size; or if cache-size is specified, the part of
+it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 2/9] qcow2: Cosmetic changes
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 1/9] qcow2: Options' documentation fixes Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  2018-08-10 12:54   ` Alberto Garcia
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 3/9] qcow2: Make sizes more humanly readable Leonid Bloch
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

Some refactoring for better readability is done here.

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
---
 block/qcow2.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index ec9e6238a0..3f4abc394e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -790,8 +790,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
     *refcount_cache_size = qemu_opt_get_size(opts,
                                              QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
-    *l2_cache_entry_size = qemu_opt_get_size(
-        opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
+    *l2_cache_entry_size = qemu_opt_get_size(opts,
+                                             QCOW2_OPT_L2_CACHE_ENTRY_SIZE,
+                                             s->cluster_size);
 
     if (combined_cache_size_set) {
         if (l2_cache_size_set && refcount_cache_size_set) {
@@ -823,8 +824,8 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
                 *l2_cache_size = max_l2_cache;
                 *refcount_cache_size = combined_cache_size - *l2_cache_size;
             } else {
-                *refcount_cache_size =
-                    MIN(combined_cache_size, min_refcount_cache);
+                *refcount_cache_size = MIN(combined_cache_size,
+                                           min_refcount_cache);
                 *l2_cache_size = combined_cache_size - *refcount_cache_size;
             }
         }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 3/9] qcow2: Make sizes more humanly readable
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 1/9] qcow2: Options' documentation fixes Leonid Bloch
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 2/9] qcow2: Cosmetic changes Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  2018-08-10  8:33   ` Alberto Garcia
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size Leonid Bloch
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
---
 block/qcow2.c | 2 +-
 block/qcow2.h | 9 +++++----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3f4abc394e..7a2d7a1d48 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -831,7 +831,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
         }
     } else {
         if (!l2_cache_size_set) {
-            *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
+            *l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
                                  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
                                  * s->cluster_size);
         }
diff --git a/block/qcow2.h b/block/qcow2.h
index 81b844e936..39e1b279f8 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
+#include "qemu/units.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
@@ -43,11 +44,11 @@
 
 /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE 0x800000
+#define QCOW_MAX_REFTABLE_SIZE (8 * MiB)
 
 /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE 0x2000000
+#define QCOW_MAX_L1_SIZE (32 * MiB)
 
 /* Allow for an average of 1k per snapshot table entry, should be plenty of
  * space for snapshot names and IDs */
@@ -75,9 +76,9 @@
 
 /* Whichever is more */
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
+#define DEFAULT_L2_CACHE_SIZE (1 * MiB)
 
-#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
                   ` (2 preceding siblings ...)
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 3/9] qcow2: Make sizes more humanly readable Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  2018-08-10 13:14   ` Alberto Garcia
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size Leonid Bloch
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
---
 block/qcow2.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7a2d7a1d48..053242f94e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -829,16 +829,13 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
                 *l2_cache_size = combined_cache_size - *refcount_cache_size;
             }
         }
-    } else {
-        if (!l2_cache_size_set) {
-            *l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
-                                 (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
-                                 * s->cluster_size);
-        }
-        if (!refcount_cache_size_set) {
-            *refcount_cache_size = min_refcount_cache;
-        }
+    } else if (!l2_cache_size_set) {
+        *l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
+                             (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
+                             * s->cluster_size);
     }
+    /* If refcount-cache-size is not specified, it will be set to minimum
+     * in qcow2_update_options_prepare(). No need to set it here. */
 
     if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) ||
         *l2_cache_entry_size > s->cluster_size ||
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
                   ` (3 preceding siblings ...)
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  2018-08-10 14:39   ` Alberto Garcia
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size Leonid Bloch
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, the L2 cache was allocated without considering the image
size, and an option existed to manually determine its size. Thus to
achieve a full coverage of an image by the L2 cache (i.e. use more than
the default value of MAX(1 MB, 8 clusters)), a user needed to calculate
the required size manually, or with a script, and pass this value to the
'l2-cache-size' option.

Now, the L2 cache is assigned taking the virtual image size into account,
and will cover the entire image, unless the size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
---
 block/qcow2.c              | 22 ++++++++++------------
 block/qcow2.h              |  4 +---
 docs/qcow2-cache.txt       | 13 ++++++++-----
 qemu-options.hx            |  6 +++---
 tests/qemu-iotests/137     |  1 -
 tests/qemu-iotests/137.out |  1 -
 6 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 053242f94e..434fb89076 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -777,16 +777,19 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
                              uint64_t *refcount_cache_size, Error **errp)
 {
     BDRVQcow2State *s = bs->opaque;
-    uint64_t combined_cache_size;
+    uint64_t combined_cache_size, l2_cache_max_setting;
     bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
     int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+    uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+    uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
 
     combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
     l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
     refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE);
 
     combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
-    *l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
+    l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE,
+                                             DEFAULT_L2_CACHE_MAX_SIZE);
     *refcount_cache_size = qemu_opt_get_size(opts,
                                              QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
@@ -794,13 +797,16 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
                                              QCOW2_OPT_L2_CACHE_ENTRY_SIZE,
                                              s->cluster_size);
 
+    *l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting);
+
     if (combined_cache_size_set) {
         if (l2_cache_size_set && refcount_cache_size_set) {
             error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
                        " and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
                        "at the same time");
             return;
-        } else if (*l2_cache_size > combined_cache_size) {
+        } else if (l2_cache_size_set &&
+                   (l2_cache_max_setting > combined_cache_size)) {
             error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
                        QCOW2_OPT_CACHE_SIZE);
             return;
@@ -815,13 +821,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
         } else if (refcount_cache_size_set) {
             *l2_cache_size = combined_cache_size - *refcount_cache_size;
         } else {
-            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
             /* Assign as much memory as possible to the L2 cache, and
              * use the remainder for the refcount cache */
-            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
-                *l2_cache_size = max_l2_cache;
+            if (combined_cache_size >= *l2_cache_size + min_refcount_cache) {
                 *refcount_cache_size = combined_cache_size - *l2_cache_size;
             } else {
                 *refcount_cache_size = MIN(combined_cache_size,
@@ -829,10 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
                 *l2_cache_size = combined_cache_size - *refcount_cache_size;
             }
         }
-    } else if (!l2_cache_size_set) {
-        *l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
-                             (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
-                             * s->cluster_size);
     }
     /* If refcount-cache-size is not specified, it will be set to minimum
      * in qcow2_update_options_prepare(). No need to set it here. */
diff --git a/block/qcow2.h b/block/qcow2.h
index 39e1b279f8..d917b5f577 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,9 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-/* Whichever is more */
-#define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_SIZE (1 * MiB)
+#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
 
 #define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 0f157d859a..69af306267 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -124,8 +124,11 @@ There are a few things that need to be taken into account:
  - Both caches must have a size that is a multiple of the cluster size
    (or the cache entry size: see "Using smaller cache sizes" below).
 
- - The default L2 cache size is 8 clusters or 1MB (whichever is more),
-   and the minimum is 2 clusters (or 2 cache entries, see below).
+ - The default L2 cache size will cover the entire virtual size of an
+   image, up to a certain maximum. This maximum is 1 MB by default
+   (enough for image sizes of up to 8 GB with the default cluster size)
+   and it can be reduced or enlarged using the "l2-cache-size" option.
+   The minimum is 2 clusters (or 2 cache entries, see below).
 
  - The default (and minimum) refcount cache size is 4 clusters.
 
@@ -183,9 +186,9 @@ Some things to take into account:
    always uses the cluster size as the entry size.
 
  - If the L2 cache is big enough to hold all of the image's L2 tables
-   (as explained in the "Choosing the right cache sizes" section
-   earlier in this document) then none of this is necessary and you
-   can omit the "l2-cache-entry-size" parameter altogether.
+   (the default behavior for images of up to 8 GB in size) then none
+   of this is necessary and you can omit the "l2-cache-entry-size"
+   parameter altogether.
 
 
 Reducing the memory usage
diff --git a/qemu-options.hx b/qemu-options.hx
index f6804758d3..22e8e2d113 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -756,9 +756,9 @@ The maximum total size of the L2 table and refcount block caches in bytes
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever
-is larger; otherwise, as large as possible or needed within the cache-size,
-while permitting the requested or the minimal refcount cache size)
+(default: if cache-size is not specified - 1M; otherwise, as large as possible
+within the cache-size, while permitting the requested or the minimal refcount
+cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
index 87965625d8..e3fb078588 100755
--- a/tests/qemu-iotests/137
+++ b/tests/qemu-iotests/137
@@ -109,7 +109,6 @@ $QEMU_IO \
     -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \
     -c "reopen -o cache-size=1M,l2-cache-size=2M" \
     -c "reopen -o cache-size=1M,refcount-cache-size=2M" \
-    -c "reopen -o l2-cache-size=256T" \
     -c "reopen -o l2-cache-entry-size=33k" \
     -c "reopen -o l2-cache-entry-size=128k" \
     -c "reopen -o refcount-cache-size=256T" \
diff --git a/tests/qemu-iotests/137.out b/tests/qemu-iotests/137.out
index 6a2ffc71fd..70f245ae7a 100644
--- a/tests/qemu-iotests/137.out
+++ b/tests/qemu-iotests/137.out
@@ -19,7 +19,6 @@ Parameter 'lazy-refcounts' expects 'on' or 'off'
 cache-size, l2-cache-size and refcount-cache-size may not be set at the same time
 l2-cache-size may not exceed cache-size
 refcount-cache-size may not exceed cache-size
-L2 cache size too big
 L2 cache entry size must be a power of two between 512 and the cluster size (65536)
 L2 cache entry size must be a power of two between 512 and the cluster size (65536)
 Refcount cache size too big
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
                   ` (4 preceding siblings ...)
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  2018-08-10 12:00   ` Alberto Garcia
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 7/9] qcow2: Resize the cache upon image resizing Leonid Bloch
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

The upper limit on the L2 cache size is increased from 1 MB to 32 MB.
This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
---
 block/qcow2.h        | 2 +-
 docs/qcow2-cache.txt | 6 +++---
 qemu-options.hx      | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index d917b5f577..e699a55d02 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,7 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
+#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
 
 #define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 69af306267..6ad1081d1a 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -125,8 +125,8 @@ There are a few things that need to be taken into account:
    (or the cache entry size: see "Using smaller cache sizes" below).
 
  - The default L2 cache size will cover the entire virtual size of an
-   image, up to a certain maximum. This maximum is 1 MB by default
-   (enough for image sizes of up to 8 GB with the default cluster size)
+   image, up to a certain maximum. This maximum is 32 MB by default
+   (enough for image sizes of up to 256 GB with the default cluster size)
    and it can be reduced or enlarged using the "l2-cache-size" option.
    The minimum is 2 clusters (or 2 cache entries, see below).
 
@@ -186,7 +186,7 @@ Some things to take into account:
    always uses the cluster size as the entry size.
 
  - If the L2 cache is big enough to hold all of the image's L2 tables
-   (the default behavior for images of up to 8 GB in size) then none
+   (the default behavior for images of up to 256 GB in size) then none
    of this is necessary and you can omit the "l2-cache-entry-size"
    parameter altogether.
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 22e8e2d113..4c44cdbc23 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -756,7 +756,7 @@ The maximum total size of the L2 table and refcount block caches in bytes
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: if cache-size is not specified - 1M; otherwise, as large as possible
+(default: if cache-size is not specified - 32M; otherwise, as large as possible
 within the cache-size, while permitting the requested or the minimal refcount
 cache size)
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 7/9] qcow2: Resize the cache upon image resizing
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
                   ` (5 preceding siblings ...)
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 8/9] qcow2: Set the default cache-clean-interval to 10 minutes Leonid Bloch
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 9/9] qcow2: Explicit number replaced by a constant Leonid Bloch
  8 siblings, 0 replies; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 434fb89076..ba4dfae735 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3418,6 +3418,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
     uint64_t old_length;
     int64_t new_l1_size;
     int ret;
+    QDict *options;
 
     if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA &&
         prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL)
@@ -3642,6 +3643,8 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
         }
     }
 
+    bs->total_sectors = offset / BDRV_SECTOR_SIZE;
+
     /* write updated header.size */
     offset = cpu_to_be64(offset);
     ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
@@ -3652,6 +3655,13 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
     }
 
     s->l1_vm_state_index = new_l1_size;
+
+    /* Update cache sizes */
+    options = qdict_clone_shallow(bs->options);
+    ret = qcow2_update_options(bs, options, s->flags, errp);
+    if (ret < 0) {
+        goto fail;
+    }
     ret = 0;
 fail:
     qemu_co_mutex_unlock(&s->lock);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 8/9] qcow2: Set the default cache-clean-interval to 10 minutes
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
                   ` (6 preceding siblings ...)
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 7/9] qcow2: Resize the cache upon image resizing Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 9/9] qcow2: Explicit number replaced by a constant Leonid Bloch
  8 siblings, 0 replies; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c        | 2 +-
 block/qcow2.h        | 1 +
 docs/qcow2-cache.txt | 4 ++--
 qapi/block-core.json | 3 ++-
 qemu-options.hx      | 2 +-
 5 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index ba4dfae735..b4f291765b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState *bs,
     /* New interval for cache cleanup timer */
     r->cache_clean_interval =
         qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
-                            s->cache_clean_interval);
+                            DEFAULT_CACHE_CLEAN_INTERVAL);
 #ifndef CONFIG_LINUX
     if (r->cache_clean_interval != 0) {
         error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
diff --git a/block/qcow2.h b/block/qcow2.h
index e699a55d02..5e94f7ffc4 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -78,6 +78,7 @@
 
 #define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
+#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
 #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 6ad1081d1a..684147ad45 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -204,8 +204,8 @@ This example removes all unused cache entries every 15 minutes:
 
    -drive file=hd.qcow2,cache-clean-interval=900
 
-If unset, the default value for this parameter is 0 and it disables
-this feature.
+If unset, the default value for this parameter is 600. Setting it to 0
+disables this feature.
 
 Note that this functionality currently relies on the MADV_DONTNEED
 argument for madvise() to actually free the memory. This is a
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 5b9084a394..9a6a708a37 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2830,7 +2830,8 @@
 #
 # @cache-clean-interval:  clean unused entries in the L2 and refcount
 #                         caches. The interval is in seconds. The default value
-#                         is 0 and it disables this feature (since 2.5)
+#                         is 600, and 0 disables this feature. (since 2.5)
+#
 # @encrypt:               Image decryption options. Mandatory for
 #                         encrypted images, except when doing a metadata-only
 #                         probe of the image. (since 2.10)
diff --git a/qemu-options.hx b/qemu-options.hx
index 4c44cdbc23..6abf3631ec 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -767,7 +767,7 @@ it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-The default value is 0 and it disables this feature.
+The default value is 600. Setting it to 0 disables this feature.
 
 @item pass-discard-request
 Whether discard requests to the qcow2 device should be forwarded to the data
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v7 9/9] qcow2: Explicit number replaced by a constant
  2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
                   ` (7 preceding siblings ...)
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 8/9] qcow2: Set the default cache-clean-interval to 10 minutes Leonid Bloch
@ 2018-08-10  6:26 ` Leonid Bloch
  8 siblings, 0 replies; 35+ messages in thread
From: Leonid Bloch @ 2018-08-10  6:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Alberto Garcia,
	Leonid Bloch

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index b4f291765b..b0e20aeffc 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
     /* 2^(s->refcount_order - 3) is the refcount width in bytes */
     s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
     s->refcount_block_size = 1 << s->refcount_block_bits;
-    bs->total_sectors = header.size / 512;
+    bs->total_sectors = header.size / BDRV_SECTOR_SIZE;
     s->csize_shift = (62 - (s->cluster_bits - 8));
     s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
     s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
@@ -3450,7 +3450,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
         goto fail;
     }
 
-    old_length = bs->total_sectors * 512;
+    old_length = bs->total_sectors * BDRV_SECTOR_SIZE;
     new_l1_size = size_to_l1(s, offset);
 
     if (offset < old_length) {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 3/9] qcow2: Make sizes more humanly readable
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 3/9] qcow2: Make sizes more humanly readable Leonid Bloch
@ 2018-08-10  8:33   ` Alberto Garcia
  0 siblings, 0 replies; 35+ messages in thread
From: Alberto Garcia @ 2018-08-10  8:33 UTC (permalink / raw)
  To: Leonid Bloch, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On Fri 10 Aug 2018 08:26:41 AM CEST, Leonid Bloch wrote:
> Signed-off-by: Leonid Bloch <lbloch@janustech.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 1/9] qcow2: Options' documentation fixes
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 1/9] qcow2: Options' documentation fixes Leonid Bloch
@ 2018-08-10 11:50   ` Alberto Garcia
  2018-08-11 18:01     ` Leonid Bloch
  0 siblings, 1 reply; 35+ messages in thread
From: Alberto Garcia @ 2018-08-10 11:50 UTC (permalink / raw)
  To: Leonid Bloch, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On Fri 10 Aug 2018 08:26:39 AM CEST, Leonid Bloch wrote:
> Signed-off-by: Leonid Bloch <lbloch@janustech.com>
> ---
>  docs/qcow2-cache.txt | 16 +++++++++++-----
>  qemu-options.hx      |  9 ++++++---
>  2 files changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
> index 8a09a5cc5f..0f157d859a 100644
> --- a/docs/qcow2-cache.txt
> +++ b/docs/qcow2-cache.txt
> @@ -97,12 +97,15 @@ need:
>     l2_cache_size = disk_size_GB * 131072
>     refcount_cache_size = disk_size_GB * 32768
>  
> -QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
> -cache of 256KB (262144 bytes), so using the formulas we've just seen
> -we have
> +With the default cluster size, to cover each 8 GB of the virtual image
> +size, 1MB of L2 cache is needed:
>  
> -   1048576 / 131072 = 8 GB of virtual disk covered by that cache
> -    262144 /  32768 = 8 GB
> +   65536 / 8 = 8192 = 8 GB / 1 MB

Where does this 65536 / 8 come from? The line that you changed follows
directly from the formula immediately before that paragraph:

     l2_cache_size = disk_size_GB * 131072

that is

     l2_cache_size / 131072 = disk_size_GB
           1048576 / 131072 = 8 GB

Berto

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size Leonid Bloch
@ 2018-08-10 12:00   ` Alberto Garcia
  2018-08-13  1:39     ` Max Reitz
  0 siblings, 1 reply; 35+ messages in thread
From: Alberto Garcia @ 2018-08-10 12:00 UTC (permalink / raw)
  To: Leonid Bloch, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On Fri 10 Aug 2018 08:26:44 AM CEST, Leonid Bloch wrote:
> The upper limit on the L2 cache size is increased from 1 MB to 32 MB.
> This is done in order to allow default full coverage with the L2 cache
> for images of up to 256 GB in size (was 8 GB). Note, that only the
> needed amount to cover the full image is allocated. The value which is
> changed here is just the upper limit on the L2 cache size, beyond which
> it will not grow, even if the size of the image will require it to.
>
> Signed-off-by: Leonid Bloch <lbloch@janustech.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

> -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
> +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)

The patch looks perfect to me now and I'm fine with this change, but
this is quite an increase from the previous default value. If anyone
thinks that this is too aggressive (or too little :)) I'm all ears.

Berto

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 2/9] qcow2: Cosmetic changes
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 2/9] qcow2: Cosmetic changes Leonid Bloch
@ 2018-08-10 12:54   ` Alberto Garcia
  0 siblings, 0 replies; 35+ messages in thread
From: Alberto Garcia @ 2018-08-10 12:54 UTC (permalink / raw)
  To: Leonid Bloch, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On Fri 10 Aug 2018 08:26:40 AM CEST, Leonid Bloch wrote:
> Some refactoring for better readability is done here.
>
> Signed-off-by: Leonid Bloch <lbloch@janustech.com>

> -    *l2_cache_entry_size = qemu_opt_get_size(
> -        opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
> +    *l2_cache_entry_size = qemu_opt_get_size(opts,
> +                                             QCOW2_OPT_L2_CACHE_ENTRY_SIZE,
> +                                             s->cluster_size);

> -                *refcount_cache_size =
> -                    MIN(combined_cache_size, min_refcount_cache);
> +                *refcount_cache_size = MIN(combined_cache_size,
> +                                           min_refcount_cache);

I won't oppose these changes if more people think that they improve
readability, but it seems to me that this is just a matter of taste and
there's no big difference either way.

I'm personally fine with the way the code is now, and this coding style
is used in many other parts of QEMU:

$ git grep -A 1 '.* = .*($'
$ git grep -A 1 '.* =$'

Berto

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size Leonid Bloch
@ 2018-08-10 13:14   ` Alberto Garcia
  2018-08-11 18:40     ` Leonid Bloch
  0 siblings, 1 reply; 35+ messages in thread
From: Alberto Garcia @ 2018-08-10 13:14 UTC (permalink / raw)
  To: Leonid Bloch, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On Fri 10 Aug 2018 08:26:42 AM CEST, Leonid Bloch wrote:
> The refcount cache size does not need to be set to its minimum value in
> read_cache_sizes(), as it is set to at least its minimum value in
> qcow2_update_options_prepare().
>
> Signed-off-by: Leonid Bloch <lbloch@janustech.com>

> -    } else {
> -        if (!l2_cache_size_set) {
> -            *l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
> -                                 (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
> -                                 * s->cluster_size);
> -        }
> -        if (!refcount_cache_size_set) {
> -            *refcount_cache_size = min_refcount_cache;
> -        }

Since you're getting rid of the rest of the code later anyway, I think
it's cleaner to only remove these last three lines here and leave the
rest untouched. It makes the patch shorter and easier to read.

> +    /* If refcount-cache-size is not specified, it will be set to minimum
> +     * in qcow2_update_options_prepare(). No need to set it here. */

This is not quite correct, because apart from the "not specified" case,
refcount_cache_size is also overridden in other circumstances: (a) the
value is specified but is too low, or (b) the value is set indirectly
via "cache-size" but the end result is too low[*].

Also, the same thing happens to l2-cache-size: if you set it manually
but it's too low then it will be overridden.

I'd personally remove the comment altogether, I think the explanation in
the commit message is enough.

Berto

[*] Now that I think of it: if you set e.g. cache-size = 16M and
    l2-cache-size = 16MB then you'll end up with refcount-cache-size =
    16M - 16M = 0, which will be overridden and set to the minimum.

    But then refcount-cache-size + l2-cache-size > cache-size

    So perhaps this should produce an error, and it may make sense to
    take the opportunity to move all the logic that ensures that
    MIN_SIZE <= size <= INT_MAX to read_cache_sizes(). But that's a
    possible task for the future and I wouldn't worry about it in this
    series.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size
  2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size Leonid Bloch
@ 2018-08-10 14:39   ` Alberto Garcia
  2018-08-11 19:19     ` Leonid Bloch
  0 siblings, 1 reply; 35+ messages in thread
From: Alberto Garcia @ 2018-08-10 14:39 UTC (permalink / raw)
  To: Leonid Bloch, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On Fri 10 Aug 2018 08:26:43 AM CEST, Leonid Bloch wrote:
> Previously, the L2 cache was allocated without considering the image
> size, and an option existed to manually determine its size.

This is not quite correct: the L2 cache was set to the maximum needed
for the image when "cache-size" was large enough:

        if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
                *l2_cache_size = max_l2_cache;
        } ...

See below for an example.

>          } else {
> -            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
> -            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
> -
>              /* Assign as much memory as possible to the L2 cache, and
>               * use the remainder for the refcount cache */
> -            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
> -                *l2_cache_size = max_l2_cache;
> +            if (combined_cache_size >= *l2_cache_size + min_refcount_cache) {

This has an (unintended?) side effect:

If you have a 1TB qcow2 image and open it with e.g. cache-size=200M,
with the previous code you would get a 128MB L2 cache (max_l2_cache).
With the new code you only get 32MB.

The rest of the function looks good to me now. We're getting close :)

> - - The default L2 cache size is 8 clusters or 1MB (whichever is more),
> -   and the minimum is 2 clusters (or 2 cache entries, see below).
> + - The default L2 cache size will cover the entire virtual size of an
> +   image, up to a certain maximum. This maximum is 1 MB by default
> +   (enough for image sizes of up to 8 GB with the default cluster size)
> +   and it can be reduced or enlarged using the "l2-cache-size" option.
> +   The minimum is 2 clusters (or 2 cache entries, see below).

It's not clear with this wording if this "certain maximum" refers to the
cache size or the image size.

I'm thinking that perhaps it's clearer if you leave that item unchanged
and add a new one below, something like:

- The L2 cache size setting (regardless of whether it takes the default
  value or it's set by the user) refers to the maximum size of the L2
  cache. If QEMU needs less memory to hold all of the image's L2 tables,
  then that's the actual size of the cache that will be allocated.

>   - If the L2 cache is big enough to hold all of the image's L2 tables
> -   (as explained in the "Choosing the right cache sizes" section
> -   earlier in this document) then none of this is necessary and you
> -   can omit the "l2-cache-entry-size" parameter altogether.
> +   (the default behavior for images of up to 8 GB in size) then none
> +   of this is necessary and you can omit the "l2-cache-entry-size"
> +   parameter altogether.

I think this change is unnecessary :-?

>  @item refcount-cache-size
>  The maximum size of the refcount block cache in bytes
> diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
> index 87965625d8..e3fb078588 100755
> --- a/tests/qemu-iotests/137
> +++ b/tests/qemu-iotests/137
> @@ -109,7 +109,6 @@ $QEMU_IO \
>      -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \
>      -c "reopen -o cache-size=1M,l2-cache-size=2M" \
>      -c "reopen -o cache-size=1M,refcount-cache-size=2M" \
> -    -c "reopen -o l2-cache-size=256T" \

The "L2 cache size too big" error can still be tested, but you will need
to create an image large enough to allow such a big cache.

$ qemu-img create -f qcow2 -o cluster_size=256k hd.qcow2 32P
$ $QEMU -drive file=hd.qcow2,l2-cache-entry-size=512,l2-cache-size=1T

Berto

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 1/9] qcow2: Options' documentation fixes
  2018-08-10 11:50   ` Alberto Garcia
@ 2018-08-11 18:01     ` Leonid Bloch
  0 siblings, 0 replies; 35+ messages in thread
From: Leonid Bloch @ 2018-08-11 18:01 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On 8/10/18 2:50 PM, Alberto Garcia wrote:
> On Fri 10 Aug 2018 08:26:39 AM CEST, Leonid Bloch wrote:
>> Signed-off-by: Leonid Bloch <lbloch@janustech.com>
>> ---
>>   docs/qcow2-cache.txt | 16 +++++++++++-----
>>   qemu-options.hx      |  9 ++++++---
>>   2 files changed, 17 insertions(+), 8 deletions(-)
>>
>> diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
>> index 8a09a5cc5f..0f157d859a 100644
>> --- a/docs/qcow2-cache.txt
>> +++ b/docs/qcow2-cache.txt
>> @@ -97,12 +97,15 @@ need:
>>      l2_cache_size = disk_size_GB * 131072
>>      refcount_cache_size = disk_size_GB * 32768
>>   
>> -QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
>> -cache of 256KB (262144 bytes), so using the formulas we've just seen
>> -we have
>> +With the default cluster size, to cover each 8 GB of the virtual image
>> +size, 1MB of L2 cache is needed:
>>   
>> -   1048576 / 131072 = 8 GB of virtual disk covered by that cache
>> -    262144 /  32768 = 8 GB
>> +   65536 / 8 = 8192 = 8 GB / 1 MB
> 
> Where does this 65536 / 8 come from? The line that you changed follows
> directly from the formula immediately before that paragraph:
> 
>       l2_cache_size = disk_size_GB * 131072
> 
> that is
> 
>       l2_cache_size / 131072 = disk_size_GB
>             1048576 / 131072 = 8 GB
> 
> Berto
> 

How about the following (all the relevant section reproduced below, for 
a continuous readability):

""""""""""
Choosing the right cache sizes
------------------------------
In order to choose the cache sizes we need to know how they relate to
the amount of the allocated space.

The amount of virtual disk that can be mapped by the L2 and refcount
caches (in bytes) is:

    disk_size = l2_cache_size * cluster_size / 8
    disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits

With the default values for cluster_size (64KB) and refcount_bits
(16), this becomes:

    disk_size = l2_cache_size * 8192
    disk_size = refcount_cache_size * 32768

So in order to cover n GB of disk space with the default values we
need:

    l2_cache_size = disk_size_GB * 131072
    refcount_cache_size = disk_size_GB * 32768

For example, 1MB of L2 cache is needed to cover each 8 GB of the virtual
image size (given that the default cluster size is used):

    8 * 131072 = 1 MB

A default refcount cache is 4 times the cluster size, which defaults to
256 KB (262144 bytes). This is sufficient for 8 GB of image size:

    262144 / 32768 = 8 GB

""""""""""

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size
  2018-08-10 13:14   ` Alberto Garcia
@ 2018-08-11 18:40     ` Leonid Bloch
  0 siblings, 0 replies; 35+ messages in thread
From: Leonid Bloch @ 2018-08-11 18:40 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On 8/10/18 4:14 PM, Alberto Garcia wrote:
> On Fri 10 Aug 2018 08:26:42 AM CEST, Leonid Bloch wrote:
>> The refcount cache size does not need to be set to its minimum value in
>> read_cache_sizes(), as it is set to at least its minimum value in
>> qcow2_update_options_prepare().
>>
>> Signed-off-by: Leonid Bloch <lbloch@janustech.com>
> 
>> -    } else {
>> -        if (!l2_cache_size_set) {
>> -            *l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
>> -                                 (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
>> -                                 * s->cluster_size);
>> -        }
>> -        if (!refcount_cache_size_set) {
>> -            *refcount_cache_size = min_refcount_cache;
>> -        }
> 
> Since you're getting rid of the rest of the code later anyway, I think
> it's cleaner to only remove these last three lines here and leave the
> rest untouched. It makes the patch shorter and easier to read.

But a correct formatting is important, I think. In every patch individually.

> 
>> +    /* If refcount-cache-size is not specified, it will be set to minimum
>> +     * in qcow2_update_options_prepare(). No need to set it here. */
> 
> This is not quite correct, because apart from the "not specified" case,
> refcount_cache_size is also overridden in other circumstances: (a) the
> value is specified but is too low, or (b) the value is set indirectly
> via "cache-size" but the end result is too low[*].

Yes, I'll fix the comment. But I think that it's important that it 
remains, because someone who looks at the code does not necessarily 
looks at the commit message, and it may be puzzling that a minimal size 
is not enforced there.

How about the following:
"l2_cache_size and refcount_cache_size are ensured to have at least 
their minimum values in qcow2_update_options_prepare()"

> 
> Also, the same thing happens to l2-cache-size: if you set it manually
> but it's too low then it will be overridden.
> 
> I'd personally remove the comment altogether, I think the explanation in
> the commit message is enough.
> 
> Berto
> 
> [*] Now that I think of it: if you set e.g. cache-size = 16M and
>      l2-cache-size = 16MB then you'll end up with refcount-cache-size =
>      16M - 16M = 0, which will be overridden and set to the minimum.
> 
>      But then refcount-cache-size + l2-cache-size > cache-size

Yes, I have noticed this behavior, and I think that it is OK: the errors 
should be a response to a wrong input of the user, who does not 
necessarily understands the internals. So if a user inputs l2-cache-size 
which is larger than cache-size, that's obviously a mistake. But if they 
are equal, it's totally OK if QEMU will use some 256 or 128 KB for the 
minimal caches transparently. This is really negligible relatively to 
the total QEMU overhead.

Now the error message is: "l2-cache-size may not exceed cache-size".
Is it reasonable to make it: "l2-cache-size may not exceed the size of 
cache-size minus the minimal refcount cache size" (or similar)? :)

> 
>      So perhaps this should produce an error, and it may make sense to
>      take the opportunity to move all the logic that ensures that
>      MIN_SIZE <= size <= INT_MAX to read_cache_sizes(). But that's a
>      possible task for the future and I wouldn't worry about it in this
>      series.
> 

Yeah, but it would be a good idea to move *all* the size-determining 
logic in one function.

Leonid.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size
  2018-08-10 14:39   ` Alberto Garcia
@ 2018-08-11 19:19     ` Leonid Bloch
  2018-08-13 11:33       ` Kevin Wolf
  0 siblings, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-11 19:19 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake

On 8/10/18 5:39 PM, Alberto Garcia wrote:
> On Fri 10 Aug 2018 08:26:43 AM CEST, Leonid Bloch wrote:
>> Previously, the L2 cache was allocated without considering the image
>> size, and an option existed to manually determine its size.
> 
> This is not quite correct: the L2 cache was set to the maximum needed
> for the image when "cache-size" was large enough:
> 
>          if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
>                  *l2_cache_size = max_l2_cache;
>          } ...
> 
> See below for an example.

You're right. I'll fix that.

> 
>>           } else {
>> -            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
>> -            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
>> -
>>               /* Assign as much memory as possible to the L2 cache, and
>>                * use the remainder for the refcount cache */
>> -            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
>> -                *l2_cache_size = max_l2_cache;
>> +            if (combined_cache_size >= *l2_cache_size + min_refcount_cache) {
> 
> This has an (unintended?) side effect:
> 
> If you have a 1TB qcow2 image and open it with e.g. cache-size=200M,
> with the previous code you would get a 128MB L2 cache (max_l2_cache).
> With the new code you only get 32MB.

Good catch!! Thanks!

> 
> The rest of the function looks good to me now. We're getting close :)
> 
>> - - The default L2 cache size is 8 clusters or 1MB (whichever is more),
>> -   and the minimum is 2 clusters (or 2 cache entries, see below).
>> + - The default L2 cache size will cover the entire virtual size of an
>> +   image, up to a certain maximum. This maximum is 1 MB by default
>> +   (enough for image sizes of up to 8 GB with the default cluster size)
>> +   and it can be reduced or enlarged using the "l2-cache-size" option.
>> +   The minimum is 2 clusters (or 2 cache entries, see below).
> 
> It's not clear with this wording if this "certain maximum" refers to the
> cache size or the image size.

"This maximum is 1 MB by default (enough for image sizes of up to 8 GB 
..." - to me it's quite clear that it speaks of the image size. But I 
guess I can change the wording.

> 
> I'm thinking that perhaps it's clearer if you leave that item unchanged
> and add a new one below, something like:

I can't leave it unchanged for two reasons: (1) "8 clusters or 1MB 
(whichever is more)" (2) "default" - now it is not default, but maximal.
OK, I'll change it to be more clear, incorporating your suggestion from 
below.

> 
> - The L2 cache size setting (regardless of whether it takes the default
>    value or it's set by the user) refers to the maximum size of the L2
>    cache. If QEMU needs less memory to hold all of the image's L2 tables,
>    then that's the actual size of the cache that will be allocated. >
>>    - If the L2 cache is big enough to hold all of the image's L2 tables
>> -   (as explained in the "Choosing the right cache sizes" section
>> -   earlier in this document) then none of this is necessary and you
>> -   can omit the "l2-cache-entry-size" parameter altogether.
>> +   (the default behavior for images of up to 8 GB in size) then none
>> +   of this is necessary and you can omit the "l2-cache-entry-size"
>> +   parameter altogether.
> 
> I think this change is unnecessary :-?

Maybe I'll just mention the default here? Or refer to the section which 
speaks of the defaults ('as explained in the "Choosing the right cache 
sizes" and "How to configure the cache sizes" sections ...")?

> 
>>   @item refcount-cache-size
>>   The maximum size of the refcount block cache in bytes
>> diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
>> index 87965625d8..e3fb078588 100755
>> --- a/tests/qemu-iotests/137
>> +++ b/tests/qemu-iotests/137
>> @@ -109,7 +109,6 @@ $QEMU_IO \
>>       -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \
>>       -c "reopen -o cache-size=1M,l2-cache-size=2M" \
>>       -c "reopen -o cache-size=1M,refcount-cache-size=2M" \
>> -    -c "reopen -o l2-cache-size=256T" \
> 
> The "L2 cache size too big" error can still be tested, but you will need
> to create an image large enough to allow such a big cache.
> 
> $ qemu-img create -f qcow2 -o cluster_size=256k hd.qcow2 32P
> $ $QEMU -drive file=hd.qcow2,l2-cache-entry-size=512,l2-cache-size=1T

* 32P qcow2 will take 33M - is it OK to create it just for a test?
* Is it worth to create a special test scenario, with a separate image 
creation, just for that case?

Leonid.

> 
> Berto
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-10 12:00   ` Alberto Garcia
@ 2018-08-13  1:39     ` Max Reitz
  2018-08-13  6:09       ` Leonid Bloch
  2018-08-13 11:23       ` Kevin Wolf
  0 siblings, 2 replies; 35+ messages in thread
From: Max Reitz @ 2018-08-13  1:39 UTC (permalink / raw)
  To: Alberto Garcia, Leonid Bloch, qemu-devel
  Cc: qemu-block, Kevin Wolf, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 5427 bytes --]

On 2018-08-10 14:00, Alberto Garcia wrote:
> On Fri 10 Aug 2018 08:26:44 AM CEST, Leonid Bloch wrote:
>> The upper limit on the L2 cache size is increased from 1 MB to 32 MB.
>> This is done in order to allow default full coverage with the L2 cache
>> for images of up to 256 GB in size (was 8 GB). Note, that only the
>> needed amount to cover the full image is allocated. The value which is
>> changed here is just the upper limit on the L2 cache size, beyond which
>> it will not grow, even if the size of the image will require it to.
>>
>> Signed-off-by: Leonid Bloch <lbloch@janustech.com>
> 
> Reviewed-by: Alberto Garcia <berto@igalia.com>
> 
>> -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
>> +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
> 
> The patch looks perfect to me now and I'm fine with this change, but
> this is quite an increase from the previous default value. If anyone
> thinks that this is too aggressive (or too little :)) I'm all ears.

This is just noise from the sidelines (so nothing too serious), but
anyway, I don't like it very much.

My first point is that the old limit doesn't mean you can only use 8 GB
qcow2 images.  You can use more, you just can't access more than 8 GB
randomly.  I know I'm naive, but I think that the number of use cases
where you need random IOPS spread out over more than 8 GB of an image
are limited.

My second point is that qemu still allocated 128 MB of RAM by default.
Using 1/4th of that for every qcow2 image you attach to the VM seems a
bit much.

Now it gets a bit complicated.  This series makes cache-clean-interval
default to 10 minutes, so it shouldn't be an issue in practice.  But one
thing to note is that this is a Linux-specific feature, so on every
other system, this really means 32 MB per image.  (Also, 10 minutes
means that whenever I boot up my VM with a couple of disks with random
accesses all over the images during boot, I might end up using 32 MB per
image again (for 10 min), even though I don't really need that performance.)

Now if we really rely on that cache-clean-interval, why not make it
always cover the whole image by default?  I don't really see why we
should now say "256 GB seems reasonable, and 32 MB doesn't sound like
too much, let's go there".  (Well, OK, I do see how you end up using 32
MB as basically a safety margin, where you'd say that anything above it
is just unreasonable.)

Do we update the limit in a couple of years again because people have
more RAM and larger disks then?  (Maybe we do?)

My personal opinion is this: Most users should be fine with 8 GB of
randomly accessible image space (this may be wrong).  Whenever a user
does have an application that uses more than 8 GB, they are probably in
an area where they want to do some performance tuning anyway.  Requiring
them to set l2-cache-full in that case seems reasonable to me.  Pushing
the default to 256 GB to me looks a bit like just letting them run into
the problem later.  It doesn't solve the issue that you need to do some
performance tuning if you have a bit of a special use case (but maybe
I'm wrong and accessing more than 8 GB randomly is what everybody does
with their VMs).

(Maybe it's even a good thing to limit it to a smaller number so users
run into the issue sooner than later...)

OTOH, this change means that everyone on a non-Linux system will have to
use 32 MB of their RAM per qcow2 image, and everyone on a Linux system
will potentially use it e.g. during boot when you do access a lot
randomly (even though the performance usually is not of utmost
importance then (important, but not extremely so)).  But then again,
this will probably only affect a single disk (the one with the OS on
it), so it won't be too bad.

So my stance is:

(1) Is it really worth pushing the default to 256 GB if you probably
have to do a bit of performance tuning anyway when you get past 8 GB
random IOPS?  I think it's reasonable to ask users to use l2-cache-full
or adjust the cache to their needs.

(2) For non-Linux systems, this seems to really mean 32 MB of RAM per
qcow2 image.  That's 1/4th of default VM RAM.  Is that worth it?

(3) For Linux, I don't like it much either, but that's because I'm
stupid.  The fact that if you don't need this much random I/O only your
boot disk may cause a RAM usage spike, and even then it's going to go
down after 10 minutes, is probably enough to justify this change.

I suppose my moaning would subside if we only increased the default on
systems that actually support cache-clean-interval...?

Max

PS: I also don't quite like how you got to the default of 10 minutes of
the cache-clean-interval.  You can't justify using 32 MB as the default
cache size by virtue of "We have a cache-clean-interval now", and then
justify a CCI of 10 min by "It's just for VMs which sit idle".

No.  If you rely on CCI to be there to make the cache size reasonable by
default for whatever the user is doing with their images, you have to
consider that fact when choosing a CCI.

Ideally we'd probably want a soft and a hard cache limit, but I don't
know...

(Like, a soft cache limit of 1 MB with a CCI of 10 min, and a hard cache
limit of 32 MB with a CCI of 1 min by default.  So whenever your cache
uses more than 1 MB of RAM, your CCI is 1 min, and whenever it's below,
your CCI is 10 min.)

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13  1:39     ` Max Reitz
@ 2018-08-13  6:09       ` Leonid Bloch
  2018-08-13 15:16         ` Max Reitz
  2018-08-13 11:23       ` Kevin Wolf
  1 sibling, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-13  6:09 UTC (permalink / raw)
  To: Max Reitz, Alberto Garcia, qemu-devel; +Cc: qemu-block, Kevin Wolf, Eric Blake



On August 13, 2018 4:39:35 AM EEST, Max Reitz <mreitz@redhat.com> wrote:
>On 2018-08-10 14:00, Alberto Garcia wrote:
>> On Fri 10 Aug 2018 08:26:44 AM CEST, Leonid Bloch wrote:
>>> The upper limit on the L2 cache size is increased from 1 MB to 32
>MB.
>>> This is done in order to allow default full coverage with the L2
>cache
>>> for images of up to 256 GB in size (was 8 GB). Note, that only the
>>> needed amount to cover the full image is allocated. The value which
>is
>>> changed here is just the upper limit on the L2 cache size, beyond
>which
>>> it will not grow, even if the size of the image will require it to.
>>>
>>> Signed-off-by: Leonid Bloch <lbloch@janustech.com>
>> 
>> Reviewed-by: Alberto Garcia <berto@igalia.com>
>> 
>>> -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
>>> +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
>> 
>> The patch looks perfect to me now and I'm fine with this change, but
>> this is quite an increase from the previous default value. If anyone
>> thinks that this is too aggressive (or too little :)) I'm all ears.
>
>This is just noise from the sidelines (so nothing too serious), but
>anyway, I don't like it very much.
>
>My first point is that the old limit doesn't mean you can only use 8 GB
>qcow2 images.  You can use more, you just can't access more than 8 GB
>randomly.  I know I'm naive, but I think that the number of use cases
>where you need random IOPS spread out over more than 8 GB of an image
>are limited.
>
>My second point is that qemu still allocated 128 MB of RAM by default.
>Using 1/4th of that for every qcow2 image you attach to the VM seems a
>bit much.
>
>Now it gets a bit complicated.  This series makes cache-clean-interval
>default to 10 minutes, so it shouldn't be an issue in practice.  But
>one
>thing to note is that this is a Linux-specific feature, so on every
>other system, this really means 32 MB per image.  (Also, 10 minutes
>means that whenever I boot up my VM with a couple of disks with random
>accesses all over the images during boot, I might end up using 32 MB
>per
>image again (for 10 min), even though I don't really need that
>performance.)
>
>Now if we really rely on that cache-clean-interval, why not make it
>always cover the whole image by default?  I don't really see why we
>should now say "256 GB seems reasonable, and 32 MB doesn't sound like
>too much, let's go there".  (Well, OK, I do see how you end up using 32
>MB as basically a safety margin, where you'd say that anything above it
>is just unreasonable.)
>
>Do we update the limit in a couple of years again because people have
>more RAM and larger disks then?  (Maybe we do?)
>
>My personal opinion is this: Most users should be fine with 8 GB of
>randomly accessible image space (this may be wrong).  Whenever a user
>does have an application that uses more than 8 GB, they are probably in
>an area where they want to do some performance tuning anyway. 
>Requiring
>them to set l2-cache-full in that case seems reasonable to me.  Pushing
>the default to 256 GB to me looks a bit like just letting them run into
>the problem later.  It doesn't solve the issue that you need to do some
>performance tuning if you have a bit of a special use case (but maybe
>I'm wrong and accessing more than 8 GB randomly is what everybody does
>with their VMs).
>
>(Maybe it's even a good thing to limit it to a smaller number so users
>run into the issue sooner than later...)
>
>OTOH, this change means that everyone on a non-Linux system will have
>to
>use 32 MB of their RAM per qcow2 image, and everyone on a Linux system
>will potentially use it e.g. during boot when you do access a lot
>randomly (even though the performance usually is not of utmost
>importance then (important, but not extremely so)).  But then again,
>this will probably only affect a single disk (the one with the OS on
>it), so it won't be too bad.
>
>So my stance is:
>
>(1) Is it really worth pushing the default to 256 GB if you probably
>have to do a bit of performance tuning anyway when you get past 8 GB
>random IOPS?  I think it's reasonable to ask users to use l2-cache-full
>or adjust the cache to their needs.
>
>(2) For non-Linux systems, this seems to really mean 32 MB of RAM per
>qcow2 image.  That's 1/4th of default VM RAM.  Is that worth it?
>
>(3) For Linux, I don't like it much either, but that's because I'm
>stupid.  The fact that if you don't need this much random I/O only your
>boot disk may cause a RAM usage spike, and even then it's going to go
>down after 10 minutes, is probably enough to justify this change.
>
>
>I suppose my moaning would subside if we only increased the default on
>systems that actually support cache-clean-interval...?
>
>Max
>
>
>PS: I also don't quite like how you got to the default of 10 minutes of
>the cache-clean-interval.  You can't justify using 32 MB as the default
>cache size by virtue of "We have a cache-clean-interval now", and then
>justify a CCI of 10 min by "It's just for VMs which sit idle".
>
>No.  If you rely on CCI to be there to make the cache size reasonable
>by
>default for whatever the user is doing with their images, you have to
>consider that fact when choosing a CCI.
>
>Ideally we'd probably want a soft and a hard cache limit, but I don't
>know...
>
>(Like, a soft cache limit of 1 MB with a CCI of 10 min, and a hard
>cache
>limit of 32 MB with a CCI of 1 min by default.  So whenever your cache
>uses more than 1 MB of RAM, your CCI is 1 min, and whenever it's below,
>your CCI is 10 min.)

Max, thanks for your insight. Indeed some good points.
Considering this, I'm thinking to set the limit to 16 MB, and the CCI to 5 min. What do you think?
Modern Windows installations should gain performance from being able to random I/O to >8 GB chunks, and data processing tasks where each data set is 8+ GB for sure do (did benchmarks). And the maximum is only ever used if (a) the image is large enough and (b) it is indeed used.
While taking 256 GB images as the "limit" can be considered an overshoot, 128 GB is quite reasonable, I think.

Your idea with "soft" and "hard" limits is great! I'm tempted to implement this. Say 4 MB with 10 min., and 16 MB with 5 min?

Leonid.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13  1:39     ` Max Reitz
  2018-08-13  6:09       ` Leonid Bloch
@ 2018-08-13 11:23       ` Kevin Wolf
  2018-08-13 15:11         ` Max Reitz
  1 sibling, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2018-08-13 11:23 UTC (permalink / raw)
  To: Max Reitz
  Cc: Alberto Garcia, Leonid Bloch, qemu-devel, qemu-block, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 8483 bytes --]

Am 13.08.2018 um 03:39 hat Max Reitz geschrieben:
> On 2018-08-10 14:00, Alberto Garcia wrote:
> > On Fri 10 Aug 2018 08:26:44 AM CEST, Leonid Bloch wrote:
> >> The upper limit on the L2 cache size is increased from 1 MB to 32 MB.
> >> This is done in order to allow default full coverage with the L2 cache
> >> for images of up to 256 GB in size (was 8 GB). Note, that only the
> >> needed amount to cover the full image is allocated. The value which is
> >> changed here is just the upper limit on the L2 cache size, beyond which
> >> it will not grow, even if the size of the image will require it to.
> >>
> >> Signed-off-by: Leonid Bloch <lbloch@janustech.com>
> > 
> > Reviewed-by: Alberto Garcia <berto@igalia.com>
> > 
> >> -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
> >> +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
> > 
> > The patch looks perfect to me now and I'm fine with this change, but
> > this is quite an increase from the previous default value. If anyone
> > thinks that this is too aggressive (or too little :)) I'm all ears.
> 
> This is just noise from the sidelines (so nothing too serious), but
> anyway, I don't like it very much.
> 
> My first point is that the old limit doesn't mean you can only use 8 GB
> qcow2 images.  You can use more, you just can't access more than 8 GB
> randomly.  I know I'm naive, but I think that the number of use cases
> where you need random IOPS spread out over more than 8 GB of an image
> are limited.

I think I can see use cases for databases that are spead across more
than 8 GB. But you're right, it's a tradeoff and users can always
increase the cache size in theory if they need more performance. But
then, they can also decrease the cache size if they need more memory.

Let's be honest: While qcow2 does have some room for functional
improvements, it mostly has an image problem, which comes from the fact
that there are cases where performance drops drastically. Benchmarks are
a very important use case and they do random I/O over more than 8 GB.

Not properly supporting such cases out-of-the-box is the reason why
people are requesting that we add features to raw images even if they
require on-disk metadata. If we want to avoid this kind of nonsense, we
need to improve the out-of-the-box experience with qcow2.

> My second point is that qemu still allocated 128 MB of RAM by default.
> Using 1/4th of that for every qcow2 image you attach to the VM seems a
> bit much.

Well, that's more because 128 MB is ridiculously low today and you won't
be able to run any recent guest without overriding the default.

> Now it gets a bit complicated.  This series makes cache-clean-interval
> default to 10 minutes, so it shouldn't be an issue in practice.  But one
> thing to note is that this is a Linux-specific feature, so on every
> other system, this really means 32 MB per image.

That's a bit inaccurate in this generality: On non-Linux, it means 32 MB
per fully accessed image with a size >= 256 GB.

> (Also, 10 minutes means that whenever I boot up my VM with a couple of
> disks with random accesses all over the images during boot, I might
> end up using 32 MB per image again (for 10 min), even though I don't
> really need that performance.)

If your system files are fragmented in a way that a boot will access
every 512 MB chunk in a 256 GB disk, you should seriously think about
fixing that...

This is a pathological case that shouldn't define our defaults. Random
I/O over 256 GB is really a pathological case, too, but people are
likely to actually test it. They aren't going to systematically test a
horribly fragmented system that wouldn't happen in reality.

> Now if we really rely on that cache-clean-interval, why not make it
> always cover the whole image by default?  I don't really see why we
> should now say "256 GB seems reasonable, and 32 MB doesn't sound like
> too much, let's go there".  (Well, OK, I do see how you end up using 32
> MB as basically a safety margin, where you'd say that anything above it
> is just unreasonable.)
> 
> Do we update the limit in a couple of years again because people have
> more RAM and larger disks then?  (Maybe we do?)

Possibly. I see those defaults as values that we can adjust to reality
whenever we think the old values don't reflect the important cases well
enough any more.

> My personal opinion is this: Most users should be fine with 8 GB of
> randomly accessible image space (this may be wrong).  Whenever a user
> does have an application that uses more than 8 GB, they are probably in
> an area where they want to do some performance tuning anyway.  Requiring
> them to set l2-cache-full in that case seems reasonable to me.

In principle, I'd agree. I'd even say that management tools should
always explicitly set those options instead of relying on our defaults.
But management tools have been ignoring these options for a long time
and keep doing so.

And honestly, if you can't spend a few megabytes for the caches, it's
just as reasonable that you should set l2-cache to a lower value. You'll
need some more tweaking anyway to reduce the memory footprint.

> Pushing the default to 256 GB to me looks a bit like just letting them
> run into the problem later.  It doesn't solve the issue that you need
> to do some performance tuning if you have a bit of a special use case
> (but maybe I'm wrong and accessing more than 8 GB randomly is what
> everybody does with their VMs).
> 
> (Maybe it's even a good thing to limit it to a smaller number so users
> run into the issue sooner than later...)

Definitely not when their management tool doesn't give them the option
of changing the value.

Being slow makes qcow2 look really bad. In contrast, I don't think I've
ever heard anyone complain about memory usage of qcow2. Our choice of a
default should reflect that, especially considering that we only use
the memory on demand. If your image is only 32 GB, you'll never use more
than 4 MB of cache. And if your image is huge, but only access part of
it, we also won't use the full 32 MB.

> OTOH, this change means that everyone on a non-Linux system will have to
> use 32 MB of their RAM per qcow2 image, and everyone on a Linux system
> will potentially use it e.g. during boot when you do access a lot
> randomly (even though the performance usually is not of utmost
> importance then (important, but not extremely so)).  But then again,
> this will probably only affect a single disk (the one with the OS on
> it), so it won't be too bad.
> 
> So my stance is:
> 
> (1) Is it really worth pushing the default to 256 GB if you probably
> have to do a bit of performance tuning anyway when you get past 8 GB
> random IOPS?  I think it's reasonable to ask users to use l2-cache-full
> or adjust the cache to their needs.
> 
> (2) For non-Linux systems, this seems to really mean 32 MB of RAM per
> qcow2 image.  That's 1/4th of default VM RAM.  Is that worth it?
> 
> (3) For Linux, I don't like it much either, but that's because I'm
> stupid.  The fact that if you don't need this much random I/O only your
> boot disk may cause a RAM usage spike, and even then it's going to go
> down after 10 minutes, is probably enough to justify this change.
> 
> 
> I suppose my moaning would subside if we only increased the default on
> systems that actually support cache-clean-interval...?
> 
> Max
> 
> 
> PS: I also don't quite like how you got to the default of 10 minutes of
> the cache-clean-interval.  You can't justify using 32 MB as the default
> cache size by virtue of "We have a cache-clean-interval now", and then
> justify a CCI of 10 min by "It's just for VMs which sit idle".
> 
> No.  If you rely on CCI to be there to make the cache size reasonable by
> default for whatever the user is doing with their images, you have to
> consider that fact when choosing a CCI.
> 
> Ideally we'd probably want a soft and a hard cache limit, but I don't
> know...
> 
> (Like, a soft cache limit of 1 MB with a CCI of 10 min, and a hard cache
> limit of 32 MB with a CCI of 1 min by default.  So whenever your cache
> uses more than 1 MB of RAM, your CCI is 1 min, and whenever it's below,
> your CCI is 10 min.)

I've actually thought of something like this before, too. Maybe we
should do that. But that can be done on top of this series.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size
  2018-08-11 19:19     ` Leonid Bloch
@ 2018-08-13 11:33       ` Kevin Wolf
  2018-08-13 11:48         ` Leonid Bloch
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2018-08-13 11:33 UTC (permalink / raw)
  To: Leonid Bloch
  Cc: Alberto Garcia, qemu-devel, qemu-block, Max Reitz, Eric Blake

Am 11.08.2018 um 21:19 hat Leonid Bloch geschrieben:
> > >   @item refcount-cache-size
> > >   The maximum size of the refcount block cache in bytes
> > > diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
> > > index 87965625d8..e3fb078588 100755
> > > --- a/tests/qemu-iotests/137
> > > +++ b/tests/qemu-iotests/137
> > > @@ -109,7 +109,6 @@ $QEMU_IO \
> > >       -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \
> > >       -c "reopen -o cache-size=1M,l2-cache-size=2M" \
> > >       -c "reopen -o cache-size=1M,refcount-cache-size=2M" \
> > > -    -c "reopen -o l2-cache-size=256T" \
> > 
> > The "L2 cache size too big" error can still be tested, but you will need
> > to create an image large enough to allow such a big cache.
> > 
> > $ qemu-img create -f qcow2 -o cluster_size=256k hd.qcow2 32P
> > $ $QEMU -drive file=hd.qcow2,l2-cache-entry-size=512,l2-cache-size=1T
> 
> * 32P qcow2 will take 33M - is it OK to create it just for a test?
> * Is it worth to create a special test scenario, with a separate image
> creation, just for that case?

We're creating larger images than that during tests, so I think this is
fine. You don't have to create a new separate test file or anything,
just increase the size of the used test image from 64M to 32P or
whatever is necessary.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size
  2018-08-13 11:33       ` Kevin Wolf
@ 2018-08-13 11:48         ` Leonid Bloch
  0 siblings, 0 replies; 35+ messages in thread
From: Leonid Bloch @ 2018-08-13 11:48 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Alberto Garcia, qemu-devel, qemu-block, Max Reitz, Eric Blake

On 8/13/18 2:33 PM, Kevin Wolf wrote:
> Am 11.08.2018 um 21:19 hat Leonid Bloch geschrieben:
>>>>    @item refcount-cache-size
>>>>    The maximum size of the refcount block cache in bytes
>>>> diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
>>>> index 87965625d8..e3fb078588 100755
>>>> --- a/tests/qemu-iotests/137
>>>> +++ b/tests/qemu-iotests/137
>>>> @@ -109,7 +109,6 @@ $QEMU_IO \
>>>>        -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \
>>>>        -c "reopen -o cache-size=1M,l2-cache-size=2M" \
>>>>        -c "reopen -o cache-size=1M,refcount-cache-size=2M" \
>>>> -    -c "reopen -o l2-cache-size=256T" \
>>>
>>> The "L2 cache size too big" error can still be tested, but you will need
>>> to create an image large enough to allow such a big cache.
>>>
>>> $ qemu-img create -f qcow2 -o cluster_size=256k hd.qcow2 32P
>>> $ $QEMU -drive file=hd.qcow2,l2-cache-entry-size=512,l2-cache-size=1T
>>
>> * 32P qcow2 will take 33M - is it OK to create it just for a test?
>> * Is it worth to create a special test scenario, with a separate image
>> creation, just for that case?
> 
> We're creating larger images than that during tests, so I think this is
> fine. You don't have to create a new separate test file or anything,
> just increase the size of the used test image from 64M to 32P or
> whatever is necessary.

OK, sure, will do in v9.

Leonid.

> 
> Kevin
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13 11:23       ` Kevin Wolf
@ 2018-08-13 15:11         ` Max Reitz
  2018-08-13 15:58           ` Kevin Wolf
  0 siblings, 1 reply; 35+ messages in thread
From: Max Reitz @ 2018-08-13 15:11 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Alberto Garcia, Leonid Bloch, qemu-devel, qemu-block, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 10986 bytes --]

On 2018-08-13 13:23, Kevin Wolf wrote:
> Am 13.08.2018 um 03:39 hat Max Reitz geschrieben:
>> On 2018-08-10 14:00, Alberto Garcia wrote:
>>> On Fri 10 Aug 2018 08:26:44 AM CEST, Leonid Bloch wrote:
>>>> The upper limit on the L2 cache size is increased from 1 MB to 32 MB.
>>>> This is done in order to allow default full coverage with the L2 cache
>>>> for images of up to 256 GB in size (was 8 GB). Note, that only the
>>>> needed amount to cover the full image is allocated. The value which is
>>>> changed here is just the upper limit on the L2 cache size, beyond which
>>>> it will not grow, even if the size of the image will require it to.
>>>>
>>>> Signed-off-by: Leonid Bloch <lbloch@janustech.com>
>>>
>>> Reviewed-by: Alberto Garcia <berto@igalia.com>
>>>
>>>> -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
>>>> +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
>>>
>>> The patch looks perfect to me now and I'm fine with this change, but
>>> this is quite an increase from the previous default value. If anyone
>>> thinks that this is too aggressive (or too little :)) I'm all ears.
>>
>> This is just noise from the sidelines (so nothing too serious), but
>> anyway, I don't like it very much.
>>
>> My first point is that the old limit doesn't mean you can only use 8 GB
>> qcow2 images.  You can use more, you just can't access more than 8 GB
>> randomly.  I know I'm naive, but I think that the number of use cases
>> where you need random IOPS spread out over more than 8 GB of an image
>> are limited.
> 
> I think I can see use cases for databases that are spead across more
> than 8 GB.

Sure, there are use cases.  But that's not quite the general case, that
was my point.

>            But you're right, it's a tradeoff and users can always
> increase the cache size in theory if they need more performance. But
> then, they can also decrease the cache size if they need more memory.

True.  But the issue here is: When your disk performance drops, you are
likely to look into what causes your disk to be slow.  Maybe you're lazy
and switch to raw.  Maybe you aren't and discover that the cache may be
an issue, so you adjust those options to your needs.

When your RAM runs low, at least I would never think of some disk image
cache, to be honest.  So I probably would either not use qemu or
increase my swap size.

> Let's be honest: While qcow2 does have some room for functional
> improvements, it mostly has an image problem, which comes from the fact
> that there are cases where performance drops drastically. Benchmarks are
> a very important use case and they do random I/O over more than 8 GB.

As long as it's our benchmarks, setting the right options is easy. O:-)

> Not properly supporting such cases out-of-the-box is the reason why
> people are requesting that we add features to raw images even if they
> require on-disk metadata. If we want to avoid this kind of nonsense, we
> need to improve the out-of-the-box experience with qcow2.

Reasonable indeed.

>> My second point is that qemu still allocated 128 MB of RAM by default.
>> Using 1/4th of that for every qcow2 image you attach to the VM seems a
>> bit much.
> 
> Well, that's more because 128 MB is ridiculously low today and you won't
> be able to run any recent guest without overriding the default.

I'm running my L4Linux just fine over here! O:-)

My point here was -- if the default RAM size is as low as it is (and
nobody seems to want to increase it), does it make sense if we try to
increase our defaults?

I suppose you could say that not adjusting the RAM default is a bad
decision, but it's not our decision, so there's nothing we can do about
that.

I suppose you could also say that adjusting the RAM size is easier than
adjusting the qcow2 cache size.

So, yeah.  True.

>> Now it gets a bit complicated.  This series makes cache-clean-interval
>> default to 10 minutes, so it shouldn't be an issue in practice.  But one
>> thing to note is that this is a Linux-specific feature, so on every
>> other system, this really means 32 MB per image.
> 
> That's a bit inaccurate in this generality: On non-Linux, it means 32 MB
> per fully accessed image with a size >= 256 GB.
> 
>> (Also, 10 minutes means that whenever I boot up my VM with a couple of
>> disks with random accesses all over the images during boot, I might
>> end up using 32 MB per image again (for 10 min), even though I don't
>> really need that performance.)
> 
> If your system files are fragmented in a way that a boot will access
> every 512 MB chunk in a 256 GB disk, you should seriously think about
> fixing that...
> 
> This is a pathological case that shouldn't define our defaults. Random
> I/O over 256 GB is really a pathological case, too, but people are
> likely to actually test it. They aren't going to systematically test a
> horribly fragmented system that wouldn't happen in reality.

True.

>> Now if we really rely on that cache-clean-interval, why not make it
>> always cover the whole image by default?  I don't really see why we
>> should now say "256 GB seems reasonable, and 32 MB doesn't sound like
>> too much, let's go there".  (Well, OK, I do see how you end up using 32
>> MB as basically a safety margin, where you'd say that anything above it
>> is just unreasonable.)
>>
>> Do we update the limit in a couple of years again because people have
>> more RAM and larger disks then?  (Maybe we do?)
> 
> Possibly. I see those defaults as values that we can adjust to reality
> whenever we think the old values don't reflect the important cases well
> enough any more.

OK.

>> My personal opinion is this: Most users should be fine with 8 GB of
>> randomly accessible image space (this may be wrong).  Whenever a user
>> does have an application that uses more than 8 GB, they are probably in
>> an area where they want to do some performance tuning anyway.  Requiring
>> them to set l2-cache-full in that case seems reasonable to me.
> 
> In principle, I'd agree. I'd even say that management tools should
> always explicitly set those options instead of relying on our defaults.
> But management tools have been ignoring these options for a long time
> and keep doing so.
> 
> And honestly, if you can't spend a few megabytes for the caches, it's
> just as reasonable that you should set l2-cache to a lower value. You'll
> need some more tweaking anyway to reduce the memory footprint.

It isn't, because as I explained above, it is more reasonable to expect
people to find out about disk options because their disk performance is
abysmal than because their RAM is exhausted.

I would like to say "but it is nearly as reasonable", but I really don't
think so.

>> Pushing the default to 256 GB to me looks a bit like just letting them
>> run into the problem later.  It doesn't solve the issue that you need
>> to do some performance tuning if you have a bit of a special use case
>> (but maybe I'm wrong and accessing more than 8 GB randomly is what
>> everybody does with their VMs).
>>
>> (Maybe it's even a good thing to limit it to a smaller number so users
>> run into the issue sooner than later...)
> 
> Definitely not when their management tool doesn't give them the option
> of changing the value.

That is true.

> Being slow makes qcow2 look really bad. In contrast, I don't think I've
> ever heard anyone complain about memory usage of qcow2.

Yeah, because it never was an issue.  It might (in theory) become now.

Also note again that people might just not realize the memory usage is
due to qcow2.

>                                                         Our choice of a
> default should reflect that, especially considering that we only use
> the memory on demand. If your image is only 32 GB, you'll never use more
> than 4 MB of cache.

Well, OK, yes.  This is an especially important point when it really is
about hosts that have limited memory.  In those cases, users probably
won't run huge images anyway.

>                     And if your image is huge, but only access part of
> it, we also won't use the full 32 MB.

On Linux. O:-)

>> OTOH, this change means that everyone on a non-Linux system will have to
>> use 32 MB of their RAM per qcow2 image, and everyone on a Linux system
>> will potentially use it e.g. during boot when you do access a lot
>> randomly (even though the performance usually is not of utmost
>> importance then (important, but not extremely so)).  But then again,
>> this will probably only affect a single disk (the one with the OS on
>> it), so it won't be too bad.
>>
>> So my stance is:
>>
>> (1) Is it really worth pushing the default to 256 GB if you probably
>> have to do a bit of performance tuning anyway when you get past 8 GB
>> random IOPS?  I think it's reasonable to ask users to use l2-cache-full
>> or adjust the cache to their needs.
>>
>> (2) For non-Linux systems, this seems to really mean 32 MB of RAM per
>> qcow2 image.  That's 1/4th of default VM RAM.  Is that worth it?
>>
>> (3) For Linux, I don't like it much either, but that's because I'm
>> stupid.  The fact that if you don't need this much random I/O only your
>> boot disk may cause a RAM usage spike, and even then it's going to go
>> down after 10 minutes, is probably enough to justify this change.
>>
>>
>> I suppose my moaning would subside if we only increased the default on
>> systems that actually support cache-clean-interval...?

So it's good that you have calmed my nerves about how this might be
problematic on Linux systems (it isn't in practice, although I disagree
that people will find qcow2 to be the fault when their memory runs out),
but you haven't said anything about non-Linux systems.  I understand
that you don't care, but as I said here, this was my only substantial
concern anyway.

>> Max
>>
>>
>> PS: I also don't quite like how you got to the default of 10 minutes of
>> the cache-clean-interval.  You can't justify using 32 MB as the default
>> cache size by virtue of "We have a cache-clean-interval now", and then
>> justify a CCI of 10 min by "It's just for VMs which sit idle".
>>
>> No.  If you rely on CCI to be there to make the cache size reasonable by
>> default for whatever the user is doing with their images, you have to
>> consider that fact when choosing a CCI.
>>
>> Ideally we'd probably want a soft and a hard cache limit, but I don't
>> know...
>>
>> (Like, a soft cache limit of 1 MB with a CCI of 10 min, and a hard cache
>> limit of 32 MB with a CCI of 1 min by default.  So whenever your cache
>> uses more than 1 MB of RAM, your CCI is 1 min, and whenever it's below,
>> your CCI is 10 min.)
> 
> I've actually thought of something like this before, too. Maybe we
> should do that. But that can be done on top of this series.

Sure.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13  6:09       ` Leonid Bloch
@ 2018-08-13 15:16         ` Max Reitz
  2018-08-13 16:00           ` Kevin Wolf
  0 siblings, 1 reply; 35+ messages in thread
From: Max Reitz @ 2018-08-13 15:16 UTC (permalink / raw)
  To: Leonid Bloch, Alberto Garcia, qemu-devel
  Cc: qemu-block, Kevin Wolf, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]

On 2018-08-13 08:09, Leonid Bloch wrote:
> On August 13, 2018 4:39:35 AM EEST, Max Reitz <mreitz@redhat.com> wrote:

[...]

>> Ideally we'd probably want a soft and a hard cache limit, but I don't
>> know...
>>
>> (Like, a soft cache limit of 1 MB with a CCI of 10 min, and a hard
>> cache
>> limit of 32 MB with a CCI of 1 min by default.  So whenever your cache
>> uses more than 1 MB of RAM, your CCI is 1 min, and whenever it's below,
>> your CCI is 10 min.)
> 
> Max, thanks for your insight. Indeed some good points.
> Considering this, I'm thinking to set the limit to 16 MB, and the CCI to 5 min. What do you think?

I think it's good for a preliminary solution, and then later increase
the limit with the soft and hard limits.

OTOH, if we implement the soft/hard limits, it doesn't really matter
what default you choose now...

> Modern Windows installations should gain performance from being able to random I/O to >8 GB chunks, and data processing tasks where each data set is 8+ GB for sure do (did benchmarks). And the maximum is only ever used if (a) the image is large enough and (b) it is indeed used.
> While taking 256 GB images as the "limit" can be considered an overshoot, 128 GB is quite reasonable, I think.
> 
> Your idea with "soft" and "hard" limits is great! I'm tempted to implement this. Say 4 MB with 10 min., and 16 MB with 5 min?

32 MB and 2 or 3 min? :-)

If you do that, I'm fine with a plain default of 32 MB for now.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13 15:11         ` Max Reitz
@ 2018-08-13 15:58           ` Kevin Wolf
  2018-08-13 16:08             ` Max Reitz
  2018-08-13 16:42             ` Leonid Bloch
  0 siblings, 2 replies; 35+ messages in thread
From: Kevin Wolf @ 2018-08-13 15:58 UTC (permalink / raw)
  To: Max Reitz
  Cc: Alberto Garcia, Leonid Bloch, qemu-devel, qemu-block, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 3294 bytes --]

Am 13.08.2018 um 17:11 hat Max Reitz geschrieben:
> >> My personal opinion is this: Most users should be fine with 8 GB of
> >> randomly accessible image space (this may be wrong).  Whenever a user
> >> does have an application that uses more than 8 GB, they are probably in
> >> an area where they want to do some performance tuning anyway.  Requiring
> >> them to set l2-cache-full in that case seems reasonable to me.
> > 
> > In principle, I'd agree. I'd even say that management tools should
> > always explicitly set those options instead of relying on our defaults.
> > But management tools have been ignoring these options for a long time
> > and keep doing so.
> > 
> > And honestly, if you can't spend a few megabytes for the caches, it's
> > just as reasonable that you should set l2-cache to a lower value. You'll
> > need some more tweaking anyway to reduce the memory footprint.
> 
> It isn't, because as I explained above, it is more reasonable to expect
> people to find out about disk options because their disk performance is
> abysmal than because their RAM is exhausted.
> 
> I would like to say "but it is nearly as reasonable", but I really don't
> think so.

Maybe in a perfect world, finding the option when their disk performance
is abysmal is what users would do. In this world, they either just use
raw and scream for backing files and dirty bitmaps and whatnot for raw,
or they just directly go to some other hypervisor.

Realistically, the cache options don't exist. They are hard to discover
in the QEMU command line and management tools don't support them.

Conclusion: We're doomed to find a one-size-fits-all default that works
well in all common use cases, including benchmarks. We can try and make
it adapt to the situation, but we can't reasonably expect users to
manually override it.

> >                                                         Our choice of a
> > default should reflect that, especially considering that we only use
> > the memory on demand. If your image is only 32 GB, you'll never use more
> > than 4 MB of cache.
> 
> Well, OK, yes.  This is an especially important point when it really is
> about hosts that have limited memory.  In those cases, users probably
> won't run huge images anyway.
> 
> >                     And if your image is huge, but only access part of
> > it, we also won't use the full 32 MB.
> 
> On Linux. O:-)

No, on any system where qemu_try_blockalign() results in a COW zero
page.

The Linux-only addition is returning memory even after an access.

> So it's good that you have calmed my nerves about how this might be
> problematic on Linux systems (it isn't in practice, although I disagree
> that people will find qcow2 to be the fault when their memory runs out),
> but you haven't said anything about non-Linux systems.  I understand
> that you don't care, but as I said here, this was my only substantial
> concern anyway.

I don't actually think it's so bad to keep the cache permanently
allocated, but I wouldn't object to a lower default for non-Linux hosts
either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
might be more adequate. My typical desktop VMs are larger than 8 GB, but
smaller than 32 GB.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13 15:16         ` Max Reitz
@ 2018-08-13 16:00           ` Kevin Wolf
  0 siblings, 0 replies; 35+ messages in thread
From: Kevin Wolf @ 2018-08-13 16:00 UTC (permalink / raw)
  To: Max Reitz
  Cc: Leonid Bloch, Alberto Garcia, qemu-devel, qemu-block, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 1639 bytes --]

Am 13.08.2018 um 17:16 hat Max Reitz geschrieben:
> On 2018-08-13 08:09, Leonid Bloch wrote:
> > On August 13, 2018 4:39:35 AM EEST, Max Reitz <mreitz@redhat.com> wrote:
> 
> [...]
> 
> >> Ideally we'd probably want a soft and a hard cache limit, but I don't
> >> know...
> >>
> >> (Like, a soft cache limit of 1 MB with a CCI of 10 min, and a hard
> >> cache
> >> limit of 32 MB with a CCI of 1 min by default.  So whenever your cache
> >> uses more than 1 MB of RAM, your CCI is 1 min, and whenever it's below,
> >> your CCI is 10 min.)
> > 
> > Max, thanks for your insight. Indeed some good points.
> > Considering this, I'm thinking to set the limit to 16 MB, and the CCI to 5 min. What do you think?
> 
> I think it's good for a preliminary solution, and then later increase
> the limit with the soft and hard limits.
> 
> OTOH, if we implement the soft/hard limits, it doesn't really matter
> what default you choose now...
> 
> > Modern Windows installations should gain performance from being able to random I/O to >8 GB chunks, and data processing tasks where each data set is 8+ GB for sure do (did benchmarks). And the maximum is only ever used if (a) the image is large enough and (b) it is indeed used.
> > While taking 256 GB images as the "limit" can be considered an overshoot, 128 GB is quite reasonable, I think.
> > 
> > Your idea with "soft" and "hard" limits is great! I'm tempted to implement this. Say 4 MB with 10 min., and 16 MB with 5 min?
> 
> 32 MB and 2 or 3 min? :-)
> 
> If you do that, I'm fine with a plain default of 32 MB for now.

I would be happy with that.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13 15:58           ` Kevin Wolf
@ 2018-08-13 16:08             ` Max Reitz
  2018-08-13 16:24               ` Kevin Wolf
  2018-08-13 16:42             ` Leonid Bloch
  1 sibling, 1 reply; 35+ messages in thread
From: Max Reitz @ 2018-08-13 16:08 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Alberto Garcia, Leonid Bloch, qemu-devel, qemu-block, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 4203 bytes --]

On 2018-08-13 17:58, Kevin Wolf wrote:
> Am 13.08.2018 um 17:11 hat Max Reitz geschrieben:
>>>> My personal opinion is this: Most users should be fine with 8 GB of
>>>> randomly accessible image space (this may be wrong).  Whenever a user
>>>> does have an application that uses more than 8 GB, they are probably in
>>>> an area where they want to do some performance tuning anyway.  Requiring
>>>> them to set l2-cache-full in that case seems reasonable to me.
>>>
>>> In principle, I'd agree. I'd even say that management tools should
>>> always explicitly set those options instead of relying on our defaults.
>>> But management tools have been ignoring these options for a long time
>>> and keep doing so.
>>>
>>> And honestly, if you can't spend a few megabytes for the caches, it's
>>> just as reasonable that you should set l2-cache to a lower value. You'll
>>> need some more tweaking anyway to reduce the memory footprint.
>>
>> It isn't, because as I explained above, it is more reasonable to expect
>> people to find out about disk options because their disk performance is
>> abysmal than because their RAM is exhausted.
>>
>> I would like to say "but it is nearly as reasonable", but I really don't
>> think so.
> 
> Maybe in a perfect world, finding the option when their disk performance
> is abysmal is what users would do. In this world, they either just use
> raw and scream for backing files and dirty bitmaps and whatnot for raw,
> or they just directly go to some other hypervisor.
> 
> Realistically, the cache options don't exist. They are hard to discover
> in the QEMU command line and management tools don't support them.
> 
> Conclusion: We're doomed to find a one-size-fits-all default that works
> well in all common use cases, including benchmarks. We can try and make
> it adapt to the situation, but we can't reasonably expect users to
> manually override it.

OK, saying both is unreasonable is something I can get behind.

>>>                                                         Our choice of a
>>> default should reflect that, especially considering that we only use
>>> the memory on demand. If your image is only 32 GB, you'll never use more
>>> than 4 MB of cache.
>>
>> Well, OK, yes.  This is an especially important point when it really is
>> about hosts that have limited memory.  In those cases, users probably
>> won't run huge images anyway.
>>
>>>                     And if your image is huge, but only access part of
>>> it, we also won't use the full 32 MB.
>>
>> On Linux. O:-)
> 
> No, on any system where qemu_try_blockalign() results in a COW zero
> page.

OK, yes, but why would you only ever access part of it?  Then you might
just as well have created a smaller disk from the beginning.

> The Linux-only addition is returning memory even after an access.
> 
>> So it's good that you have calmed my nerves about how this might be
>> problematic on Linux systems (it isn't in practice, although I disagree
>> that people will find qcow2 to be the fault when their memory runs out),
>> but you haven't said anything about non-Linux systems.  I understand
>> that you don't care, but as I said here, this was my only substantial
>> concern anyway.
> 
> I don't actually think it's so bad to keep the cache permanently
> allocated, but I wouldn't object to a lower default for non-Linux hosts
> either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
> might be more adequate. My typical desktop VMs are larger than 8 GB, but
> smaller than 32 GB.

Will your typical desktop VMs gain anything from the cache covering more
than 8 GB?

Anyway, I certainly won't complain about 4 MB.

(My point here is that on non-Linux systems, qemu probably does not have
users who have use cases where they need to access 256 GB of disk
simultaneously.  Probably not even more than 8 GB.  If you want to
increase the cache size there to 4 MB, fine, I think that won't hurt.
But 32 MB might hurt, and I don't think on non-Linux systems there are
users who would benefit from it -- specifically because your "typical
desktop VM" wouldn't benefit from it.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13 16:08             ` Max Reitz
@ 2018-08-13 16:24               ` Kevin Wolf
  0 siblings, 0 replies; 35+ messages in thread
From: Kevin Wolf @ 2018-08-13 16:24 UTC (permalink / raw)
  To: Max Reitz
  Cc: Alberto Garcia, Leonid Bloch, qemu-devel, qemu-block, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 2818 bytes --]

Am 13.08.2018 um 18:08 hat Max Reitz geschrieben:
> >>> default should reflect that, especially considering that we only use
> >>> the memory on demand. If your image is only 32 GB, you'll never use more
> >>> than 4 MB of cache.
> >>
> >> Well, OK, yes.  This is an especially important point when it really is
> >> about hosts that have limited memory.  In those cases, users probably
> >> won't run huge images anyway.
> >>
> >>>                     And if your image is huge, but only access part of
> >>> it, we also won't use the full 32 MB.
> >>
> >> On Linux. O:-)
> > 
> > No, on any system where qemu_try_blockalign() results in a COW zero
> > page.
> 
> OK, yes, but why would you only ever access part of it?  Then you might
> just as well have created a smaller disk from the beginning.

I always create my qcow2 images larger than I actually need them. It
costs basically nothing and avoids the need to resize my partitions
inside the guest later.

And anyway, a disk with 100% usage is not the common case, but the point
where the user will either delete stuff or resize the image. For
long-running VMs, deleting stuff doesn't get rid of the large cache on
non-Linux, but I think we agree that long-running guests aren't what we
expect on those hosts?

> > The Linux-only addition is returning memory even after an access.
> > 
> >> So it's good that you have calmed my nerves about how this might be
> >> problematic on Linux systems (it isn't in practice, although I disagree
> >> that people will find qcow2 to be the fault when their memory runs out),
> >> but you haven't said anything about non-Linux systems.  I understand
> >> that you don't care, but as I said here, this was my only substantial
> >> concern anyway.
> > 
> > I don't actually think it's so bad to keep the cache permanently
> > allocated, but I wouldn't object to a lower default for non-Linux hosts
> > either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
> > might be more adequate. My typical desktop VMs are larger than 8 GB, but
> > smaller than 32 GB.
> 
> Will your typical desktop VMs gain anything from the cache covering
> more than 8 GB?

Good point. Probably not.

> Anyway, I certainly won't complain about 4 MB.
> 
> (My point here is that on non-Linux systems, qemu probably does not have
> users who have use cases where they need to access 256 GB of disk
> simultaneously.  Probably not even more than 8 GB.  If you want to
> increase the cache size there to 4 MB, fine, I think that won't hurt.
> But 32 MB might hurt, and I don't think on non-Linux systems there are
> users who would benefit from it -- specifically because your "typical
> desktop VM" wouldn't benefit from it.)

Maybe 1 MB is fine for them, after all.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13 15:58           ` Kevin Wolf
  2018-08-13 16:08             ` Max Reitz
@ 2018-08-13 16:42             ` Leonid Bloch
  2018-08-14  8:18               ` Kevin Wolf
  1 sibling, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-13 16:42 UTC (permalink / raw)
  To: Kevin Wolf, Max Reitz; +Cc: Alberto Garcia, qemu-devel, qemu-block, Eric Blake

> I don't actually think it's so bad to keep the cache permanently
> allocated, but I wouldn't object to a lower default for non-Linux hosts
> either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
> might be more adequate. My typical desktop VMs are larger than 8 GB, but
> smaller than 32 GB.
> 
> Kevin
> 

And for a Windows VM just the OS installation takes above 40 GB. While 
we probably are not running Windows VMs for our own needs, it is very 
common that a customer of, for example, some cloud service uses QEMU 
(unknowingly) for a full-blown Windows. So 100 GB+ images which are 
quite heavily used is not a rare scenario. 256 GB - yeah, that would be 
on the higher end.

So 16 MB would indeed be a reasonable default for the *max.* L2 cache 
now, although below that would be too little, I think. 32 MB - if we 
want some future-proofing.

Leonid.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-13 16:42             ` Leonid Bloch
@ 2018-08-14  8:18               ` Kevin Wolf
  2018-08-14 11:34                 ` Leonid Bloch
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2018-08-14  8:18 UTC (permalink / raw)
  To: Leonid Bloch
  Cc: Max Reitz, Alberto Garcia, qemu-devel, qemu-block, Eric Blake

Am 13.08.2018 um 18:42 hat Leonid Bloch geschrieben:
> > I don't actually think it's so bad to keep the cache permanently
> > allocated, but I wouldn't object to a lower default for non-Linux hosts
> > either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
> > might be more adequate. My typical desktop VMs are larger than 8 GB, but
> > smaller than 32 GB.
> 
> And for a Windows VM just the OS installation takes above 40 GB. While we
> probably are not running Windows VMs for our own needs, it is very common
> that a customer of, for example, some cloud service uses QEMU (unknowingly)
> for a full-blown Windows. So 100 GB+ images which are quite heavily used is
> not a rare scenario. 256 GB - yeah, that would be on the higher end.

The OS installation is mostly sequential access, though. You only need
that much cache when you have completely random I/O across the whole
image. Otherwise the LRU based approach of the cache is good enough to
keep those tables cached that are actually in use.

The maximum cache size is maybe for huge databases or indeed random I/O
benchmarks, both of which are important to support (on Linux at least),
but probably not the most common use case.

> So 16 MB would indeed be a reasonable default for the *max.* L2 cache now,
> although below that would be too little, I think. 32 MB - if we want some
> future-proofing.

I think we all agree that 32 MB + cache-clean-interval is okay.

It's just that for non-Linux guests, cache-clean-interval doesn't work.
However, we probably care less about those large random I/O cases there,
so a smaller cache size like 1 MB can do on non-Linux.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-14  8:18               ` Kevin Wolf
@ 2018-08-14 11:34                 ` Leonid Bloch
  2018-08-14 11:44                   ` Kevin Wolf
  0 siblings, 1 reply; 35+ messages in thread
From: Leonid Bloch @ 2018-08-14 11:34 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Max Reitz, Alberto Garcia, qemu-devel, qemu-block, Eric Blake

On 8/14/18 11:18 AM, Kevin Wolf wrote:
> Am 13.08.2018 um 18:42 hat Leonid Bloch geschrieben:
>>> I don't actually think it's so bad to keep the cache permanently
>>> allocated, but I wouldn't object to a lower default for non-Linux hosts
>>> either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
>>> might be more adequate. My typical desktop VMs are larger than 8 GB, but
>>> smaller than 32 GB.
>>
>> And for a Windows VM just the OS installation takes above 40 GB. While we
>> probably are not running Windows VMs for our own needs, it is very common
>> that a customer of, for example, some cloud service uses QEMU (unknowingly)
>> for a full-blown Windows. So 100 GB+ images which are quite heavily used is
>> not a rare scenario. 256 GB - yeah, that would be on the higher end.
> 
> The OS installation is mostly sequential access, though. You only need
> that much cache when you have completely random I/O across the whole
> image. Otherwise the LRU based approach of the cache is good enough to
> keep those tables cached that are actually in use.

Sorry, by "OS installation" I meant the installed size of the OS, which 
should be available for fast and frequent access, not the installation 
process itself. Obviously for one-time tasks like the installation 
process it's not worth it, unless one installs all the time, instead of 
using ready images, for some reason. :)

> 
> The maximum cache size is maybe for huge databases or indeed random I/O
> benchmarks, both of which are important to support (on Linux at least),
> but probably not the most common use case.
> 
>> So 16 MB would indeed be a reasonable default for the *max.* L2 cache now,
>> although below that would be too little, I think. 32 MB - if we want some
>> future-proofing.
> 
> I think we all agree that 32 MB + cache-clean-interval is okay.
> 
> It's just that for non-Linux guests, cache-clean-interval doesn't work.
> However, we probably care less about those large random I/O cases there,
> so a smaller cache size like 1 MB can do on non-Linux.
> 
> Kevin
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-14 11:34                 ` Leonid Bloch
@ 2018-08-14 11:44                   ` Kevin Wolf
  2018-08-14 12:29                     ` Leonid Bloch
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2018-08-14 11:44 UTC (permalink / raw)
  To: Leonid Bloch
  Cc: Max Reitz, Alberto Garcia, qemu-devel, qemu-block, Eric Blake

Am 14.08.2018 um 13:34 hat Leonid Bloch geschrieben:
> On 8/14/18 11:18 AM, Kevin Wolf wrote:
> > Am 13.08.2018 um 18:42 hat Leonid Bloch geschrieben:
> > > > I don't actually think it's so bad to keep the cache permanently
> > > > allocated, but I wouldn't object to a lower default for non-Linux hosts
> > > > either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
> > > > might be more adequate. My typical desktop VMs are larger than 8 GB, but
> > > > smaller than 32 GB.
> > > 
> > > And for a Windows VM just the OS installation takes above 40 GB. While we
> > > probably are not running Windows VMs for our own needs, it is very common
> > > that a customer of, for example, some cloud service uses QEMU (unknowingly)
> > > for a full-blown Windows. So 100 GB+ images which are quite heavily used is
> > > not a rare scenario. 256 GB - yeah, that would be on the higher end.
> > 
> > The OS installation is mostly sequential access, though. You only need
> > that much cache when you have completely random I/O across the whole
> > image. Otherwise the LRU based approach of the cache is good enough to
> > keep those tables cached that are actually in use.
> 
> Sorry, by "OS installation" I meant the installed size of the OS, which
> should be available for fast and frequent access, not the installation
> process itself. Obviously for one-time tasks like the installation process
> it's not worth it, unless one installs all the time, instead of using ready
> images, for some reason. :)

But you never use everything that is present in an OS installation of
40 GB (is it really _that_ huge these days?), and you don't read OS
files non-stop. The most frequently used parts of the OS are actually in
the guest RAM.

I don't think you'll really notice the difference in qcow2 unless you
have a really I/O intensive workload - and that is not usually for OS
files, but for user data. For only occasional accesses, the additional
64k for the metadata table wouldn't play a big role.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
  2018-08-14 11:44                   ` Kevin Wolf
@ 2018-08-14 12:29                     ` Leonid Bloch
  0 siblings, 0 replies; 35+ messages in thread
From: Leonid Bloch @ 2018-08-14 12:29 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Max Reitz, Alberto Garcia, qemu-devel, qemu-block, Eric Blake

On 8/14/18 2:44 PM, Kevin Wolf wrote:
> Am 14.08.2018 um 13:34 hat Leonid Bloch geschrieben:
>> On 8/14/18 11:18 AM, Kevin Wolf wrote:
>>> Am 13.08.2018 um 18:42 hat Leonid Bloch geschrieben:
>>>>> I don't actually think it's so bad to keep the cache permanently
>>>>> allocated, but I wouldn't object to a lower default for non-Linux hosts
>>>>> either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
>>>>> might be more adequate. My typical desktop VMs are larger than 8 GB, but
>>>>> smaller than 32 GB.
>>>>
>>>> And for a Windows VM just the OS installation takes above 40 GB. While we
>>>> probably are not running Windows VMs for our own needs, it is very common
>>>> that a customer of, for example, some cloud service uses QEMU (unknowingly)
>>>> for a full-blown Windows. So 100 GB+ images which are quite heavily used is
>>>> not a rare scenario. 256 GB - yeah, that would be on the higher end.
>>>
>>> The OS installation is mostly sequential access, though. You only need
>>> that much cache when you have completely random I/O across the whole
>>> image. Otherwise the LRU based approach of the cache is good enough to
>>> keep those tables cached that are actually in use.
>>
>> Sorry, by "OS installation" I meant the installed size of the OS, which
>> should be available for fast and frequent access, not the installation
>> process itself. Obviously for one-time tasks like the installation process
>> it's not worth it, unless one installs all the time, instead of using ready
>> images, for some reason. :)
> 
> But you never use everything that is present in an OS installation of
> 40 GB (is it really _that_ huge these days?), and you don't read OS
> files non-stop. The most frequently used parts of the OS are actually in
> the guest RAM.

Yes, Windows 8.1, with all the desktop bloat - just above 40 GB. :]
I did a proper benchmarking indeed only on heavy I/O load, where full 
cache did show above 50% improvement, although just regular usage felt 
faster as well, but maybe it's just psychosomatic. :)

Leonid.

> 
> I don't think you'll really notice the difference in qcow2 unless you
> have a really I/O intensive workload - and that is not usually for OS
> files, but for user data. For only occasional accesses, the additional
> 64k for the metadata table wouldn't play a big role.
> 
> Kevin
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2018-08-14 12:29 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-10  6:26 [Qemu-devel] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache Leonid Bloch
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 1/9] qcow2: Options' documentation fixes Leonid Bloch
2018-08-10 11:50   ` Alberto Garcia
2018-08-11 18:01     ` Leonid Bloch
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 2/9] qcow2: Cosmetic changes Leonid Bloch
2018-08-10 12:54   ` Alberto Garcia
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 3/9] qcow2: Make sizes more humanly readable Leonid Bloch
2018-08-10  8:33   ` Alberto Garcia
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size Leonid Bloch
2018-08-10 13:14   ` Alberto Garcia
2018-08-11 18:40     ` Leonid Bloch
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size Leonid Bloch
2018-08-10 14:39   ` Alberto Garcia
2018-08-11 19:19     ` Leonid Bloch
2018-08-13 11:33       ` Kevin Wolf
2018-08-13 11:48         ` Leonid Bloch
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size Leonid Bloch
2018-08-10 12:00   ` Alberto Garcia
2018-08-13  1:39     ` Max Reitz
2018-08-13  6:09       ` Leonid Bloch
2018-08-13 15:16         ` Max Reitz
2018-08-13 16:00           ` Kevin Wolf
2018-08-13 11:23       ` Kevin Wolf
2018-08-13 15:11         ` Max Reitz
2018-08-13 15:58           ` Kevin Wolf
2018-08-13 16:08             ` Max Reitz
2018-08-13 16:24               ` Kevin Wolf
2018-08-13 16:42             ` Leonid Bloch
2018-08-14  8:18               ` Kevin Wolf
2018-08-14 11:34                 ` Leonid Bloch
2018-08-14 11:44                   ` Kevin Wolf
2018-08-14 12:29                     ` Leonid Bloch
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 7/9] qcow2: Resize the cache upon image resizing Leonid Bloch
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 8/9] qcow2: Set the default cache-clean-interval to 10 minutes Leonid Bloch
2018-08-10  6:26 ` [Qemu-devel] [PATCH v7 9/9] qcow2: Explicit number replaced by a constant Leonid Bloch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.