All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold
@ 2014-12-04  8:59 Francesco Romani
  2014-12-04  9:00 ` Francesco Romani
  2015-01-09 16:54 ` Eric Blake
  0 siblings, 2 replies; 6+ messages in thread
From: Francesco Romani @ 2014-12-04  8:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, mdroth, lcapitulino, Francesco Romani, stefanha

Managing applications, like oVirt (http://www.ovirt.org), make extensive
use of thin-provisioned disk images.
To let the guest run smoothly and be not unnecessarily paused, oVirt sets
a disk usage threshold (so called 'high water mark') based on the occupation
of the device,  and automatically extends the image once the threshold
is reached or exceeded.

In order to detect the crossing of the threshold, oVirt has no choice but
aggressively polling the QEMU monitor using the query-blockstats command.
This lead to unnecessary system load, and is made even worse under scale:
deployments with hundreds of VMs are no longer rare.

To fix this, this patch adds:
* A new monitor command to set a write threshold for a given block device.
* A new event to report if a block device usage exceeds the threshold.

This will allow the managing application to use smarter and more
efficient monitoring, greatly reducing the need of polling.

A followup patch is planned to allow to add the write threshold at
device creation.

Signed-off-by: Francesco Romani <fromani@redhat.com>
---
 block/Makefile.objs             |   1 +
 block/qapi.c                    |   3 +
 block/write-threshold.c         | 125 ++++++++++++++++++++++++++++++++++++++++
 include/block/block_int.h       |   4 ++
 include/block/write-threshold.h |  64 ++++++++++++++++++++
 qapi/block-core.json            |  50 +++++++++++++++-
 qmp-commands.hx                 |  32 ++++++++++
 tests/Makefile                  |   3 +
 tests/test-write-threshold.c    | 119 ++++++++++++++++++++++++++++++++++++++
 9 files changed, 400 insertions(+), 1 deletion(-)
 create mode 100644 block/write-threshold.c
 create mode 100644 include/block/write-threshold.h
 create mode 100644 tests/test-write-threshold.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 04b0e43..010afad 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -20,6 +20,7 @@ block-obj-$(CONFIG_GLUSTERFS) += gluster.o
 block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
 block-obj-$(CONFIG_LIBSSH2) += ssh.o
 block-obj-y += accounting.o
+block-obj-y += write-threshold.o
 
 common-obj-y += stream.o
 common-obj-y += commit.o
diff --git a/block/qapi.c b/block/qapi.c
index a87a34a..d38f7a6 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -24,6 +24,7 @@
 
 #include "block/qapi.h"
 #include "block/block_int.h"
+#include "block/write-threshold.h"
 #include "qmp-commands.h"
 #include "qapi-visit.h"
 #include "qapi/qmp-output-visitor.h"
@@ -82,6 +83,8 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
         info->iops_size = cfg.op_size;
     }
 
+    info->write_threshold = bdrv_write_threshold_get(bs);
+
     return info;
 }
 
diff --git a/block/write-threshold.c b/block/write-threshold.c
new file mode 100644
index 0000000..b7aa20e
--- /dev/null
+++ b/block/write-threshold.c
@@ -0,0 +1,125 @@
+/*
+ * QEMU System Emulator block write threshold notification
+ *
+ * Copyright Red Hat, Inc. 2014
+ *
+ * Authors:
+ *  Francesco Romani <fromani@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include "block/block_int.h"
+#include "block/coroutine.h"
+#include "block/write-threshold.h"
+#include "qemu/notify.h"
+#include "qapi-event.h"
+#include "qmp-commands.h"
+
+
+uint64_t bdrv_write_threshold_get(const BlockDriverState *bs)
+{
+    return bs->write_threshold_offset;
+}
+
+bool bdrv_write_threshold_is_set(const BlockDriverState *bs)
+{
+    return !!(bs->write_threshold_offset > 0);
+}
+
+static void write_threshold_disable(BlockDriverState *bs)
+{
+    if (bdrv_write_threshold_is_set(bs)) {
+        notifier_with_return_remove(&bs->write_threshold_notifier);
+        bs->write_threshold_offset = 0;
+    }
+}
+
+uint64_t bdrv_write_threshold_exceeded(const BlockDriverState *bs,
+                                       const BdrvTrackedRequest *req)
+{
+    if (bdrv_write_threshold_is_set(bs)) {
+        if (req->offset > bs->write_threshold_offset) {
+            return (req->offset - bs->write_threshold_offset) + req->bytes;
+        }
+        if ((req->offset + req->bytes) > bs->write_threshold_offset) {
+            return (req->offset + req->bytes) - bs->write_threshold_offset;
+        }
+    }
+    return 0;
+}
+
+static int coroutine_fn before_write_notify(NotifierWithReturn *notifier,
+                                            void *opaque)
+{
+    BdrvTrackedRequest *req = opaque;
+    BlockDriverState *bs = req->bs;
+    uint64_t amount = 0;
+
+    amount = bdrv_write_threshold_exceeded(bs, req);
+    if (amount > 0) {
+        qapi_event_send_block_write_threshold(
+            bs->node_name,
+            amount,
+            bs->write_threshold_offset,
+            &error_abort);
+
+        /* autodisable to avoid flooding the monitor */
+        write_threshold_disable(bs);
+    }
+
+    return 0; /* should always let other notifiers run */
+}
+
+static void write_threshold_register_notifier(BlockDriverState *bs)
+{
+    bs->write_threshold_notifier.notify = before_write_notify;
+    notifier_with_return_list_add(&bs->before_write_notifiers,
+                                  &bs->write_threshold_notifier);
+}
+
+static void write_threshold_update(BlockDriverState *bs,
+                                   int64_t threshold_bytes)
+{
+    bs->write_threshold_offset = threshold_bytes;
+}
+
+void bdrv_write_threshold_set(BlockDriverState *bs, uint64_t threshold_bytes)
+{
+    if (bdrv_write_threshold_is_set(bs)) {
+        if (threshold_bytes > 0) {
+            write_threshold_update(bs, threshold_bytes);
+        } else {
+            write_threshold_disable(bs);
+        }
+    } else {
+        if (threshold_bytes > 0) {
+            /* avoid multiple registration */
+            write_threshold_register_notifier(bs);
+            write_threshold_update(bs, threshold_bytes);
+        }
+        /* discard bogus disable request */
+    }
+}
+
+void qmp_block_set_write_threshold(const char *node_name,
+                                   uint64_t threshold_bytes,
+                                   Error **errp)
+{
+    BlockDriverState *bs;
+    AioContext *aio_context;
+
+    bs = bdrv_find_node(node_name);
+    if (!bs) {
+        error_set(errp, QERR_DEVICE_NOT_FOUND, node_name);
+        return;
+    }
+
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+
+    bdrv_write_threshold_set(bs, threshold_bytes);
+
+    aio_context_release(aio_context);
+}
diff --git a/include/block/block_int.h b/include/block/block_int.h
index a1c17b9..3791d9b 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -409,6 +409,10 @@ struct BlockDriverState {
 
     /* The error object in use for blocking operations on backing_hd */
     Error *backing_blocker;
+
+    /* threshold limit for writes, in bytes. "High water mark". */
+    uint64_t write_threshold_offset;
+    NotifierWithReturn write_threshold_notifier;
 };
 
 int get_tmp_filename(char *filename, int size);
diff --git a/include/block/write-threshold.h b/include/block/write-threshold.h
new file mode 100644
index 0000000..f1b899c
--- /dev/null
+++ b/include/block/write-threshold.h
@@ -0,0 +1,64 @@
+/*
+ * QEMU System Emulator block write threshold notification
+ *
+ * Copyright Red Hat, Inc. 2014
+ *
+ * Authors:
+ *  Francesco Romani <fromani@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+#ifndef BLOCK_WRITE_THRESHOLD_H
+#define BLOCK_WRITE_THRESHOLD_H
+
+#include <stdint.h>
+
+#include "qemu/typedefs.h"
+#include "qemu-common.h"
+
+/*
+ * bdrv_write_threshold_set:
+ *
+ * Set the write threshold for block devices, in bytes.
+ * Notify when a write exceeds the threshold, meaning the device
+ * is becoming full, so it can be transparently resized.
+ * To be used with thin-provisioned block devices.
+ *
+ * Use threshold_bytes == 0 to disable.
+ */
+void bdrv_write_threshold_set(BlockDriverState *bs, uint64_t threshold_bytes);
+
+/*
+ * bdrv_write_threshold_get
+ *
+ * Get the configured write threshold, in bytes.
+ * Zero means no threshold configured.
+ */
+uint64_t bdrv_write_threshold_get(const BlockDriverState *bs);
+
+/*
+ * bdrv_write_threshold_is_set
+ *
+ * Tell if a write threshold is set for a given BDS.
+ */
+bool bdrv_write_threshold_is_set(const BlockDriverState *bs);
+
+/*
+ * bdrv_write_threshold_exceeded
+ *
+ * Return the extent of a write request that exceeded the threshold,
+ * or zero if the request is below the threshold.
+ * Return zero also if the threshold was not set.
+ *
+ * NOTE: here we assume the following holds for each request this code
+ * deals with:
+ *
+ * assert((req->offset + req->bytes) <= UINT64_MAX)
+ *
+ * Please not there is *not* an actual C assert().
+ */
+uint64_t bdrv_write_threshold_exceeded(const BlockDriverState *bs,
+                                       const BdrvTrackedRequest *req);
+
+#endif
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a14e6ab..5c78387 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -239,6 +239,9 @@
 #
 # @iops_size: #optional an I/O size in bytes (Since 1.7)
 #
+# @write_threshold: configured write threshold for the device.
+#                   0 if disabled. (Since 2.3)
+#
 # Since: 0.14.0
 #
 ##
@@ -253,7 +256,7 @@
             '*bps_max': 'int', '*bps_rd_max': 'int',
             '*bps_wr_max': 'int', '*iops_max': 'int',
             '*iops_rd_max': 'int', '*iops_wr_max': 'int',
-            '*iops_size': 'int' } }
+            '*iops_size': 'int', 'write_threshold': 'uint64' } }
 
 ##
 # @BlockDeviceIoStatus:
@@ -1826,3 +1829,48 @@
 ##
 { 'enum': 'PreallocMode',
   'data': [ 'off', 'metadata', 'falloc', 'full' ] }
+
+##
+# @BLOCK_WRITE_THRESHOLD
+#
+# Emitted when writes on block device reaches or exceeds the
+# configured write threshold. For thin-provisioned devices, this
+# means the device should be extended to avoid pausing for
+# disk exhaustion.
+# The event is one shot. Once triggered, it needs to be
+# re-registered with another block-set-threshold command.
+#
+# @node-name: graph node name on which the threshold was exceeded.
+#
+# @amount-exceeded: amount of data which exceeded the threshold, in bytes.
+#
+# @write-threshold: last configured threshold, in bytes.
+#
+# Since: 2.3
+##
+{ 'event': 'BLOCK_WRITE_THRESHOLD',
+  'data': { 'node-name': 'str',
+            'amount-exceeded': 'uint64',
+            'write-threshold': 'uint64' } }
+
+##
+# @block-set-write-threshold
+#
+# Change the write threshold for a block drive. An event will be delivered
+# if a write to this block drive crosses the configured threshold.
+# This is useful to transparently resize thin-provisioned drives without
+# the guest OS noticing.
+#
+# @node-name: graph node name on which the threshold must be set.
+#
+# @write-threshold: configured threshold for the block device, bytes.
+#                   Use 0 to disable the threshold.
+#
+# Returns: Nothing on success
+#          If @node name is not found on the block device graph,
+#          DeviceNotFound
+#
+# Since: 2.3
+##
+{ 'command': 'block-set-write-threshold',
+  'data': { 'node-name': 'str', 'write-threshold': 'uint64' } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 718dd92..6172dc1 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2104,6 +2104,8 @@ Each json-object contain the following:
          - "iops_size": I/O size when limiting by iops (json-int)
          - "detect_zeroes": detect and optimize zero writing (json-string)
              - Possible values: "off", "on", "unmap"
+         - "write_threshold": write offset threshold in bytes, a event will be
+                              emitted if crossed. Zero if disabled (json-int)
          - "image": the detail of the image, it is a json-object containing
             the following:
              - "filename": image file name (json-string)
@@ -2181,6 +2183,7 @@ Example:
                "iops_wr_max": 0,
                "iops_size": 0,
                "detect_zeroes": "on",
+               "write_threshold": 0,
                "image":{
                   "filename":"disks/test.qcow2",
                   "format":"qcow2",
@@ -3618,6 +3621,7 @@ Example:
                    "iops_rd_max": 0,
                    "iops_wr_max": 0,
                    "iops_size": 0,
+                   "write_threshold": 0,
                    "image":{
                       "filename":"disks/test.qcow2",
                       "format":"qcow2",
@@ -3854,3 +3858,31 @@ Move mouse pointer to absolute coordinates (20000, 400).
 <- { "return": {} }
 
 EQMP
+
+    {
+        .name       = "block-set-write-threshold",
+        .args_type  = "node-name:s,write-threshold:l",
+        .mhandler.cmd_new = qmp_marshal_input_block_set_write_threshold,
+    },
+
+SQMP
+block-set-write-threshold
+------------
+
+Change the write threshold for a block drive. The threshold is an offset,
+thus must be non-negative. Default is no write threshold.
+Setting the threshold to zero disables it.
+
+Arguments:
+
+- "node-name": the node name in the block driver state graph (json-string)
+- "write-threshold": the write threshold in bytes (json-int)
+
+Example:
+
+-> { "execute": "block-set-write-threshold",
+  "arguments": { "node-name": "mydev",
+                 "write-threshold": 17179869184 } }
+<- { "return": {} }
+
+EQMP
diff --git a/tests/Makefile b/tests/Makefile
index 16f0e4c..120b4e5 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -64,6 +64,8 @@ gcov-files-check-qom-interface-y = qom/object.c
 check-unit-$(CONFIG_POSIX) += tests/test-vmstate$(EXESUF)
 check-unit-y += tests/test-qemu-opts$(EXESUF)
 gcov-files-test-qemu-opts-y = qom/test-qemu-opts.c
+check-unit-y += tests/test-write-threshold$(EXESUF)
+gcov-files-test-write-threshold-y = block/write-threshold.c
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -351,6 +353,7 @@ tests/usb-hcd-xhci-test$(EXESUF): tests/usb-hcd-xhci-test.o $(libqos-usb-obj-y)
 tests/vhost-user-test$(EXESUF): tests/vhost-user-test.o qemu-char.o qemu-timer.o $(qtest-obj-y)
 tests/qemu-iotests/socket_scm_helper$(EXESUF): tests/qemu-iotests/socket_scm_helper.o
 tests/test-qemu-opts$(EXESUF): tests/test-qemu-opts.o libqemuutil.a libqemustub.a
+tests/test-write-threshold$(EXESUF): tests/test-write-threshold.o $(block-obj-y) libqemuutil.a libqemustub.a
 
 ifeq ($(CONFIG_POSIX),y)
 LIBS += -lutil
diff --git a/tests/test-write-threshold.c b/tests/test-write-threshold.c
new file mode 100644
index 0000000..d24dc17
--- /dev/null
+++ b/tests/test-write-threshold.c
@@ -0,0 +1,119 @@
+/*
+ * Test block device write threshold
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include <glib.h>
+#include <stdint.h>
+#include "block/block_int.h"
+#include "block/write-threshold.h"
+
+
+static void test_threshold_not_set_on_init(void)
+{
+    uint64_t res;
+    BlockDriverState bs;
+    memset(&bs, 0, sizeof(bs));
+
+    g_assert_false(bdrv_write_threshold_is_set(&bs));
+
+    res = bdrv_write_threshold_get(&bs);
+    g_assert_cmpint(res, ==, 0);
+}
+
+static void test_threshold_set_get(void)
+{
+    uint64_t threshold = 4 * 1024 * 1024;
+    uint64_t res;
+    BlockDriverState bs;
+    memset(&bs, 0, sizeof(bs));
+
+    bdrv_write_threshold_set(&bs, threshold);
+
+    g_assert(bdrv_write_threshold_is_set(&bs));
+
+    res = bdrv_write_threshold_get(&bs);
+    g_assert_cmpint(res, ==, threshold);
+}
+
+static void test_threshold_multi_set_get(void)
+{
+    uint64_t threshold1 = 4 * 1024 * 1024;
+    uint64_t threshold2 = 15 * 1024 * 1024;
+    uint64_t res;
+    BlockDriverState bs;
+    memset(&bs, 0, sizeof(bs));
+
+    bdrv_write_threshold_set(&bs, threshold1);
+    bdrv_write_threshold_set(&bs, threshold2);
+    res = bdrv_write_threshold_get(&bs);
+    g_assert_cmpint(res, ==, threshold2);
+}
+
+static void test_threshold_not_trigger(void)
+{
+    uint64_t amount = 0;
+    uint64_t threshold = 4 * 1024 * 1024;
+    BlockDriverState bs;
+    BdrvTrackedRequest req;
+
+    memset(&bs, 0, sizeof(bs));
+    memset(&req, 0, sizeof(req));
+    req.offset = 1024;
+    req.bytes = 1024;
+
+    bdrv_write_threshold_set(&bs, threshold);
+    amount = bdrv_write_threshold_exceeded(&bs, &req);
+    g_assert_cmpuint(amount, ==, 0);
+}
+
+
+static void test_threshold_trigger(void)
+{
+    uint64_t amount = 0;
+    uint64_t threshold = 4 * 1024 * 1024;
+    BlockDriverState bs;
+    BdrvTrackedRequest req;
+
+    memset(&bs, 0, sizeof(bs));
+    memset(&req, 0, sizeof(req));
+    req.offset = (4 * 1024 * 1024) - 1024;
+    req.bytes = 2 * 1024;
+
+    bdrv_write_threshold_set(&bs, threshold);
+    amount = bdrv_write_threshold_exceeded(&bs, &req);
+    g_assert_cmpuint(amount, >=, 1024);
+}
+
+typedef struct TestStruct {
+    const char *name;
+    void (*func)(void);
+} TestStruct;
+
+
+int main(int argc, char **argv)
+{
+    size_t i;
+    TestStruct tests[] = {
+        { "/write-threshold/not-set-on-init",
+          test_threshold_not_set_on_init },
+        { "/write-threshold/set-get",
+          test_threshold_set_get },
+        { "/write-threshold/multi-set-get",
+          test_threshold_multi_set_get },
+        { "/write-threshold/not-trigger",
+          test_threshold_not_trigger },
+        { "/write-threshold/trigger",
+          test_threshold_trigger },
+        { NULL, NULL }
+    };
+
+    g_test_init(&argc, &argv, NULL);
+    for (i = 0; tests[i].name != NULL; i++) {
+        g_test_add_func(tests[i].name, tests[i].func);
+    }
+    return g_test_run();
+}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold
  2014-12-04  8:59 [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold Francesco Romani
@ 2014-12-04  9:00 ` Francesco Romani
  2014-12-15 11:19   ` Francesco Romani
  2015-01-09 16:54 ` Eric Blake
  1 sibling, 1 reply; 6+ messages in thread
From: Francesco Romani @ 2014-12-04  9:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, mdroth, stefanha, lcapitulino

----- Original Message -----
> From: "Francesco Romani" <fromani@redhat.com>
> To: qemu-devel@nongnu.org
> Cc: kwolf@redhat.com, stefanha@redhat.com, lcapitulino@redhat.com, mdroth@linux.vnet.ibm.com, eblake@redhat.com,
> "Francesco Romani" <fromani@redhat.com>
> Sent: Thursday, December 4, 2014 9:59:08 AM
> Subject: [PATCH v4] block: add event when disk usage exceeds threshold
[...]

Changes since v3:
-----------------

addressed reviewers comments:
- dropped redundant cover letter
- improved commit message to reflect real intended use
  (polling is not going away completely)
- spelling fixes
- API consistency fixes
- kept 'uint64' in QAPI because it is mapped to C type  uint64_t,
  where 'int' is mapped to 'int64_t'
- renamed for consistency everywhere:
  'usage_threshold' => 'write_threshold'

Thanks and best regards

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold
  2014-12-04  9:00 ` Francesco Romani
@ 2014-12-15 11:19   ` Francesco Romani
  2015-01-09 10:36     ` Francesco Romani
  0 siblings, 1 reply; 6+ messages in thread
From: Francesco Romani @ 2014-12-15 11:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, mdroth, stefanha, lcapitulino

ping?

----- Original Message -----
> From: "Francesco Romani" <fromani@redhat.com>
> To: qemu-devel@nongnu.org
> Cc: kwolf@redhat.com, mdroth@linux.vnet.ibm.com, stefanha@redhat.com, lcapitulino@redhat.com
> Sent: Thursday, December 4, 2014 10:00:18 AM
> Subject: Re: [Qemu-devel] [PATCH v4] block: add event when disk usage	exceeds threshold
> 
> ----- Original Message -----
> > From: "Francesco Romani" <fromani@redhat.com>
> > To: qemu-devel@nongnu.org
> > Cc: kwolf@redhat.com, stefanha@redhat.com, lcapitulino@redhat.com,
> > mdroth@linux.vnet.ibm.com, eblake@redhat.com,
> > "Francesco Romani" <fromani@redhat.com>
> > Sent: Thursday, December 4, 2014 9:59:08 AM
> > Subject: [PATCH v4] block: add event when disk usage exceeds threshold
> [...]
> 
> Changes since v3:
> -----------------
> 
> addressed reviewers comments:
> - dropped redundant cover letter
> - improved commit message to reflect real intended use
>   (polling is not going away completely)
> - spelling fixes
> - API consistency fixes
> - kept 'uint64' in QAPI because it is mapped to C type  uint64_t,
>   where 'int' is mapped to 'int64_t'
> - renamed for consistency everywhere:
>   'usage_threshold' => 'write_threshold'
> 
> Thanks and best regards
> 
> --
> Francesco Romani
> RedHat Engineering Virtualization R & D
> Phone: 8261328
> IRC: fromani
> 
> 

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold
  2014-12-15 11:19   ` Francesco Romani
@ 2015-01-09 10:36     ` Francesco Romani
  0 siblings, 0 replies; 6+ messages in thread
From: Francesco Romani @ 2015-01-09 10:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, mdroth, stefanha, lcapitulino

ping

----- Original Message -----
> From: "Francesco Romani" <fromani@redhat.com>
> To: qemu-devel@nongnu.org
> Cc: kwolf@redhat.com, mdroth@linux.vnet.ibm.com, stefanha@redhat.com, lcapitulino@redhat.com
> Sent: Monday, December 15, 2014 12:19:44 PM
> Subject: Re: [Qemu-devel] [PATCH v4] block: add event when disk	usage	exceeds threshold
> 
> ping?
> 
> ----- Original Message -----
> > From: "Francesco Romani" <fromani@redhat.com>
> > To: qemu-devel@nongnu.org
> > Cc: kwolf@redhat.com, mdroth@linux.vnet.ibm.com, stefanha@redhat.com,
> > lcapitulino@redhat.com
> > Sent: Thursday, December 4, 2014 10:00:18 AM
> > Subject: Re: [Qemu-devel] [PATCH v4] block: add event when disk usage
> > 	exceeds threshold
> > 
> > ----- Original Message -----
> > > From: "Francesco Romani" <fromani@redhat.com>
> > > To: qemu-devel@nongnu.org
> > > Cc: kwolf@redhat.com, stefanha@redhat.com, lcapitulino@redhat.com,
> > > mdroth@linux.vnet.ibm.com, eblake@redhat.com,
> > > "Francesco Romani" <fromani@redhat.com>
> > > Sent: Thursday, December 4, 2014 9:59:08 AM
> > > Subject: [PATCH v4] block: add event when disk usage exceeds threshold
> > [...]
> > 
> > Changes since v3:
> > -----------------
> > 
> > addressed reviewers comments:
> > - dropped redundant cover letter
> > - improved commit message to reflect real intended use
> >   (polling is not going away completely)
> > - spelling fixes
> > - API consistency fixes
> > - kept 'uint64' in QAPI because it is mapped to C type  uint64_t,
> >   where 'int' is mapped to 'int64_t'
> > - renamed for consistency everywhere:
> >   'usage_threshold' => 'write_threshold'
> > 
> > Thanks and best regards
> > 
> > --
> > Francesco Romani
> > RedHat Engineering Virtualization R & D
> > Phone: 8261328
> > IRC: fromani
> > 
> > 
> 
> --
> Francesco Romani
> RedHat Engineering Virtualization R & D
> Phone: 8261328
> IRC: fromani
> 
> 

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold
  2014-12-04  8:59 [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold Francesco Romani
  2014-12-04  9:00 ` Francesco Romani
@ 2015-01-09 16:54 ` Eric Blake
  2015-01-12 10:54   ` Francesco Romani
  1 sibling, 1 reply; 6+ messages in thread
From: Eric Blake @ 2015-01-09 16:54 UTC (permalink / raw)
  To: Francesco Romani, qemu-devel; +Cc: kwolf, mdroth, stefanha, lcapitulino

[-- Attachment #1: Type: text/plain, Size: 2775 bytes --]

On 12/04/2014 01:59 AM, Francesco Romani wrote:
> Managing applications, like oVirt (http://www.ovirt.org), make extensive
> use of thin-provisioned disk images.
> To let the guest run smoothly and be not unnecessarily paused, oVirt sets
> a disk usage threshold (so called 'high water mark') based on the occupation
> of the device,  and automatically extends the image once the threshold
> is reached or exceeded.
> 
> In order to detect the crossing of the threshold, oVirt has no choice but
> aggressively polling the QEMU monitor using the query-blockstats command.
> This lead to unnecessary system load, and is made even worse under scale:
> deployments with hundreds of VMs are no longer rare.
> 
> To fix this, this patch adds:
> * A new monitor command to set a write threshold for a given block device.
> * A new event to report if a block device usage exceeds the threshold.

Please also mention the names of those two things in the commit message,
to make it easier to find them when doing 'git log' archaeology.

> 
> This will allow the managing application to use smarter and more
> efficient monitoring, greatly reducing the need of polling.
> 
> A followup patch is planned to allow to add the write threshold at
> device creation.
> 
> Signed-off-by: Francesco Romani <fromani@redhat.com>
> ---

> --- /dev/null
> +++ b/block/write-threshold.c
> @@ -0,0 +1,125 @@
> +/*
> + * QEMU System Emulator block write threshold notification
> + *
> + * Copyright Red Hat, Inc. 2014

I've been so slow on the review that you may want to add 2015.


> +bool bdrv_write_threshold_is_set(const BlockDriverState *bs)
> +{
> +    return !!(bs->write_threshold_offset > 0);

The !! is spurious; use of > already guarantees a bool result.

> +++ b/qapi/block-core.json
> @@ -239,6 +239,9 @@
>  #
>  # @iops_size: #optional an I/O size in bytes (Since 1.7)
>  #
> +# @write_threshold: configured write threshold for the device.
> +#                   0 if disabled. (Since 2.3)
> +#
>  # Since: 0.14.0
>  #
>  ##
> @@ -253,7 +256,7 @@
>              '*bps_max': 'int', '*bps_rd_max': 'int',
>              '*bps_wr_max': 'int', '*iops_max': 'int',
>              '*iops_rd_max': 'int', '*iops_wr_max': 'int',
> -            '*iops_size': 'int' } }
> +            '*iops_size': 'int', 'write_threshold': 'uint64' } }

'int' works as well as 'uint64'; since this is an output parameter, we
aren't gaining any stricter input parsing by using a more-specific type.

My findings are minor, so I'm okay if you post a v5 that addresses them
and includes:
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold
  2015-01-09 16:54 ` Eric Blake
@ 2015-01-12 10:54   ` Francesco Romani
  0 siblings, 0 replies; 6+ messages in thread
From: Francesco Romani @ 2015-01-12 10:54 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, lcapitulino, qemu-devel, stefanha, mdroth

Hi,

thanks for the review!

----- Original Message -----
> From: "Eric Blake" <eblake@redhat.com>
> To: "Francesco Romani" <fromani@redhat.com>, qemu-devel@nongnu.org
> Cc: kwolf@redhat.com, mdroth@linux.vnet.ibm.com, stefanha@redhat.com, lcapitulino@redhat.com
> Sent: Friday, January 9, 2015 5:54:40 PM
> Subject: Re: [Qemu-devel] [PATCH v4] block: add event when disk usage	exceeds threshold
> 
> On 12/04/2014 01:59 AM, Francesco Romani wrote:
> > Managing applications, like oVirt (http://www.ovirt.org), make extensive
> > use of thin-provisioned disk images.
> > To let the guest run smoothly and be not unnecessarily paused, oVirt sets
> > a disk usage threshold (so called 'high water mark') based on the
> > occupation
> > of the device,  and automatically extends the image once the threshold
> > is reached or exceeded.
> > 
> > In order to detect the crossing of the threshold, oVirt has no choice but
> > aggressively polling the QEMU monitor using the query-blockstats command.
> > This lead to unnecessary system load, and is made even worse under scale:
> > deployments with hundreds of VMs are no longer rare.
> > 
> > To fix this, this patch adds:
> > * A new monitor command to set a write threshold for a given block device.
> > * A new event to report if a block device usage exceeds the threshold.
> 
> Please also mention the names of those two things in the commit message,
> to make it easier to find them when doing 'git log' archaeology.

Sure, will do.

> > This will allow the managing application to use smarter and more
> > efficient monitoring, greatly reducing the need of polling.
> > 
> > A followup patch is planned to allow to add the write threshold at
> > device creation.
> > 
> > Signed-off-by: Francesco Romani <fromani@redhat.com>
> > ---
> 
> > --- /dev/null
> > +++ b/block/write-threshold.c
> > @@ -0,0 +1,125 @@
> > +/*
> > + * QEMU System Emulator block write threshold notification
> > + *
> > + * Copyright Red Hat, Inc. 2014
> 
> I've been so slow on the review that you may want to add 2015.

IANAL, but since most (~99%) of code was written in 2014, I'll just leave as that.


> > +bool bdrv_write_threshold_is_set(const BlockDriverState *bs)
> > +{
> > +    return !!(bs->write_threshold_offset > 0);
> 
> The !! is spurious; use of > already guarantees a bool result.

Will remove.
 
> > +++ b/qapi/block-core.json
> > @@ -239,6 +239,9 @@
> >  #
> >  # @iops_size: #optional an I/O size in bytes (Since 1.7)
> >  #
> > +# @write_threshold: configured write threshold for the device.
> > +#                   0 if disabled. (Since 2.3)
> > +#
> >  # Since: 0.14.0
> >  #
> >  ##
> > @@ -253,7 +256,7 @@
> >              '*bps_max': 'int', '*bps_rd_max': 'int',
> >              '*bps_wr_max': 'int', '*iops_max': 'int',
> >              '*iops_rd_max': 'int', '*iops_wr_max': 'int',
> > -            '*iops_size': 'int' } }
> > +            '*iops_size': 'int', 'write_threshold': 'uint64' } }
> 
> 'int' works as well as 'uint64'; since this is an output parameter, we
> aren't gaining any stricter input parsing by using a more-specific type.

I found one case on which the usage 'int' vs 'uint64' made the code generator
emit different code - and the one using uint64 was more correct.
Can't recall if that is the case; I'll retry, and add a comment here
to document the behaviour if I stumble on this again.


> My findings are minor, so I'm okay if you post a v5 that addresses them
> and includes:
> Reviewed-by: Eric Blake <eblake@redhat.com>

Yep, will post ASAP.

Thanks and best regards.

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-01-12 10:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-04  8:59 [Qemu-devel] [PATCH v4] block: add event when disk usage exceeds threshold Francesco Romani
2014-12-04  9:00 ` Francesco Romani
2014-12-15 11:19   ` Francesco Romani
2015-01-09 10:36     ` Francesco Romani
2015-01-09 16:54 ` Eric Blake
2015-01-12 10:54   ` Francesco Romani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.