All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event
@ 2014-07-23 13:17 Luiz Capitulino
  2014-07-23 13:17 ` [Qemu-devel] [RFC 1/3] qapi: block-core.json: improve query-block doc Luiz Capitulino
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Luiz Capitulino @ 2014-07-23 13:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, armbru


Management software, such as OpenStack and RHEV's vdsm, wants to be able
to allocate VM disk space on demand. The basic use case is to start a VM
with a small disk and then the disk is enlarged when QEMU hits a ENOSPC
condition.

To this end, the management software has to be notified when QEMU
encounters ENOSPC. The most straightforward solution is to extend QMP's
BLOCK_IO_ERROR event with that information.

This series does exactly that. The approach taken is the simplest possible:
the BLOCK_IO_ERROR event is extended to contain a "nospace" key, which
will be true whenever the guest runs out of space *and* werror=stop|enospc.
Here's an example:

{ "event": "BLOCK_IO_ERROR",
    "data": { "device": "ide0-hd1",
	          "operation": "write",
			  "action": "stop",
			  "nospace": true },
    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }

There are three important things to observe:

 1. query-block already supports querying the event by means of the
    "io-status" key. Actually, "nospace" and "io-status" keys share
    the same semantics. This is a big advantage of this approach, no
    further extension of query-block is needed

 2. The event could also contain an error message key for debugging,
    But if we add it to the event, should we add it to query-block too?

 3. I'm not extending BLOCK_JOB_ERROR. The reason is that it seems
    that BLOCK_IO_ERROR is also emitted on BLOCK_JOB_ERROR

Now, this series is an RFC because there's an alternative solution for
this problem: instead of extending the BLOCK_IO_ERROR event with no-space
indicator, we could have a stringfied errno. This way management apps
would also be able to distinguish among other errors.

For example, we could have a "error-details" dict containing a
"reason" and a "message" key:

{ "event": "BLOCK_IO_ERROR",
    "data": { "device": "ide0-hd1",
              "operation": "write",
              "action": "stop",
			  "error-details": { "reason": "eio", "message": "I/O error" },
    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }

And then query-block would have to be extended to contain the same
information.

IMO, this series implementation is good enough for the requirement we
currently have but I'm open to go complex if needed.

Luiz Capitulino (3):
  qapi: block-core.json: improve query-block doc
  QMP: rate limit BLOCK_IO_ERROR
  QMP: extend BLOCK_IO_ERROR event with no-space indicator

 block.c              | 22 ++++++++++++++--------
 monitor.c            |  1 +
 qapi/block-core.json |  8 +++++++-
 3 files changed, 22 insertions(+), 9 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC 1/3] qapi: block-core.json: improve query-block doc
  2014-07-23 13:17 [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Luiz Capitulino
@ 2014-07-23 13:17 ` Luiz Capitulino
  2014-08-05 12:13   ` Eric Blake
  2014-07-23 13:17 ` [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR Luiz Capitulino
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Luiz Capitulino @ 2014-07-23 13:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, armbru

List device models supporting the io-status key.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 qapi/block-core.json | 1 +
 1 file changed, 1 insertion(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index e378653..1069679 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -332,6 +332,7 @@
 #
 # @io-status: #optional @BlockDeviceIoStatus. Only present if the device
 #             supports it and the VM is configured to stop on errors
+#             (supported device models: virtio-blk, ide, scsi-disk)
 #
 # @inserted: #optional @BlockDeviceInfo describing the device if media is
 #            present
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR
  2014-07-23 13:17 [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Luiz Capitulino
  2014-07-23 13:17 ` [Qemu-devel] [RFC 1/3] qapi: block-core.json: improve query-block doc Luiz Capitulino
@ 2014-07-23 13:17 ` Luiz Capitulino
  2014-08-05 12:14   ` Eric Blake
  2014-08-11  8:17   ` Daniel P. Berrange
  2014-07-23 13:17 ` [Qemu-devel] [RFC 3/3] QMP: extend BLOCK_IO_ERROR event with no-space indicator Luiz Capitulino
  2014-08-05  9:13 ` [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Kevin Wolf
  3 siblings, 2 replies; 15+ messages in thread
From: Luiz Capitulino @ 2014-07-23 13:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, armbru

This event has the same characteristics of the other rate-limited
events, mainly we can emit dozens of it. Rate limit it then.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 monitor.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/monitor.c b/monitor.c
index 5bc70a6..33abe6c 100644
--- a/monitor.c
+++ b/monitor.c
@@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void)
     monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000);
     monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000);
     monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000);
+    monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000);
 
     qmp_event_set_func_emit(monitor_qapi_event_queue);
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC 3/3] QMP: extend BLOCK_IO_ERROR event with no-space indicator
  2014-07-23 13:17 [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Luiz Capitulino
  2014-07-23 13:17 ` [Qemu-devel] [RFC 1/3] qapi: block-core.json: improve query-block doc Luiz Capitulino
  2014-07-23 13:17 ` [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR Luiz Capitulino
@ 2014-07-23 13:17 ` Luiz Capitulino
  2014-08-05 12:19   ` Eric Blake
  2014-08-05  9:13 ` [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Kevin Wolf
  3 siblings, 1 reply; 15+ messages in thread
From: Luiz Capitulino @ 2014-07-23 13:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, armbru

Management software, such as OpenStack and RHEV's vdsm, want to be able
to allocate disk space on demand. The basic use case is to start a VM
with a small disk and then the disk is enlarged when QEMU hits a ENOSPC
condition.

To this end, the management software has to be notified when QEMU
encounters ENOSPC. The solution implemented by this commit is simple:
it extends the BLOCK_IO_ERROR with a 'nospace' key, which is true
when QEMU is stopped due to ENOSPC.

Note that support for quering this event is already present in
query-block by means of the 'io-status' key and that the new 'nospace'
BLOCK_IO_ERROR field shares the same semantics with 'io-status',
which basically means that werror= has to be set to either
'stop' or 'enospc'.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 block.c              | 22 ++++++++++++++--------
 qapi/block-core.json |  7 ++++++-
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 8cf519b..566ef56 100644
--- a/block.c
+++ b/block.c
@@ -3596,6 +3596,18 @@ BlockErrorAction bdrv_get_error_action(BlockDriverState *bs, bool is_read, int e
     }
 }
 
+static void send_qmp_error_event(BlockDriverState *bs,
+                                 BlockErrorAction action,
+                                 bool is_read, int error)
+{
+    BlockErrorAction ac;
+
+    ac = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
+    qapi_event_send_block_io_error(bdrv_get_device_name(bs), ac, action,
+                                   bdrv_iostatus_is_enabled(bs),
+                                   error == ENOSPC, &error_abort);
+}
+
 /* This is done by device models because, while the block layer knows
  * about the error, it does not know whether an operation comes from
  * the device or the block layer (from a job, for example).
@@ -3621,16 +3633,10 @@ void bdrv_error_action(BlockDriverState *bs, BlockErrorAction action,
          * also ensures that the STOP/RESUME pair of events is emitted.
          */
         qemu_system_vmstop_request_prepare();
-        qapi_event_send_block_io_error(bdrv_get_device_name(bs),
-                                       is_read ? IO_OPERATION_TYPE_READ :
-                                       IO_OPERATION_TYPE_WRITE,
-                                       action, &error_abort);
+        send_qmp_error_event(bs, action, is_read, error);
         qemu_system_vmstop_request(RUN_STATE_IO_ERROR);
     } else {
-        qapi_event_send_block_io_error(bdrv_get_device_name(bs),
-                                       is_read ? IO_OPERATION_TYPE_READ :
-                                       IO_OPERATION_TYPE_WRITE,
-                                       action, &error_abort);
+        send_qmp_error_event(bs, action, is_read, error);
     }
 }
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 1069679..d659165 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1534,6 +1534,11 @@
 #
 # @action: action that has been taken
 #
+# @nospace: #optional true if I/O error was caused due to a no-space
+#           condition. This key is only present if query-block's
+#           io-status is present, please see query-block documentation
+#           for more information (since: 2.2)
+#
 # Note: If action is "stop", a STOP event will eventually follow the
 # BLOCK_IO_ERROR event
 #
@@ -1541,7 +1546,7 @@
 ##
 { 'event': 'BLOCK_IO_ERROR',
   'data': { 'device': 'str', 'operation': 'IoOperationType',
-            'action': 'BlockErrorAction' } }
+            'action': 'BlockErrorAction', '*nospace': 'bool' } }
 
 ##
 # @BLOCK_JOB_COMPLETED
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event
  2014-07-23 13:17 [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Luiz Capitulino
                   ` (2 preceding siblings ...)
  2014-07-23 13:17 ` [Qemu-devel] [RFC 3/3] QMP: extend BLOCK_IO_ERROR event with no-space indicator Luiz Capitulino
@ 2014-08-05  9:13 ` Kevin Wolf
  2014-08-14 13:05   ` Luiz Capitulino
  3 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2014-08-05  9:13 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: pbonzini, qemu-devel, armbru

Am 23.07.2014 um 15:17 hat Luiz Capitulino geschrieben:
> 
> Management software, such as OpenStack and RHEV's vdsm, wants to be able
> to allocate VM disk space on demand. The basic use case is to start a VM
> with a small disk and then the disk is enlarged when QEMU hits a ENOSPC
> condition.
> 
> To this end, the management software has to be notified when QEMU
> encounters ENOSPC. The most straightforward solution is to extend QMP's
> BLOCK_IO_ERROR event with that information.
> 
> This series does exactly that. The approach taken is the simplest possible:
> the BLOCK_IO_ERROR event is extended to contain a "nospace" key, which
> will be true whenever the guest runs out of space *and* werror=stop|enospc.
> Here's an example:
> 
> { "event": "BLOCK_IO_ERROR",
>     "data": { "device": "ide0-hd1",
> 	          "operation": "write",
> 			  "action": "stop",
> 			  "nospace": true },
>     "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> 
> There are three important things to observe:
> 
>  1. query-block already supports querying the event by means of the
>     "io-status" key. Actually, "nospace" and "io-status" keys share
>     the same semantics. This is a big advantage of this approach, no
>     further extension of query-block is needed
> 
>  2. The event could also contain an error message key for debugging,
>     But if we add it to the event, should we add it to query-block too?

I don't think it's strictly necessary, but I can imagine that it would
be a very nice feature for debugging if you could check after that fact
what caused the VM stop even if you don't have a QMP log with the event.

>  3. I'm not extending BLOCK_JOB_ERROR. The reason is that it seems
>     that BLOCK_IO_ERROR is also emitted on BLOCK_JOB_ERROR

Hm, I can't see this in the code. Where do I need to look?

Or did you get both a BLOCK_JOB_ERROR and a BLOCK_IO_ERROR because the
guest tried to access the image, too, and caused a separate error?

> Now, this series is an RFC because there's an alternative solution for
> this problem: instead of extending the BLOCK_IO_ERROR event with no-space
> indicator, we could have a stringfied errno. This way management apps
> would also be able to distinguish among other errors.

I don't think sending errnos is a good approach (but if we took it,
we should use an enum rather than strings) and prefer exposing the
exact information that is actually needed.

> For example, we could have a "error-details" dict containing a
> "reason" and a "message" key:
> 
> { "event": "BLOCK_IO_ERROR",
>     "data": { "device": "ide0-hd1",
>               "operation": "write",
>               "action": "stop",
> 			  "error-details": { "reason": "eio", "message": "I/O error" },
>     "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> 
> And then query-block would have to be extended to contain the same
> information.
> 
> IMO, this series implementation is good enough for the requirement we
> currently have but I'm open to go complex if needed.

Agreed. I would like to see the human-readable strerror() string added,
but that doesn't make this series any worse as a first step:

Acked-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 1/3] qapi: block-core.json: improve query-block doc
  2014-07-23 13:17 ` [Qemu-devel] [RFC 1/3] qapi: block-core.json: improve query-block doc Luiz Capitulino
@ 2014-08-05 12:13   ` Eric Blake
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2014-08-05 12:13 UTC (permalink / raw)
  To: Luiz Capitulino, qemu-devel; +Cc: kwolf, pbonzini, armbru

[-- Attachment #1: Type: text/plain, Size: 848 bytes --]

On 07/23/2014 07:17 AM, Luiz Capitulino wrote:
> List device models supporting the io-status key.
> 
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
>  qapi/block-core.json | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index e378653..1069679 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -332,6 +332,7 @@
>  #
>  # @io-status: #optional @BlockDeviceIoStatus. Only present if the device
>  #             supports it and the VM is configured to stop on errors
> +#             (supported device models: virtio-blk, ide, scsi-disk)

Hopefully this list gets larger over time.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR
  2014-07-23 13:17 ` [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR Luiz Capitulino
@ 2014-08-05 12:14   ` Eric Blake
  2014-08-11  8:17   ` Daniel P. Berrange
  1 sibling, 0 replies; 15+ messages in thread
From: Eric Blake @ 2014-08-05 12:14 UTC (permalink / raw)
  To: Luiz Capitulino, qemu-devel; +Cc: kwolf, pbonzini, armbru

[-- Attachment #1: Type: text/plain, Size: 1072 bytes --]

On 07/23/2014 07:17 AM, Luiz Capitulino wrote:
> This event has the same characteristics of the other rate-limited
> events, mainly we can emit dozens of it. Rate limit it then.
> 
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
>  monitor.c | 1 +
>  1 file changed, 1 insertion(+)

It's not guest-triggered, per se, but can indeed flood management.

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/monitor.c b/monitor.c
> index 5bc70a6..33abe6c 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void)
>      monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000);
>      monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000);
>      monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000);
> +    monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000);
>  
>      qmp_event_set_func_emit(monitor_qapi_event_queue);
>  }
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 3/3] QMP: extend BLOCK_IO_ERROR event with no-space indicator
  2014-07-23 13:17 ` [Qemu-devel] [RFC 3/3] QMP: extend BLOCK_IO_ERROR event with no-space indicator Luiz Capitulino
@ 2014-08-05 12:19   ` Eric Blake
  2014-08-14 13:07     ` Luiz Capitulino
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Blake @ 2014-08-05 12:19 UTC (permalink / raw)
  To: Luiz Capitulino, qemu-devel; +Cc: kwolf, pbonzini, armbru

[-- Attachment #1: Type: text/plain, Size: 2168 bytes --]

On 07/23/2014 07:17 AM, Luiz Capitulino wrote:
> Management software, such as OpenStack and RHEV's vdsm, want to be able
> to allocate disk space on demand. The basic use case is to start a VM
> with a small disk and then the disk is enlarged when QEMU hits a ENOSPC
> condition.

I'd still like feedback from OpenStack or vdsm folks stating what they
do with this information, if a bool for ENOSPC is good enough.  In RHEV,
qemu was patched to provide a downstream-only enum of ENOSPC, EIO,
EPERM, and a catch-all for all other errors; if anyone cared about the
distinction between EIO/EPERM and all others, providing a bool for
ENOSPC is a loss of information.  On the other hand, as far as I can
tell, management above libvirt really cared only about ENOSPC (time to
grow the underlying disk) vs. all other errors (tell the user their
guest is hosed), in which case this simpler proposal seems fine to me.
The problem is that since I don't know the upper layer use case, I don't
know if we are painting ourselves into a corner by providing a bool that
we'd have to maintain forever, if we also end up having to provide
EIO/EPERM distinctions.

> 
> To this end, the management software has to be notified when QEMU
> encounters ENOSPC. The solution implemented by this commit is simple:
> it extends the BLOCK_IO_ERROR with a 'nospace' key, which is true
> when QEMU is stopped due to ENOSPC.
> 
> Note that support for quering this event is already present in
> query-block by means of the 'io-status' key and that the new 'nospace'
> BLOCK_IO_ERROR field shares the same semantics with 'io-status',
> which basically means that werror= has to be set to either
> 'stop' or 'enospc'.
> 
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
>  block.c              | 22 ++++++++++++++--------
>  qapi/block-core.json |  7 ++++++-
>  2 files changed, 20 insertions(+), 9 deletions(-)

Codewise, this looks correct.  Design-wise, I'd still like more input,
so I'm reluctant to give R-b without discussion.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR
  2014-07-23 13:17 ` [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR Luiz Capitulino
  2014-08-05 12:14   ` Eric Blake
@ 2014-08-11  8:17   ` Daniel P. Berrange
  2014-08-11 11:07     ` Markus Armbruster
  2014-08-14 13:13     ` Luiz Capitulino
  1 sibling, 2 replies; 15+ messages in thread
From: Daniel P. Berrange @ 2014-08-11  8:17 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: kwolf, pbonzini, qemu-devel, armbru

On Wed, Jul 23, 2014 at 09:17:17AM -0400, Luiz Capitulino wrote:
> This event has the same characteristics of the other rate-limited
> events, mainly we can emit dozens of it. Rate limit it then.
> 
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
>  monitor.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/monitor.c b/monitor.c
> index 5bc70a6..33abe6c 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void)
>      monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000);
>      monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000);
>      monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000);
> +    monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000);


The rate limiting code only rate limits at the granularity of
individual event types. If there is context sensitive data associated
with events then the rate limiting will cause problems for applications
tracking the events.

eg consider with the simpler RTC CHANGE events if we get

   QAPI_EVENT_RTC_CHANGE offset=30
   QAPI_EVENT_RTC_CHANGE offset=700
   QAPI_EVENT_RTC_CHANGE offset=340

then rate limiting will mean that the application only receives

   QAPI_EVENT_RTC_CHANGE offset=340

This is fine because the application will always end up with a correct
view of the current system state.


For the BLOCK IO ERROR events this does not work because the events are
device and operation specific.

  QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=read action=stop
  QAPI_EVENT_BLOCK_IO_ERROR dev=scsi1-hd2 op=write action=stop
  QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop

with throttling the app wll only receive

  QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop

which means it will have an *incorrect* view of the system state because
the info about  scsi1-hd2 is irretrievably lost, likewise info about the
read operation of ide0-hd1.

If you want to throttle BLOCK IO ERROR events, then you need to make the
monitor throttling more intelligent, so that it hashes on all the contextual
state. In this case you'd have to throttle based on (event, dev, op) to get
correct application behaviour.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR
  2014-08-11  8:17   ` Daniel P. Berrange
@ 2014-08-11 11:07     ` Markus Armbruster
  2014-08-11 11:15       ` Daniel P. Berrange
  2014-08-14 13:13     ` Luiz Capitulino
  1 sibling, 1 reply; 15+ messages in thread
From: Markus Armbruster @ 2014-08-11 11:07 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: kwolf, pbonzini, qemu-devel, Luiz Capitulino

"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Wed, Jul 23, 2014 at 09:17:17AM -0400, Luiz Capitulino wrote:
>> This event has the same characteristics of the other rate-limited
>> events, mainly we can emit dozens of it. Rate limit it then.
>> 
>> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
>> ---
>>  monitor.c | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/monitor.c b/monitor.c
>> index 5bc70a6..33abe6c 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void)
>>      monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000);
>>      monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000);
>>      monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000);
>> +    monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000);
>
>
> The rate limiting code only rate limits at the granularity of
> individual event types. If there is context sensitive data associated
> with events then the rate limiting will cause problems for applications
> tracking the events.
>
> eg consider with the simpler RTC CHANGE events if we get
>
>    QAPI_EVENT_RTC_CHANGE offset=30
>    QAPI_EVENT_RTC_CHANGE offset=700
>    QAPI_EVENT_RTC_CHANGE offset=340
>
> then rate limiting will mean that the application only receives
>
>    QAPI_EVENT_RTC_CHANGE offset=340
>
> This is fine because the application will always end up with a correct
> view of the current system state.

... since the intermediate states no longer matter.

> For the BLOCK IO ERROR events this does not work because the events are
> device and operation specific.
>
>   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=read action=stop
>   QAPI_EVENT_BLOCK_IO_ERROR dev=scsi1-hd2 op=write action=stop
>   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop
>
> with throttling the app wll only receive
>
>   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop
>
> which means it will have an *incorrect* view of the system state because
> the info about  scsi1-hd2 is irretrievably lost, likewise info about the
> read operation of ide0-hd1.

Even when the event is lost, the information should not be lost.  There
should be a way to poll for it (libvirt needs that anyway, to cope with
possible event loss during a libvirt restart).

> If you want to throttle BLOCK IO ERROR events, then you need to make the
> monitor throttling more intelligent, so that it hashes on all the contextual
> state. In this case you'd have to throttle based on (event, dev, op) to get
> correct application behaviour.

I think there's more than one  to skin this cat:

1. Don't throttle.  Client can rely on events as long as it keeps the
   QMP connection alive.  Client should poll after establishing the QMP
   connection.

2. Throttle more smartly, so that events only get dropped when they're
   semantically superseded.  I figure that's what you proposed in your
   last paragraph.

3. Throttle, but accumulate the information carried by the event, i.e.
   any dropped events' data is sent with the next non-dropped event.

4. Throttle without smarts or accumulation.

   a. The event's additional information may be incomplete, thus
      worthless.  Client needs too poll after getting an event.

   b. Add a flag "throttling has dropped some events".  The additional
      information is incomplete when the flag is set.  Client needs to
      poll then.

Backward compatibility considerations may narrow our choice.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR
  2014-08-11 11:07     ` Markus Armbruster
@ 2014-08-11 11:15       ` Daniel P. Berrange
  2014-08-17  6:08         ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel P. Berrange @ 2014-08-11 11:15 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: kwolf, pbonzini, qemu-devel, Luiz Capitulino

On Mon, Aug 11, 2014 at 01:07:51PM +0200, Markus Armbruster wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> > For the BLOCK IO ERROR events this does not work because the events are
> > device and operation specific.
> >
> >   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=read action=stop
> >   QAPI_EVENT_BLOCK_IO_ERROR dev=scsi1-hd2 op=write action=stop
> >   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop
> >
> > with throttling the app wll only receive
> >
> >   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop
> >
> > which means it will have an *incorrect* view of the system state because
> > the info about  scsi1-hd2 is irretrievably lost, likewise info about the
> > read operation of ide0-hd1.
> 
> Even when the event is lost, the information should not be lost.  There
> should be a way to poll for it (libvirt needs that anyway, to cope with
> possible event loss during a libvirt restart).

Yes, that's true.

> > If you want to throttle BLOCK IO ERROR events, then you need to make the
> > monitor throttling more intelligent, so that it hashes on all the contextual
> > state. In this case you'd have to throttle based on (event, dev, op) to get
> > correct application behaviour.
> 
> I think there's more than one  to skin this cat:
> 
> 1. Don't throttle.  Client can rely on events as long as it keeps the
>    QMP connection alive.  Client should poll after establishing the QMP
>    connection.

A malicious guest OS can flood libvirt with events in this way. Of course
even if we throttle, a compromised QEMU can still flood libvirt. The only
fail-safe protection is for libvirt to detect flooding and throttle the
rate at which it talks to the (malicious) QEMU.

> 2. Throttle more smartly, so that events only get dropped when they're
>    semantically superseded.  I figure that's what you proposed in your
>    last paragraph.

Yep, that's what I was suggesting.

> 3. Throttle, but accumulate the information carried by the event, i.e.
>    any dropped events' data is sent with the next non-dropped event.

I fear this could get rather ugly - fields which are currently scalar
quantities would need to become lists or hashes.

> 4. Throttle without smarts or accumulation.
> 
>    a. The event's additional information may be incomplete, thus
>       worthless.  Client needs too poll after getting an event.

Might as well just not bother sending any additionl info in events
if we took this path.

> 
>    b. Add a flag "throttling has dropped some events".  The additional
>       information is incomplete when the flag is set.  Client needs to
>       poll then.

This is a reasonable idea too.

> Backward compatibility considerations may narrow our choice.

I think  1, 2 or 4b are viable from a general design POV, but only 1 or 2
are viable from a back-compat POV, unless there was an explicit command
that client apps issued to turn on the throttling in 4b instead of it
being on by default.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event
  2014-08-05  9:13 ` [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Kevin Wolf
@ 2014-08-14 13:05   ` Luiz Capitulino
  0 siblings, 0 replies; 15+ messages in thread
From: Luiz Capitulino @ 2014-08-14 13:05 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: pbonzini, qemu-devel, armbru

On Tue, 5 Aug 2014 11:13:39 +0200
Kevin Wolf <kwolf@redhat.com> wrote:

> Am 23.07.2014 um 15:17 hat Luiz Capitulino geschrieben:
> > 
> > Management software, such as OpenStack and RHEV's vdsm, wants to be able
> > to allocate VM disk space on demand. The basic use case is to start a VM
> > with a small disk and then the disk is enlarged when QEMU hits a ENOSPC
> > condition.
> > 
> > To this end, the management software has to be notified when QEMU
> > encounters ENOSPC. The most straightforward solution is to extend QMP's
> > BLOCK_IO_ERROR event with that information.
> > 
> > This series does exactly that. The approach taken is the simplest possible:
> > the BLOCK_IO_ERROR event is extended to contain a "nospace" key, which
> > will be true whenever the guest runs out of space *and* werror=stop|enospc.
> > Here's an example:
> > 
> > { "event": "BLOCK_IO_ERROR",
> >     "data": { "device": "ide0-hd1",
> > 	          "operation": "write",
> > 			  "action": "stop",
> > 			  "nospace": true },
> >     "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> > 
> > There are three important things to observe:
> > 
> >  1. query-block already supports querying the event by means of the
> >     "io-status" key. Actually, "nospace" and "io-status" keys share
> >     the same semantics. This is a big advantage of this approach, no
> >     further extension of query-block is needed
> > 
> >  2. The event could also contain an error message key for debugging,
> >     But if we add it to the event, should we add it to query-block too?
> 
> I don't think it's strictly necessary, but I can imagine that it would
> be a very nice feature for debugging if you could check after that fact
> what caused the VM stop even if you don't have a QMP log with the event.
> 
> >  3. I'm not extending BLOCK_JOB_ERROR. The reason is that it seems
> >     that BLOCK_IO_ERROR is also emitted on BLOCK_JOB_ERROR
> 
> Hm, I can't see this in the code. Where do I need to look?
> 
> Or did you get both a BLOCK_JOB_ERROR and a BLOCK_IO_ERROR because the
> guest tried to access the image, too, and caused a separate error?

Yes. I saw this behavior while testing my code where BLOCK_IO_ERROR always
followed BLOCK_JOB_ERROR. I assumed this was not a coincidence. If I'm
wrong I can just extend BLOCK_JOB_ERROR too.

> 
> > Now, this series is an RFC because there's an alternative solution for
> > this problem: instead of extending the BLOCK_IO_ERROR event with no-space
> > indicator, we could have a stringfied errno. This way management apps
> > would also be able to distinguish among other errors.
> 
> I don't think sending errnos is a good approach (but if we took it,
> we should use an enum rather than strings) and prefer exposing the
> exact information that is actually needed.
> 
> > For example, we could have a "error-details" dict containing a
> > "reason" and a "message" key:
> > 
> > { "event": "BLOCK_IO_ERROR",
> >     "data": { "device": "ide0-hd1",
> >               "operation": "write",
> >               "action": "stop",
> > 			  "error-details": { "reason": "eio", "message": "I/O error" },
> >     "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> > 
> > And then query-block would have to be extended to contain the same
> > information.
> > 
> > IMO, this series implementation is good enough for the requirement we
> > currently have but I'm open to go complex if needed.
> 
> Agreed. I would like to see the human-readable strerror() string added,
> but that doesn't make this series any worse as a first step:
> 
> Acked-by: Kevin Wolf <kwolf@redhat.com>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 3/3] QMP: extend BLOCK_IO_ERROR event with no-space indicator
  2014-08-05 12:19   ` Eric Blake
@ 2014-08-14 13:07     ` Luiz Capitulino
  0 siblings, 0 replies; 15+ messages in thread
From: Luiz Capitulino @ 2014-08-14 13:07 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, pbonzini, qemu-devel, armbru

On Tue, 05 Aug 2014 06:19:27 -0600
Eric Blake <eblake@redhat.com> wrote:

> On 07/23/2014 07:17 AM, Luiz Capitulino wrote:
> > Management software, such as OpenStack and RHEV's vdsm, want to be able
> > to allocate disk space on demand. The basic use case is to start a VM
> > with a small disk and then the disk is enlarged when QEMU hits a ENOSPC
> > condition.
> 
> I'd still like feedback from OpenStack or vdsm folks stating what they
> do with this information, if a bool for ENOSPC is good enough.

You're right, that's the most important item for this series. Do you know
whom I should contact from OpenStack?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR
  2014-08-11  8:17   ` Daniel P. Berrange
  2014-08-11 11:07     ` Markus Armbruster
@ 2014-08-14 13:13     ` Luiz Capitulino
  1 sibling, 0 replies; 15+ messages in thread
From: Luiz Capitulino @ 2014-08-14 13:13 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: kwolf, pbonzini, qemu-devel, armbru

On Mon, 11 Aug 2014 09:17:19 +0100
"Daniel P. Berrange" <berrange@redhat.com> wrote:

> On Wed, Jul 23, 2014 at 09:17:17AM -0400, Luiz Capitulino wrote:
> > This event has the same characteristics of the other rate-limited
> > events, mainly we can emit dozens of it. Rate limit it then.
> > 
> > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> > ---
> >  monitor.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/monitor.c b/monitor.c
> > index 5bc70a6..33abe6c 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void)
> >      monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000);
> >      monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000);
> >      monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000);
> > +    monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000);
> 
> 
> The rate limiting code only rate limits at the granularity of
> individual event types. If there is context sensitive data associated
> with events then the rate limiting will cause problems for applications
> tracking the events.
> 
> eg consider with the simpler RTC CHANGE events if we get
> 
>    QAPI_EVENT_RTC_CHANGE offset=30
>    QAPI_EVENT_RTC_CHANGE offset=700
>    QAPI_EVENT_RTC_CHANGE offset=340
> 
> then rate limiting will mean that the application only receives
> 
>    QAPI_EVENT_RTC_CHANGE offset=340
> 
> This is fine because the application will always end up with a correct
> view of the current system state.
> 
> 
> For the BLOCK IO ERROR events this does not work because the events are
> device and operation specific.
> 
>   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=read action=stop
>   QAPI_EVENT_BLOCK_IO_ERROR dev=scsi1-hd2 op=write action=stop
>   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop
> 
> with throttling the app wll only receive
> 
>   QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop
> 
> which means it will have an *incorrect* view of the system state because
> the info about  scsi1-hd2 is irretrievably lost, likewise info about the
> read operation of ide0-hd1.

You're completely right, of course. Thanks for reviewing!

I think I'll just drop this patch for now.

> 
> If you want to throttle BLOCK IO ERROR events, then you need to make the
> monitor throttling more intelligent, so that it hashes on all the contextual
> state. In this case you'd have to throttle based on (event, dev, op) to get
> correct application behaviour.
> 
> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR
  2014-08-11 11:15       ` Daniel P. Berrange
@ 2014-08-17  6:08         ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2014-08-17  6:08 UTC (permalink / raw)
  To: Daniel P. Berrange, Markus Armbruster; +Cc: kwolf, qemu-devel, Luiz Capitulino

Il 11/08/2014 13:15, Daniel P. Berrange ha scritto:
>> > 1. Don't throttle.  Client can rely on events as long as it keeps the
>> >    QMP connection alive.  Client should poll after establishing the QMP
>> >    connection.
> A malicious guest OS can flood libvirt with events in this way. Of course
> even if we throttle, a compromised QEMU can still flood libvirt. The only
> fail-safe protection is for libvirt to detect flooding and throttle the
> rate at which it talks to the (malicious) QEMU.
> 

If you use rerror=stop,werror=stop, only a limited error can be passed
down to libvirt before libvirt invokes the "cont" command and there's no
need to do any throttling.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-08-17  6:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-23 13:17 [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Luiz Capitulino
2014-07-23 13:17 ` [Qemu-devel] [RFC 1/3] qapi: block-core.json: improve query-block doc Luiz Capitulino
2014-08-05 12:13   ` Eric Blake
2014-07-23 13:17 ` [Qemu-devel] [RFC 2/3] QMP: rate limit BLOCK_IO_ERROR Luiz Capitulino
2014-08-05 12:14   ` Eric Blake
2014-08-11  8:17   ` Daniel P. Berrange
2014-08-11 11:07     ` Markus Armbruster
2014-08-11 11:15       ` Daniel P. Berrange
2014-08-17  6:08         ` Paolo Bonzini
2014-08-14 13:13     ` Luiz Capitulino
2014-07-23 13:17 ` [Qemu-devel] [RFC 3/3] QMP: extend BLOCK_IO_ERROR event with no-space indicator Luiz Capitulino
2014-08-05 12:19   ` Eric Blake
2014-08-14 13:07     ` Luiz Capitulino
2014-08-05  9:13 ` [Qemu-devel] [RFC 0/3] QMP: extend BLOCK_IO_ERROR event Kevin Wolf
2014-08-14 13:05   ` Luiz Capitulino

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.