All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/2]: QMP DISK_ERROR event
@ 2010-02-01 18:07 Luiz Capitulino
  2010-02-01 18:07 ` [Qemu-devel] [PATCH 1/2] block: Introduce drive_get_err_action() Luiz Capitulino
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Luiz Capitulino @ 2010-02-01 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf

 Hi there,

 I've been requested by libvirt guys to add a QMP event for disk I/O errors,
this is what this series is about.

 It's a RFC because I need feedback on the following:

1. drive_get_on_error() is called on all disk errors, right?
2. I've tested only ENOSPC errors, is there a way to test other errors? Like
read ones?
3. Is this the right approach at all? :)

 Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH 1/2] block: Introduce drive_get_err_action()
  2010-02-01 18:07 [Qemu-devel] [RFC 0/2]: QMP DISK_ERROR event Luiz Capitulino
@ 2010-02-01 18:07 ` Luiz Capitulino
  2010-02-01 18:07 ` [Qemu-devel] [PATCH 2/2] QMP: Introduce DISK_ERROR event Luiz Capitulino
  2010-02-02  9:25 ` [Qemu-devel] Re: [RFC 0/2]: QMP " Kevin Wolf
  2 siblings, 0 replies; 7+ messages in thread
From: Luiz Capitulino @ 2010-02-01 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf

It's really drive_get_on_error()'s current body. This is needed
so that the disk error event code can also use the returned action.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 vl.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/vl.c b/vl.c
index 6f1e1ab..57c439d 100644
--- a/vl.c
+++ b/vl.c
@@ -1843,7 +1843,7 @@ const char *drive_get_serial(BlockDriverState *bdrv)
     return "\0";
 }
 
-BlockInterfaceErrorAction drive_get_on_error(
+static BlockInterfaceErrorAction drive_get_err_action(
     BlockDriverState *bdrv, int is_read)
 {
     DriveInfo *dinfo;
@@ -1856,6 +1856,12 @@ BlockInterfaceErrorAction drive_get_on_error(
     return is_read ? BLOCK_ERR_REPORT : BLOCK_ERR_STOP_ENOSPC;
 }
 
+BlockInterfaceErrorAction drive_get_on_error(
+    BlockDriverState *bdrv, int is_read)
+{
+    return drive_get_err_action(bdrv, is_read);
+}
+
 static void bdrv_format_print(void *opaque, const char *name)
 {
     fprintf(stderr, " %s", name);
-- 
1.6.6

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH 2/2] QMP: Introduce DISK_ERROR event
  2010-02-01 18:07 [Qemu-devel] [RFC 0/2]: QMP DISK_ERROR event Luiz Capitulino
  2010-02-01 18:07 ` [Qemu-devel] [PATCH 1/2] block: Introduce drive_get_err_action() Luiz Capitulino
@ 2010-02-01 18:07 ` Luiz Capitulino
  2010-02-02  9:30   ` [Qemu-devel] " Kevin Wolf
  2010-02-02  9:25 ` [Qemu-devel] Re: [RFC 0/2]: QMP " Kevin Wolf
  2 siblings, 1 reply; 7+ messages in thread
From: Luiz Capitulino @ 2010-02-01 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf

It's emitted when a disk write or read fails, some device information
is provided. We can also provide error details in the future.

Example:

{ "event": "DISK_ERROR",
    "data": { "device": "ide0-hd1",
              "operation": "write",
              "action": "stop" }
    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }

NOTE: Adding a small reference in QMP/qmp-events.txt, but this file is
wrong and will be replaced by proper documentation shortly.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 QMP/qmp-events.txt |    7 +++++++
 monitor.c          |    3 +++
 monitor.h          |    1 +
 vl.c               |   34 +++++++++++++++++++++++++++++++++-
 4 files changed, 44 insertions(+), 1 deletions(-)

diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
index dc48ccc..e968ef5 100644
--- a/QMP/qmp-events.txt
+++ b/QMP/qmp-events.txt
@@ -43,3 +43,10 @@ Data: 'server' and 'client' keys with the same keys as 'query-vnc'.
 
 Description: Issued when the VNC session is made active.
 Data: 'server' and 'client' keys with the same keys as 'query-vnc'.
+
+7 DISK_ERROR
+------------
+
+Description: Issued when a disk I/O error occurs
+Data: 'device' (device name), 'action' (action to be taken),
+      'operation' ("read" or "write")
diff --git a/monitor.c b/monitor.c
index fb7c572..82edd79 100644
--- a/monitor.c
+++ b/monitor.c
@@ -378,6 +378,9 @@ void monitor_protocol_event(MonitorEvent event, QObject *data)
         case QEVENT_VNC_DISCONNECTED:
             event_name = "VNC_DISCONNECTED";
             break;
+        case QEVENT_DISK_ERROR:
+            event_name = "DISK_ERROR";
+            break;
         default:
             abort();
             break;
diff --git a/monitor.h b/monitor.h
index b0f9270..beaddaf 100644
--- a/monitor.h
+++ b/monitor.h
@@ -23,6 +23,7 @@ typedef enum MonitorEvent {
     QEVENT_VNC_CONNECTED,
     QEVENT_VNC_INITIALIZED,
     QEVENT_VNC_DISCONNECTED,
+    QEVENT_DISK_ERROR,
     QEVENT_MAX,
 } MonitorEvent;
 
diff --git a/vl.c b/vl.c
index 57c439d..1f69f56 100644
--- a/vl.c
+++ b/vl.c
@@ -1856,10 +1856,42 @@ static BlockInterfaceErrorAction drive_get_err_action(
     return is_read ? BLOCK_ERR_REPORT : BLOCK_ERR_STOP_ENOSPC;
 }
 
+static void driver_err_event(
+    BlockInterfaceErrorAction action, int is_read, const char *device)
+{
+    QObject *data;
+    const char *action_str;
+
+    switch (action) {
+    case BLOCK_ERR_REPORT:
+        action_str = "report";
+        break;
+    case BLOCK_ERR_IGNORE:
+        action_str = "ignore";
+        break;
+    case BLOCK_ERR_STOP_ANY:
+    case BLOCK_ERR_STOP_ENOSPC:
+        action_str = "stop";
+        break;
+    default:
+        abort();
+    }
+
+    data = qobject_from_jsonf("{'device': %s, 'action': %s, 'operation': %s}",
+                              device, action_str, is_read ? "read" : "write");
+    monitor_protocol_event(QEVENT_DISK_ERROR, data);
+    qobject_decref(data);
+}
+
 BlockInterfaceErrorAction drive_get_on_error(
     BlockDriverState *bdrv, int is_read)
 {
-    return drive_get_err_action(bdrv, is_read);
+    BlockInterfaceErrorAction action;
+
+    action = drive_get_err_action(bdrv, is_read);
+    driver_err_event(action, is_read, bdrv->device_name);
+
+    return action;
 }
 
 static void bdrv_format_print(void *opaque, const char *name)
-- 
1.6.6

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event
  2010-02-01 18:07 [Qemu-devel] [RFC 0/2]: QMP DISK_ERROR event Luiz Capitulino
  2010-02-01 18:07 ` [Qemu-devel] [PATCH 1/2] block: Introduce drive_get_err_action() Luiz Capitulino
  2010-02-01 18:07 ` [Qemu-devel] [PATCH 2/2] QMP: Introduce DISK_ERROR event Luiz Capitulino
@ 2010-02-02  9:25 ` Kevin Wolf
  2010-02-02 12:17   ` Luiz Capitulino
  2 siblings, 1 reply; 7+ messages in thread
From: Kevin Wolf @ 2010-02-02  9:25 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: qemu-devel

Hi Luiz,

Am 01.02.2010 19:07, schrieb Luiz Capitulino:
>  Hi there,
> 
>  I've been requested by libvirt guys to add a QMP event for disk I/O errors,
> this is what this series is about.
> 
>  It's a RFC because I need feedback on the following:
> 
> 1. drive_get_on_error() is called on all disk errors, right?

Well, yes, it is for all devices that support rerror/werror. But it also
might be called in other situations. Look at the "get" in the function
name, it's really a getter function and not a event handler.

> 2. I've tested only ENOSPC errors, is there a way to test other errors? Like
> read ones?

So you'll probably want some EIO. Some recent bugs I've been handling
were a about images on NFS when the NFS server want away. It's a
reliable way to get EIO (mount with -osoft and small timeouts). I guess
qemu-nbd and the nbd: protocol might work, too.

Or maybe copy the start of a qcow2 image to a too small device.

> 3. Is this the right approach at all? :)

Yes and no. As I said above, drive_get_on_error() is not the right place
to do it. Unfortunately it looks like there isn't a single generic place
where it can be done, but the call to the event handler must be added to
every device.

Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] Re: [PATCH 2/2] QMP: Introduce DISK_ERROR event
  2010-02-01 18:07 ` [Qemu-devel] [PATCH 2/2] QMP: Introduce DISK_ERROR event Luiz Capitulino
@ 2010-02-02  9:30   ` Kevin Wolf
  0 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2010-02-02  9:30 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: qemu-devel

Am 01.02.2010 19:07, schrieb Luiz Capitulino:
> It's emitted when a disk write or read fails, some device information
> is provided. We can also provide error details in the future.
> 
> Example:
> 
> { "event": "DISK_ERROR",
>     "data": { "device": "ide0-hd1",
>               "operation": "write",
>               "action": "stop" }
>     "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> 
> NOTE: Adding a small reference in QMP/qmp-events.txt, but this file is
> wrong and will be replaced by proper documentation shortly.
> 
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
>  QMP/qmp-events.txt |    7 +++++++
>  monitor.c          |    3 +++
>  monitor.h          |    1 +
>  vl.c               |   34 +++++++++++++++++++++++++++++++++-
>  4 files changed, 44 insertions(+), 1 deletions(-)
> 
> diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
> index dc48ccc..e968ef5 100644
> --- a/QMP/qmp-events.txt
> +++ b/QMP/qmp-events.txt
> @@ -43,3 +43,10 @@ Data: 'server' and 'client' keys with the same keys as 'query-vnc'.
>  
>  Description: Issued when the VNC session is made active.
>  Data: 'server' and 'client' keys with the same keys as 'query-vnc'.
> +
> +7 DISK_ERROR
> +------------
> +
> +Description: Issued when a disk I/O error occurs
> +Data: 'device' (device name), 'action' (action to be taken),
> +      'operation' ("read" or "write")
> diff --git a/monitor.c b/monitor.c
> index fb7c572..82edd79 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -378,6 +378,9 @@ void monitor_protocol_event(MonitorEvent event, QObject *data)
>          case QEVENT_VNC_DISCONNECTED:
>              event_name = "VNC_DISCONNECTED";
>              break;
> +        case QEVENT_DISK_ERROR:
> +            event_name = "DISK_ERROR";
> +            break;
>          default:
>              abort();
>              break;
> diff --git a/monitor.h b/monitor.h
> index b0f9270..beaddaf 100644
> --- a/monitor.h
> +++ b/monitor.h
> @@ -23,6 +23,7 @@ typedef enum MonitorEvent {
>      QEVENT_VNC_CONNECTED,
>      QEVENT_VNC_INITIALIZED,
>      QEVENT_VNC_DISCONNECTED,
> +    QEVENT_DISK_ERROR,
>      QEVENT_MAX,
>  } MonitorEvent;
>  
> diff --git a/vl.c b/vl.c
> index 57c439d..1f69f56 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1856,10 +1856,42 @@ static BlockInterfaceErrorAction drive_get_err_action(
>      return is_read ? BLOCK_ERR_REPORT : BLOCK_ERR_STOP_ENOSPC;
>  }
>  
> +static void driver_err_event(
> +    BlockInterfaceErrorAction action, int is_read, const char *device)
> +{
> +    QObject *data;
> +    const char *action_str;
> +
> +    switch (action) {
> +    case BLOCK_ERR_REPORT:
> +        action_str = "report";
> +        break;
> +    case BLOCK_ERR_IGNORE:
> +        action_str = "ignore";
> +        break;
> +    case BLOCK_ERR_STOP_ANY:
> +    case BLOCK_ERR_STOP_ENOSPC:
> +        action_str = "stop";

This is wrong. If it's BLOCK_ERR_STOP_ENOSPC, the action taken depends
on the error code. It might as well be a "report" instead of "stop" if
it was an EIO, for example.

But the problem is probably going to go away when you stop abusing a
getter function and add some calls that are explicitly made for your
requirements.

Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event
  2010-02-02  9:25 ` [Qemu-devel] Re: [RFC 0/2]: QMP " Kevin Wolf
@ 2010-02-02 12:17   ` Luiz Capitulino
  2010-02-02 12:19     ` Kevin Wolf
  0 siblings, 1 reply; 7+ messages in thread
From: Luiz Capitulino @ 2010-02-02 12:17 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel

On Tue, 02 Feb 2010 10:25:19 +0100
Kevin Wolf <kwolf@redhat.com> wrote:

> Hi Luiz,
> 
> Am 01.02.2010 19:07, schrieb Luiz Capitulino:
> >  Hi there,
> > 
> >  I've been requested by libvirt guys to add a QMP event for disk I/O errors,
> > this is what this series is about.
> > 
> >  It's a RFC because I need feedback on the following:
> > 
> > 1. drive_get_on_error() is called on all disk errors, right?
> 
> Well, yes, it is for all devices that support rerror/werror. But it also
> might be called in other situations. Look at the "get" in the function
> name, it's really a getter function and not a event handler.
> 
> > 2. I've tested only ENOSPC errors, is there a way to test other errors? Like
> > read ones?
> 
> So you'll probably want some EIO. Some recent bugs I've been handling
> were a about images on NFS when the NFS server want away. It's a
> reliable way to get EIO (mount with -osoft and small timeouts). I guess
> qemu-nbd and the nbd: protocol might work, too.
> 
> Or maybe copy the start of a qcow2 image to a too small device.

 Thanks!

> > 3. Is this the right approach at all? :)
> 
> Yes and no. As I said above, drive_get_on_error() is not the right place
> to do it. Unfortunately it looks like there isn't a single generic place
> where it can be done, but the call to the event handler must be added to
> every device.

 Can't it be added to subsystems? Like ide, virtio etc?

 Maybe in the same function that calls driver_get_on_error()?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] Re: [RFC 0/2]: QMP DISK_ERROR event
  2010-02-02 12:17   ` Luiz Capitulino
@ 2010-02-02 12:19     ` Kevin Wolf
  0 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2010-02-02 12:19 UTC (permalink / raw)
  To: Luiz Capitulino; +Cc: qemu-devel

Am 02.02.2010 13:17, schrieb Luiz Capitulino:
>>> 3. Is this the right approach at all? :)
>>
>> Yes and no. As I said above, drive_get_on_error() is not the right place
>> to do it. Unfortunately it looks like there isn't a single generic place
>> where it can be done, but the call to the event handler must be added to
>> every device.
> 
>  Can't it be added to subsystems? Like ide, virtio etc?
> 
>  Maybe in the same function that calls driver_get_on_error()?

This is what I meant by devices, yes. Putting it into the same function
sounds good, too.

Kevin

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-02-02 12:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-01 18:07 [Qemu-devel] [RFC 0/2]: QMP DISK_ERROR event Luiz Capitulino
2010-02-01 18:07 ` [Qemu-devel] [PATCH 1/2] block: Introduce drive_get_err_action() Luiz Capitulino
2010-02-01 18:07 ` [Qemu-devel] [PATCH 2/2] QMP: Introduce DISK_ERROR event Luiz Capitulino
2010-02-02  9:30   ` [Qemu-devel] " Kevin Wolf
2010-02-02  9:25 ` [Qemu-devel] Re: [RFC 0/2]: QMP " Kevin Wolf
2010-02-02 12:17   ` Luiz Capitulino
2010-02-02 12:19     ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.