All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure
@ 2013-07-26  6:10 MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 01/10] ignore SIGPIPE in qemu-img and qemu-io MORITA Kazutaka
                   ` (10 more replies)
  0 siblings, 11 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

Currently, if a sheepdog server exits, all the connecting VMs need to
be restarted.  This series implements a feature to reconnect the
server, and enables us to do online sheepdog upgrade and avoid
restarting VMs when sheepdog servers crash unexpectedly.

v4:
 - Added comment to explain why we need a failed queue.
 - Fixed a return value of sd_acb_cancelable().

v3:
 - Check return values of qemu_co_recv/send more strictly.
 - Move inflight requests to the failed list after reconnection
   completes.  This is necessary to resend I/Os while connection is
   lost.
 - Check simultaneous create in resend_aioreq().

v2:
 - Dropped nonblocking connect patches.

MORITA Kazutaka (10):
  ignore SIGPIPE in qemu-img and qemu-io
  iov: handle EOF in iov_send_recv
  sheepdog: check return values of qemu_co_recv/send correctly
  sheepdog: handle vdi objects in resend_aio_req
  sheepdog: reload inode outside of resend_aioreq
  coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
  sheepdog: try to reconnect to sheepdog after network error
  sheepdog: make add_aio_request and send_aioreq void functions
  sheepdog: cancel aio requests if possible
  sheepdog: check simultaneous create in resend_aioreq

 block/sheepdog.c          | 320 +++++++++++++++++++++++++++++-----------------
 include/block/coroutine.h |   8 ++
 qemu-coroutine-sleep.c    |  47 +++++++
 qemu-img.c                |   4 +
 qemu-io.c                 |   4 +
 util/iov.c                |   6 +
 6 files changed, 269 insertions(+), 120 deletions(-)

-- 
1.8.1.3.566.gaa39828

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 01/10] ignore SIGPIPE in qemu-img and qemu-io
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-30 13:41   ` Stefan Hajnoczi
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 02/10] iov: handle EOF in iov_send_recv MORITA Kazutaka
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

This prevents the tools from being stopped when they write data to a
closed connection in the other side.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 qemu-img.c | 4 ++++
 qemu-io.c  | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/qemu-img.c b/qemu-img.c
index c55ca5c..919d464 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2319,6 +2319,10 @@ int main(int argc, char **argv)
     const img_cmd_t *cmd;
     const char *cmdname;
 
+#ifdef CONFIG_POSIX
+    signal(SIGPIPE, SIG_IGN);
+#endif
+
     error_set_progname(argv[0]);
 
     qemu_init_main_loop();
diff --git a/qemu-io.c b/qemu-io.c
index cb9def5..d54dc86 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -335,6 +335,10 @@ int main(int argc, char **argv)
     int opt_index = 0;
     int flags = BDRV_O_UNMAP;
 
+#ifdef CONFIG_POSIX
+    signal(SIGPIPE, SIG_IGN);
+#endif
+
     progname = basename(argv[0]);
 
     while ((c = getopt_long(argc, argv, sopt, lopt, &opt_index)) != -1) {
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 02/10] iov: handle EOF in iov_send_recv
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 01/10] ignore SIGPIPE in qemu-img and qemu-io MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 03/10] sheepdog: check return values of qemu_co_recv/send correctly MORITA Kazutaka
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

Without this patch, iov_send_recv() never returns when do_send_recv()
returns zero.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 util/iov.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/util/iov.c b/util/iov.c
index cc6e837..f705586 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -202,6 +202,12 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
             return -1;
         }
 
+        if (ret == 0 && !do_send) {
+            /* recv returns 0 when the peer has performed an orderly
+             * shutdown. */
+            break;
+        }
+
         /* Prepare for the next iteration */
         offset += ret;
         total += ret;
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 03/10] sheepdog: check return values of qemu_co_recv/send correctly
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 01/10] ignore SIGPIPE in qemu-img and qemu-io MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 02/10] iov: handle EOF in iov_send_recv MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-30 13:48   ` Stefan Hajnoczi
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 04/10] sheepdog: handle vdi objects in resend_aio_req MORITA Kazutaka
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

If qemu_co_recv/send doesn't return the specified length, it means
that an error happened.

Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 block/sheepdog.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 6a41ad9..c6e9b89 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -489,13 +489,13 @@ static coroutine_fn int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
     int ret;
 
     ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
-    if (ret < sizeof(*hdr)) {
+    if (ret != sizeof(*hdr)) {
         error_report("failed to send a req, %s", strerror(errno));
         return ret;
     }
 
     ret = qemu_co_send(sockfd, data, *wlen);
-    if (ret < *wlen) {
+    if (ret != *wlen) {
         error_report("failed to send a req, %s", strerror(errno));
     }
 
@@ -548,7 +548,7 @@ static coroutine_fn void do_co_req(void *opaque)
     qemu_aio_set_fd_handler(sockfd, restart_co_req, NULL, have_co_req, co);
 
     ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
-    if (ret < sizeof(*hdr)) {
+    if (ret != sizeof(*hdr)) {
         error_report("failed to get a rsp, %s", strerror(errno));
         ret = -errno;
         goto out;
@@ -560,7 +560,7 @@ static coroutine_fn void do_co_req(void *opaque)
 
     if (*rlen) {
         ret = qemu_co_recv(sockfd, data, *rlen);
-        if (ret < *rlen) {
+        if (ret != *rlen) {
             error_report("failed to get the data, %s", strerror(errno));
             ret = -errno;
             goto out;
@@ -671,7 +671,7 @@ static void coroutine_fn aio_read_response(void *opaque)
 
     /* read a header */
     ret = qemu_co_recv(fd, &rsp, sizeof(rsp));
-    if (ret < 0) {
+    if (ret != sizeof(rsp)) {
         error_report("failed to get the header, %s", strerror(errno));
         goto out;
     }
@@ -722,7 +722,7 @@ static void coroutine_fn aio_read_response(void *opaque)
     case AIOCB_READ_UDATA:
         ret = qemu_co_recvv(fd, acb->qiov->iov, acb->qiov->niov,
                             aio_req->iov_offset, rsp.data_length);
-        if (ret < 0) {
+        if (ret != rsp.data_length) {
             error_report("failed to get the data, %s", strerror(errno));
             goto out;
         }
@@ -1075,7 +1075,7 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
 
     /* send a header */
     ret = qemu_co_send(s->fd, &hdr, sizeof(hdr));
-    if (ret < 0) {
+    if (ret != sizeof(hdr)) {
         qemu_co_mutex_unlock(&s->lock);
         error_report("failed to send a req, %s", strerror(errno));
         return -errno;
@@ -1083,7 +1083,7 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
 
     if (wlen) {
         ret = qemu_co_sendv(s->fd, iov, niov, aio_req->iov_offset, wlen);
-        if (ret < 0) {
+        if (ret != wlen) {
             qemu_co_mutex_unlock(&s->lock);
             error_report("failed to send a data, %s", strerror(errno));
             return -errno;
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 04/10] sheepdog: handle vdi objects in resend_aio_req
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
                   ` (2 preceding siblings ...)
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 03/10] sheepdog: check return values of qemu_co_recv/send correctly MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 05/10] sheepdog: reload inode outside of resend_aioreq MORITA Kazutaka
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

The current resend_aio_req() doesn't work when the request is against
vdi objects.  This fixes the problem.

Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 block/sheepdog.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index c6e9b89..fae17ac 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1209,11 +1209,15 @@ static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
         return ret;
     }
 
-    aio_req->oid = vid_to_data_oid(s->inode.vdi_id,
-                                   data_oid_to_idx(aio_req->oid));
+    if (is_data_obj(aio_req->oid)) {
+        aio_req->oid = vid_to_data_oid(s->inode.vdi_id,
+                                       data_oid_to_idx(aio_req->oid));
+    } else {
+        aio_req->oid = vid_to_vdi_oid(s->inode.vdi_id);
+    }
 
     /* check whether this request becomes a CoW one */
-    if (acb->aiocb_type == AIOCB_WRITE_UDATA) {
+    if (acb->aiocb_type == AIOCB_WRITE_UDATA && is_data_obj(aio_req->oid)) {
         int idx = data_oid_to_idx(aio_req->oid);
         AIOReq *areq;
 
@@ -1241,8 +1245,15 @@ static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
         create = true;
     }
 out:
-    return add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
-                           create, acb->aiocb_type);
+    if (is_data_obj(aio_req->oid)) {
+        return add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
+                               create, acb->aiocb_type);
+    } else {
+        struct iovec iov;
+        iov.iov_base = &s->inode;
+        iov.iov_len = sizeof(s->inode);
+        return add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
+    }
 }
 
 /* TODO Convert to fine grained options */
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 05/10] sheepdog: reload inode outside of resend_aioreq
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
                   ` (3 preceding siblings ...)
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 04/10] sheepdog: handle vdi objects in resend_aio_req MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers MORITA Kazutaka
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

This prepares for using resend_aioreq() after reconnecting to the
sheepdog server.

Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 block/sheepdog.c | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index fae17ac..7b22816 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -222,6 +222,11 @@ static inline uint64_t data_oid_to_idx(uint64_t oid)
     return oid & (MAX_DATA_OBJS - 1);
 }
 
+static inline uint32_t oid_to_vid(uint64_t oid)
+{
+    return (oid & ~VDI_BIT) >> VDI_SPACE_SHIFT;
+}
+
 static inline uint64_t vid_to_vdi_oid(uint32_t vid)
 {
     return VDI_BIT | ((uint64_t)vid << VDI_SPACE_SHIFT);
@@ -607,7 +612,7 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
                            struct iovec *iov, int niov, bool create,
                            enum AIOCBState aiocb_type);
 static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req);
-
+static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag);
 
 static AIOReq *find_pending_req(BDRVSheepdogState *s, uint64_t oid)
 {
@@ -755,6 +760,19 @@ static void coroutine_fn aio_read_response(void *opaque)
     case SD_RES_SUCCESS:
         break;
     case SD_RES_READONLY:
+        if (s->inode.vdi_id == oid_to_vid(aio_req->oid)) {
+            ret = reload_inode(s, 0, "");
+            if (ret < 0) {
+                goto out;
+            }
+        }
+
+        if (is_data_obj(aio_req->oid)) {
+            aio_req->oid = vid_to_data_oid(s->inode.vdi_id,
+                                           data_oid_to_idx(aio_req->oid));
+        } else {
+            aio_req->oid = vid_to_vdi_oid(s->inode.vdi_id);
+        }
         ret = resend_aioreq(s, aio_req);
         if (ret == SD_RES_SUCCESS) {
             goto out;
@@ -1202,19 +1220,6 @@ static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
 {
     SheepdogAIOCB *acb = aio_req->aiocb;
     bool create = false;
-    int ret;
-
-    ret = reload_inode(s, 0, "");
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (is_data_obj(aio_req->oid)) {
-        aio_req->oid = vid_to_data_oid(s->inode.vdi_id,
-                                       data_oid_to_idx(aio_req->oid));
-    } else {
-        aio_req->oid = vid_to_vdi_oid(s->inode.vdi_id);
-    }
 
     /* check whether this request becomes a CoW one */
     if (acb->aiocb_type == AIOCB_WRITE_UDATA && is_data_obj(aio_req->oid)) {
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
                   ` (4 preceding siblings ...)
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 05/10] sheepdog: reload inode outside of resend_aioreq MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-30 13:58   ` Stefan Hajnoczi
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 07/10] sheepdog: try to reconnect to sheepdog after network error MORITA Kazutaka
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

This helper function behaves similarly to co_sleep_ns(), but the
sleeping coroutine will be resumed when using qemu_aio_wait().

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 include/block/coroutine.h |  8 ++++++++
 qemu-coroutine-sleep.c    | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+)

diff --git a/include/block/coroutine.h b/include/block/coroutine.h
index 377805a..23ea6e9 100644
--- a/include/block/coroutine.h
+++ b/include/block/coroutine.h
@@ -210,6 +210,14 @@ void qemu_co_rwlock_unlock(CoRwlock *lock);
 void coroutine_fn co_sleep_ns(QEMUClock *clock, int64_t ns);
 
 /**
+ * Yield the coroutine for a given duration
+ *
+ * Behaves similarly to co_sleep_ns(), but the sleeping coroutine will be
+ * resumed when using qemu_aio_wait().
+ */
+void coroutine_fn co_aio_sleep_ns(int64_t ns);
+
+/**
  * Yield until a file descriptor becomes readable
  *
  * Note that this function clobbers the handlers for the file descriptor.
diff --git a/qemu-coroutine-sleep.c b/qemu-coroutine-sleep.c
index 169ce5c..3955347 100644
--- a/qemu-coroutine-sleep.c
+++ b/qemu-coroutine-sleep.c
@@ -13,6 +13,7 @@
 
 #include "block/coroutine.h"
 #include "qemu/timer.h"
+#include "qemu/thread.h"
 
 typedef struct CoSleepCB {
     QEMUTimer *ts;
@@ -37,3 +38,49 @@ void coroutine_fn co_sleep_ns(QEMUClock *clock, int64_t ns)
     qemu_del_timer(sleep_cb.ts);
     qemu_free_timer(sleep_cb.ts);
 }
+
+typedef struct CoAioSleepCB {
+    QEMUBH *bh;
+    int64_t ns;
+    Coroutine *co;
+} CoAioSleepCB;
+
+static void co_aio_sleep_cb(void *opaque)
+{
+    CoAioSleepCB *aio_sleep_cb = opaque;
+
+    qemu_coroutine_enter(aio_sleep_cb->co, NULL);
+}
+
+static void *sleep_thread(void *opaque)
+{
+    CoAioSleepCB *aio_sleep_cb = opaque;
+    struct timespec req = {
+        .tv_sec = aio_sleep_cb->ns / 1000000000,
+        .tv_nsec = aio_sleep_cb->ns % 1000000000,
+    };
+    struct timespec rem;
+
+    while (nanosleep(&req, &rem) < 0 && errno == EINTR) {
+        req = rem;
+    }
+
+    qemu_bh_schedule(aio_sleep_cb->bh);
+
+    return NULL;
+}
+
+void coroutine_fn co_aio_sleep_ns(int64_t ns)
+{
+    CoAioSleepCB aio_sleep_cb = {
+        .ns = ns,
+        .co = qemu_coroutine_self(),
+    };
+    QemuThread thread;
+
+    aio_sleep_cb.bh = qemu_bh_new(co_aio_sleep_cb, &aio_sleep_cb);
+    qemu_thread_create(&thread, sleep_thread, &aio_sleep_cb,
+                       QEMU_THREAD_DETACHED);
+    qemu_coroutine_yield();
+    qemu_bh_delete(aio_sleep_cb.bh);
+}
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 07/10] sheepdog: try to reconnect to sheepdog after network error
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
                   ` (5 preceding siblings ...)
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 08/10] sheepdog: make add_aio_request and send_aioreq void functions MORITA Kazutaka
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

This introduces a failed request queue and links all the inflight
requests to the list after network error happens.  After QEMU
reconnects to the sheepdog server successfully, the sheepdog block
driver will retry all the requests in the failed queue.

Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 block/sheepdog.c | 78 +++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 63 insertions(+), 15 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 7b22816..3860611 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -318,8 +318,11 @@ typedef struct BDRVSheepdogState {
     Coroutine *co_recv;
 
     uint32_t aioreq_seq_num;
+
+    /* Every aio request must be linked to either of these queues. */
     QLIST_HEAD(inflight_aio_head, AIOReq) inflight_aio_head;
     QLIST_HEAD(pending_aio_head, AIOReq) pending_aio_head;
+    QLIST_HEAD(failed_aio_head, AIOReq) failed_aio_head;
 } BDRVSheepdogState;
 
 static const char * sd_strerror(int err)
@@ -613,6 +616,8 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
                            enum AIOCBState aiocb_type);
 static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req);
 static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag);
+static int get_sheep_fd(BDRVSheepdogState *s);
+static void co_write_request(void *opaque);
 
 static AIOReq *find_pending_req(BDRVSheepdogState *s, uint64_t oid)
 {
@@ -654,6 +659,50 @@ static void coroutine_fn send_pending_req(BDRVSheepdogState *s, uint64_t oid)
     }
 }
 
+static coroutine_fn void reconnect_to_sdog(void *opaque)
+{
+    BDRVSheepdogState *s = opaque;
+    AIOReq *aio_req, *next;
+
+    qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL, NULL);
+    close(s->fd);
+    s->fd = -1;
+
+    /* Wait for outstanding write requests to be completed. */
+    while (s->co_send != NULL) {
+        co_write_request(opaque);
+    }
+
+    /* Try to reconnect the sheepdog server every one second. */
+    while (s->fd < 0) {
+        s->fd = get_sheep_fd(s);
+        if (s->fd < 0) {
+            dprintf("Wait for connection to be established\n");
+            co_aio_sleep_ns(1000000000ULL);
+        }
+    };
+
+    /*
+     * Now we have to resend all the request in the inflight queue.  However,
+     * resend_aioreq() can yield and newly created requests can be added to the
+     * inflight queue before the coroutine is resumed.  To avoid mixing them, we
+     * have to move all the inflight requests to the failed queue before
+     * resend_aioreq() is called.
+     */
+    QLIST_FOREACH_SAFE(aio_req, &s->inflight_aio_head, aio_siblings, next) {
+        QLIST_REMOVE(aio_req, aio_siblings);
+        QLIST_INSERT_HEAD(&s->failed_aio_head, aio_req, aio_siblings);
+    }
+
+    /* Resend all the failed aio requests. */
+    while (!QLIST_EMPTY(&s->failed_aio_head)) {
+        aio_req = QLIST_FIRST(&s->failed_aio_head);
+        QLIST_REMOVE(aio_req, aio_siblings);
+        QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
+        resend_aioreq(s, aio_req);
+    }
+}
+
 /*
  * Receive responses of the I/O requests.
  *
@@ -670,15 +719,11 @@ static void coroutine_fn aio_read_response(void *opaque)
     SheepdogAIOCB *acb;
     uint64_t idx;
 
-    if (QLIST_EMPTY(&s->inflight_aio_head)) {
-        goto out;
-    }
-
     /* read a header */
     ret = qemu_co_recv(fd, &rsp, sizeof(rsp));
     if (ret != sizeof(rsp)) {
         error_report("failed to get the header, %s", strerror(errno));
-        goto out;
+        goto err;
     }
 
     /* find the right aio_req from the inflight aio list */
@@ -689,7 +734,7 @@ static void coroutine_fn aio_read_response(void *opaque)
     }
     if (!aio_req) {
         error_report("cannot find aio_req %x", rsp.id);
-        goto out;
+        goto err;
     }
 
     acb = aio_req->aiocb;
@@ -729,7 +774,7 @@ static void coroutine_fn aio_read_response(void *opaque)
                             aio_req->iov_offset, rsp.data_length);
         if (ret != rsp.data_length) {
             error_report("failed to get the data, %s", strerror(errno));
-            goto out;
+            goto err;
         }
         break;
     case AIOCB_FLUSH_CACHE:
@@ -763,10 +808,9 @@ static void coroutine_fn aio_read_response(void *opaque)
         if (s->inode.vdi_id == oid_to_vid(aio_req->oid)) {
             ret = reload_inode(s, 0, "");
             if (ret < 0) {
-                goto out;
+                goto err;
             }
         }
-
         if (is_data_obj(aio_req->oid)) {
             aio_req->oid = vid_to_data_oid(s->inode.vdi_id,
                                            data_oid_to_idx(aio_req->oid));
@@ -794,6 +838,10 @@ static void coroutine_fn aio_read_response(void *opaque)
     }
 out:
     s->co_recv = NULL;
+    return;
+err:
+    s->co_recv = NULL;
+    reconnect_to_sdog(opaque);
 }
 
 static void co_read_response(void *opaque)
@@ -819,7 +867,8 @@ static int aio_flush_request(void *opaque)
     BDRVSheepdogState *s = opaque;
 
     return !QLIST_EMPTY(&s->inflight_aio_head) ||
-        !QLIST_EMPTY(&s->pending_aio_head);
+        !QLIST_EMPTY(&s->pending_aio_head) ||
+        !QLIST_EMPTY(&s->failed_aio_head);
 }
 
 /*
@@ -1094,23 +1143,21 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
     /* send a header */
     ret = qemu_co_send(s->fd, &hdr, sizeof(hdr));
     if (ret != sizeof(hdr)) {
-        qemu_co_mutex_unlock(&s->lock);
         error_report("failed to send a req, %s", strerror(errno));
-        return -errno;
+        goto out;
     }
 
     if (wlen) {
         ret = qemu_co_sendv(s->fd, iov, niov, aio_req->iov_offset, wlen);
         if (ret != wlen) {
-            qemu_co_mutex_unlock(&s->lock);
             error_report("failed to send a data, %s", strerror(errno));
-            return -errno;
         }
     }
-
+out:
     socket_set_cork(s->fd, 0);
     qemu_aio_set_fd_handler(s->fd, co_read_response, NULL,
                             aio_flush_request, s);
+    s->co_send = NULL;
     qemu_co_mutex_unlock(&s->lock);
 
     return 0;
@@ -1300,6 +1347,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags)
 
     QLIST_INIT(&s->inflight_aio_head);
     QLIST_INIT(&s->pending_aio_head);
+    QLIST_INIT(&s->failed_aio_head);
     s->fd = -1;
 
     memset(vdi, 0, sizeof(vdi));
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 08/10] sheepdog: make add_aio_request and send_aioreq void functions
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
                   ` (6 preceding siblings ...)
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 07/10] sheepdog: try to reconnect to sheepdog after network error MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 09/10] sheepdog: cancel aio requests if possible MORITA Kazutaka
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

These functions no longer return errors.  We can make them void
functions and simplify the codes.

Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 block/sheepdog.c | 66 +++++++++++++++-----------------------------------------
 1 file changed, 17 insertions(+), 49 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 3860611..17c7941 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -611,10 +611,10 @@ static int do_req(int sockfd, SheepdogReq *hdr, void *data,
     return srco.ret;
 }
 
-static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
+static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
                            struct iovec *iov, int niov, bool create,
                            enum AIOCBState aiocb_type);
-static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req);
+static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req);
 static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag);
 static int get_sheep_fd(BDRVSheepdogState *s);
 static void co_write_request(void *opaque);
@@ -640,22 +640,14 @@ static void coroutine_fn send_pending_req(BDRVSheepdogState *s, uint64_t oid)
 {
     AIOReq *aio_req;
     SheepdogAIOCB *acb;
-    int ret;
 
     while ((aio_req = find_pending_req(s, oid)) != NULL) {
         acb = aio_req->aiocb;
         /* move aio_req from pending list to inflight one */
         QLIST_REMOVE(aio_req, aio_siblings);
         QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
-        ret = add_aio_request(s, aio_req, acb->qiov->iov,
-                              acb->qiov->niov, false, acb->aiocb_type);
-        if (ret < 0) {
-            error_report("add_aio_request is failed");
-            free_aio_req(s, aio_req);
-            if (!acb->nr_pending) {
-                sd_finish_aiocb(acb);
-            }
-        }
+        add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, false,
+                        acb->aiocb_type);
     }
 }
 
@@ -817,11 +809,8 @@ static void coroutine_fn aio_read_response(void *opaque)
         } else {
             aio_req->oid = vid_to_vdi_oid(s->inode.vdi_id);
         }
-        ret = resend_aioreq(s, aio_req);
-        if (ret == SD_RES_SUCCESS) {
-            goto out;
-        }
-        /* fall through */
+        resend_aioreq(s, aio_req);
+        goto out;
     default:
         acb->ret = -EIO;
         error_report("%s", sd_strerror(rsp.result));
@@ -1079,7 +1068,7 @@ out:
     return ret;
 }
 
-static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
+static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
                            struct iovec *iov, int niov, bool create,
                            enum AIOCBState aiocb_type)
 {
@@ -1159,8 +1148,6 @@ out:
                             aio_flush_request, s);
     s->co_send = NULL;
     qemu_co_mutex_unlock(&s->lock);
-
-    return 0;
 }
 
 static int read_write_object(int fd, char *buf, uint64_t oid, int copies,
@@ -1263,7 +1250,7 @@ out:
     return ret;
 }
 
-static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
+static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
 {
     SheepdogAIOCB *acb = aio_req->aiocb;
     bool create = false;
@@ -1288,7 +1275,7 @@ static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
                 dprintf("simultaneous CoW to %" PRIx64 "\n", aio_req->oid);
                 QLIST_REMOVE(aio_req, aio_siblings);
                 QLIST_INSERT_HEAD(&s->pending_aio_head, aio_req, aio_siblings);
-                return SD_RES_SUCCESS;
+                return;
             }
         }
 
@@ -1298,13 +1285,13 @@ static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
     }
 out:
     if (is_data_obj(aio_req->oid)) {
-        return add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
-                               create, acb->aiocb_type);
+        add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, create,
+                        acb->aiocb_type);
     } else {
         struct iovec iov;
         iov.iov_base = &s->inode;
         iov.iov_len = sizeof(s->inode);
-        return add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
+        add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
     }
 }
 
@@ -1694,7 +1681,6 @@ static int sd_truncate(BlockDriverState *bs, int64_t offset)
  */
 static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
 {
-    int ret;
     BDRVSheepdogState *s = acb->common.bs->opaque;
     struct iovec iov;
     AIOReq *aio_req;
@@ -1716,18 +1702,13 @@ static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
         aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
                                 data_len, offset, 0, 0, offset);
         QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
-        ret = add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
-        if (ret) {
-            free_aio_req(s, aio_req);
-            acb->ret = -EIO;
-            goto out;
-        }
+        add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
 
         acb->aio_done_func = sd_finish_aiocb;
         acb->aiocb_type = AIOCB_WRITE_UDATA;
         return;
     }
-out:
+
     sd_finish_aiocb(acb);
 }
 
@@ -1934,14 +1915,8 @@ static int coroutine_fn sd_co_rw_vector(void *p)
         }
 
         QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
-        ret = add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
-                              create, acb->aiocb_type);
-        if (ret < 0) {
-            error_report("add_aio_request is failed");
-            free_aio_req(s, aio_req);
-            acb->ret = -EIO;
-            goto out;
-        }
+        add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, create,
+                        acb->aiocb_type);
     done:
         offset = 0;
         idx++;
@@ -2009,7 +1984,6 @@ static int coroutine_fn sd_co_flush_to_disk(BlockDriverState *bs)
     BDRVSheepdogState *s = bs->opaque;
     SheepdogAIOCB *acb;
     AIOReq *aio_req;
-    int ret;
 
     if (s->cache_flags != SD_FLAG_CMD_CACHE) {
         return 0;
@@ -2022,13 +1996,7 @@ static int coroutine_fn sd_co_flush_to_disk(BlockDriverState *bs)
     aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
                             0, 0, 0, 0, 0);
     QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
-    ret = add_aio_request(s, aio_req, NULL, 0, false, acb->aiocb_type);
-    if (ret < 0) {
-        error_report("add_aio_request is failed");
-        free_aio_req(s, aio_req);
-        qemu_aio_release(acb);
-        return ret;
-    }
+    add_aio_request(s, aio_req, NULL, 0, false, acb->aiocb_type);
 
     qemu_coroutine_yield();
     return acb->ret;
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 09/10] sheepdog: cancel aio requests if possible
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
                   ` (7 preceding siblings ...)
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 08/10] sheepdog: make add_aio_request and send_aioreq void functions MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 10/10] sheepdog: check simultaneous create in resend_aioreq MORITA Kazutaka
  2013-07-30 14:13 ` [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure Stefan Hajnoczi
  10 siblings, 0 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

This patch tries to cancel aio requests in pending queue and failed
queue.  When the sheepdog driver cannot cancel the requests, it waits
for them to be completed.

Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 block/sheepdog.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 59 insertions(+), 11 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 17c7941..c8739ae 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -294,7 +294,8 @@ struct SheepdogAIOCB {
     Coroutine *coroutine;
     void (*aio_done_func)(SheepdogAIOCB *);
 
-    bool canceled;
+    bool cancelable;
+    bool *finished;
     int nr_pending;
 };
 
@@ -411,6 +412,7 @@ static inline void free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
 {
     SheepdogAIOCB *acb = aio_req->aiocb;
 
+    acb->cancelable = false;
     QLIST_REMOVE(aio_req, aio_siblings);
     g_free(aio_req);
 
@@ -419,23 +421,68 @@ static inline void free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
 
 static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB *acb)
 {
-    if (!acb->canceled) {
-        qemu_coroutine_enter(acb->coroutine, NULL);
+    qemu_coroutine_enter(acb->coroutine, NULL);
+    if (acb->finished) {
+        *acb->finished = true;
     }
     qemu_aio_release(acb);
 }
 
+/*
+ * Check whether the specified acb can be canceled
+ *
+ * We can cancel aio when any request belonging to the acb is:
+ *  - Not processed by the sheepdog server.
+ *  - Not linked to the inflight queue.
+ */
+static bool sd_acb_cancelable(const SheepdogAIOCB *acb)
+{
+    BDRVSheepdogState *s = acb->common.bs->opaque;
+    AIOReq *aioreq;
+
+    if (!acb->cancelable) {
+        return false;
+    }
+
+    QLIST_FOREACH(aioreq, &s->inflight_aio_head, aio_siblings) {
+        if (aioreq->aiocb == acb) {
+            return false;
+        }
+    }
+
+    return true;
+}
+
 static void sd_aio_cancel(BlockDriverAIOCB *blockacb)
 {
     SheepdogAIOCB *acb = (SheepdogAIOCB *)blockacb;
+    BDRVSheepdogState *s = acb->common.bs->opaque;
+    AIOReq *aioreq, *next;
+    bool finished = false;
+
+    acb->finished = &finished;
+    while (!finished) {
+        if (sd_acb_cancelable(acb)) {
+            /* Remove outstanding requests from pending and failed queues.  */
+            QLIST_FOREACH_SAFE(aioreq, &s->pending_aio_head, aio_siblings,
+                               next) {
+                if (aioreq->aiocb == acb) {
+                    free_aio_req(s, aioreq);
+                }
+            }
+            QLIST_FOREACH_SAFE(aioreq, &s->failed_aio_head, aio_siblings,
+                               next) {
+                if (aioreq->aiocb == acb) {
+                    free_aio_req(s, aioreq);
+                }
+            }
 
-    /*
-     * Sheepdog cannot cancel the requests which are already sent to
-     * the servers, so we just complete the request with -EIO here.
-     */
-    acb->ret = -EIO;
-    qemu_coroutine_enter(acb->coroutine, NULL);
-    acb->canceled = true;
+            assert(acb->nr_pending == 0);
+            sd_finish_aiocb(acb);
+            return;
+        }
+        qemu_aio_wait();
+    }
 }
 
 static const AIOCBInfo sd_aiocb_info = {
@@ -456,7 +503,8 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
     acb->nb_sectors = nb_sectors;
 
     acb->aio_done_func = NULL;
-    acb->canceled = false;
+    acb->cancelable = true;
+    acb->finished = NULL;
     acb->coroutine = qemu_coroutine_self();
     acb->ret = 0;
     acb->nr_pending = 0;
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH v4 10/10] sheepdog: check simultaneous create in resend_aioreq
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
                   ` (8 preceding siblings ...)
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 09/10] sheepdog: cancel aio requests if possible MORITA Kazutaka
@ 2013-07-26  6:10 ` MORITA Kazutaka
  2013-07-30 14:13 ` [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure Stefan Hajnoczi
  10 siblings, 0 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-07-26  6:10 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi, qemu-devel; +Cc: Paolo Bonzini, sheepdog, Liu Yuan

After reconnection happens, all the inflight requests are moved to the
failed request list.  As a result, sd_co_rw_vector() can send another
create request before resend_aioreq() resends a create request from
the failed list.

This patch adds a helper function check_simultaneous_create() and
checks simultaneous create requests more strictly in resend_aioreq().

Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
---
 block/sheepdog.c | 64 ++++++++++++++++++++++++++++----------------------------
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index c8739ae..800ebf4 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1298,6 +1298,29 @@ out:
     return ret;
 }
 
+/* Return true if the specified request is linked to the pending list. */
+static bool check_simultaneous_create(BDRVSheepdogState *s, AIOReq *aio_req)
+{
+    AIOReq *areq;
+    QLIST_FOREACH(areq, &s->inflight_aio_head, aio_siblings) {
+        if (areq != aio_req && areq->oid == aio_req->oid) {
+            /*
+             * Sheepdog cannot handle simultaneous create requests to the same
+             * object, so we cannot send the request until the previous request
+             * finishes.
+             */
+            dprintf("simultaneous create to %" PRIx64 "\n", aio_req->oid);
+            aio_req->flags = 0;
+            aio_req->base_oid = 0;
+            QLIST_REMOVE(aio_req, aio_siblings);
+            QLIST_INSERT_HEAD(&s->pending_aio_head, aio_req, aio_siblings);
+            return true;
+        }
+    }
+
+    return false;
+}
+
 static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
 {
     SheepdogAIOCB *acb = aio_req->aiocb;
@@ -1306,29 +1329,19 @@ static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
     /* check whether this request becomes a CoW one */
     if (acb->aiocb_type == AIOCB_WRITE_UDATA && is_data_obj(aio_req->oid)) {
         int idx = data_oid_to_idx(aio_req->oid);
-        AIOReq *areq;
 
-        if (s->inode.data_vdi_id[idx] == 0) {
-            create = true;
-            goto out;
-        }
         if (is_data_obj_writable(&s->inode, idx)) {
             goto out;
         }
 
-        /* link to the pending list if there is another CoW request to
-         * the same object */
-        QLIST_FOREACH(areq, &s->inflight_aio_head, aio_siblings) {
-            if (areq != aio_req && areq->oid == aio_req->oid) {
-                dprintf("simultaneous CoW to %" PRIx64 "\n", aio_req->oid);
-                QLIST_REMOVE(aio_req, aio_siblings);
-                QLIST_INSERT_HEAD(&s->pending_aio_head, aio_req, aio_siblings);
-                return;
-            }
+        if (check_simultaneous_create(s, aio_req)) {
+            return;
         }
 
-        aio_req->base_oid = vid_to_data_oid(s->inode.data_vdi_id[idx], idx);
-        aio_req->flags |= SD_FLAG_CMD_COW;
+        if (s->inode.data_vdi_id[idx]) {
+            aio_req->base_oid = vid_to_data_oid(s->inode.data_vdi_id[idx], idx);
+            aio_req->flags |= SD_FLAG_CMD_COW;
+        }
         create = true;
     }
 out:
@@ -1942,27 +1955,14 @@ static int coroutine_fn sd_co_rw_vector(void *p)
         }
 
         aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, old_oid, done);
+        QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
 
         if (create) {
-            AIOReq *areq;
-            QLIST_FOREACH(areq, &s->inflight_aio_head, aio_siblings) {
-                if (areq->oid == oid) {
-                    /*
-                     * Sheepdog cannot handle simultaneous create
-                     * requests to the same object.  So we cannot send
-                     * the request until the previous request
-                     * finishes.
-                     */
-                    aio_req->flags = 0;
-                    aio_req->base_oid = 0;
-                    QLIST_INSERT_HEAD(&s->pending_aio_head, aio_req,
-                                      aio_siblings);
-                    goto done;
-                }
+            if (check_simultaneous_create(s, aio_req)) {
+                goto done;
             }
         }
 
-        QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
         add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, create,
                         acb->aiocb_type);
     done:
-- 
1.8.1.3.566.gaa39828

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 01/10] ignore SIGPIPE in qemu-img and qemu-io
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 01/10] ignore SIGPIPE in qemu-img and qemu-io MORITA Kazutaka
@ 2013-07-30 13:41   ` Stefan Hajnoczi
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2013-07-30 13:41 UTC (permalink / raw)
  To: MORITA Kazutaka
  Cc: Kevin Wolf, sheepdog, qemu-devel, Stefan Hajnoczi, Liu Yuan,
	Paolo Bonzini

On Fri, Jul 26, 2013 at 03:10:43PM +0900, MORITA Kazutaka wrote:
> This prevents the tools from being stopped when they write data to a
> closed connection in the other side.
> 
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
> ---
>  qemu-img.c | 4 ++++
>  qemu-io.c  | 4 ++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index c55ca5c..919d464 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -2319,6 +2319,10 @@ int main(int argc, char **argv)
>      const img_cmd_t *cmd;
>      const char *cmdname;
>  
> +#ifdef CONFIG_POSIX
> +    signal(SIGPIPE, SIG_IGN);
> +#endif

Reusing os_setup_early_signal_handling() would be messy, this seems fine
for now.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 03/10] sheepdog: check return values of qemu_co_recv/send correctly
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 03/10] sheepdog: check return values of qemu_co_recv/send correctly MORITA Kazutaka
@ 2013-07-30 13:48   ` Stefan Hajnoczi
  2013-08-02  6:09     ` [Qemu-devel] [sheepdog] " MORITA Kazutaka
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Hajnoczi @ 2013-07-30 13:48 UTC (permalink / raw)
  To: MORITA Kazutaka
  Cc: Kevin Wolf, sheepdog, qemu-devel, Stefan Hajnoczi, Liu Yuan,
	Paolo Bonzini

On Fri, Jul 26, 2013 at 03:10:45PM +0900, MORITA Kazutaka wrote:
> If qemu_co_recv/send doesn't return the specified length, it means
> that an error happened.
> 
> Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
> Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
> ---
>  block/sheepdog.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/block/sheepdog.c b/block/sheepdog.c
> index 6a41ad9..c6e9b89 100644
> --- a/block/sheepdog.c
> +++ b/block/sheepdog.c
> @@ -489,13 +489,13 @@ static coroutine_fn int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
>      int ret;
>  
>      ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
> -    if (ret < sizeof(*hdr)) {
> +    if (ret != sizeof(*hdr)) {
>          error_report("failed to send a req, %s", strerror(errno));

Does this rely on qemu_co_send_recv() getting ret=-1 errno=EPIPE from
iov_send_recv()?  I want to check that I understand what happens when
the socket is closed by the other side.

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers MORITA Kazutaka
@ 2013-07-30 13:58   ` Stefan Hajnoczi
  2013-08-02  6:20     ` [Qemu-devel] [sheepdog] " MORITA Kazutaka
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Hajnoczi @ 2013-07-30 13:58 UTC (permalink / raw)
  To: MORITA Kazutaka
  Cc: Kevin Wolf, pingfank, sheepdog, alex, qemu-devel,
	Stefan Hajnoczi, Liu Yuan, Paolo Bonzini

On Fri, Jul 26, 2013 at 03:10:48PM +0900, MORITA Kazutaka wrote:
> This helper function behaves similarly to co_sleep_ns(), but the
> sleeping coroutine will be resumed when using qemu_aio_wait().
> 
> Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
> ---
>  include/block/coroutine.h |  8 ++++++++
>  qemu-coroutine-sleep.c    | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 55 insertions(+)
> 
> diff --git a/include/block/coroutine.h b/include/block/coroutine.h
> index 377805a..23ea6e9 100644
> --- a/include/block/coroutine.h
> +++ b/include/block/coroutine.h
> @@ -210,6 +210,14 @@ void qemu_co_rwlock_unlock(CoRwlock *lock);
>  void coroutine_fn co_sleep_ns(QEMUClock *clock, int64_t ns);
>  
>  /**
> + * Yield the coroutine for a given duration
> + *
> + * Behaves similarly to co_sleep_ns(), but the sleeping coroutine will be
> + * resumed when using qemu_aio_wait().
> + */
> +void coroutine_fn co_aio_sleep_ns(int64_t ns);
> +
> +/**
>   * Yield until a file descriptor becomes readable
>   *
>   * Note that this function clobbers the handlers for the file descriptor.
> diff --git a/qemu-coroutine-sleep.c b/qemu-coroutine-sleep.c
> index 169ce5c..3955347 100644
> --- a/qemu-coroutine-sleep.c
> +++ b/qemu-coroutine-sleep.c
> @@ -13,6 +13,7 @@
>  
>  #include "block/coroutine.h"
>  #include "qemu/timer.h"
> +#include "qemu/thread.h"
>  
>  typedef struct CoSleepCB {
>      QEMUTimer *ts;
> @@ -37,3 +38,49 @@ void coroutine_fn co_sleep_ns(QEMUClock *clock, int64_t ns)
>      qemu_del_timer(sleep_cb.ts);
>      qemu_free_timer(sleep_cb.ts);
>  }
> +
> +typedef struct CoAioSleepCB {
> +    QEMUBH *bh;
> +    int64_t ns;
> +    Coroutine *co;
> +} CoAioSleepCB;
> +
> +static void co_aio_sleep_cb(void *opaque)
> +{
> +    CoAioSleepCB *aio_sleep_cb = opaque;
> +
> +    qemu_coroutine_enter(aio_sleep_cb->co, NULL);
> +}
> +
> +static void *sleep_thread(void *opaque)
> +{
> +    CoAioSleepCB *aio_sleep_cb = opaque;
> +    struct timespec req = {
> +        .tv_sec = aio_sleep_cb->ns / 1000000000,
> +        .tv_nsec = aio_sleep_cb->ns % 1000000000,
> +    };
> +    struct timespec rem;
> +
> +    while (nanosleep(&req, &rem) < 0 && errno == EINTR) {

This breaks the Windows build and makes qemu_aio_wait() spin instead of
blocking (wastes CPU).

I think Alex Bligh and Ping Fan's QEMUTimer in AioContext work is needed
here.  I have CCed them.  Their patches would allow you to use
co_sleep_ns() in qemu_aio_wait().

> +        req = rem;
> +    }
> +
> +    qemu_bh_schedule(aio_sleep_cb->bh);
> +
> +    return NULL;
> +}
> +
> +void coroutine_fn co_aio_sleep_ns(int64_t ns)
> +{
> +    CoAioSleepCB aio_sleep_cb = {
> +        .ns = ns,
> +        .co = qemu_coroutine_self(),
> +    };
> +    QemuThread thread;
> +
> +    aio_sleep_cb.bh = qemu_bh_new(co_aio_sleep_cb, &aio_sleep_cb);
> +    qemu_thread_create(&thread, sleep_thread, &aio_sleep_cb,
> +                       QEMU_THREAD_DETACHED);
> +    qemu_coroutine_yield();
> +    qemu_bh_delete(aio_sleep_cb.bh);
> +}
> -- 
> 1.8.1.3.566.gaa39828
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure
  2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
                   ` (9 preceding siblings ...)
  2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 10/10] sheepdog: check simultaneous create in resend_aioreq MORITA Kazutaka
@ 2013-07-30 14:13 ` Stefan Hajnoczi
  10 siblings, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2013-07-30 14:13 UTC (permalink / raw)
  To: MORITA Kazutaka
  Cc: Kevin Wolf, sheepdog, nick, qemu-devel, Stefan Hajnoczi,
	Liu Yuan, Paolo Bonzini

On Fri, Jul 26, 2013 at 03:10:42PM +0900, MORITA Kazutaka wrote:
> Currently, if a sheepdog server exits, all the connecting VMs need to
> be restarted.  This series implements a feature to reconnect the
> server, and enables us to do online sheepdog upgrade and avoid
> restarting VMs when sheepdog servers crash unexpectedly.
> 
> v4:
>  - Added comment to explain why we need a failed queue.
>  - Fixed a return value of sd_acb_cancelable().
> 
> v3:
>  - Check return values of qemu_co_recv/send more strictly.
>  - Move inflight requests to the failed list after reconnection
>    completes.  This is necessary to resend I/Os while connection is
>    lost.
>  - Check simultaneous create in resend_aioreq().
> 
> v2:
>  - Dropped nonblocking connect patches.
> 
> MORITA Kazutaka (10):
>   ignore SIGPIPE in qemu-img and qemu-io
>   iov: handle EOF in iov_send_recv
>   sheepdog: check return values of qemu_co_recv/send correctly
>   sheepdog: handle vdi objects in resend_aio_req
>   sheepdog: reload inode outside of resend_aioreq
>   coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
>   sheepdog: try to reconnect to sheepdog after network error
>   sheepdog: make add_aio_request and send_aioreq void functions
>   sheepdog: cancel aio requests if possible
>   sheepdog: check simultaneous create in resend_aioreq
> 
>  block/sheepdog.c          | 320 +++++++++++++++++++++++++++++-----------------
>  include/block/coroutine.h |   8 ++
>  qemu-coroutine-sleep.c    |  47 +++++++
>  qemu-img.c                |   4 +
>  qemu-io.c                 |   4 +
>  util/iov.c                |   6 +
>  6 files changed, 269 insertions(+), 120 deletions(-)

I have done a brief review.  The biggest change that I suggest using the
new AioContext timer support that Alex Bligh and Ping Fan are working on
(see qemu-devel for the latest patches).  It provides a way to use a
timer during qemu_aio_wait() without spinning.

CCed Nick Thomas who worked on NBD reconnect.  Maybe your series will
motivate him to push his patches again, or he might have some review
suggestions for you.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [sheepdog] [PATCH v4 03/10] sheepdog: check return values of qemu_co_recv/send correctly
  2013-07-30 13:48   ` Stefan Hajnoczi
@ 2013-08-02  6:09     ` MORITA Kazutaka
  0 siblings, 0 replies; 19+ messages in thread
From: MORITA Kazutaka @ 2013-08-02  6:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, sheepdog, qemu-devel, Stefan Hajnoczi, Paolo Bonzini,
	MORITA Kazutaka

At Tue, 30 Jul 2013 15:48:02 +0200,
Stefan Hajnoczi wrote:
> 
> On Fri, Jul 26, 2013 at 03:10:45PM +0900, MORITA Kazutaka wrote:
> > If qemu_co_recv/send doesn't return the specified length, it means
> > that an error happened.
> > 
> > Tested-and-reviewed-by: Liu Yuan <namei.unix@gmail.com>
> > Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
> > ---
> >  block/sheepdog.c | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/block/sheepdog.c b/block/sheepdog.c
> > index 6a41ad9..c6e9b89 100644
> > --- a/block/sheepdog.c
> > +++ b/block/sheepdog.c
> > @@ -489,13 +489,13 @@ static coroutine_fn int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
> >      int ret;
> >  
> >      ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
> > -    if (ret < sizeof(*hdr)) {
> > +    if (ret != sizeof(*hdr)) {
> >          error_report("failed to send a req, %s", strerror(errno));
> 
> Does this rely on qemu_co_send_recv() getting ret=-1 errno=EPIPE from
> iov_send_recv()?  I want to check that I understand what happens when
> the socket is closed by the other side.

Yes, when the socket is closed by the peer, qemu_co_send_recv()
returns a short write (if some bytes are already sent) or -1 (if no
data is sent).  The current sheepdog driver doesn't work correctly for
the latter case because it compares -1 and an unsigned value.

This doesn't happen for the current qemu-io and qemu-img because they
terminate with SIGPIPE when the connection is closed by the peer.

Thanks,

Kazutaka

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [sheepdog] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
  2013-07-30 13:58   ` Stefan Hajnoczi
@ 2013-08-02  6:20     ` MORITA Kazutaka
  2013-08-02  8:19       ` liu ping fan
  0 siblings, 1 reply; 19+ messages in thread
From: MORITA Kazutaka @ 2013-08-02  6:20 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, pingfank, sheepdog, Stefan Hajnoczi, qemu-devel,
	alex, Paolo Bonzini, MORITA Kazutaka

At Tue, 30 Jul 2013 15:58:58 +0200,
Stefan Hajnoczi wrote:
> 
> On Fri, Jul 26, 2013 at 03:10:48PM +0900, MORITA Kazutaka wrote:
> > This helper function behaves similarly to co_sleep_ns(), but the
> > sleeping coroutine will be resumed when using qemu_aio_wait().
> > 
> > Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
> > ---
> >  include/block/coroutine.h |  8 ++++++++
> >  qemu-coroutine-sleep.c    | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 55 insertions(+)
> > 
> > diff --git a/include/block/coroutine.h b/include/block/coroutine.h
> > index 377805a..23ea6e9 100644
> > --- a/include/block/coroutine.h
> > +++ b/include/block/coroutine.h
> > @@ -210,6 +210,14 @@ void qemu_co_rwlock_unlock(CoRwlock *lock);
> >  void coroutine_fn co_sleep_ns(QEMUClock *clock, int64_t ns);
> >  
> >  /**
> > + * Yield the coroutine for a given duration
> > + *
> > + * Behaves similarly to co_sleep_ns(), but the sleeping coroutine will be
> > + * resumed when using qemu_aio_wait().
> > + */
> > +void coroutine_fn co_aio_sleep_ns(int64_t ns);
> > +
> > +/**
> >   * Yield until a file descriptor becomes readable
> >   *
> >   * Note that this function clobbers the handlers for the file descriptor.
> > diff --git a/qemu-coroutine-sleep.c b/qemu-coroutine-sleep.c
> > index 169ce5c..3955347 100644
> > --- a/qemu-coroutine-sleep.c
> > +++ b/qemu-coroutine-sleep.c
> > @@ -13,6 +13,7 @@
> >  
> >  #include "block/coroutine.h"
> >  #include "qemu/timer.h"
> > +#include "qemu/thread.h"
> >  
> >  typedef struct CoSleepCB {
> >      QEMUTimer *ts;
> > @@ -37,3 +38,49 @@ void coroutine_fn co_sleep_ns(QEMUClock *clock, int64_t ns)
> >      qemu_del_timer(sleep_cb.ts);
> >      qemu_free_timer(sleep_cb.ts);
> >  }
> > +
> > +typedef struct CoAioSleepCB {
> > +    QEMUBH *bh;
> > +    int64_t ns;
> > +    Coroutine *co;
> > +} CoAioSleepCB;
> > +
> > +static void co_aio_sleep_cb(void *opaque)
> > +{
> > +    CoAioSleepCB *aio_sleep_cb = opaque;
> > +
> > +    qemu_coroutine_enter(aio_sleep_cb->co, NULL);
> > +}
> > +
> > +static void *sleep_thread(void *opaque)
> > +{
> > +    CoAioSleepCB *aio_sleep_cb = opaque;
> > +    struct timespec req = {
> > +        .tv_sec = aio_sleep_cb->ns / 1000000000,
> > +        .tv_nsec = aio_sleep_cb->ns % 1000000000,
> > +    };
> > +    struct timespec rem;
> > +
> > +    while (nanosleep(&req, &rem) < 0 && errno == EINTR) {
> 
> This breaks the Windows build and makes qemu_aio_wait() spin instead of
> blocking (wastes CPU).
> 
> I think Alex Bligh and Ping Fan's QEMUTimer in AioContext work is needed
> here.  I have CCed them.  Their patches would allow you to use
> co_sleep_ns() in qemu_aio_wait().

Okay, I'll update this patch based on the AioContext timer.  I'm also
happy to help Alex and Pingfan to finish the implementation.

Thanks,

Kazutaka

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [sheepdog] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
  2013-08-02  6:20     ` [Qemu-devel] [sheepdog] " MORITA Kazutaka
@ 2013-08-02  8:19       ` liu ping fan
  2013-08-02  9:35         ` Alex Bligh
  0 siblings, 1 reply; 19+ messages in thread
From: liu ping fan @ 2013-08-02  8:19 UTC (permalink / raw)
  To: MORITA Kazutaka
  Cc: Kevin Wolf, pingfank, sheepdog, alex, Stefan Hajnoczi,
	qemu-devel, Stefan Hajnoczi, Paolo Bonzini

On Fri, Aug 2, 2013 at 2:20 PM, MORITA Kazutaka
<morita.kazutaka@lab.ntt.co.jp> wrote:
> At Tue, 30 Jul 2013 15:58:58 +0200,
> Stefan Hajnoczi wrote:
>>
>> On Fri, Jul 26, 2013 at 03:10:48PM +0900, MORITA Kazutaka wrote:
>> > This helper function behaves similarly to co_sleep_ns(), but the
>> > sleeping coroutine will be resumed when using qemu_aio_wait().
>> >
>> > Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
>> > ---
>> >  include/block/coroutine.h |  8 ++++++++
>> >  qemu-coroutine-sleep.c    | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>> >  2 files changed, 55 insertions(+)
>> >
>> > diff --git a/include/block/coroutine.h b/include/block/coroutine.h
>> > index 377805a..23ea6e9 100644
>> > --- a/include/block/coroutine.h
>> > +++ b/include/block/coroutine.h
>> > @@ -210,6 +210,14 @@ void qemu_co_rwlock_unlock(CoRwlock *lock);
>> >  void coroutine_fn co_sleep_ns(QEMUClock *clock, int64_t ns);
>> >
>> >  /**
>> > + * Yield the coroutine for a given duration
>> > + *
>> > + * Behaves similarly to co_sleep_ns(), but the sleeping coroutine will be
>> > + * resumed when using qemu_aio_wait().
>> > + */
>> > +void coroutine_fn co_aio_sleep_ns(int64_t ns);
>> > +
>> > +/**
>> >   * Yield until a file descriptor becomes readable
>> >   *
>> >   * Note that this function clobbers the handlers for the file descriptor.
>> > diff --git a/qemu-coroutine-sleep.c b/qemu-coroutine-sleep.c
>> > index 169ce5c..3955347 100644
>> > --- a/qemu-coroutine-sleep.c
>> > +++ b/qemu-coroutine-sleep.c
>> > @@ -13,6 +13,7 @@
>> >
>> >  #include "block/coroutine.h"
>> >  #include "qemu/timer.h"
>> > +#include "qemu/thread.h"
>> >
>> >  typedef struct CoSleepCB {
>> >      QEMUTimer *ts;
>> > @@ -37,3 +38,49 @@ void coroutine_fn co_sleep_ns(QEMUClock *clock, int64_t ns)
>> >      qemu_del_timer(sleep_cb.ts);
>> >      qemu_free_timer(sleep_cb.ts);
>> >  }
>> > +
>> > +typedef struct CoAioSleepCB {
>> > +    QEMUBH *bh;
>> > +    int64_t ns;
>> > +    Coroutine *co;
>> > +} CoAioSleepCB;
>> > +
>> > +static void co_aio_sleep_cb(void *opaque)
>> > +{
>> > +    CoAioSleepCB *aio_sleep_cb = opaque;
>> > +
>> > +    qemu_coroutine_enter(aio_sleep_cb->co, NULL);
>> > +}
>> > +
>> > +static void *sleep_thread(void *opaque)
>> > +{
>> > +    CoAioSleepCB *aio_sleep_cb = opaque;
>> > +    struct timespec req = {
>> > +        .tv_sec = aio_sleep_cb->ns / 1000000000,
>> > +        .tv_nsec = aio_sleep_cb->ns % 1000000000,
>> > +    };
>> > +    struct timespec rem;
>> > +
>> > +    while (nanosleep(&req, &rem) < 0 && errno == EINTR) {
>>
>> This breaks the Windows build and makes qemu_aio_wait() spin instead of
>> blocking (wastes CPU).
>>
>> I think Alex Bligh and Ping Fan's QEMUTimer in AioContext work is needed
>> here.  I have CCed them.  Their patches would allow you to use
>> co_sleep_ns() in qemu_aio_wait().
>
> Okay, I'll update this patch based on the AioContext timer.  I'm also
> happy to help Alex and Pingfan to finish the implementation.
>
Alex, do you make a further step than V4? If you have, could you share
it with us?So we can rebase our patches onto yours.

Thanks,
Pingfan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [sheepdog] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
  2013-08-02  8:19       ` liu ping fan
@ 2013-08-02  9:35         ` Alex Bligh
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Bligh @ 2013-08-02  9:35 UTC (permalink / raw)
  To: liu ping fan, MORITA Kazutaka
  Cc: Kevin Wolf, pingfank, sheepdog, Alex Bligh, Stefan Hajnoczi,
	qemu-devel, Stefan Hajnoczi, Paolo Bonzini

Pingfan,

--On 2 August 2013 16:19:31 +0800 liu ping fan <qemulist@gmail.com> wrote:

>> Okay, I'll update this patch based on the AioContext timer.  I'm also
>> happy to help Alex and Pingfan to finish the implementation.
>>
> Alex, do you make a further step than V4? If you have, could you share
> it with us?So we can rebase our patches onto yours.

I've got no changes beyond v4 yet.

The changes I am thinking of making are:

1. changing the AioContext * in QEMUTimerList to be a pointer to a callback
   instead. This is a tiny and self-contained change and may well be
   unnecessary.

2. Whatever comes out of Paolo's long email of yesterday, the contents of
   which I have not fully digested. However, I think what this boils down
   to is calling *_notify slightly less often.

3. Someone (Paolo/Kevin?) suggested renaming the *new* timer functions
   (i.e. the ones with timerlist in the name). That won't affect anything
   current.

I suggest you go with v4 for the time being. If I end up doing a v5 and
your code doesn't sit well on top of it, I can rebase to yours or
something. I'd expect changes in v5 to be really small.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-08-02 20:53 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-26  6:10 [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure MORITA Kazutaka
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 01/10] ignore SIGPIPE in qemu-img and qemu-io MORITA Kazutaka
2013-07-30 13:41   ` Stefan Hajnoczi
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 02/10] iov: handle EOF in iov_send_recv MORITA Kazutaka
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 03/10] sheepdog: check return values of qemu_co_recv/send correctly MORITA Kazutaka
2013-07-30 13:48   ` Stefan Hajnoczi
2013-08-02  6:09     ` [Qemu-devel] [sheepdog] " MORITA Kazutaka
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 04/10] sheepdog: handle vdi objects in resend_aio_req MORITA Kazutaka
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 05/10] sheepdog: reload inode outside of resend_aioreq MORITA Kazutaka
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 06/10] coroutine: add co_aio_sleep_ns() to allow sleep in block drivers MORITA Kazutaka
2013-07-30 13:58   ` Stefan Hajnoczi
2013-08-02  6:20     ` [Qemu-devel] [sheepdog] " MORITA Kazutaka
2013-08-02  8:19       ` liu ping fan
2013-08-02  9:35         ` Alex Bligh
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 07/10] sheepdog: try to reconnect to sheepdog after network error MORITA Kazutaka
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 08/10] sheepdog: make add_aio_request and send_aioreq void functions MORITA Kazutaka
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 09/10] sheepdog: cancel aio requests if possible MORITA Kazutaka
2013-07-26  6:10 ` [Qemu-devel] [PATCH v4 10/10] sheepdog: check simultaneous create in resend_aioreq MORITA Kazutaka
2013-07-30 14:13 ` [Qemu-devel] [PATCH v4 00/10] sheepdog: reconnect server after connection failure Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.