All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash
@ 2019-09-10 12:41 Max Reitz
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 1/7] curl: Keep pointer to the CURLState in CURLSocket Max Reitz
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Max Reitz @ 2019-09-10 12:41 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-stable, John Snow, qemu-devel, Max Reitz

Hi,

As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our
curl block driver can spontaneously hang.  This becomes visible e.g.
when reading compressed qcow2 images:

$ qemu-img convert -p -O raw -n \
  https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \
  null-co://

(Hangs at 74.21 %, usually.)

A more direct way is:

$ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \
    -d 1 -S 524288 -c 2

(Which simply performs two requests, and the second one hangs.  You can
use any HTTP resource (probably FTP, too) you’d like that is at least
1 MB in size.)

It turns out that this is because cURL 7.59.0 has added a protective
feature against some misuse we had in our code: curl_multi_add_handle()
must not be called from within a cURL callback, but in some cases we
did.  As of 7.59.0, this fails, our new request is not registered and
the I/O request stalls.  This is fixed by patch 6.

Patch 7 makes us check for curl_multi_add_handle()’s return code,
because if we had done that before, debugging would have been much
simpler.


On the way to fixing it, I had a look over the whole cURL code and found
a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very
safe at all.  I think this may lead to crashes, although I have never
seen any myself.  https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5
shows one in exactly the function in question, so I think it actually is
a problem.

This is fixed by patch 5, patches 1, 2, and 4 prepare for it.

(Patch 3 is kind of a misc patch that should ensure that we always end
up calling curl_multi_check_completion() whenever a request might have
been completed.)


v2:
- Patch 2: Remove the socket from the list only add the end of the
           function (yielding a nicer 5+/5- diff stat)
- Patch 3: Added
- Patch 4: Rebased on patch 3, and s/socket/ready_socket/ in one place
- Patch 5: Rebased on the changed patch 4


git-backport-diff against v1:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/7:[----] [--] 'curl: Keep pointer to the CURLState in CURLSocket'
002/7:[0007] [FC] 'curl: Keep *socket until the end of curl_sock_cb()'
003/7:[down] 'curl: Check completion in curl_multi_do()'
004/7:[0019] [FC] 'curl: Pass CURLSocket to curl_multi_{do,read}()'
005/7:[0002] [FC] 'curl: Report only ready sockets'
006/7:[----] [--] 'curl: Handle success in multi_check_completion'
007/7:[----] [--] 'curl: Check curl_multi_add_handle()'s return code'


Max Reitz (7):
  curl: Keep pointer to the CURLState in CURLSocket
  curl: Keep *socket until the end of curl_sock_cb()
  curl: Check completion in curl_multi_do()
  curl: Pass CURLSocket to curl_multi_do()
  curl: Report only ready sockets
  curl: Handle success in multi_check_completion
  curl: Check curl_multi_add_handle()'s return code

 block/curl.c | 133 +++++++++++++++++++++++----------------------------
 1 file changed, 59 insertions(+), 74 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 1/7] curl: Keep pointer to the CURLState in CURLSocket
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
@ 2019-09-10 12:41 ` Max Reitz
  2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 2/7] curl: Keep *socket until the end of curl_sock_cb() Max Reitz
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Max Reitz @ 2019-09-10 12:41 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-stable, John Snow, qemu-devel, Max Reitz

A follow-up patch will make curl_multi_do() and curl_multi_read() take a
CURLSocket instead of the CURLState.  They still need the latter,
though, so add a pointer to it to the former.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 block/curl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/curl.c b/block/curl.c
index d4c8e94f3e..92dc2f630e 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -80,6 +80,7 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define CURL_BLOCK_OPT_TIMEOUT_DEFAULT 5
 
 struct BDRVCURLState;
+struct CURLState;
 
 static bool libcurl_initialized;
 
@@ -97,6 +98,7 @@ typedef struct CURLAIOCB {
 
 typedef struct CURLSocket {
     int fd;
+    struct CURLState *state;
     QLIST_ENTRY(CURLSocket) next;
 } CURLSocket;
 
@@ -180,6 +182,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
     if (!socket) {
         socket = g_new0(CURLSocket, 1);
         socket->fd = fd;
+        socket->state = state;
         QLIST_INSERT_HEAD(&state->sockets, socket, next);
     }
     socket = NULL;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 2/7] curl: Keep *socket until the end of curl_sock_cb()
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 1/7] curl: Keep pointer to the CURLState in CURLSocket Max Reitz
@ 2019-09-10 12:41 ` Max Reitz
  2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  2019-09-10 21:19   ` [Qemu-devel] " John Snow
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 3/7] curl: Check completion in curl_multi_do() Max Reitz
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 22+ messages in thread
From: Max Reitz @ 2019-09-10 12:41 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-stable, John Snow, qemu-devel, Max Reitz

This does not really change anything, but it makes the code a bit easier
to follow once we use @socket as the opaque pointer for
aio_set_fd_handler().

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/curl.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 92dc2f630e..95d7b77dc0 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -172,10 +172,6 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
 
     QLIST_FOREACH(socket, &state->sockets, next) {
         if (socket->fd == fd) {
-            if (action == CURL_POLL_REMOVE) {
-                QLIST_REMOVE(socket, next);
-                g_free(socket);
-            }
             break;
         }
     }
@@ -185,7 +181,6 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
         socket->state = state;
         QLIST_INSERT_HEAD(&state->sockets, socket, next);
     }
-    socket = NULL;
 
     trace_curl_sock_cb(action, (int)fd);
     switch (action) {
@@ -207,6 +202,11 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
             break;
     }
 
+    if (action == CURL_POLL_REMOVE) {
+        QLIST_REMOVE(socket, next);
+        g_free(socket);
+    }
+
     return 0;
 }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 3/7] curl: Check completion in curl_multi_do()
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 1/7] curl: Keep pointer to the CURLState in CURLSocket Max Reitz
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 2/7] curl: Keep *socket until the end of curl_sock_cb() Max Reitz
@ 2019-09-10 12:41 ` Max Reitz
  2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 4/7] curl: Pass CURLSocket to curl_multi_do() Max Reitz
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Max Reitz @ 2019-09-10 12:41 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-stable, John Snow, qemu-devel, Max Reitz

While it is more likely that transfers complete after some file
descriptor has data ready to read, we probably should not rely on it.
Better be safe than sorry and call curl_multi_check_completion() in
curl_multi_do(), too, just like it is done in curl_multi_read().

With this change, curl_multi_do() and curl_multi_read() are actually the
same, so drop curl_multi_read() and use curl_multi_do() as the sole FD
handler.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/curl.c | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 95d7b77dc0..5838afef99 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -139,7 +139,6 @@ typedef struct BDRVCURLState {
 
 static void curl_clean_state(CURLState *s);
 static void curl_multi_do(void *arg);
-static void curl_multi_read(void *arg);
 
 #ifdef NEED_CURL_TIMER_CALLBACK
 /* Called from curl_multi_do_locked, with s->mutex held.  */
@@ -186,7 +185,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
     switch (action) {
         case CURL_POLL_IN:
             aio_set_fd_handler(s->aio_context, fd, false,
-                               curl_multi_read, NULL, NULL, state);
+                               curl_multi_do, NULL, NULL, state);
             break;
         case CURL_POLL_OUT:
             aio_set_fd_handler(s->aio_context, fd, false,
@@ -194,7 +193,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
             break;
         case CURL_POLL_INOUT:
             aio_set_fd_handler(s->aio_context, fd, false,
-                               curl_multi_read, curl_multi_do, NULL, state);
+                               curl_multi_do, curl_multi_do, NULL, state);
             break;
         case CURL_POLL_REMOVE:
             aio_set_fd_handler(s->aio_context, fd, false,
@@ -416,15 +415,6 @@ static void curl_multi_do(void *arg)
 {
     CURLState *s = (CURLState *)arg;
 
-    qemu_mutex_lock(&s->s->mutex);
-    curl_multi_do_locked(s);
-    qemu_mutex_unlock(&s->s->mutex);
-}
-
-static void curl_multi_read(void *arg)
-{
-    CURLState *s = (CURLState *)arg;
-
     qemu_mutex_lock(&s->s->mutex);
     curl_multi_do_locked(s);
     curl_multi_check_completion(s->s);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 4/7] curl: Pass CURLSocket to curl_multi_do()
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
                   ` (2 preceding siblings ...)
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 3/7] curl: Check completion in curl_multi_do() Max Reitz
@ 2019-09-10 12:41 ` Max Reitz
  2019-09-10 16:12   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 5/7] curl: Report only ready sockets Max Reitz
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Max Reitz @ 2019-09-10 12:41 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-stable, John Snow, qemu-devel, Max Reitz

curl_multi_do_locked() currently marks all sockets as ready.  That is
not only inefficient, but in fact unsafe (the loop is).  A follow-up
patch will change that, but to do so, curl_multi_do_locked() needs to
know exactly which socket is ready; and that is accomplished by this
patch here.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/curl.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 5838afef99..cf2686218d 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -185,15 +185,15 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
     switch (action) {
         case CURL_POLL_IN:
             aio_set_fd_handler(s->aio_context, fd, false,
-                               curl_multi_do, NULL, NULL, state);
+                               curl_multi_do, NULL, NULL, socket);
             break;
         case CURL_POLL_OUT:
             aio_set_fd_handler(s->aio_context, fd, false,
-                               NULL, curl_multi_do, NULL, state);
+                               NULL, curl_multi_do, NULL, socket);
             break;
         case CURL_POLL_INOUT:
             aio_set_fd_handler(s->aio_context, fd, false,
-                               curl_multi_do, curl_multi_do, NULL, state);
+                               curl_multi_do, curl_multi_do, NULL, socket);
             break;
         case CURL_POLL_REMOVE:
             aio_set_fd_handler(s->aio_context, fd, false,
@@ -392,9 +392,10 @@ static void curl_multi_check_completion(BDRVCURLState *s)
 }
 
 /* Called with s->mutex held.  */
-static void curl_multi_do_locked(CURLState *s)
+static void curl_multi_do_locked(CURLSocket *ready_socket)
 {
     CURLSocket *socket, *next_socket;
+    CURLState *s = ready_socket->state;
     int running;
     int r;
 
@@ -413,12 +414,13 @@ static void curl_multi_do_locked(CURLState *s)
 
 static void curl_multi_do(void *arg)
 {
-    CURLState *s = (CURLState *)arg;
+    CURLSocket *socket = arg;
+    BDRVCURLState *s = socket->state->s;
 
-    qemu_mutex_lock(&s->s->mutex);
-    curl_multi_do_locked(s);
-    curl_multi_check_completion(s->s);
-    qemu_mutex_unlock(&s->s->mutex);
+    qemu_mutex_lock(&s->mutex);
+    curl_multi_do_locked(socket);
+    curl_multi_check_completion(s);
+    qemu_mutex_unlock(&s->mutex);
 }
 
 static void curl_multi_timeout_do(void *arg)
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 5/7] curl: Report only ready sockets
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
                   ` (3 preceding siblings ...)
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 4/7] curl: Pass CURLSocket to curl_multi_do() Max Reitz
@ 2019-09-10 12:41 ` Max Reitz
  2019-09-10 16:12   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 6/7] curl: Handle success in multi_check_completion Max Reitz
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Max Reitz @ 2019-09-10 12:41 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-stable, John Snow, qemu-devel, Max Reitz

Instead of reporting all sockets to cURL, only report the one that has
caused curl_multi_do_locked() to be called.  This lets us get rid of the
QLIST_FOREACH_SAFE() list, which was actually wrong: SAFE foreaches are
only safe when the current element is removed in each iteration.  If it
possible for the list to be concurrently modified, we cannot guarantee
that only the current element will be removed.  Therefore, we must not
use QLIST_FOREACH_SAFE() here.

Fixes: ff5ca1664af85b24a4180d595ea6873fd3deac57
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/curl.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index cf2686218d..fd70f1ebc4 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -392,24 +392,19 @@ static void curl_multi_check_completion(BDRVCURLState *s)
 }
 
 /* Called with s->mutex held.  */
-static void curl_multi_do_locked(CURLSocket *ready_socket)
+static void curl_multi_do_locked(CURLSocket *socket)
 {
-    CURLSocket *socket, *next_socket;
-    CURLState *s = ready_socket->state;
+    BDRVCURLState *s = socket->state->s;
     int running;
     int r;
 
-    if (!s->s->multi) {
+    if (!s->multi) {
         return;
     }
 
-    /* Need to use _SAFE because curl_multi_socket_action() may trigger
-     * curl_sock_cb() which might modify this list */
-    QLIST_FOREACH_SAFE(socket, &s->sockets, next, next_socket) {
-        do {
-            r = curl_multi_socket_action(s->s->multi, socket->fd, 0, &running);
-        } while (r == CURLM_CALL_MULTI_PERFORM);
-    }
+    do {
+        r = curl_multi_socket_action(s->multi, socket->fd, 0, &running);
+    } while (r == CURLM_CALL_MULTI_PERFORM);
 }
 
 static void curl_multi_do(void *arg)
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 6/7] curl: Handle success in multi_check_completion
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
                   ` (4 preceding siblings ...)
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 5/7] curl: Report only ready sockets Max Reitz
@ 2019-09-10 12:41 ` Max Reitz
  2019-09-10 16:13   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 7/7] curl: Check curl_multi_add_handle()'s return code Max Reitz
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Max Reitz @ 2019-09-10 12:41 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-stable, John Snow, qemu-devel, Max Reitz

Background: As of cURL 7.59.0, it verifies that several functions are
not called from within a callback.  Among these functions is
curl_multi_add_handle().

curl_read_cb() is a callback from cURL and not a coroutine.  Waking up
acb->co will lead to entering it then and there, which means the current
request will settle and the caller (if it runs in the same coroutine)
may then issue the next request.  In such a case, we will enter
curl_setup_preadv() effectively from within curl_read_cb().

Calling curl_multi_add_handle() will then fail and the new request will
not be processed.

Fix this by not letting curl_read_cb() wake up acb->co.  Instead, leave
the whole business of settling the AIOCB objects to
curl_multi_check_completion() (which is called from our timer callback
and our FD handler, so not from any cURL callbacks).

Reported-by: Natalie Gavrielov <ngavrilo@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1740193
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/curl.c | 69 ++++++++++++++++++++++------------------------------
 1 file changed, 29 insertions(+), 40 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index fd70f1ebc4..c343c7ed3d 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -229,7 +229,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
 {
     CURLState *s = ((CURLState*)opaque);
     size_t realsize = size * nmemb;
-    int i;
 
     trace_curl_read_cb(realsize);
 
@@ -245,32 +244,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
     memcpy(s->orig_buf + s->buf_off, ptr, realsize);
     s->buf_off += realsize;
 
-    for(i=0; i<CURL_NUM_ACB; i++) {
-        CURLAIOCB *acb = s->acb[i];
-
-        if (!acb)
-            continue;
-
-        if ((s->buf_off >= acb->end)) {
-            size_t request_length = acb->bytes;
-
-            qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
-                                acb->end - acb->start);
-
-            if (acb->end - acb->start < request_length) {
-                size_t offset = acb->end - acb->start;
-                qemu_iovec_memset(acb->qiov, offset, 0,
-                                  request_length - offset);
-            }
-
-            acb->ret = 0;
-            s->acb[i] = NULL;
-            qemu_mutex_unlock(&s->s->mutex);
-            aio_co_wake(acb->co);
-            qemu_mutex_lock(&s->s->mutex);
-        }
-    }
-
 read_end:
     /* curl will error out if we do not return this value */
     return size * nmemb;
@@ -351,13 +324,14 @@ static void curl_multi_check_completion(BDRVCURLState *s)
             break;
 
         if (msg->msg == CURLMSG_DONE) {
+            int i;
             CURLState *state = NULL;
+            bool error = msg->data.result != CURLE_OK;
+
             curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
                               (char **)&state);
 
-            /* ACBs for successful messages get completed in curl_read_cb */
-            if (msg->data.result != CURLE_OK) {
-                int i;
+            if (error) {
                 static int errcount = 100;
 
                 /* Don't lose the original error message from curl, since
@@ -369,20 +343,35 @@ static void curl_multi_check_completion(BDRVCURLState *s)
                         error_report("curl: further errors suppressed");
                     }
                 }
+            }
 
-                for (i = 0; i < CURL_NUM_ACB; i++) {
-                    CURLAIOCB *acb = state->acb[i];
+            for (i = 0; i < CURL_NUM_ACB; i++) {
+                CURLAIOCB *acb = state->acb[i];
 
-                    if (acb == NULL) {
-                        continue;
-                    }
+                if (acb == NULL) {
+                    continue;
+                }
+
+                if (!error) {
+                    /* Assert that we have read all data */
+                    assert(state->buf_off >= acb->end);
+
+                    qemu_iovec_from_buf(acb->qiov, 0,
+                                        state->orig_buf + acb->start,
+                                        acb->end - acb->start);
 
-                    acb->ret = -EIO;
-                    state->acb[i] = NULL;
-                    qemu_mutex_unlock(&s->mutex);
-                    aio_co_wake(acb->co);
-                    qemu_mutex_lock(&s->mutex);
+                    if (acb->end - acb->start < acb->bytes) {
+                        size_t offset = acb->end - acb->start;
+                        qemu_iovec_memset(acb->qiov, offset, 0,
+                                          acb->bytes - offset);
+                    }
                 }
+
+                acb->ret = error ? -EIO : 0;
+                state->acb[i] = NULL;
+                qemu_mutex_unlock(&s->mutex);
+                aio_co_wake(acb->co);
+                qemu_mutex_lock(&s->mutex);
             }
 
             curl_clean_state(state);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 7/7] curl: Check curl_multi_add_handle()'s return code
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
                   ` (5 preceding siblings ...)
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 6/7] curl: Handle success in multi_check_completion Max Reitz
@ 2019-09-10 12:41 ` Max Reitz
  2019-09-10 16:13   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  2019-09-11  1:23 ` [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash John Snow
  2019-09-13 11:48 ` Max Reitz
  8 siblings, 1 reply; 22+ messages in thread
From: Max Reitz @ 2019-09-10 12:41 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, qemu-stable, John Snow, qemu-devel, Max Reitz

If we had done that all along, debugging would have been much simpler.
(Also, I/O errors are better than hangs.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/curl.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/curl.c b/block/curl.c
index c343c7ed3d..f86299378e 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -882,7 +882,13 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
     trace_curl_setup_preadv(acb->bytes, start, state->range);
     curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
 
-    curl_multi_add_handle(s->multi, state->curl);
+    if (curl_multi_add_handle(s->multi, state->curl) != CURLM_OK) {
+        state->acb[0] = NULL;
+        acb->ret = -EIO;
+
+        curl_clean_state(state);
+        goto out;
+    }
 
     /* Tell curl it needs to kick things off */
     curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 1/7] curl: Keep pointer to the CURLState in CURLSocket
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 1/7] curl: Keep pointer to the CURLState in CURLSocket Max Reitz
@ 2019-09-10 16:11   ` Maxim Levitsky
  0 siblings, 0 replies; 22+ messages in thread
From: Maxim Levitsky @ 2019-09-10 16:11 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel

On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
> A follow-up patch will make curl_multi_do() and curl_multi_read() take a
> CURLSocket instead of the CURLState.  They still need the latter,
> though, so add a pointer to it to the former.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: John Snow <jsnow@redhat.com>
> ---
>  block/curl.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/block/curl.c b/block/curl.c
> index d4c8e94f3e..92dc2f630e 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -80,6 +80,7 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
>  #define CURL_BLOCK_OPT_TIMEOUT_DEFAULT 5
>  
>  struct BDRVCURLState;
> +struct CURLState;
>  
>  static bool libcurl_initialized;
>  
> @@ -97,6 +98,7 @@ typedef struct CURLAIOCB {
>  
>  typedef struct CURLSocket {
>      int fd;
> +    struct CURLState *state;
>      QLIST_ENTRY(CURLSocket) next;
>  } CURLSocket;
>  
> @@ -180,6 +182,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>      if (!socket) {
>          socket = g_new0(CURLSocket, 1);
>          socket->fd = fd;
> +        socket->state = state;
>          QLIST_INSERT_HEAD(&state->sockets, socket, next);
>      }
>      socket = NULL;

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 2/7] curl: Keep *socket until the end of curl_sock_cb()
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 2/7] curl: Keep *socket until the end of curl_sock_cb() Max Reitz
@ 2019-09-10 16:11   ` Maxim Levitsky
  2019-09-10 21:19   ` [Qemu-devel] " John Snow
  1 sibling, 0 replies; 22+ messages in thread
From: Maxim Levitsky @ 2019-09-10 16:11 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel

On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
> This does not really change anything, but it makes the code a bit easier
> to follow once we use @socket as the opaque pointer for
> aio_set_fd_handler().
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/curl.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index 92dc2f630e..95d7b77dc0 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -172,10 +172,6 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>  
>      QLIST_FOREACH(socket, &state->sockets, next) {
>          if (socket->fd == fd) {
> -            if (action == CURL_POLL_REMOVE) {
> -                QLIST_REMOVE(socket, next);
> -                g_free(socket);
> -            }
>              break;
>          }
>      }
> @@ -185,7 +181,6 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>          socket->state = state;
>          QLIST_INSERT_HEAD(&state->sockets, socket, next);
>      }
> -    socket = NULL;
>  
>      trace_curl_sock_cb(action, (int)fd);
>      switch (action) {
> @@ -207,6 +202,11 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>              break;
>      }
>  
> +    if (action == CURL_POLL_REMOVE) {
> +        QLIST_REMOVE(socket, next);
> +        g_free(socket);
> +    }
> +
>      return 0;
>  }
>  
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 3/7] curl: Check completion in curl_multi_do()
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 3/7] curl: Check completion in curl_multi_do() Max Reitz
@ 2019-09-10 16:11   ` Maxim Levitsky
  2019-09-11  1:16     ` John Snow
  2019-09-11  6:51     ` Max Reitz
  0 siblings, 2 replies; 22+ messages in thread
From: Maxim Levitsky @ 2019-09-10 16:11 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel

On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
> While it is more likely that transfers complete after some file
> descriptor has data ready to read, we probably should not rely on it.
> Better be safe than sorry and call curl_multi_check_completion() in
> curl_multi_do(), too, just like it is done in curl_multi_read().
> 
> With this change, curl_multi_do() and curl_multi_read() are actually the
> same, so drop curl_multi_read() and use curl_multi_do() as the sole FD
> handler.

I understand the reasoning, but I still a bit worry that this
could paper over some bug/race in the future.
If curl asks us only to deal with write, that would mean
that it doesn't expect any data to be received.

Do you by a chance have an example, of this patch
affecting the code? Maybe when a unexpected error reply
is received from the server?

I don't really know the CURL library, so I probably missed
something important.

Other than that,
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>


Best regards,
	Maxim Levitsky


> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/curl.c | 14 ++------------
>  1 file changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index 95d7b77dc0..5838afef99 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -139,7 +139,6 @@ typedef struct BDRVCURLState {
>  
>  static void curl_clean_state(CURLState *s);
>  static void curl_multi_do(void *arg);
> -static void curl_multi_read(void *arg);
>  
>  #ifdef NEED_CURL_TIMER_CALLBACK
>  /* Called from curl_multi_do_locked, with s->mutex held.  */
> @@ -186,7 +185,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>      switch (action) {
>          case CURL_POLL_IN:
>              aio_set_fd_handler(s->aio_context, fd, false,
> -                               curl_multi_read, NULL, NULL, state);
> +                               curl_multi_do, NULL, NULL, state);
>              break;
>          case CURL_POLL_OUT:
>              aio_set_fd_handler(s->aio_context, fd, false,
> @@ -194,7 +193,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>              break;
>          case CURL_POLL_INOUT:
>              aio_set_fd_handler(s->aio_context, fd, false,
> -                               curl_multi_read, curl_multi_do, NULL, state);
> +                               curl_multi_do, curl_multi_do, NULL, state);
>              break;
>          case CURL_POLL_REMOVE:
>              aio_set_fd_handler(s->aio_context, fd, false,
> @@ -416,15 +415,6 @@ static void curl_multi_do(void *arg)
>  {
>      CURLState *s = (CURLState *)arg;
>  
> -    qemu_mutex_lock(&s->s->mutex);
> -    curl_multi_do_locked(s);
> -    qemu_mutex_unlock(&s->s->mutex);
> -}
> -
> -static void curl_multi_read(void *arg)
> -{
> -    CURLState *s = (CURLState *)arg;
> -
>      qemu_mutex_lock(&s->s->mutex);
>      curl_multi_do_locked(s);
>      curl_multi_check_completion(s->s);




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 4/7] curl: Pass CURLSocket to curl_multi_do()
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 4/7] curl: Pass CURLSocket to curl_multi_do() Max Reitz
@ 2019-09-10 16:12   ` Maxim Levitsky
  0 siblings, 0 replies; 22+ messages in thread
From: Maxim Levitsky @ 2019-09-10 16:12 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel

On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
> curl_multi_do_locked() currently marks all sockets as ready.  That is
> not only inefficient, but in fact unsafe (the loop is).  A follow-up
> patch will change that, but to do so, curl_multi_do_locked() needs to
> know exactly which socket is ready; and that is accomplished by this
> patch here.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/curl.c | 20 +++++++++++---------
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index 5838afef99..cf2686218d 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -185,15 +185,15 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>      switch (action) {
>          case CURL_POLL_IN:
>              aio_set_fd_handler(s->aio_context, fd, false,
> -                               curl_multi_do, NULL, NULL, state);
> +                               curl_multi_do, NULL, NULL, socket);
>              break;
>          case CURL_POLL_OUT:
>              aio_set_fd_handler(s->aio_context, fd, false,
> -                               NULL, curl_multi_do, NULL, state);
> +                               NULL, curl_multi_do, NULL, socket);
>              break;
>          case CURL_POLL_INOUT:
>              aio_set_fd_handler(s->aio_context, fd, false,
> -                               curl_multi_do, curl_multi_do, NULL, state);
> +                               curl_multi_do, curl_multi_do, NULL, socket);
>              break;
>          case CURL_POLL_REMOVE:
>              aio_set_fd_handler(s->aio_context, fd, false,
> @@ -392,9 +392,10 @@ static void curl_multi_check_completion(BDRVCURLState *s)
>  }
>  
>  /* Called with s->mutex held.  */
> -static void curl_multi_do_locked(CURLState *s)
> +static void curl_multi_do_locked(CURLSocket *ready_socket)
>  {
>      CURLSocket *socket, *next_socket;
> +    CURLState *s = ready_socket->state;
>      int running;
>      int r;
>  
> @@ -413,12 +414,13 @@ static void curl_multi_do_locked(CURLState *s)
>  
>  static void curl_multi_do(void *arg)
>  {
> -    CURLState *s = (CURLState *)arg;
> +    CURLSocket *socket = arg;
> +    BDRVCURLState *s = socket->state->s;
>  
> -    qemu_mutex_lock(&s->s->mutex);
> -    curl_multi_do_locked(s);
> -    curl_multi_check_completion(s->s);
> -    qemu_mutex_unlock(&s->s->mutex);
> +    qemu_mutex_lock(&s->mutex);
> +    curl_multi_do_locked(socket);
> +    curl_multi_check_completion(s);
> +    qemu_mutex_unlock(&s->mutex);
>  }
>  
>  static void curl_multi_timeout_do(void *arg)

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 5/7] curl: Report only ready sockets
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 5/7] curl: Report only ready sockets Max Reitz
@ 2019-09-10 16:12   ` Maxim Levitsky
  0 siblings, 0 replies; 22+ messages in thread
From: Maxim Levitsky @ 2019-09-10 16:12 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel

On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
> Instead of reporting all sockets to cURL, only report the one that has
> caused curl_multi_do_locked() to be called.  This lets us get rid of the
> QLIST_FOREACH_SAFE() list, which was actually wrong: SAFE foreaches are
> only safe when the current element is removed in each iteration.  If it
> possible for the list to be concurrently modified, we cannot guarantee
> that only the current element will be removed.  Therefore, we must not
> use QLIST_FOREACH_SAFE() here.
> 
> Fixes: ff5ca1664af85b24a4180d595ea6873fd3deac57
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/curl.c | 17 ++++++-----------
>  1 file changed, 6 insertions(+), 11 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index cf2686218d..fd70f1ebc4 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -392,24 +392,19 @@ static void curl_multi_check_completion(BDRVCURLState *s)
>  }
>  
>  /* Called with s->mutex held.  */
> -static void curl_multi_do_locked(CURLSocket *ready_socket)
> +static void curl_multi_do_locked(CURLSocket *socket)
Here you revert the variable name change you had in previous commit

>  {
> -    CURLSocket *socket, *next_socket;
> -    CURLState *s = ready_socket->state;
> +    BDRVCURLState *s = socket->state->s;
>      int running;
>      int r;
>  
> -    if (!s->s->multi) {
> +    if (!s->multi) {
>          return;
>      }
>  
> -    /* Need to use _SAFE because curl_multi_socket_action() may trigger
> -     * curl_sock_cb() which might modify this list */
> -    QLIST_FOREACH_SAFE(socket, &s->sockets, next, next_socket) {
> -        do {
> -            r = curl_multi_socket_action(s->s->multi, socket->fd, 0, &running);
> -        } while (r == CURLM_CALL_MULTI_PERFORM);
> -    }
> +    do {
> +        r = curl_multi_socket_action(s->multi, socket->fd, 0, &running);
> +    } while (r == CURLM_CALL_MULTI_PERFORM);
>  }
>  
>  static void curl_multi_do(void *arg)

Other than that nitpick,
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 6/7] curl: Handle success in multi_check_completion
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 6/7] curl: Handle success in multi_check_completion Max Reitz
@ 2019-09-10 16:13   ` Maxim Levitsky
  2019-09-13 11:20     ` Max Reitz
  0 siblings, 1 reply; 22+ messages in thread
From: Maxim Levitsky @ 2019-09-10 16:13 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel

On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
> Background: As of cURL 7.59.0, it verifies that several functions are
> not called from within a callback.  Among these functions is
> curl_multi_add_handle().
> 
> curl_read_cb() is a callback from cURL and not a coroutine.  Waking up
> acb->co will lead to entering it then and there, which means the current
> request will settle and the caller (if it runs in the same coroutine)
> may then issue the next request.  In such a case, we will enter
> curl_setup_preadv() effectively from within curl_read_cb().
> 
> Calling curl_multi_add_handle() will then fail and the new request will
> not be processed.
> 
> Fix this by not letting curl_read_cb() wake up acb->co.  Instead, leave
> the whole business of settling the AIOCB objects to
> curl_multi_check_completion() (which is called from our timer callback
> and our FD handler, so not from any cURL callbacks).
> 
> Reported-by: Natalie Gavrielov <ngavrilo@redhat.com>
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1740193
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/curl.c | 69 ++++++++++++++++++++++------------------------------
>  1 file changed, 29 insertions(+), 40 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index fd70f1ebc4..c343c7ed3d 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -229,7 +229,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
>  {
>      CURLState *s = ((CURLState*)opaque);
>      size_t realsize = size * nmemb;
> -    int i;
>  
>      trace_curl_read_cb(realsize);
>  
> @@ -245,32 +244,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
>      memcpy(s->orig_buf + s->buf_off, ptr, realsize);
>      s->buf_off += realsize;
>  
> -    for(i=0; i<CURL_NUM_ACB; i++) {
> -        CURLAIOCB *acb = s->acb[i];
> -
> -        if (!acb)
> -            continue;
> -
> -        if ((s->buf_off >= acb->end)) {
> -            size_t request_length = acb->bytes;
> -
> -            qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
> -                                acb->end - acb->start);
> -
> -            if (acb->end - acb->start < request_length) {
> -                size_t offset = acb->end - acb->start;
> -                qemu_iovec_memset(acb->qiov, offset, 0,
> -                                  request_length - offset);
> -            }
> -
> -            acb->ret = 0;
> -            s->acb[i] = NULL;
> -            qemu_mutex_unlock(&s->s->mutex);
> -            aio_co_wake(acb->co);
> -            qemu_mutex_lock(&s->s->mutex);
> -        }
> -    }
> -
>  read_end:
>      /* curl will error out if we do not return this value */
>      return size * nmemb;
> @@ -351,13 +324,14 @@ static void curl_multi_check_completion(BDRVCURLState *s)
>              break;
>  
>          if (msg->msg == CURLMSG_DONE) {
> +            int i;
>              CURLState *state = NULL;
> +            bool error = msg->data.result != CURLE_OK;
> +
>              curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
>                                (char **)&state);
>  
> -            /* ACBs for successful messages get completed in curl_read_cb */
> -            if (msg->data.result != CURLE_OK) {
> -                int i;
> +            if (error) {
>                  static int errcount = 100;
>  
>                  /* Don't lose the original error message from curl, since
> @@ -369,20 +343,35 @@ static void curl_multi_check_completion(BDRVCURLState *s)
>                          error_report("curl: further errors suppressed");
>                      }
>                  }
> +            }
>  
> -                for (i = 0; i < CURL_NUM_ACB; i++) {
> -                    CURLAIOCB *acb = state->acb[i];
> +            for (i = 0; i < CURL_NUM_ACB; i++) {
> +                CURLAIOCB *acb = state->acb[i];
>  
> -                    if (acb == NULL) {
> -                        continue;
> -                    }
> +                if (acb == NULL) {
> +                    continue;
> +                }
> +
> +                if (!error) {
> +                    /* Assert that we have read all data */
> +                    assert(state->buf_off >= acb->end);
> +
> +                    qemu_iovec_from_buf(acb->qiov, 0,
> +                                        state->orig_buf + acb->start,
> +                                        acb->end - acb->start);
>  
> -                    acb->ret = -EIO;
> -                    state->acb[i] = NULL;
> -                    qemu_mutex_unlock(&s->mutex);
> -                    aio_co_wake(acb->co);
> -                    qemu_mutex_lock(&s->mutex);
> +                    if (acb->end - acb->start < acb->bytes) {
> +                        size_t offset = acb->end - acb->start;
> +                        qemu_iovec_memset(acb->qiov, offset, 0,
> +                                          acb->bytes - offset);
> +                    }
Original code was memsetting the tail of the buffer before waking up the coroutine.
Is this change intended?

aio_co_wake doesn't enter the co-routine if already in coroutine, but
I think that this is an aio fd handler with isn't run in co-routine itself,
so the callback could run with not yet ready data.


>                  }
> +
> +                acb->ret = error ? -EIO : 0;
> +                state->acb[i] = NULL;
> +                qemu_mutex_unlock(&s->mutex);
> +                aio_co_wake(acb->co);
> +                qemu_mutex_lock(&s->mutex);
>              }
>  
>              curl_clean_state(state);


Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 7/7] curl: Check curl_multi_add_handle()'s return code
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 7/7] curl: Check curl_multi_add_handle()'s return code Max Reitz
@ 2019-09-10 16:13   ` Maxim Levitsky
  0 siblings, 0 replies; 22+ messages in thread
From: Maxim Levitsky @ 2019-09-10 16:13 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel

On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
> If we had done that all along, debugging would have been much simpler.
> (Also, I/O errors are better than hangs.)
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/curl.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/block/curl.c b/block/curl.c
> index c343c7ed3d..f86299378e 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -882,7 +882,13 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
>      trace_curl_setup_preadv(acb->bytes, start, state->range);
>      curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
>  
> -    curl_multi_add_handle(s->multi, state->curl);
> +    if (curl_multi_add_handle(s->multi, state->curl) != CURLM_OK) {
> +        state->acb[0] = NULL;
> +        acb->ret = -EIO;
> +
> +        curl_clean_state(state);
> +        goto out;
> +    }
>  
>      /* Tell curl it needs to kick things off */
>      curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);

Checking the return values is always a very good idea.
I would myself make this patch #1 in the series, since it doesn't
depend on others and it itself a bugfix.
But this is my style, so I don't mind if you leave this as is.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/7] curl: Keep *socket until the end of curl_sock_cb()
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 2/7] curl: Keep *socket until the end of curl_sock_cb() Max Reitz
  2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
@ 2019-09-10 21:19   ` John Snow
  1 sibling, 0 replies; 22+ messages in thread
From: John Snow @ 2019-09-10 21:19 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel



On 9/10/19 8:41 AM, Max Reitz wrote:
> This does not really change anything, but it makes the code a bit easier
> to follow once we use @socket as the opaque pointer for
> aio_set_fd_handler().
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: John Snow <jsnow@redhat.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 3/7] curl: Check completion in curl_multi_do()
  2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
@ 2019-09-11  1:16     ` John Snow
  2019-09-11  6:51     ` Max Reitz
  1 sibling, 0 replies; 22+ messages in thread
From: John Snow @ 2019-09-11  1:16 UTC (permalink / raw)
  To: Maxim Levitsky, Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel



On 9/10/19 12:11 PM, Maxim Levitsky wrote:
> On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
>> While it is more likely that transfers complete after some file
>> descriptor has data ready to read, we probably should not rely on it.
>> Better be safe than sorry and call curl_multi_check_completion() in
>> curl_multi_do(), too, just like it is done in curl_multi_read().
>>
>> With this change, curl_multi_do() and curl_multi_read() are actually the
>> same, so drop curl_multi_read() and use curl_multi_do() as the sole FD
>> handler.
> 
> I understand the reasoning, but I still a bit worry that this
> could paper over some bug/race in the future.
> If curl asks us only to deal with write, that would mean
> that it doesn't expect any data to be received.
> 
> Do you by a chance have an example, of this patch
> affecting the code? Maybe when a unexpected error reply
> is received from the server?
> 
> I don't really know the CURL library, so I probably missed
> something important.
> 
> Other than that,
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> 

In this case, it's because I had some doubts in V1 about when we call
the post-completion cleanup code. It didn't look obvious at a glance.

This just makes it simpler.

I don't think that by checking *more* things we're going to paper over
some race condition -- it's the opposite. If anything, this will explode
sooner rather than later.

So I think it's OK, naturally.

Reviewed-by: John Snow <jsnow@redhat.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
                   ` (6 preceding siblings ...)
  2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 7/7] curl: Check curl_multi_add_handle()'s return code Max Reitz
@ 2019-09-11  1:23 ` John Snow
  2019-09-13 11:48 ` Max Reitz
  8 siblings, 0 replies; 22+ messages in thread
From: John Snow @ 2019-09-11  1:23 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel



On 9/10/19 8:41 AM, Max Reitz wrote:
> Hi,
> 
> As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our
> curl block driver can spontaneously hang.  This becomes visible e.g.
> when reading compressed qcow2 images:
> 
> $ qemu-img convert -p -O raw -n \
>   https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \
>   null-co://
> 
> (Hangs at 74.21 %, usually.)
> 
> A more direct way is:
> 
> $ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \
>     -d 1 -S 524288 -c 2
> 
> (Which simply performs two requests, and the second one hangs.  You can
> use any HTTP resource (probably FTP, too) you’d like that is at least
> 1 MB in size.)
> 
> It turns out that this is because cURL 7.59.0 has added a protective
> feature against some misuse we had in our code: curl_multi_add_handle()
> must not be called from within a cURL callback, but in some cases we
> did.  As of 7.59.0, this fails, our new request is not registered and
> the I/O request stalls.  This is fixed by patch 6.
> 
> Patch 7 makes us check for curl_multi_add_handle()’s return code,
> because if we had done that before, debugging would have been much
> simpler.
> 
> 
> On the way to fixing it, I had a look over the whole cURL code and found
> a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very
> safe at all.  I think this may lead to crashes, although I have never
> seen any myself.  https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5
> shows one in exactly the function in question, so I think it actually is
> a problem.
> 
> This is fixed by patch 5, patches 1, 2, and 4 prepare for it.
> 
> (Patch 3 is kind of a misc patch that should ensure that we always end
> up calling curl_multi_check_completion() whenever a request might have
> been completed.)
> 
> 
> v2:
> - Patch 2: Remove the socket from the list only add the end of the
>            function (yielding a nicer 5+/5- diff stat)
> - Patch 3: Added
> - Patch 4: Rebased on patch 3, and s/socket/ready_socket/ in one place
> - Patch 5: Rebased on the changed patch 4
> 
> 
> git-backport-diff against v1:
> 
> Key:
> [----] : patches are identical
> [####] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively
> 
> 001/7:[----] [--] 'curl: Keep pointer to the CURLState in CURLSocket'
> 002/7:[0007] [FC] 'curl: Keep *socket until the end of curl_sock_cb()'
> 003/7:[down] 'curl: Check completion in curl_multi_do()'
> 004/7:[0019] [FC] 'curl: Pass CURLSocket to curl_multi_{do,read}()'
> 005/7:[0002] [FC] 'curl: Report only ready sockets'
> 006/7:[----] [--] 'curl: Handle success in multi_check_completion'
> 007/7:[----] [--] 'curl: Check curl_multi_add_handle()'s return code'
> 
> 
> Max Reitz (7):
>   curl: Keep pointer to the CURLState in CURLSocket
>   curl: Keep *socket until the end of curl_sock_cb()
>   curl: Check completion in curl_multi_do()
>   curl: Pass CURLSocket to curl_multi_do()
>   curl: Report only ready sockets
>   curl: Handle success in multi_check_completion
>   curl: Check curl_multi_add_handle()'s return code
> 
>  block/curl.c | 133 +++++++++++++++++++++++----------------------------
>  1 file changed, 59 insertions(+), 74 deletions(-)
> 

And for 4-7:

Reviewed-by: John Snow <jsnow@redhat.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 3/7] curl: Check completion in curl_multi_do()
  2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  2019-09-11  1:16     ` John Snow
@ 2019-09-11  6:51     ` Max Reitz
  1 sibling, 0 replies; 22+ messages in thread
From: Max Reitz @ 2019-09-11  6:51 UTC (permalink / raw)
  To: Maxim Levitsky, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 3676 bytes --]

On 10.09.19 18:11, Maxim Levitsky wrote:
> On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
>> While it is more likely that transfers complete after some file
>> descriptor has data ready to read, we probably should not rely on it.
>> Better be safe than sorry and call curl_multi_check_completion() in
>> curl_multi_do(), too, just like it is done in curl_multi_read().
>>
>> With this change, curl_multi_do() and curl_multi_read() are actually the
>> same, so drop curl_multi_read() and use curl_multi_do() as the sole FD
>> handler.
> 
> I understand the reasoning, but I still a bit worry that this
> could paper over some bug/race in the future.
> If curl asks us only to deal with write, that would mean
> that it doesn't expect any data to be received.

I can imagine that maybe it wants to send some data first (to close the
connection) before it really marks the request as done.

> Do you by a chance have an example, of this patch
> affecting the code? Maybe when a unexpected error reply
> is received from the server?

No, I don’t.  As John said, this is just to ensure that we always call
curl_multi_check_completion() after the read_cb might have been invoked
(and once the request is marked as “done”).

> I don't really know the CURL library, so I probably missed
> something important.

I’d wager a guess that nobody really does because otherwise block/curl.c
wouldn’t be in the “Odd Fixes” category (with no dedicated maintainer).

Max

> Other than that,
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> 
> 
> Best regards,
> 	Maxim Levitsky
> 
> 
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>  block/curl.c | 14 ++------------
>>  1 file changed, 2 insertions(+), 12 deletions(-)
>>
>> diff --git a/block/curl.c b/block/curl.c
>> index 95d7b77dc0..5838afef99 100644
>> --- a/block/curl.c
>> +++ b/block/curl.c
>> @@ -139,7 +139,6 @@ typedef struct BDRVCURLState {
>>  
>>  static void curl_clean_state(CURLState *s);
>>  static void curl_multi_do(void *arg);
>> -static void curl_multi_read(void *arg);
>>  
>>  #ifdef NEED_CURL_TIMER_CALLBACK
>>  /* Called from curl_multi_do_locked, with s->mutex held.  */
>> @@ -186,7 +185,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>>      switch (action) {
>>          case CURL_POLL_IN:
>>              aio_set_fd_handler(s->aio_context, fd, false,
>> -                               curl_multi_read, NULL, NULL, state);
>> +                               curl_multi_do, NULL, NULL, state);
>>              break;
>>          case CURL_POLL_OUT:
>>              aio_set_fd_handler(s->aio_context, fd, false,
>> @@ -194,7 +193,7 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
>>              break;
>>          case CURL_POLL_INOUT:
>>              aio_set_fd_handler(s->aio_context, fd, false,
>> -                               curl_multi_read, curl_multi_do, NULL, state);
>> +                               curl_multi_do, curl_multi_do, NULL, state);
>>              break;
>>          case CURL_POLL_REMOVE:
>>              aio_set_fd_handler(s->aio_context, fd, false,
>> @@ -416,15 +415,6 @@ static void curl_multi_do(void *arg)
>>  {
>>      CURLState *s = (CURLState *)arg;
>>  
>> -    qemu_mutex_lock(&s->s->mutex);
>> -    curl_multi_do_locked(s);
>> -    qemu_mutex_unlock(&s->s->mutex);
>> -}
>> -
>> -static void curl_multi_read(void *arg)
>> -{
>> -    CURLState *s = (CURLState *)arg;
>> -
>>      qemu_mutex_lock(&s->s->mutex);
>>      curl_multi_do_locked(s);
>>      curl_multi_check_completion(s->s);
> 
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 6/7] curl: Handle success in multi_check_completion
  2019-09-10 16:13   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
@ 2019-09-13 11:20     ` Max Reitz
  2019-09-13 11:47       ` Maxim Levitsky
  0 siblings, 1 reply; 22+ messages in thread
From: Max Reitz @ 2019-09-13 11:20 UTC (permalink / raw)
  To: Maxim Levitsky, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 6170 bytes --]

On 10.09.19 18:13, Maxim Levitsky wrote:
> On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
>> Background: As of cURL 7.59.0, it verifies that several functions are
>> not called from within a callback.  Among these functions is
>> curl_multi_add_handle().
>>
>> curl_read_cb() is a callback from cURL and not a coroutine.  Waking up
>> acb->co will lead to entering it then and there, which means the current
>> request will settle and the caller (if it runs in the same coroutine)
>> may then issue the next request.  In such a case, we will enter
>> curl_setup_preadv() effectively from within curl_read_cb().
>>
>> Calling curl_multi_add_handle() will then fail and the new request will
>> not be processed.
>>
>> Fix this by not letting curl_read_cb() wake up acb->co.  Instead, leave
>> the whole business of settling the AIOCB objects to
>> curl_multi_check_completion() (which is called from our timer callback
>> and our FD handler, so not from any cURL callbacks).
>>
>> Reported-by: Natalie Gavrielov <ngavrilo@redhat.com>
>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1740193
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>  block/curl.c | 69 ++++++++++++++++++++++------------------------------
>>  1 file changed, 29 insertions(+), 40 deletions(-)
>>
>> diff --git a/block/curl.c b/block/curl.c
>> index fd70f1ebc4..c343c7ed3d 100644
>> --- a/block/curl.c
>> +++ b/block/curl.c
>> @@ -229,7 +229,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
>>  {
>>      CURLState *s = ((CURLState*)opaque);
>>      size_t realsize = size * nmemb;
>> -    int i;
>>  
>>      trace_curl_read_cb(realsize);
>>  
>> @@ -245,32 +244,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
>>      memcpy(s->orig_buf + s->buf_off, ptr, realsize);
>>      s->buf_off += realsize;
>>  
>> -    for(i=0; i<CURL_NUM_ACB; i++) {
>> -        CURLAIOCB *acb = s->acb[i];
>> -
>> -        if (!acb)
>> -            continue;
>> -
>> -        if ((s->buf_off >= acb->end)) {
>> -            size_t request_length = acb->bytes;
>> -
>> -            qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
>> -                                acb->end - acb->start);
>> -
>> -            if (acb->end - acb->start < request_length) {
>> -                size_t offset = acb->end - acb->start;
>> -                qemu_iovec_memset(acb->qiov, offset, 0,
>> -                                  request_length - offset);
>> -            }
>> -
>> -            acb->ret = 0;
>> -            s->acb[i] = NULL;
>> -            qemu_mutex_unlock(&s->s->mutex);
>> -            aio_co_wake(acb->co);
>> -            qemu_mutex_lock(&s->s->mutex);
>> -        }
>> -    }
>> -
>>  read_end:
>>      /* curl will error out if we do not return this value */
>>      return size * nmemb;
>> @@ -351,13 +324,14 @@ static void curl_multi_check_completion(BDRVCURLState *s)
>>              break;
>>  
>>          if (msg->msg == CURLMSG_DONE) {
>> +            int i;
>>              CURLState *state = NULL;
>> +            bool error = msg->data.result != CURLE_OK;
>> +
>>              curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
>>                                (char **)&state);
>>  
>> -            /* ACBs for successful messages get completed in curl_read_cb */
>> -            if (msg->data.result != CURLE_OK) {
>> -                int i;
>> +            if (error) {
>>                  static int errcount = 100;
>>  
>>                  /* Don't lose the original error message from curl, since
>> @@ -369,20 +343,35 @@ static void curl_multi_check_completion(BDRVCURLState *s)
>>                          error_report("curl: further errors suppressed");
>>                      }
>>                  }
>> +            }
>>  
>> -                for (i = 0; i < CURL_NUM_ACB; i++) {
>> -                    CURLAIOCB *acb = state->acb[i];
>> +            for (i = 0; i < CURL_NUM_ACB; i++) {
>> +                CURLAIOCB *acb = state->acb[i];
>>  
>> -                    if (acb == NULL) {
>> -                        continue;
>> -                    }
>> +                if (acb == NULL) {
>> +                    continue;
>> +                }
>> +
>> +                if (!error) {
>> +                    /* Assert that we have read all data */
>> +                    assert(state->buf_off >= acb->end);
>> +
>> +                    qemu_iovec_from_buf(acb->qiov, 0,
>> +                                        state->orig_buf + acb->start,
>> +                                        acb->end - acb->start);
>>  
>> -                    acb->ret = -EIO;
>> -                    state->acb[i] = NULL;
>> -                    qemu_mutex_unlock(&s->mutex);
>> -                    aio_co_wake(acb->co);
>> -                    qemu_mutex_lock(&s->mutex);
>> +                    if (acb->end - acb->start < acb->bytes) {
>> +                        size_t offset = acb->end - acb->start;
>> +                        qemu_iovec_memset(acb->qiov, offset, 0,
>> +                                          acb->bytes - offset);
>> +                    }
> Original code was memsetting the tail of the buffer before waking up the coroutine.
> Is this change intended?
> 
> aio_co_wake doesn't enter the co-routine if already in coroutine, but
> I think that this is an aio fd handler with isn't run in co-routine itself,
> so the callback could run with not yet ready data.

(Sorry, I don’t know why I forgot to reply.)

I don’t see how anything changes.  aio_co_wake() is still called after
the qemu_iovec_memset().

Max

>>                  }
>> +
>> +                acb->ret = error ? -EIO : 0;
>> +                state->acb[i] = NULL;
>> +                qemu_mutex_unlock(&s->mutex);
>> +                aio_co_wake(acb->co);
>> +                qemu_mutex_lock(&s->mutex);
>>              }
>>  
>>              curl_clean_state(state);
> 
> 
> Best regards,
> 	Maxim Levitsky
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 6/7] curl: Handle success in multi_check_completion
  2019-09-13 11:20     ` Max Reitz
@ 2019-09-13 11:47       ` Maxim Levitsky
  0 siblings, 0 replies; 22+ messages in thread
From: Maxim Levitsky @ 2019-09-13 11:47 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-stable, qemu-devel

On Fri, 2019-09-13 at 13:20 +0200, Max Reitz wrote:
> On 10.09.19 18:13, Maxim Levitsky wrote:
> > On Tue, 2019-09-10 at 14:41 +0200, Max Reitz wrote:
> > > Background: As of cURL 7.59.0, it verifies that several functions are
> > > not called from within a callback.  Among these functions is
> > > curl_multi_add_handle().
> > > 
> > > curl_read_cb() is a callback from cURL and not a coroutine.  Waking up
> > > acb->co will lead to entering it then and there, which means the current
> > > request will settle and the caller (if it runs in the same coroutine)
> > > may then issue the next request.  In such a case, we will enter
> > > curl_setup_preadv() effectively from within curl_read_cb().
> > > 
> > > Calling curl_multi_add_handle() will then fail and the new request will
> > > not be processed.
> > > 
> > > Fix this by not letting curl_read_cb() wake up acb->co.  Instead, leave
> > > the whole business of settling the AIOCB objects to
> > > curl_multi_check_completion() (which is called from our timer callback
> > > and our FD handler, so not from any cURL callbacks).
> > > 
> > > Reported-by: Natalie Gavrielov <ngavrilo@redhat.com>
> > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1740193
> > > Cc: qemu-stable@nongnu.org
> > > Signed-off-by: Max Reitz <mreitz@redhat.com>
> > > ---
> > >  block/curl.c | 69 ++++++++++++++++++++++------------------------------
> > >  1 file changed, 29 insertions(+), 40 deletions(-)
> > > 
> > > diff --git a/block/curl.c b/block/curl.c
> > > index fd70f1ebc4..c343c7ed3d 100644
> > > --- a/block/curl.c
> > > +++ b/block/curl.c
> > > @@ -229,7 +229,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
> > >  {
> > >      CURLState *s = ((CURLState*)opaque);
> > >      size_t realsize = size * nmemb;
> > > -    int i;
> > >  
> > >      trace_curl_read_cb(realsize);
> > >  
> > > @@ -245,32 +244,6 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
> > >      memcpy(s->orig_buf + s->buf_off, ptr, realsize);
> > >      s->buf_off += realsize;
> > >  
> > > -    for(i=0; i<CURL_NUM_ACB; i++) {
> > > -        CURLAIOCB *acb = s->acb[i];
> > > -
> > > -        if (!acb)
> > > -            continue;
> > > -
> > > -        if ((s->buf_off >= acb->end)) {
> > > -            size_t request_length = acb->bytes;
> > > -
> > > -            qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
> > > -                                acb->end - acb->start);
> > > -
> > > -            if (acb->end - acb->start < request_length) {
> > > -                size_t offset = acb->end - acb->start;
> > > -                qemu_iovec_memset(acb->qiov, offset, 0,
> > > -                                  request_length - offset);
> > > -            }
> > > -
> > > -            acb->ret = 0;
> > > -            s->acb[i] = NULL;
> > > -            qemu_mutex_unlock(&s->s->mutex);
> > > -            aio_co_wake(acb->co);
> > > -            qemu_mutex_lock(&s->s->mutex);
> > > -        }
> > > -    }
> > > -
> > >  read_end:
> > >      /* curl will error out if we do not return this value */
> > >      return size * nmemb;
> > > @@ -351,13 +324,14 @@ static void curl_multi_check_completion(BDRVCURLState *s)
> > >              break;
> > >  
> > >          if (msg->msg == CURLMSG_DONE) {
> > > +            int i;
> > >              CURLState *state = NULL;
> > > +            bool error = msg->data.result != CURLE_OK;
> > > +
> > >              curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
> > >                                (char **)&state);
> > >  
> > > -            /* ACBs for successful messages get completed in curl_read_cb */
> > > -            if (msg->data.result != CURLE_OK) {
> > > -                int i;
> > > +            if (error) {
> > >                  static int errcount = 100;
> > >  
> > >                  /* Don't lose the original error message from curl, since
> > > @@ -369,20 +343,35 @@ static void curl_multi_check_completion(BDRVCURLState *s)
> > >                          error_report("curl: further errors suppressed");
> > >                      }
> > >                  }
> > > +            }
> > >  
> > > -                for (i = 0; i < CURL_NUM_ACB; i++) {
> > > -                    CURLAIOCB *acb = state->acb[i];
> > > +            for (i = 0; i < CURL_NUM_ACB; i++) {
> > > +                CURLAIOCB *acb = state->acb[i];
> > >  
> > > -                    if (acb == NULL) {
> > > -                        continue;
> > > -                    }
> > > +                if (acb == NULL) {
> > > +                    continue;
> > > +                }
> > > +
> > > +                if (!error) {
> > > +                    /* Assert that we have read all data */
> > > +                    assert(state->buf_off >= acb->end);
> > > +
> > > +                    qemu_iovec_from_buf(acb->qiov, 0,
> > > +                                        state->orig_buf + acb->start,
> > > +                                        acb->end - acb->start);
> > >  
> > > -                    acb->ret = -EIO;
> > > -                    state->acb[i] = NULL;
> > > -                    qemu_mutex_unlock(&s->mutex);
> > > -                    aio_co_wake(acb->co);
> > > -                    qemu_mutex_lock(&s->mutex);
> > > +                    if (acb->end - acb->start < acb->bytes) {
> > > +                        size_t offset = acb->end - acb->start;
> > > +                        qemu_iovec_memset(acb->qiov, offset, 0,
> > > +                                          acb->bytes - offset);
> > > +                    }
> > 
> > Original code was memsetting the tail of the buffer before waking up the coroutine.
> > Is this change intended?
> > 
> > aio_co_wake doesn't enter the co-routine if already in coroutine, but
> > I think that this is an aio fd handler with isn't run in co-routine itself,
> > so the callback could run with not yet ready data.
> 
> (Sorry, I don’t know why I forgot to reply.)
> 
> I don’t see how anything changes.  aio_co_wake() is still called after
> the qemu_iovec_memset().
Oops, in this specific areas of the patch, I totally missed the diff markers.

So,

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>


Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash
  2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
                   ` (7 preceding siblings ...)
  2019-09-11  1:23 ` [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash John Snow
@ 2019-09-13 11:48 ` Max Reitz
  8 siblings, 0 replies; 22+ messages in thread
From: Max Reitz @ 2019-09-13 11:48 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, John Snow, qemu-devel, qemu-stable


[-- Attachment #1.1: Type: text/plain, Size: 1979 bytes --]

On 10.09.19 14:41, Max Reitz wrote:
> Hi,
> 
> As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our
> curl block driver can spontaneously hang.  This becomes visible e.g.
> when reading compressed qcow2 images:
> 
> $ qemu-img convert -p -O raw -n \
>   https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \
>   null-co://
> 
> (Hangs at 74.21 %, usually.)
> 
> A more direct way is:
> 
> $ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \
>     -d 1 -S 524288 -c 2
> 
> (Which simply performs two requests, and the second one hangs.  You can
> use any HTTP resource (probably FTP, too) you’d like that is at least
> 1 MB in size.)
> 
> It turns out that this is because cURL 7.59.0 has added a protective
> feature against some misuse we had in our code: curl_multi_add_handle()
> must not be called from within a cURL callback, but in some cases we
> did.  As of 7.59.0, this fails, our new request is not registered and
> the I/O request stalls.  This is fixed by patch 6.
> 
> Patch 7 makes us check for curl_multi_add_handle()’s return code,
> because if we had done that before, debugging would have been much
> simpler.
> 
> 
> On the way to fixing it, I had a look over the whole cURL code and found
> a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very
> safe at all.  I think this may lead to crashes, although I have never
> seen any myself.  https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5
> shows one in exactly the function in question, so I think it actually is
> a problem.
> 
> This is fixed by patch 5, patches 1, 2, and 4 prepare for it.
> 
> (Patch 3 is kind of a misc patch that should ensure that we always end
> up calling curl_multi_check_completion() whenever a request might have
> been completed.)

Thanks for the review, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-09-13 11:50 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-10 12:41 [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash Max Reitz
2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 1/7] curl: Keep pointer to the CURLState in CURLSocket Max Reitz
2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 2/7] curl: Keep *socket until the end of curl_sock_cb() Max Reitz
2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
2019-09-10 21:19   ` [Qemu-devel] " John Snow
2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 3/7] curl: Check completion in curl_multi_do() Max Reitz
2019-09-10 16:11   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
2019-09-11  1:16     ` John Snow
2019-09-11  6:51     ` Max Reitz
2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 4/7] curl: Pass CURLSocket to curl_multi_do() Max Reitz
2019-09-10 16:12   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 5/7] curl: Report only ready sockets Max Reitz
2019-09-10 16:12   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 6/7] curl: Handle success in multi_check_completion Max Reitz
2019-09-10 16:13   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
2019-09-13 11:20     ` Max Reitz
2019-09-13 11:47       ` Maxim Levitsky
2019-09-10 12:41 ` [Qemu-devel] [PATCH v2 7/7] curl: Check curl_multi_add_handle()'s return code Max Reitz
2019-09-10 16:13   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
2019-09-11  1:23 ` [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash John Snow
2019-09-13 11:48 ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.