All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server
@ 2020-08-18 11:08 David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 1/9] block/curl: Add an 'offset' parameter, affecting all range requests David Edmondson
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

When using qemu-img to convert an image that is hosted on an HTTP
server to some faster local (or pseudo-local) storage, the overall
performance can be improved by reading data from the HTTP server in
larger blocks and by caching and re-using blocks already read. This
set of patches implements both of these, and adds a further patch
allowing an offset to be added to all of the HTTP requests.

The first patch (block/curl: Add an 'offset' parameter, affecting all
range requests) allows the user to add an arbitrary offset to all
range requests sent to the HTTP server. This is useful if the image to
be read from the HTTP server is embedded in another file (for example
an uncompressed tar file). It avoids the need to first download the
file containing the source image and extract it (both of which will
require writing the image to local storage). It is logically distinct
from the rest of the patches and somewhat use-case specific.

The remaining patches implement block based retrieval of data from the
HTTP server and, optionally, caching of those blocks in memory.

The existing HTTP implementation simply reads whatever data is
requested by the caller, with the option for a user-specified amount
of readahead. This is poor for performance because most IO requests
(from QCOW2, for example) are for relatively small amounts of data,
typically no more than 64kB. This does not allow the underlying TCP
connections to achieve peak throughput.

The existing readhead mechanism is also intended to work in
conjunction with the HTTP driver's attempt to piggy-back a new IO
request on one that is already in flight. This works, but is often
defeated because it relies on the existing IO request *completely*
satisfying any subsequent request that might piggy-back onto it. This
is rarely the case and, particularly when used with "readahead", can
result in the same data being downloaded repeatedly.

The observed performance will depend greatly on the environment, but
when using qemu-img to retrieve a 1GiB QCOW2 image from an HTTPS
server, the following was observed:

| approach                                   | time (hh:mm:ss) |
|--------------------------------------------+-----------------|
| QCOW2 over HTTPS (existing implementation) |        00:00:59 |
| 256kB blocks, 8 cached blocks              |        00:00:42 |
| 2MB blocks, 100 cached blocks              |        00:00:34 |

By way of comparison, aria2c (a dedicated HTTP download client) can
retrieve the same image in 19 seconds. Obviously this is without any
QCOW2 layer.

David Edmondson (9):
  block/curl: Add an 'offset' parameter, affecting all range requests
  block/curl: Remove readahead support
  block/curl: Tracing
  block/curl: Perform IO in fixed size chunks
  block/curl: Allow the blocksize to be specified by the user
  block/curl: Cache downloaded blocks
  block/curl: Allow the user to control the number of cache blocks
  block/curl: Allow 16 sockets/ACB
  block/curl: Add readahead support

 block/curl.c                          | 515 ++++++++++++++++++++++----
 block/io.c                            |   4 +
 block/linux-aio.c                     |   6 +
 block/trace-events                    |  18 +-
 docs/system/device-url-syntax.rst.inc |  15 +
 qapi/block-core.json                  |  11 +-
 6 files changed, 488 insertions(+), 81 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 1/9] block/curl: Add an 'offset' parameter, affecting all range requests
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 2/9] block/curl: Remove readahead support David Edmondson
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

A new 'offset' parameter affects all range requests that are sent to
the remote server. The value, in bytes, is simply added to any byte
offset values passed in to the driver.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c                          | 12 +++++++++++-
 docs/system/device-url-syntax.rst.inc |  4 ++++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/curl.c b/block/curl.c
index 4f907c47be..32ec760f66 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -74,6 +74,7 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define CURL_BLOCK_OPT_PASSWORD_SECRET "password-secret"
 #define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"
 #define CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET "proxy-password-secret"
+#define CURL_BLOCK_OPT_OFFSET "offset"
 
 #define CURL_BLOCK_OPT_READAHEAD_DEFAULT (256 * 1024)
 #define CURL_BLOCK_OPT_SSLVERIFY_DEFAULT true
@@ -135,6 +136,7 @@ typedef struct BDRVCURLState {
     char *password;
     char *proxyusername;
     char *proxypassword;
+    size_t offset;
 } BDRVCURLState;
 
 static void curl_clean_state(CURLState *s);
@@ -658,6 +660,11 @@ static QemuOptsList runtime_opts = {
             .type = QEMU_OPT_STRING,
             .help = "ID of secret used as password for HTTP proxy auth",
         },
+        {
+            .name = CURL_BLOCK_OPT_OFFSET,
+            .type = QEMU_OPT_SIZE,
+            .help = "Offset into underlying content"
+        },
         { /* end of list */ }
     },
 };
@@ -769,6 +776,8 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
         }
     }
 
+    s->offset = qemu_opt_get_size(opts, CURL_BLOCK_OPT_OFFSET, 0);
+
     trace_curl_open(file);
     qemu_co_queue_init(&s->free_state_waitq);
     s->aio_context = bdrv_get_aio_context(bs);
@@ -899,7 +908,8 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
     }
     state->acb[0] = acb;
 
-    snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64, start, end);
+    snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64,
+             s->offset + start, s->offset + end);
     trace_curl_setup_preadv(acb->bytes, start, state->range);
     curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
 
diff --git a/docs/system/device-url-syntax.rst.inc b/docs/system/device-url-syntax.rst.inc
index 88d7a372a7..33f1ddfe6d 100644
--- a/docs/system/device-url-syntax.rst.inc
+++ b/docs/system/device-url-syntax.rst.inc
@@ -197,6 +197,10 @@ These are specified using a special URL syntax.
       get the size of the image to be downloaded. If not set, the
       default timeout of 5 seconds is used.
 
+   ``offset``
+      Add an offset, in bytes, to all range requests sent to the
+      remote server.
+
    Note that when passing options to qemu explicitly, ``driver`` is the
    value of <protocol>.
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 2/9] block/curl: Remove readahead support
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 1/9] block/curl: Add an 'offset' parameter, affecting all range requests David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 3/9] block/curl: Tracing David Edmondson
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

Block based caching and the current readahead support do not interact
well, so remove readahead support before adding block
caching. Readahead will be re-added later.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c                          | 23 ++++-------------------
 docs/system/device-url-syntax.rst.inc |  7 -------
 qapi/block-core.json                  |  4 ----
 3 files changed, 4 insertions(+), 30 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 32ec760f66..d0c74d7de5 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -65,7 +65,6 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define CURL_TIMEOUT_MAX 10000
 
 #define CURL_BLOCK_OPT_URL       "url"
-#define CURL_BLOCK_OPT_READAHEAD "readahead"
 #define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
 #define CURL_BLOCK_OPT_TIMEOUT "timeout"
 #define CURL_BLOCK_OPT_COOKIE    "cookie"
@@ -76,7 +75,6 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET "proxy-password-secret"
 #define CURL_BLOCK_OPT_OFFSET "offset"
 
-#define CURL_BLOCK_OPT_READAHEAD_DEFAULT (256 * 1024)
 #define CURL_BLOCK_OPT_SSLVERIFY_DEFAULT true
 #define CURL_BLOCK_OPT_TIMEOUT_DEFAULT 5
 
@@ -124,7 +122,6 @@ typedef struct BDRVCURLState {
     uint64_t len;
     CURLState states[CURL_NUM_STATES];
     char *url;
-    size_t readahead_size;
     bool sslverify;
     uint64_t timeout;
     char *cookie;
@@ -615,11 +612,6 @@ static QemuOptsList runtime_opts = {
             .type = QEMU_OPT_STRING,
             .help = "URL to open",
         },
-        {
-            .name = CURL_BLOCK_OPT_READAHEAD,
-            .type = QEMU_OPT_SIZE,
-            .help = "Readahead size",
-        },
         {
             .name = CURL_BLOCK_OPT_SSLVERIFY,
             .type = QEMU_OPT_BOOL,
@@ -705,14 +697,6 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
         goto out_noclean;
     }
 
-    s->readahead_size = qemu_opt_get_size(opts, CURL_BLOCK_OPT_READAHEAD,
-                                          CURL_BLOCK_OPT_READAHEAD_DEFAULT);
-    if ((s->readahead_size & 0x1ff) != 0) {
-        error_setg(errp, "HTTP_READAHEAD_SIZE %zd is not a multiple of 512",
-                   s->readahead_size);
-        goto out_noclean;
-    }
-
     s->timeout = qemu_opt_get_number(opts, CURL_BLOCK_OPT_TIMEOUT,
                                      CURL_BLOCK_OPT_TIMEOUT_DEFAULT);
     if (s->timeout > CURL_TIMEOUT_MAX) {
@@ -898,7 +882,7 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
     state->buf_off = 0;
     g_free(state->orig_buf);
     state->buf_start = start;
-    state->buf_len = MIN(acb->end + s->readahead_size, s->len - start);
+    state->buf_len = MIN(acb->end, s->len - start);
     end = start + state->buf_len - 1;
     state->orig_buf = g_try_malloc(state->buf_len);
     if (state->buf_len && state->orig_buf == NULL) {
@@ -971,8 +955,9 @@ static void curl_refresh_filename(BlockDriverState *bs)
 {
     BDRVCURLState *s = bs->opaque;
 
-    /* "readahead" and "timeout" do not change the guest-visible data,
-     * so ignore them */
+    /*
+     * "timeout" does not change the guest-visible data, so ignore it.
+     */
     if (s->sslverify != CURL_BLOCK_OPT_SSLVERIFY_DEFAULT ||
         s->cookie || s->username || s->password || s->proxyusername ||
         s->proxypassword)
diff --git a/docs/system/device-url-syntax.rst.inc b/docs/system/device-url-syntax.rst.inc
index 33f1ddfe6d..bc38b9df38 100644
--- a/docs/system/device-url-syntax.rst.inc
+++ b/docs/system/device-url-syntax.rst.inc
@@ -174,13 +174,6 @@ These are specified using a special URL syntax.
    ``url``
       The full URL when passing options to the driver explicitly.
 
-   ``readahead``
-      The amount of data to read ahead with each range request to the
-      remote server. This value may optionally have the suffix 'T', 'G',
-      'M', 'K', 'k' or 'b'. If it does not have a suffix, it will be
-      assumed to be in bytes. The value must be a multiple of 512 bytes.
-      It defaults to 256k.
-
    ``sslverify``
       Whether to verify the remote server's certificate when connecting
       over SSL. It can have the value 'on' or 'off'. It defaults to
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 197bdc1c36..d6f5e91cc3 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3752,9 +3752,6 @@
 #
 # @url: URL of the image file
 #
-# @readahead: Size of the read-ahead cache; must be a multiple of
-#             512 (defaults to 256 kB)
-#
 # @timeout: Timeout for connections, in seconds (defaults to 5)
 #
 # @username: Username for authentication (defaults to none)
@@ -3771,7 +3768,6 @@
 ##
 { 'struct': 'BlockdevOptionsCurlBase',
   'data': { 'url': 'str',
-            '*readahead': 'int',
             '*timeout': 'int',
             '*username': 'str',
             '*password-secret': 'str',
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 3/9] block/curl: Tracing
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 1/9] block/curl: Add an 'offset' parameter, affecting all range requests David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 2/9] block/curl: Remove readahead support David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 4/9] block/curl: Perform IO in fixed size chunks David Edmondson
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

Add some more trace functions to the IO path.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c       | 10 +++++++++-
 block/io.c         |  4 ++++
 block/linux-aio.c  |  6 ++++++
 block/trace-events | 13 ++++++++++++-
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index d0c74d7de5..d2f4de46c9 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -299,6 +299,8 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
         {
             char *buf = state->orig_buf + (start - state->buf_start);
 
+            trace_curl_pending_hit(qemu_coroutine_self(),
+                                   start, len);
             qemu_iovec_from_buf(acb->qiov, 0, buf, clamped_len);
             if (clamped_len < len) {
                 qemu_iovec_memset(acb->qiov, clamped_len, 0, len - clamped_len);
@@ -316,6 +318,8 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
         {
             int j;
 
+            trace_curl_pending_piggyback(qemu_coroutine_self(),
+                                         start, len);
             acb->start = start - state->buf_start;
             acb->end = acb->start + clamped_len;
 
@@ -328,6 +332,8 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
         }
     }
 
+    trace_curl_pending_miss(qemu_coroutine_self(), start, len);
+
     return false;
 }
 
@@ -894,7 +900,7 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
 
     snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64,
              s->offset + start, s->offset + end);
-    trace_curl_setup_preadv(acb->bytes, start, state->range);
+    trace_curl_setup_preadv(qemu_coroutine_self(), start, acb->bytes);
     curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
 
     if (curl_multi_add_handle(s->multi, state->curl) != CURLM_OK) {
@@ -923,10 +929,12 @@ static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
         .bytes = bytes
     };
 
+    trace_curl_co_preadv(qemu_coroutine_self(), offset, bytes);
     curl_setup_preadv(bs, &acb);
     while (acb.ret == -EINPROGRESS) {
         qemu_coroutine_yield();
     }
+    trace_curl_co_preadv_done(qemu_coroutine_self());
     return acb.ret;
 }
 
diff --git a/block/io.c b/block/io.c
index ad3a51ed53..f816a46bf0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1194,6 +1194,8 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
         return -ENOMEDIUM;
     }
 
+    trace_bdrv_driver_pwritev(qemu_coroutine_self(), offset, bytes);
+
     if (drv->bdrv_co_pwritev_part) {
         ret = drv->bdrv_co_pwritev_part(bs, offset, bytes, qiov, qiov_offset,
                                         flags & bs->supported_write_flags);
@@ -1253,6 +1255,8 @@ emulate_flags:
         qemu_iovec_destroy(&local_qiov);
     }
 
+    trace_bdrv_driver_pwritev_done(qemu_coroutine_self());
+
     return ret;
 }
 
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 3c0527c2bf..811e9ff94c 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -15,6 +15,7 @@
 #include "qemu/event_notifier.h"
 #include "qemu/coroutine.h"
 #include "qapi/error.h"
+#include "trace.h"
 
 #include <libaio.h>
 
@@ -391,6 +392,8 @@ int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
         .qiov       = qiov,
     };
 
+    trace_laio_co_submit(qemu_coroutine_self(), offset, qiov->size);
+
     ret = laio_do_submit(fd, &laiocb, offset, type);
     if (ret < 0) {
         return ret;
@@ -399,6 +402,9 @@ int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
     if (laiocb.ret == -EINPROGRESS) {
         qemu_coroutine_yield();
     }
+
+    trace_laio_co_submit_done(qemu_coroutine_self());
+
     return laiocb.ret;
 }
 
diff --git a/block/trace-events b/block/trace-events
index 9158335061..0b52d2ca1d 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -17,6 +17,8 @@ bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p off
 bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t cluster_offset, int64_t cluster_bytes) "bs %p offset %"PRId64" bytes %u cluster_offset %"PRId64" cluster_bytes %"PRId64
 bdrv_co_copy_range_from(void *src, uint64_t src_offset, void *dst, uint64_t dst_offset, uint64_t bytes, int read_flags, int write_flags) "src %p offset %"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" rw flags 0x%x 0x%x"
 bdrv_co_copy_range_to(void *src, uint64_t src_offset, void *dst, uint64_t dst_offset, uint64_t bytes, int read_flags, int write_flags) "src %p offset %"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" rw flags 0x%x 0x%x"
+bdrv_driver_pwritev(void *co, uint64_t offset, uint64_t bytes) "co %p writes 0x%"PRIx64" + 0x%"PRIx64
+bdrv_driver_pwritev_done(void *co) "co %p done"
 
 # stream.c
 stream_one_iteration(void *s, int64_t offset, uint64_t bytes, int is_allocated) "s %p offset %" PRId64 " bytes %" PRIu64 " is_allocated %d"
@@ -196,7 +198,12 @@ curl_sock_cb(int action, int fd) "sock action %d on fd %d"
 curl_read_cb(size_t realsize) "just reading %zu bytes"
 curl_open(const char *file) "opening %s"
 curl_open_size(uint64_t size) "size = %" PRIu64
-curl_setup_preadv(uint64_t bytes, uint64_t start, const char *range) "reading %" PRIu64 " at %" PRIu64 " (%s)"
+curl_co_preadv(void *co, uint64_t offset, uint64_t bytes) "co %p requests 0x%" PRIx64 " + 0x%" PRIx64
+curl_co_preadv_done(void *co) "co %p done"
+curl_setup_preadv(void *co, uint64_t offset, uint64_t bytes) "co %p requests 0x%" PRIx64 " + 0x%" PRIx64
+curl_pending_hit(void *co, uint64_t start, uint64_t len) "co %p finds 0x%" PRIx64 " + 0x%" PRIx64
+curl_pending_piggyback(void *co, uint64_t start, uint64_t len) "co %p pending 0x%" PRIx64 " + 0x%" PRIx64
+curl_pending_miss(void *co, uint64_t start, uint64_t len) "co %p misses 0x%" PRIx64 " + 0x%" PRIx64
 curl_close(void) "close"
 
 # file-posix.c
@@ -222,3 +229,7 @@ sheepdog_snapshot_create_inode(const char *name, uint32_t snap, uint32_t vdi) "s
 
 # ssh.c
 sftp_error(const char *op, const char *ssh_err, int ssh_err_code, int sftp_err_code) "%s failed: %s (libssh error code: %d, sftp error code: %d)"
+
+# linux-aio.c
+laio_co_submit(void *co, uint64_t offset, uint64_t bytes) "co %p writes 0x%"PRIx64" + 0x%"PRIx64
+laio_co_submit_done(void *co) "co %p done"
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 4/9] block/curl: Perform IO in fixed size chunks
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
                   ` (2 preceding siblings ...)
  2020-08-18 11:08 ` [RFC PATCH 3/9] block/curl: Tracing David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 5/9] block/curl: Allow the blocksize to be specified by the user David Edmondson
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

Do all IO requests to the remote server in 256kB chunks.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c       | 151 ++++++++++++++++++++++++++++++++-------------
 block/trace-events |   2 +
 2 files changed, 109 insertions(+), 44 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index d2f4de46c9..cfc518efda 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -78,6 +78,14 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define CURL_BLOCK_OPT_SSLVERIFY_DEFAULT true
 #define CURL_BLOCK_OPT_TIMEOUT_DEFAULT 5
 
+/* Must be a non-zero power of 2. */
+#define CURL_BLOCK_SIZE (256 * 1024)
+
+/* Align "n" to the start of the containing block. */
+#define CURL_BLOCK_ALIGN(n) ((n) & ~(CURL_BLOCK_SIZE - 1))
+/* The offset of "n" within its' block. */
+#define CURL_BLOCK_OFFSET(n) ((n) & (CURL_BLOCK_SIZE - 1))
+
 struct BDRVCURLState;
 struct CURLState;
 
@@ -86,11 +94,18 @@ static bool libcurl_initialized;
 typedef struct CURLAIOCB {
     Coroutine *co;
     QEMUIOVector *qiov;
+    uint64_t qiov_offset; /* Offset in qiov to place data. */
 
     uint64_t offset;
     uint64_t bytes;
     int ret;
 
+    /*
+     * start and end indicate the subset of the surrounding
+     * CURL_BLOCK_SIZE sized block that is the subject of this
+     * IOCB. They are offsets from the beginning of the underlying
+     * buffer.
+     */
     size_t start;
     size_t end;
 } CURLAIOCB;
@@ -110,7 +125,6 @@ typedef struct CURLState
     char *orig_buf;
     uint64_t buf_start;
     size_t buf_off;
-    size_t buf_len;
     char range[128];
     char errmsg[CURL_ERROR_SIZE];
     char in_use;
@@ -259,11 +273,11 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
         goto read_end;
     }
 
-    if (s->buf_off >= s->buf_len) {
+    if (s->buf_off >= CURL_BLOCK_SIZE) {
         /* buffer full, read nothing */
         goto read_end;
     }
-    realsize = MIN(realsize, s->buf_len - s->buf_off);
+    realsize = MIN(realsize, CURL_BLOCK_SIZE - s->buf_off);
     memcpy(s->orig_buf + s->buf_off, ptr, realsize);
     s->buf_off += realsize;
 
@@ -281,35 +295,44 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
     uint64_t clamped_end = MIN(end, s->len);
     uint64_t clamped_len = clamped_end - start;
 
-    for (i=0; i<CURL_NUM_STATES; i++) {
+    for (i = 0; i < CURL_NUM_STATES; i++) {
         CURLState *state = &s->states[i];
-        uint64_t buf_end = (state->buf_start + state->buf_off);
-        uint64_t buf_fend = (state->buf_start + state->buf_len);
+        /* The end of the currently valid data. */
+        uint64_t buf_end = state->buf_start + state->buf_off;
+        /* The end of the valid data when the IO completes. */
+        uint64_t buf_fend = state->buf_start + CURL_BLOCK_SIZE;
 
         if (!state->orig_buf)
             continue;
         if (!state->buf_off)
             continue;
 
-        // Does the existing buffer cover our section?
+        /*
+         * Does the existing buffer cover our section?
+         */
         if ((start >= state->buf_start) &&
             (start <= buf_end) &&
             (clamped_end >= state->buf_start) &&
             (clamped_end <= buf_end))
         {
-            char *buf = state->orig_buf + (start - state->buf_start);
+            char *buf = state->orig_buf + CURL_BLOCK_OFFSET(start);
 
             trace_curl_pending_hit(qemu_coroutine_self(),
                                    start, len);
-            qemu_iovec_from_buf(acb->qiov, 0, buf, clamped_len);
+            qemu_iovec_from_buf(acb->qiov, acb->qiov_offset, buf, clamped_len);
             if (clamped_len < len) {
-                qemu_iovec_memset(acb->qiov, clamped_len, 0, len - clamped_len);
+                qemu_iovec_memset(acb->qiov, acb->qiov_offset + clamped_len,
+                                  0, len - clamped_len);
             }
             acb->ret = 0;
             return true;
         }
 
-        // Wait for unfinished chunks
+        /*
+         * If an in-progress IO will provide the required data, wait
+         * for it to complete - the initiator will complete this
+         * aiocb.
+         */
         if (state->in_use &&
             (start >= state->buf_start) &&
             (start <= buf_fend) &&
@@ -320,10 +343,10 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
 
             trace_curl_pending_piggyback(qemu_coroutine_self(),
                                          start, len);
-            acb->start = start - state->buf_start;
+            acb->start = CURL_BLOCK_OFFSET(start);
             acb->end = acb->start + clamped_len;
 
-            for (j=0; j<CURL_NUM_ACB; j++) {
+            for (j = 0; j < CURL_NUM_ACB; j++) {
                 if (!state->acb[j]) {
                     state->acb[j] = acb;
                     return true;
@@ -377,7 +400,7 @@ static void curl_multi_check_completion(BDRVCURLState *s)
             for (i = 0; i < CURL_NUM_ACB; i++) {
                 CURLAIOCB *acb = state->acb[i];
 
-                if (acb == NULL) {
+                if (!acb) {
                     continue;
                 }
 
@@ -385,14 +408,15 @@ static void curl_multi_check_completion(BDRVCURLState *s)
                     /* Assert that we have read all data */
                     assert(state->buf_off >= acb->end);
 
-                    qemu_iovec_from_buf(acb->qiov, 0,
+                    qemu_iovec_from_buf(acb->qiov, acb->qiov_offset,
                                         state->orig_buf + acb->start,
                                         acb->end - acb->start);
 
                     if (acb->end - acb->start < acb->bytes) {
                         size_t offset = acb->end - acb->start;
-                        qemu_iovec_memset(acb->qiov, offset, 0,
-                                          acb->bytes - offset);
+
+                        qemu_iovec_memset(acb->qiov, acb->qiov_offset + offset,
+                                          0, acb->bytes - offset);
                     }
                 }
 
@@ -539,6 +563,7 @@ static int curl_init_state(BDRVCURLState *s, CURLState *state)
 static void curl_clean_state(CURLState *s)
 {
     int j;
+
     for (j = 0; j < CURL_NUM_ACB; j++) {
         assert(!s->acb[j]);
     }
@@ -856,18 +881,26 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
 
     BDRVCURLState *s = bs->opaque;
 
-    uint64_t start = acb->offset;
-    uint64_t end;
+    /*
+     * Our caller must ensure that this request does not span two
+     * blocks.
+     */
+    assert(CURL_BLOCK_ALIGN(acb->offset) ==
+           CURL_BLOCK_ALIGN(acb->offset + acb->bytes - 1));
 
     qemu_mutex_lock(&s->mutex);
 
-    // In case we have the requested data already (e.g. read-ahead),
-    // we can just call the callback and be done.
-    if (curl_find_buf(s, start, acb->bytes, acb)) {
+    /*
+     * Check whether the requested data can be found in an existing or
+     * pending IO request.
+     */
+    if (curl_find_buf(s, acb->offset, acb->bytes, acb)) {
         goto out;
     }
 
-    // No cache found, so let's start a new request
+    /*
+     * No cache found, so let's start a new request.
+     */
     for (;;) {
         state = curl_find_state(s);
         if (state) {
@@ -882,16 +915,15 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
         goto out;
     }
 
-    acb->start = 0;
-    acb->end = MIN(acb->bytes, s->len - start);
+    acb->start = CURL_BLOCK_OFFSET(acb->offset);
+    acb->end = acb->start + MIN(acb->bytes, s->len - acb->offset);
 
     state->buf_off = 0;
-    g_free(state->orig_buf);
-    state->buf_start = start;
-    state->buf_len = MIN(acb->end, s->len - start);
-    end = start + state->buf_len - 1;
-    state->orig_buf = g_try_malloc(state->buf_len);
-    if (state->buf_len && state->orig_buf == NULL) {
+    state->buf_start = CURL_BLOCK_ALIGN(acb->offset);
+    if (!state->orig_buf) {
+        state->orig_buf = g_try_malloc(CURL_BLOCK_SIZE);
+    }
+    if (!state->orig_buf) {
         curl_clean_state(state);
         acb->ret = -ENOMEM;
         goto out;
@@ -899,8 +931,10 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
     state->acb[0] = acb;
 
     snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64,
-             s->offset + start, s->offset + end);
-    trace_curl_setup_preadv(qemu_coroutine_self(), start, acb->bytes);
+             s->offset + state->buf_start,
+             s->offset + state->buf_start + CURL_BLOCK_SIZE);
+    trace_curl_setup_preadv(qemu_coroutine_self(), state->buf_start,
+                            CURL_BLOCK_SIZE);
     curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
 
     if (curl_multi_add_handle(s->multi, state->curl) != CURLM_OK) {
@@ -921,21 +955,50 @@ out:
 static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
         uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
-    CURLAIOCB acb = {
-        .co = qemu_coroutine_self(),
-        .ret = -EINPROGRESS,
-        .qiov = qiov,
-        .offset = offset,
-        .bytes = bytes
-    };
+    /*
+     * The lower layer does all IO in single CURL_BLOCK_SIZE sized and
+     * aligned chunks and cannot handle an IO that spans two blocks,
+     * so split the request here.
+     */
+    int ret = 0;
+    uint64_t qiov_offset = 0;
+    uint64_t off = offset;
 
     trace_curl_co_preadv(qemu_coroutine_self(), offset, bytes);
-    curl_setup_preadv(bs, &acb);
-    while (acb.ret == -EINPROGRESS) {
-        qemu_coroutine_yield();
+
+    while (bytes > 0) {
+        uint64_t len = MIN(bytes, CURL_BLOCK_SIZE - CURL_BLOCK_OFFSET(off));
+        CURLAIOCB acb = {
+            .co = qemu_coroutine_self(),
+            .ret = -EINPROGRESS,
+            .qiov = qiov,
+            .qiov_offset = qiov_offset,
+            .offset = off,
+            .bytes = len,
+        };
+
+        trace_curl_co_preadv_segment(qemu_coroutine_self(), off, len);
+
+        curl_setup_preadv(bs, &acb);
+        while (acb.ret == -EINPROGRESS) {
+            qemu_coroutine_yield();
+        }
+
+        ret = acb.ret;
+        if (ret != 0) {
+            return ret;
+        }
+
+        trace_curl_co_preadv_segment_done(qemu_coroutine_self());
+
+        qiov_offset += len;
+        off += len;
+        bytes -= len;
     }
+
     trace_curl_co_preadv_done(qemu_coroutine_self());
-    return acb.ret;
+
+    return ret;
 }
 
 static void curl_close(BlockDriverState *bs)
diff --git a/block/trace-events b/block/trace-events
index 0b52d2ca1d..72b1e927bf 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -200,6 +200,8 @@ curl_open(const char *file) "opening %s"
 curl_open_size(uint64_t size) "size = %" PRIu64
 curl_co_preadv(void *co, uint64_t offset, uint64_t bytes) "co %p requests 0x%" PRIx64 " + 0x%" PRIx64
 curl_co_preadv_done(void *co) "co %p done"
+curl_co_preadv_segment(void *co, uint64_t offset, uint64_t bytes) "co %p requests 0x%" PRIx64 " + 0x%" PRIx64
+curl_co_preadv_segment_done(void *co) "co %p done"
 curl_setup_preadv(void *co, uint64_t offset, uint64_t bytes) "co %p requests 0x%" PRIx64 " + 0x%" PRIx64
 curl_pending_hit(void *co, uint64_t start, uint64_t len) "co %p finds 0x%" PRIx64 " + 0x%" PRIx64
 curl_pending_piggyback(void *co, uint64_t start, uint64_t len) "co %p pending 0x%" PRIx64 " + 0x%" PRIx64
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 5/9] block/curl: Allow the blocksize to be specified by the user
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
                   ` (3 preceding siblings ...)
  2020-08-18 11:08 ` [RFC PATCH 4/9] block/curl: Perform IO in fixed size chunks David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-24 13:19   ` Markus Armbruster
  2020-08-18 11:08 ` [RFC PATCH 6/9] block/curl: Cache downloaded blocks David Edmondson
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

Rather than a fixed 256kB blocksize, allow the user to specify the
size used. It must be a non-zero power of two, defaulting to 256kB.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c                          | 73 +++++++++++++++++----------
 docs/system/device-url-syntax.rst.inc |  7 +++
 qapi/block-core.json                  |  4 ++
 3 files changed, 58 insertions(+), 26 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index cfc518efda..b2d02818a9 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -74,17 +74,12 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"
 #define CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET "proxy-password-secret"
 #define CURL_BLOCK_OPT_OFFSET "offset"
+#define CURL_BLOCK_OPT_BLOCKSIZE "blocksize"
 
 #define CURL_BLOCK_OPT_SSLVERIFY_DEFAULT true
 #define CURL_BLOCK_OPT_TIMEOUT_DEFAULT 5
-
 /* Must be a non-zero power of 2. */
-#define CURL_BLOCK_SIZE (256 * 1024)
-
-/* Align "n" to the start of the containing block. */
-#define CURL_BLOCK_ALIGN(n) ((n) & ~(CURL_BLOCK_SIZE - 1))
-/* The offset of "n" within its' block. */
-#define CURL_BLOCK_OFFSET(n) ((n) & (CURL_BLOCK_SIZE - 1))
+#define CURL_BLOCK_OPT_BLOCKSIZE_DEFAULT (256 * 1024)
 
 struct BDRVCURLState;
 struct CURLState;
@@ -102,7 +97,7 @@ typedef struct CURLAIOCB {
 
     /*
      * start and end indicate the subset of the surrounding
-     * CURL_BLOCK_SIZE sized block that is the subject of this
+     * BDRVCURLState.blocksize sized block that is the subject of this
      * IOCB. They are offsets from the beginning of the underlying
      * buffer.
      */
@@ -148,11 +143,24 @@ typedef struct BDRVCURLState {
     char *proxyusername;
     char *proxypassword;
     size_t offset;
+    size_t blocksize;
 } BDRVCURLState;
 
 static void curl_clean_state(CURLState *s);
 static void curl_multi_do(void *arg);
 
+/* Align "n" to the start of the containing block. */
+static inline uint64_t curl_block_align(BDRVCURLState *s, uint64_t n)
+{
+    return n & ~(s->blocksize - 1);
+}
+
+/* The offset of "n" within its' block. */
+static inline uint64_t curl_block_offset(BDRVCURLState *s, uint64_t n)
+{
+    return n & (s->blocksize - 1);
+}
+
 #ifdef NEED_CURL_TIMER_CALLBACK
 /* Called from curl_multi_do_locked, with s->mutex held.  */
 static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
@@ -264,22 +272,23 @@ static size_t curl_header_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
 /* Called from curl_multi_do_locked, with s->mutex held.  */
 static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
 {
-    CURLState *s = ((CURLState*)opaque);
+    CURLState *state = (CURLState *)opaque;
+    BDRVCURLState *s = state->s;
     size_t realsize = size * nmemb;
 
     trace_curl_read_cb(realsize);
 
-    if (!s || !s->orig_buf) {
+    if (!state || !state->orig_buf) {
         goto read_end;
     }
 
-    if (s->buf_off >= CURL_BLOCK_SIZE) {
+    if (state->buf_off >= s->blocksize) {
         /* buffer full, read nothing */
         goto read_end;
     }
-    realsize = MIN(realsize, CURL_BLOCK_SIZE - s->buf_off);
-    memcpy(s->orig_buf + s->buf_off, ptr, realsize);
-    s->buf_off += realsize;
+    realsize = MIN(realsize, s->blocksize - state->buf_off);
+    memcpy(state->orig_buf + state->buf_off, ptr, realsize);
+    state->buf_off += realsize;
 
 read_end:
     /* curl will error out if we do not return this value */
@@ -300,7 +309,7 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
         /* The end of the currently valid data. */
         uint64_t buf_end = state->buf_start + state->buf_off;
         /* The end of the valid data when the IO completes. */
-        uint64_t buf_fend = state->buf_start + CURL_BLOCK_SIZE;
+        uint64_t buf_fend = state->buf_start + s->blocksize;
 
         if (!state->orig_buf)
             continue;
@@ -315,7 +324,7 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
             (clamped_end >= state->buf_start) &&
             (clamped_end <= buf_end))
         {
-            char *buf = state->orig_buf + CURL_BLOCK_OFFSET(start);
+            char *buf = state->orig_buf + curl_block_offset(s, start);
 
             trace_curl_pending_hit(qemu_coroutine_self(),
                                    start, len);
@@ -343,7 +352,7 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
 
             trace_curl_pending_piggyback(qemu_coroutine_self(),
                                          start, len);
-            acb->start = CURL_BLOCK_OFFSET(start);
+            acb->start = curl_block_offset(s, start);
             acb->end = acb->start + clamped_len;
 
             for (j = 0; j < CURL_NUM_ACB; j++) {
@@ -688,6 +697,11 @@ static QemuOptsList runtime_opts = {
             .type = QEMU_OPT_SIZE,
             .help = "Offset into underlying content"
         },
+        {
+            .name = CURL_BLOCK_OPT_BLOCKSIZE,
+            .type = QEMU_OPT_SIZE,
+            .help = "Block size for IO requests"
+        },
         { /* end of list */ }
     },
 };
@@ -792,6 +806,12 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
     }
 
     s->offset = qemu_opt_get_size(opts, CURL_BLOCK_OPT_OFFSET, 0);
+    s->blocksize = qemu_opt_get_size(opts, CURL_BLOCK_OPT_BLOCKSIZE,
+                                     CURL_BLOCK_OPT_BLOCKSIZE_DEFAULT);
+    if ((s->blocksize & (s->blocksize - 1)) != 0) {
+        error_setg(errp, "blocksize must be a non-zero power of two");
+        goto out_noclean;
+    }
 
     trace_curl_open(file);
     qemu_co_queue_init(&s->free_state_waitq);
@@ -885,8 +905,8 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
      * Our caller must ensure that this request does not span two
      * blocks.
      */
-    assert(CURL_BLOCK_ALIGN(acb->offset) ==
-           CURL_BLOCK_ALIGN(acb->offset + acb->bytes - 1));
+    assert(curl_block_align(s, acb->offset) ==
+           curl_block_align(s, acb->offset + acb->bytes - 1));
 
     qemu_mutex_lock(&s->mutex);
 
@@ -915,13 +935,13 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
         goto out;
     }
 
-    acb->start = CURL_BLOCK_OFFSET(acb->offset);
+    acb->start = curl_block_offset(s, acb->offset);
     acb->end = acb->start + MIN(acb->bytes, s->len - acb->offset);
 
     state->buf_off = 0;
-    state->buf_start = CURL_BLOCK_ALIGN(acb->offset);
+    state->buf_start = curl_block_align(s, acb->offset);
     if (!state->orig_buf) {
-        state->orig_buf = g_try_malloc(CURL_BLOCK_SIZE);
+        state->orig_buf = g_try_malloc(s->blocksize);
     }
     if (!state->orig_buf) {
         curl_clean_state(state);
@@ -932,9 +952,9 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
 
     snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64,
              s->offset + state->buf_start,
-             s->offset + state->buf_start + CURL_BLOCK_SIZE);
+             s->offset + state->buf_start + s->blocksize);
     trace_curl_setup_preadv(qemu_coroutine_self(), state->buf_start,
-                            CURL_BLOCK_SIZE);
+                            s->blocksize);
     curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
 
     if (curl_multi_add_handle(s->multi, state->curl) != CURLM_OK) {
@@ -955,8 +975,9 @@ out:
 static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
         uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
+    BDRVCURLState *s = bs->opaque;
     /*
-     * The lower layer does all IO in single CURL_BLOCK_SIZE sized and
+     * The lower layer does all IO in single s->blocksize sized and
      * aligned chunks and cannot handle an IO that spans two blocks,
      * so split the request here.
      */
@@ -967,7 +988,7 @@ static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
     trace_curl_co_preadv(qemu_coroutine_self(), offset, bytes);
 
     while (bytes > 0) {
-        uint64_t len = MIN(bytes, CURL_BLOCK_SIZE - CURL_BLOCK_OFFSET(off));
+        uint64_t len = MIN(bytes, s->blocksize - curl_block_offset(s, off));
         CURLAIOCB acb = {
             .co = qemu_coroutine_self(),
             .ret = -EINPROGRESS,
diff --git a/docs/system/device-url-syntax.rst.inc b/docs/system/device-url-syntax.rst.inc
index bc38b9df38..ee504ee41a 100644
--- a/docs/system/device-url-syntax.rst.inc
+++ b/docs/system/device-url-syntax.rst.inc
@@ -194,6 +194,13 @@ These are specified using a special URL syntax.
       Add an offset, in bytes, to all range requests sent to the
       remote server.
 
+   ``blocksize``
+      The size of all IO requests sent to the remote server. This
+      value may optionally have the suffix 'T', 'G', 'M', 'K', 'k' or
+      'b'. If it does not have a suffix, it will be assumed to be in
+      bytes. The value must be a non-zero power of two.  It defaults
+      to 256kB.
+
    Note that when passing options to qemu explicitly, ``driver`` is the
    value of <protocol>.
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d6f5e91cc3..cd16197e1e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3764,10 +3764,14 @@
 # @proxy-password-secret: ID of a QCryptoSecret object providing a password
 #                         for proxy authentication (defaults to no password)
 #
+# @blocksize: Size of all IO requests sent to the remote server; must
+#             be a non-zero power of two (defaults to 1 256kB)
+#
 # Since: 2.9
 ##
 { 'struct': 'BlockdevOptionsCurlBase',
   'data': { 'url': 'str',
+            '*blocksize': 'int',
             '*timeout': 'int',
             '*username': 'str',
             '*password-secret': 'str',
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 6/9] block/curl: Cache downloaded blocks
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
                   ` (4 preceding siblings ...)
  2020-08-18 11:08 ` [RFC PATCH 5/9] block/curl: Allow the blocksize to be specified by the user David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 7/9] block/curl: Allow the user to control the number of cache blocks David Edmondson
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

In the hope that they will be referred to multiple times, cache the
blocks downloaded from the remote server.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c       | 321 +++++++++++++++++++++++++++++++++++++++------
 block/trace-events |   3 +
 2 files changed, 287 insertions(+), 37 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index b2d02818a9..0ea9eedebd 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -81,11 +81,29 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 /* Must be a non-zero power of 2. */
 #define CURL_BLOCK_OPT_BLOCKSIZE_DEFAULT (256 * 1024)
 
+/* The maximum number of blocks to store in the cache. */
+#define CURL_BLOCK_CACHE_MAX_BLOCKS 100
+/* The number of heads in the hash table. */
+#define CURL_BLOCK_CACHE_HASH 37
+
 struct BDRVCURLState;
 struct CURLState;
 
 static bool libcurl_initialized;
 
+typedef struct block {
+    QLIST_ENTRY(block) hash; /* Blocks with the same hash value. */
+    QLIST_ENTRY(block) free; /* Block free list. */
+    QTAILQ_ENTRY(block) lru; /* LRU list. */
+    bool hashed; /* block_t contains data and is hashed. */
+    int use;     /* Use count. */
+
+    uint64_t start; /* Offset of first byte. */
+    uint64_t count; /* Valid bytes. */
+
+    char *buf;      /* Data. */
+} block_t;
+
 typedef struct CURLAIOCB {
     Coroutine *co;
     QEMUIOVector *qiov;
@@ -117,12 +135,11 @@ typedef struct CURLState
     CURLAIOCB *acb[CURL_NUM_ACB];
     CURL *curl;
     QLIST_HEAD(, CURLSocket) sockets;
-    char *orig_buf;
-    uint64_t buf_start;
     size_t buf_off;
     char range[128];
     char errmsg[CURL_ERROR_SIZE];
     char in_use;
+    block_t *cache_block;
 } CURLState;
 
 typedef struct BDRVCURLState {
@@ -144,11 +161,17 @@ typedef struct BDRVCURLState {
     char *proxypassword;
     size_t offset;
     size_t blocksize;
+    int cache_allocated; /* The number of block_t currently allocated. */
+    QLIST_HEAD(, block) cache_free;
+    QTAILQ_HEAD(, block) cache_lru;
+    QLIST_HEAD(, block) * cache_hash;
 } BDRVCURLState;
 
 static void curl_clean_state(CURLState *s);
 static void curl_multi_do(void *arg);
 
+static void curl_cache_free(BDRVCURLState *s, block_t *b);
+
 /* Align "n" to the start of the containing block. */
 static inline uint64_t curl_block_align(BDRVCURLState *s, uint64_t n)
 {
@@ -161,6 +184,198 @@ static inline uint64_t curl_block_offset(BDRVCURLState *s, uint64_t n)
     return n & (s->blocksize - 1);
 }
 
+static uint64_t curl_cache_hash(BDRVCURLState *s, uint64_t n)
+{
+    return curl_block_align(s, n) % CURL_BLOCK_CACHE_HASH;
+}
+
+static bool curl_cache_init(BDRVCURLState *s)
+{
+    s->cache_allocated = 0;
+
+    QLIST_INIT(&s->cache_free);
+    QTAILQ_INIT(&s->cache_lru);
+
+    s->cache_hash = g_try_malloc(CURL_BLOCK_CACHE_HASH * sizeof(s->cache_hash));
+    if (!s->cache_hash) {
+        return false;
+    }
+
+    for (int i = 0; i < CURL_BLOCK_CACHE_HASH; i++) {
+        QLIST_INIT(&s->cache_hash[i]);
+    }
+
+    return true;
+}
+
+static void curl_cache_deinit(BDRVCURLState *s)
+{
+    block_t *b;
+
+    /*
+     * Cache blocks are either in the hash table or on the free list.
+     */
+    for (int i = 0; i < CURL_BLOCK_CACHE_HASH; i++) {
+        while (!QLIST_EMPTY(&s->cache_hash[i])) {
+            b = QLIST_FIRST(&s->cache_hash[i]);
+            QLIST_REMOVE(b, hash);
+            b->hashed = false;
+            curl_cache_free(s, b);
+        }
+    }
+
+    while (!QLIST_EMPTY(&s->cache_free)) {
+        b = QLIST_FIRST(&s->cache_free);
+        QLIST_REMOVE(b, free);
+        curl_cache_free(s, b);
+    }
+
+    assert(s->cache_allocated == 0);
+
+    g_free(s->cache_hash);
+    s->cache_hash = NULL;
+}
+
+static block_t *curl_cache_alloc(BDRVCURLState *s)
+{
+    block_t *b = g_try_malloc0(sizeof(*b));
+
+    if (!b) {
+        return NULL;
+    }
+
+    b->buf = g_try_malloc(s->blocksize);
+    if (!b->buf) {
+        g_free(b);
+        return NULL;
+    }
+
+    s->cache_allocated++;
+
+    trace_curl_cache_alloc(s->cache_allocated);
+
+    return b;
+}
+
+static void curl_cache_free(BDRVCURLState *s, block_t *b)
+{
+    assert(b->use == 0);
+    assert(!b->hashed);
+
+    g_free(b->buf);
+    g_free(b);
+
+    s->cache_allocated--;
+
+    trace_curl_cache_free(s->cache_allocated);
+}
+
+static block_t *curl_cache_get(BDRVCURLState *s)
+{
+    block_t *b = NULL;
+
+    /* If there is one on the free list, use it. */
+    if (!QLIST_EMPTY(&s->cache_free)) {
+        b = QLIST_FIRST(&s->cache_free);
+        QLIST_REMOVE(b, free);
+
+        assert(b->use == 0);
+        assert(!b->hashed);
+
+        b->use++;
+        goto done;
+    }
+
+    /* If not at the limit, try get a new one. */
+    if (s->cache_allocated < CURL_BLOCK_CACHE_MAX_BLOCKS) {
+        b = curl_cache_alloc(s);
+        if (b) {
+            b->use++;
+            goto done;
+        }
+    }
+
+    /* Take one from the LRU list. */
+    if (!QTAILQ_EMPTY(&s->cache_lru)) {
+        b = QTAILQ_FIRST(&s->cache_lru);
+        QTAILQ_REMOVE(&s->cache_lru, b, lru);
+
+        /* Remove it from the hash. */
+        QLIST_REMOVE(b, hash);
+
+        assert(b->use == 0);
+
+        b->hashed = false;
+        b->use++;
+        goto done;
+    }
+
+ done:
+    return b;
+}
+
+static void curl_cache_put(BDRVCURLState *s, block_t *b, bool valid)
+{
+    b->use--;
+
+    if (valid) {
+        /* If it's not hashed, hash it now. */
+        if (!b->hashed) {
+            b->hashed = true;
+            QLIST_INSERT_HEAD(&s->cache_hash[curl_cache_hash(s, b->start)],
+                              b, hash);
+        }
+
+        /* If the block is no longer being used, put it on the LRU list. */
+        if (b->use == 0) {
+            QTAILQ_INSERT_TAIL(&s->cache_lru, b, lru);
+        }
+    } else {
+        b->hashed = false;
+        QLIST_INSERT_HEAD(&s->cache_free, b, free);
+    }
+}
+
+static block_t *cache_lookup(BDRVCURLState *s, uint64_t start)
+{
+    block_t *b;
+
+    QLIST_FOREACH(b, &s->cache_hash[curl_cache_hash(s, start)], hash) {
+        if (b->start <= start && start < b->start + b->count) {
+            assert(b->hashed);
+            b->use++;
+
+            /* Remove from the LRU list. */
+            QTAILQ_REMOVE(&s->cache_lru, b, lru);
+
+            return b;
+        }
+    }
+
+    return NULL;
+}
+
+static bool curl_cache_find(BDRVCURLState *s, CURLAIOCB *acb)
+{
+    block_t *b;
+
+    b = cache_lookup(s, acb->offset);
+    if (!b) {
+        return false;
+    }
+
+    trace_curl_cache_hit(qemu_coroutine_self(), acb->offset, acb->bytes);
+
+    qemu_iovec_from_buf(acb->qiov, acb->qiov_offset,
+                        b->buf + curl_block_offset(s, acb->offset),
+                        acb->bytes);
+
+    curl_cache_put(s, b, true);
+
+    acb->ret = 0;
+    return true;
+}
+
 #ifdef NEED_CURL_TIMER_CALLBACK
 /* Called from curl_multi_do_locked, with s->mutex held.  */
 static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
@@ -274,11 +489,12 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
 {
     CURLState *state = (CURLState *)opaque;
     BDRVCURLState *s = state->s;
+    block_t *b = state->cache_block;
     size_t realsize = size * nmemb;
 
     trace_curl_read_cb(realsize);
 
-    if (!state || !state->orig_buf) {
+    if (!state || !b) {
         goto read_end;
     }
 
@@ -287,8 +503,9 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
         goto read_end;
     }
     realsize = MIN(realsize, s->blocksize - state->buf_off);
-    memcpy(state->orig_buf + state->buf_off, ptr, realsize);
+    memcpy(b->buf + state->buf_off, ptr, realsize);
     state->buf_off += realsize;
+    b->count += realsize;
 
 read_end:
     /* curl will error out if we do not return this value */
@@ -296,35 +513,38 @@ read_end:
 }
 
 /* Called with s->mutex held.  */
-static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
-                          CURLAIOCB *acb)
+static bool curl_pending_find(BDRVCURLState *s, CURLAIOCB *acb)
 {
     int i;
+    uint64_t start = acb->offset;
+    uint64_t len = acb->bytes;
     uint64_t end = start + len;
     uint64_t clamped_end = MIN(end, s->len);
     uint64_t clamped_len = clamped_end - start;
 
     for (i = 0; i < CURL_NUM_STATES; i++) {
         CURLState *state = &s->states[i];
-        /* The end of the currently valid data. */
-        uint64_t buf_end = state->buf_start + state->buf_off;
-        /* The end of the valid data when the IO completes. */
-        uint64_t buf_fend = state->buf_start + s->blocksize;
+        block_t *b = state->cache_block;
+        uint64_t buf_end, buf_fend;
 
-        if (!state->orig_buf)
-            continue;
-        if (!state->buf_off)
+        if (!b) {
             continue;
+        }
+
+        /* The end of the currently valid data. */
+        buf_end = b->start + state->buf_off;
+        /* The end of the valid data when the IO completes. */
+        buf_fend = b->start + s->blocksize;
 
         /*
          * Does the existing buffer cover our section?
          */
-        if ((start >= state->buf_start) &&
+        if ((start >= b->start) &&
             (start <= buf_end) &&
-            (clamped_end >= state->buf_start) &&
+            (clamped_end >= b->start) &&
             (clamped_end <= buf_end))
         {
-            char *buf = state->orig_buf + curl_block_offset(s, start);
+            char *buf = b->buf + curl_block_offset(s, start);
 
             trace_curl_pending_hit(qemu_coroutine_self(),
                                    start, len);
@@ -343,9 +563,9 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
          * aiocb.
          */
         if (state->in_use &&
-            (start >= state->buf_start) &&
+            (start >= b->start) &&
             (start <= buf_fend) &&
-            (clamped_end >= state->buf_start) &&
+            (clamped_end >= b->start) &&
             (clamped_end <= buf_fend))
         {
             int j;
@@ -388,10 +608,10 @@ static void curl_multi_check_completion(BDRVCURLState *s)
             int i;
             CURLState *state = NULL;
             bool error = msg->data.result != CURLE_OK;
+            block_t *b;
 
             curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
                               (char **)&state);
-
             if (error) {
                 static int errcount = 100;
 
@@ -406,6 +626,8 @@ static void curl_multi_check_completion(BDRVCURLState *s)
                 }
             }
 
+            b = state->cache_block;
+
             for (i = 0; i < CURL_NUM_ACB; i++) {
                 CURLAIOCB *acb = state->acb[i];
 
@@ -418,7 +640,7 @@ static void curl_multi_check_completion(BDRVCURLState *s)
                     assert(state->buf_off >= acb->end);
 
                     qemu_iovec_from_buf(acb->qiov, acb->qiov_offset,
-                                        state->orig_buf + acb->start,
+                                        b->buf + acb->start,
                                         acb->end - acb->start);
 
                     if (acb->end - acb->start < acb->bytes) {
@@ -436,6 +658,9 @@ static void curl_multi_check_completion(BDRVCURLState *s)
                 qemu_mutex_lock(&s->mutex);
             }
 
+            curl_cache_put(s, b, true);
+            state->cache_block = NULL;
+
             curl_clean_state(state);
             break;
         }
@@ -612,8 +837,10 @@ static void curl_detach_aio_context(BlockDriverState *bs)
             curl_easy_cleanup(s->states[i].curl);
             s->states[i].curl = NULL;
         }
-        g_free(s->states[i].orig_buf);
-        s->states[i].orig_buf = NULL;
+        if (s->states[i].cache_block) {
+            curl_cache_free(s, s->states[i].cache_block);
+            s->states[i].cache_block = NULL;
+        }
     }
     if (s->multi) {
         curl_multi_cleanup(s->multi);
@@ -868,6 +1095,10 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
     }
     trace_curl_open_size(s->len);
 
+    if (!curl_cache_init(s)) {
+        goto out;
+    }
+
     qemu_mutex_lock(&s->mutex);
     curl_clean_state(state);
     qemu_mutex_unlock(&s->mutex);
@@ -898,6 +1129,7 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
 {
     CURLState *state;
     int running;
+    block_t *b;
 
     BDRVCURLState *s = bs->opaque;
 
@@ -910,11 +1142,16 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
 
     qemu_mutex_lock(&s->mutex);
 
+    /* Can this request be handled using a cached block? */
+    if (curl_cache_find(s, acb)) {
+        goto out;
+    }
+
     /*
-     * Check whether the requested data can be found in an existing or
-     * pending IO request.
+     * Check whether the requested data can be found in a pending IO
+     * request.
      */
-    if (curl_find_buf(s, acb->offset, acb->bytes, acb)) {
+    if (curl_pending_find(s, acb)) {
         goto out;
     }
 
@@ -935,25 +1172,34 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
         goto out;
     }
 
-    acb->start = curl_block_offset(s, acb->offset);
-    acb->end = acb->start + MIN(acb->bytes, s->len - acb->offset);
-
-    state->buf_off = 0;
-    state->buf_start = curl_block_align(s, acb->offset);
-    if (!state->orig_buf) {
-        state->orig_buf = g_try_malloc(s->blocksize);
-    }
-    if (!state->orig_buf) {
+    b = curl_cache_get(s);
+    if (!b) {
         curl_clean_state(state);
         acb->ret = -ENOMEM;
         goto out;
     }
+
+    state->cache_block = b;
+
+    /*
+     * Any already cached or in-progress IO for
+     * curl_cache_base(acb->offset) would have been found by
+     * curl_cache_find() or curl_pending_find() respectively, so this
+     * must be a new request for that block.
+     */
+    b->start = curl_block_align(s, acb->offset);
+    b->count = 0; /* Nothing read so far. */
+
+    acb->start = curl_block_offset(s, acb->offset);
+    acb->end = acb->start + MIN(acb->bytes, s->len - acb->offset);
+
+    state->buf_off = 0;
     state->acb[0] = acb;
 
     snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64,
-             s->offset + state->buf_start,
-             s->offset + state->buf_start + s->blocksize);
-    trace_curl_setup_preadv(qemu_coroutine_self(), state->buf_start,
+             s->offset + b->start,
+             s->offset + b->start + s->blocksize);
+    trace_curl_setup_preadv(qemu_coroutine_self(), b->start,
                             s->blocksize);
     curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
 
@@ -1027,6 +1273,7 @@ static void curl_close(BlockDriverState *bs)
     BDRVCURLState *s = bs->opaque;
 
     trace_curl_close();
+    curl_cache_deinit(s);
     curl_detach_aio_context(bs);
     qemu_mutex_destroy(&s->mutex);
 
diff --git a/block/trace-events b/block/trace-events
index 72b1e927bf..deac37c4ad 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -206,7 +206,10 @@ curl_setup_preadv(void *co, uint64_t offset, uint64_t bytes) "co %p requests 0x%
 curl_pending_hit(void *co, uint64_t start, uint64_t len) "co %p finds 0x%" PRIx64 " + 0x%" PRIx64
 curl_pending_piggyback(void *co, uint64_t start, uint64_t len) "co %p pending 0x%" PRIx64 " + 0x%" PRIx64
 curl_pending_miss(void *co, uint64_t start, uint64_t len) "co %p misses 0x%" PRIx64 " + 0x%" PRIx64
+curl_cache_hit(void *co, uint64_t start, uint64_t len) "co %p finds 0x%" PRIx64 " + 0x%" PRIx64
 curl_close(void) "close"
+curl_cache_alloc(size_t n) "%zu cache blocks allocated"
+curl_cache_free(size_t n) "%zu cache blocks allocated"
 
 # file-posix.c
 file_xfs_write_zeroes(const char *error) "cannot write zero range (%s)"
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 7/9] block/curl: Allow the user to control the number of cache blocks
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
                   ` (5 preceding siblings ...)
  2020-08-18 11:08 ` [RFC PATCH 6/9] block/curl: Cache downloaded blocks David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 8/9] block/curl: Allow 16 sockets/ACB David Edmondson
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

Rather than using a fixed number, allow the user to specify the number
of cache blocks allocated. This cannot be less than the number of Curl
states and defaults to that value.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c                          | 20 +++++++++++++++++---
 docs/system/device-url-syntax.rst.inc |  4 ++++
 qapi/block-core.json                  |  4 ++++
 3 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 0ea9eedebd..27fa77c351 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -75,14 +75,15 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET "proxy-password-secret"
 #define CURL_BLOCK_OPT_OFFSET "offset"
 #define CURL_BLOCK_OPT_BLOCKSIZE "blocksize"
+#define CURL_BLOCK_OPT_BLOCKCOUNT "blockcount"
 
 #define CURL_BLOCK_OPT_SSLVERIFY_DEFAULT true
 #define CURL_BLOCK_OPT_TIMEOUT_DEFAULT 5
 /* Must be a non-zero power of 2. */
 #define CURL_BLOCK_OPT_BLOCKSIZE_DEFAULT (256 * 1024)
+/* The defaultnumber of blocks to store in the cache. */
+#define CURL_BLOCK_OPT_BLOCKCOUNT_DEFAULT (CURL_NUM_STATES)
 
-/* The maximum number of blocks to store in the cache. */
-#define CURL_BLOCK_CACHE_MAX_BLOCKS 100
 /* The number of heads in the hash table. */
 #define CURL_BLOCK_CACHE_HASH 37
 
@@ -161,6 +162,7 @@ typedef struct BDRVCURLState {
     char *proxypassword;
     size_t offset;
     size_t blocksize;
+    int cache_max;
     int cache_allocated; /* The number of block_t currently allocated. */
     QLIST_HEAD(, block) cache_free;
     QTAILQ_HEAD(, block) cache_lru;
@@ -287,7 +289,7 @@ static block_t *curl_cache_get(BDRVCURLState *s)
     }
 
     /* If not at the limit, try get a new one. */
-    if (s->cache_allocated < CURL_BLOCK_CACHE_MAX_BLOCKS) {
+    if (s->cache_allocated < s->cache_max) {
         b = curl_cache_alloc(s);
         if (b) {
             b->use++;
@@ -929,6 +931,11 @@ static QemuOptsList runtime_opts = {
             .type = QEMU_OPT_SIZE,
             .help = "Block size for IO requests"
         },
+        {
+            .name = CURL_BLOCK_OPT_BLOCKCOUNT,
+            .type = QEMU_OPT_SIZE,
+            .help = "Maximum number of cached blocks"
+        },
         { /* end of list */ }
     },
 };
@@ -1039,6 +1046,13 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
         error_setg(errp, "blocksize must be a non-zero power of two");
         goto out_noclean;
     }
+    s->cache_max = qemu_opt_get_size(opts, CURL_BLOCK_OPT_BLOCKCOUNT,
+                                     CURL_BLOCK_OPT_BLOCKCOUNT_DEFAULT);
+    if (s->cache_max < CURL_NUM_STATES) {
+        error_setg(errp, "blockcount must be larger than %d",
+            CURL_NUM_STATES - 1);
+        goto out_noclean;
+    }
 
     trace_curl_open(file);
     qemu_co_queue_init(&s->free_state_waitq);
diff --git a/docs/system/device-url-syntax.rst.inc b/docs/system/device-url-syntax.rst.inc
index ee504ee41a..56843cb38f 100644
--- a/docs/system/device-url-syntax.rst.inc
+++ b/docs/system/device-url-syntax.rst.inc
@@ -201,6 +201,10 @@ These are specified using a special URL syntax.
       bytes. The value must be a non-zero power of two.  It defaults
       to 256kB.
 
+   ``blockcount``
+      The number of ``blocksize`` blocks that the system may allocate
+      to store data read from the remote server.
+
    Note that when passing options to qemu explicitly, ``driver`` is the
    value of <protocol>.
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index cd16197e1e..91888166fa 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3767,11 +3767,15 @@
 # @blocksize: Size of all IO requests sent to the remote server; must
 #             be a non-zero power of two (defaults to 1 256kB)
 #
+# @blockcount: The number of IO blocks used to cache data from the
+#              remote server.
+#
 # Since: 2.9
 ##
 { 'struct': 'BlockdevOptionsCurlBase',
   'data': { 'url': 'str',
             '*blocksize': 'int',
+            '*blockcount': 'int',
             '*timeout': 'int',
             '*username': 'str',
             '*password-secret': 'str',
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 8/9] block/curl: Allow 16 sockets/ACB
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
                   ` (6 preceding siblings ...)
  2020-08-18 11:08 ` [RFC PATCH 7/9] block/curl: Allow the user to control the number of cache blocks David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-18 11:08 ` [RFC PATCH 9/9] block/curl: Add readahead support David Edmondson
  2020-08-19 14:11 ` [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server Stefan Hajnoczi
  9 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

qemu-img allows up to 16 coroutines when performing IO. Ensure that
there is a Curl socket and ACB available to each of them.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 27fa77c351..8ee314739a 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -60,8 +60,8 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define PROTOCOLS (CURLPROTO_HTTP | CURLPROTO_HTTPS | \
                    CURLPROTO_FTP | CURLPROTO_FTPS)
 
-#define CURL_NUM_STATES 8
-#define CURL_NUM_ACB    8
+#define CURL_NUM_STATES 16
+#define CURL_NUM_ACB    16
 #define CURL_TIMEOUT_MAX 10000
 
 #define CURL_BLOCK_OPT_URL       "url"
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 9/9] block/curl: Add readahead support
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
                   ` (7 preceding siblings ...)
  2020-08-18 11:08 ` [RFC PATCH 8/9] block/curl: Allow 16 sockets/ACB David Edmondson
@ 2020-08-18 11:08 ` David Edmondson
  2020-08-19 14:11 ` [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server Stefan Hajnoczi
  9 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-18 11:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, Max Reitz,
	David Edmondson, Stefan Hajnoczi

Re-add support for a readahead parameter, which is the number of bytes
added to the request from the upper layer before breaking the request
into blocks. The default is zero. The number of bytes specified has no
alignment requirements.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 block/curl.c                          | 11 +++++++++++
 docs/system/device-url-syntax.rst.inc |  7 +++++++
 qapi/block-core.json                  |  3 +++
 3 files changed, 21 insertions(+)

diff --git a/block/curl.c b/block/curl.c
index 8ee314739a..a182a55b93 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -65,6 +65,7 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
 #define CURL_TIMEOUT_MAX 10000
 
 #define CURL_BLOCK_OPT_URL       "url"
+#define CURL_BLOCK_OPT_READAHEAD "readahead"
 #define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
 #define CURL_BLOCK_OPT_TIMEOUT "timeout"
 #define CURL_BLOCK_OPT_COOKIE    "cookie"
@@ -149,6 +150,7 @@ typedef struct BDRVCURLState {
     uint64_t len;
     CURLState states[CURL_NUM_STATES];
     char *url;
+    size_t readahead_size;
     bool sslverify;
     uint64_t timeout;
     char *cookie;
@@ -881,6 +883,11 @@ static QemuOptsList runtime_opts = {
             .type = QEMU_OPT_STRING,
             .help = "URL to open",
         },
+        {
+            .name = CURL_BLOCK_OPT_READAHEAD,
+            .type = QEMU_OPT_SIZE,
+            .help = "Readahead size",
+        },
         {
             .name = CURL_BLOCK_OPT_SSLVERIFY,
             .type = QEMU_OPT_BOOL,
@@ -976,6 +983,8 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
         goto out_noclean;
     }
 
+    s->readahead_size = qemu_opt_get_size(opts, CURL_BLOCK_OPT_READAHEAD, 0);
+
     s->timeout = qemu_opt_get_number(opts, CURL_BLOCK_OPT_TIMEOUT,
                                      CURL_BLOCK_OPT_TIMEOUT_DEFAULT);
     if (s->timeout > CURL_TIMEOUT_MAX) {
@@ -1247,6 +1256,8 @@ static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
 
     trace_curl_co_preadv(qemu_coroutine_self(), offset, bytes);
 
+    bytes += s->readahead_size;
+
     while (bytes > 0) {
         uint64_t len = MIN(bytes, s->blocksize - curl_block_offset(s, off));
         CURLAIOCB acb = {
diff --git a/docs/system/device-url-syntax.rst.inc b/docs/system/device-url-syntax.rst.inc
index 56843cb38f..58245e017c 100644
--- a/docs/system/device-url-syntax.rst.inc
+++ b/docs/system/device-url-syntax.rst.inc
@@ -174,6 +174,13 @@ These are specified using a special URL syntax.
    ``url``
       The full URL when passing options to the driver explicitly.
 
+   ``readahead``
+      The amount of data to read ahead with each range request to the
+      remote server. This value may optionally have the suffix 'T', 'G',
+      'M', 'K', 'k' or 'b'. If it does not have a suffix, it will be
+      assumed to be in bytes. The value must be a multiple of 512 bytes.
+      It defaults to 256k.
+
    ``sslverify``
       Whether to verify the remote server's certificate when connecting
       over SSL. It can have the value 'on' or 'off'. It defaults to
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 91888166fa..f4092ccc14 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3752,6 +3752,8 @@
 #
 # @url: URL of the image file
 #
+# @readahead: Amount of read-ahead (defaults to 0)
+#
 # @timeout: Timeout for connections, in seconds (defaults to 5)
 #
 # @username: Username for authentication (defaults to none)
@@ -3776,6 +3778,7 @@
   'data': { 'url': 'str',
             '*blocksize': 'int',
             '*blockcount': 'int',
+            '*readahead': 'int',
             '*timeout': 'int',
             '*username': 'str',
             '*password-secret': 'str',
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server
  2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
                   ` (8 preceding siblings ...)
  2020-08-18 11:08 ` [RFC PATCH 9/9] block/curl: Add readahead support David Edmondson
@ 2020-08-19 14:11 ` Stefan Hajnoczi
  2020-08-19 14:19   ` David Edmondson
  9 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19 14:11 UTC (permalink / raw)
  To: David Edmondson
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Markus Armbruster, qemu-devel,
	Max Reitz

[-- Attachment #1: Type: text/plain, Size: 758 bytes --]

On Tue, Aug 18, 2020 at 12:08:36PM +0100, David Edmondson wrote:
> When using qemu-img to convert an image that is hosted on an HTTP
> server to some faster local (or pseudo-local) storage, the overall
> performance can be improved by reading data from the HTTP server in
> larger blocks and by caching and re-using blocks already read. This
> set of patches implements both of these, and adds a further patch
> allowing an offset to be added to all of the HTTP requests.

Hi David,
Thanks for posting this! Kevin and Max are the maintainers in this area,
but I wanted to ask an initial question:

Is caching curl-specific or could this be implemented as a block filter
driver so that it can be stacked on top of other network protocols too?

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server
  2020-08-19 14:11 ` [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server Stefan Hajnoczi
@ 2020-08-19 14:19   ` David Edmondson
  2020-08-25 10:03     ` Max Reitz
  0 siblings, 1 reply; 15+ messages in thread
From: David Edmondson @ 2020-08-19 14:19 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-block, qemu-devel, Markus Armbruster,
	Max Reitz

On Wednesday, 2020-08-19 at 15:11:37 +01, Stefan Hajnoczi wrote:

> On Tue, Aug 18, 2020 at 12:08:36PM +0100, David Edmondson wrote:
>> When using qemu-img to convert an image that is hosted on an HTTP
>> server to some faster local (or pseudo-local) storage, the overall
>> performance can be improved by reading data from the HTTP server in
>> larger blocks and by caching and re-using blocks already read. This
>> set of patches implements both of these, and adds a further patch
>> allowing an offset to be added to all of the HTTP requests.
>
> Hi David,
> Thanks for posting this! Kevin and Max are the maintainers in this area,
> but I wanted to ask an initial question:
>
> Is caching curl-specific or could this be implemented as a block filter
> driver so that it can be stacked on top of other network protocols too?

This implementation is curl specific, as you probably surmised. I will
look at implementing something similar as a block filter.

dme.
-- 
Facts don't do what I want them to.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/9] block/curl: Allow the blocksize to be specified by the user
  2020-08-18 11:08 ` [RFC PATCH 5/9] block/curl: Allow the blocksize to be specified by the user David Edmondson
@ 2020-08-24 13:19   ` Markus Armbruster
  2020-08-24 14:21     ` David Edmondson
  0 siblings, 1 reply; 15+ messages in thread
From: Markus Armbruster @ 2020-08-24 13:19 UTC (permalink / raw)
  To: David Edmondson
  Cc: Kevin Wolf, Fam Zheng, qemu-block, qemu-devel, Max Reitz,
	Stefan Hajnoczi

David Edmondson <david.edmondson@oracle.com> writes:

> Rather than a fixed 256kB blocksize, allow the user to specify the
> size used. It must be a non-zero power of two, defaulting to 256kB.

Nitpick: any power of two is non-zero.  Scratch "non-zero".

> Signed-off-by: David Edmondson <david.edmondson@oracle.com>
> ---
[...]
> diff --git a/docs/system/device-url-syntax.rst.inc b/docs/system/device-url-syntax.rst.inc
> index bc38b9df38..ee504ee41a 100644
> --- a/docs/system/device-url-syntax.rst.inc
> +++ b/docs/system/device-url-syntax.rst.inc
> @@ -194,6 +194,13 @@ These are specified using a special URL syntax.
>        Add an offset, in bytes, to all range requests sent to the
>        remote server.
>  
> +   ``blocksize``
> +      The size of all IO requests sent to the remote server. This
> +      value may optionally have the suffix 'T', 'G', 'M', 'K', 'k' or
> +      'b'. If it does not have a suffix, it will be assumed to be in
> +      bytes. The value must be a non-zero power of two.  It defaults
> +      to 256kB.
> +
>     Note that when passing options to qemu explicitly, ``driver`` is the
>     value of <protocol>.
>  
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index d6f5e91cc3..cd16197e1e 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -3764,10 +3764,14 @@
>  # @proxy-password-secret: ID of a QCryptoSecret object providing a password
>  #                         for proxy authentication (defaults to no password)
>  #
> +# @blocksize: Size of all IO requests sent to the remote server; must
> +#             be a non-zero power of two (defaults to 1 256kB)

Scratch "non-zero".

"(defaults to 1 256kB)" confuses me.  Do you mean "(defaults to 256kB)"?

Please add "(since 5.2)".

> +#
>  # Since: 2.9
>  ##
>  { 'struct': 'BlockdevOptionsCurlBase',
>    'data': { 'url': 'str',
> +            '*blocksize': 'int',

Should we use 'size' rather than 'int'?

>              '*timeout': 'int',
>              '*username': 'str',
>              '*password-secret': 'str',



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/9] block/curl: Allow the blocksize to be specified by the user
  2020-08-24 13:19   ` Markus Armbruster
@ 2020-08-24 14:21     ` David Edmondson
  0 siblings, 0 replies; 15+ messages in thread
From: David Edmondson @ 2020-08-24 14:21 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Fam Zheng, qemu-block, qemu-devel, Max Reitz,
	Stefan Hajnoczi

On Monday, 2020-08-24 at 15:19:24 +02, Markus Armbruster wrote:

> David Edmondson <david.edmondson@oracle.com> writes:
>
>> Rather than a fixed 256kB blocksize, allow the user to specify the
>> size used. It must be a non-zero power of two, defaulting to 256kB.
>
> Nitpick: any power of two is non-zero.  Scratch "non-zero".

Indeed.

>> Signed-off-by: David Edmondson <david.edmondson@oracle.com>
>> ---
> [...]
>> diff --git a/docs/system/device-url-syntax.rst.inc b/docs/system/device-url-syntax.rst.inc
>> index bc38b9df38..ee504ee41a 100644
>> --- a/docs/system/device-url-syntax.rst.inc
>> +++ b/docs/system/device-url-syntax.rst.inc
>> @@ -194,6 +194,13 @@ These are specified using a special URL syntax.
>>        Add an offset, in bytes, to all range requests sent to the
>>        remote server.
>>  
>> +   ``blocksize``
>> +      The size of all IO requests sent to the remote server. This
>> +      value may optionally have the suffix 'T', 'G', 'M', 'K', 'k' or
>> +      'b'. If it does not have a suffix, it will be assumed to be in
>> +      bytes. The value must be a non-zero power of two.  It defaults
>> +      to 256kB.
>> +
>>     Note that when passing options to qemu explicitly, ``driver`` is the
>>     value of <protocol>.
>>  
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index d6f5e91cc3..cd16197e1e 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -3764,10 +3764,14 @@
>>  # @proxy-password-secret: ID of a QCryptoSecret object providing a password
>>  #                         for proxy authentication (defaults to no password)
>>  #
>> +# @blocksize: Size of all IO requests sent to the remote server; must
>> +#             be a non-zero power of two (defaults to 1 256kB)
>
> Scratch "non-zero".
>
> "(defaults to 1 256kB)" confuses me.  Do you mean "(defaults to 256kB)"?

Yes, thanks for catching it.

> Please add "(since 5.2)".

Will do.

>> +#
>>  # Since: 2.9
>>  ##
>>  { 'struct': 'BlockdevOptionsCurlBase',
>>    'data': { 'url': 'str',
>> +            '*blocksize': 'int',
>
> Should we use 'size' rather than 'int'?

Yes.

>>              '*timeout': 'int',
>>              '*username': 'str',
>>              '*password-secret': 'str',

dme.
-- 
But are you safe Miss Gradenko?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server
  2020-08-19 14:19   ` David Edmondson
@ 2020-08-25 10:03     ` Max Reitz
  0 siblings, 0 replies; 15+ messages in thread
From: Max Reitz @ 2020-08-25 10:03 UTC (permalink / raw)
  To: David Edmondson, Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, Markus Armbruster, qemu-block, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1798 bytes --]

On 19.08.20 16:19, David Edmondson wrote:
> On Wednesday, 2020-08-19 at 15:11:37 +01, Stefan Hajnoczi wrote:
> 
>> On Tue, Aug 18, 2020 at 12:08:36PM +0100, David Edmondson wrote:
>>> When using qemu-img to convert an image that is hosted on an HTTP
>>> server to some faster local (or pseudo-local) storage, the overall
>>> performance can be improved by reading data from the HTTP server in
>>> larger blocks and by caching and re-using blocks already read. This
>>> set of patches implements both of these, and adds a further patch
>>> allowing an offset to be added to all of the HTTP requests.
>>
>> Hi David,
>> Thanks for posting this! Kevin and Max are the maintainers in this area,
>> but I wanted to ask an initial question:
>>
>> Is caching curl-specific or could this be implemented as a block filter
>> driver so that it can be stacked on top of other network protocols too?
> 
> This implementation is curl specific, as you probably surmised. I will
> look at implementing something similar as a block filter.

I think from an implementation standpoint the best would be if we could
just use such a generic caching block filter above all curl nodes so we
can drop all caching from curl.

However, I suppose then we’d at least have the problem of how to put
this cache node on top of all curl nodes without breaking compatibility,
which may be impossible.

OTOH, maybe it would be fine to leave the new cache optional, and just
leave the curl driver itself as it is.  Which would also mean that
wouldn’t need readahead support in the cache driver.

But if we do need this full cache directly in the curl driver, is it at
least possible to share most of the caching code between it and a
(potential future) dedicated cache block driver?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-08-25 10:04 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18 11:08 [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server David Edmondson
2020-08-18 11:08 ` [RFC PATCH 1/9] block/curl: Add an 'offset' parameter, affecting all range requests David Edmondson
2020-08-18 11:08 ` [RFC PATCH 2/9] block/curl: Remove readahead support David Edmondson
2020-08-18 11:08 ` [RFC PATCH 3/9] block/curl: Tracing David Edmondson
2020-08-18 11:08 ` [RFC PATCH 4/9] block/curl: Perform IO in fixed size chunks David Edmondson
2020-08-18 11:08 ` [RFC PATCH 5/9] block/curl: Allow the blocksize to be specified by the user David Edmondson
2020-08-24 13:19   ` Markus Armbruster
2020-08-24 14:21     ` David Edmondson
2020-08-18 11:08 ` [RFC PATCH 6/9] block/curl: Cache downloaded blocks David Edmondson
2020-08-18 11:08 ` [RFC PATCH 7/9] block/curl: Allow the user to control the number of cache blocks David Edmondson
2020-08-18 11:08 ` [RFC PATCH 8/9] block/curl: Allow 16 sockets/ACB David Edmondson
2020-08-18 11:08 ` [RFC PATCH 9/9] block/curl: Add readahead support David Edmondson
2020-08-19 14:11 ` [RFC PATCH 0/9] block/curl: Add caching of data downloaded from the remote server Stefan Hajnoczi
2020-08-19 14:19   ` David Edmondson
2020-08-25 10:03     ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.