All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer
@ 2016-07-15 18:31 Eric Blake
  2016-07-15 18:31 ` [Qemu-devel] [PATCH v3 1/6] block: Fragment reads to max transfer length Eric Blake
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Eric Blake @ 2016-07-15 18:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, stefanha

We have max_transfer documented in BlockLimits, but while we
honor it during pwrite_zeroes, we were blindly ignoring it
during pwritev and preadv, leading to multiple drivers having
to implement fragmentation themselves.  This series moves
fragmentation to the block layer, then fixes the NBD and iscsi
driver to use it.

qcow2 still does self-fragmenting, but that's because of cluster
boundaries where it really has to do additional work beyond what
the block layer can automatically provide.

Prequisite: Kevin's latest block branch PULL request

Also available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-fragment-v3

Changes since v2:
- patch 1, 3: change return semantics to be 0, not bytes, on success,
since at least one caller asserts during 'make check' otherwise

001/6:[0008] [FC] 'block: Fragment reads to max transfer length'
002/6:[----] [--] 'raw_bsd: Don't advertise flags not supported by protocol layer'
003/6:[0002] [FC] 'block: Fragment writes to max transfer length'
004/6:[----] [--] 'nbd: Rely on block layer to break up large requests'
005/6:[----] [--] 'nbd: Drop unused offset parameter'
006/6:[----] [--] 'iscsi: Rely on block layer to break up large requests'

Eric Blake (6):
  block: Fragment reads to max transfer length
  raw_bsd: Don't advertise flags not supported by protocol layer
  block: Fragment writes to max transfer length
  nbd: Rely on block layer to break up large requests
  nbd: Drop unused offset parameter
  iscsi: Rely on block layer to break up large requests

 include/block/nbd.h |  1 -
 nbd/nbd-internal.h  |  4 +--
 block/io.c          | 90 +++++++++++++++++++++++++++++++++++++++--------------
 block/iscsi.c       | 14 +++------
 block/nbd-client.c  | 78 +++++++++++++---------------------------------
 block/nbd.c         | 12 ++-----
 block/raw_bsd.c     |  6 ++--
 nbd/common.c        |  5 +--
 8 files changed, 103 insertions(+), 107 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v3 1/6] block: Fragment reads to max transfer length
  2016-07-15 18:31 [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Eric Blake
@ 2016-07-15 18:31 ` Eric Blake
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 2/6] raw_bsd: Don't advertise flags not supported by protocol layer Eric Blake
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2016-07-15 18:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, stefanha, Max Reitz

Drivers should be able to rely on the block layer honoring the
max transfer length, rather than needing to return -EINVAL
(iscsi) or manually fragment things (nbd).  This patch adds
the fragmentation in the block layer, after requests have been
aligned (fragmenting before alignment would lead to multiple
unaligned requests, rather than just the head and tail).

The return value was previously nebulous on success on whether
it was zero or the length read; and fragmenting may introduce
yet other non-zero values if we use the last length read.  But
as at least some callers are sloppy and expect only zero on
success, it is easiest to just guarantee 0.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: Slam 'ret' to 0, not 'bytes', as some callers choke otherwise
v2: Fix uninitialized use of 'ret' for an all-zero read beyond eof
---
 block/io.c | 55 ++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/block/io.c b/block/io.c
index 2887394..ceff694 100644
--- a/block/io.c
+++ b/block/io.c
@@ -971,8 +971,8 @@ err:

 /*
  * Forwards an already correctly aligned request to the BlockDriver. This
- * handles copy on read and zeroing after EOF; any other features must be
- * implemented by the caller.
+ * handles copy on read, zeroing after EOF, and fragmentation of large
+ * reads; any other features must be implemented by the caller.
  */
 static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
     BdrvTrackedRequest *req, int64_t offset, unsigned int bytes,
@@ -980,12 +980,16 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
 {
     int64_t total_bytes, max_bytes;
     int ret;
+    uint64_t bytes_remaining = bytes;
+    int max_transfer;

     assert(is_power_of_2(align));
     assert((offset & (align - 1)) == 0);
     assert((bytes & (align - 1)) == 0);
     assert(!qiov || bytes == qiov->size);
     assert((bs->open_flags & BDRV_O_NO_IO) == 0);
+    max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
+                                   align);

     /* TODO: We would need a per-BDS .supported_read_flags and
      * potential fallback support, if we ever implement any read flags
@@ -1024,7 +1028,7 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
         }
     }

-    /* Forward the request to the BlockDriver */
+    /* Forward the request to the BlockDriver, possibly fragmenting it */
     total_bytes = bdrv_getlength(bs);
     if (total_bytes < 0) {
         ret = total_bytes;
@@ -1032,30 +1036,39 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
     }

     max_bytes = ROUND_UP(MAX(0, total_bytes - offset), align);
-    if (bytes <= max_bytes) {
+    if (bytes <= max_bytes && bytes <= max_transfer) {
         ret = bdrv_driver_preadv(bs, offset, bytes, qiov, 0);
-    } else if (max_bytes > 0) {
-        QEMUIOVector local_qiov;
-
-        qemu_iovec_init(&local_qiov, qiov->niov);
-        qemu_iovec_concat(&local_qiov, qiov, 0, max_bytes);
-
-        ret = bdrv_driver_preadv(bs, offset, max_bytes, &local_qiov, 0);
-
-        qemu_iovec_destroy(&local_qiov);
-    } else {
-        ret = 0;
+        goto out;
     }

-    /* Reading beyond end of file is supposed to produce zeroes */
-    if (ret == 0 && total_bytes < offset + bytes) {
-        uint64_t zero_offset = MAX(0, total_bytes - offset);
-        uint64_t zero_bytes = offset + bytes - zero_offset;
-        qemu_iovec_memset(qiov, zero_offset, 0, zero_bytes);
+    while (bytes_remaining) {
+        int num;
+
+        if (max_bytes) {
+            QEMUIOVector local_qiov;
+
+            num = MIN(bytes_remaining, MIN(max_bytes, max_transfer));
+            assert(num);
+            qemu_iovec_init(&local_qiov, qiov->niov);
+            qemu_iovec_concat(&local_qiov, qiov, bytes - bytes_remaining, num);
+
+            ret = bdrv_driver_preadv(bs, offset + bytes - bytes_remaining,
+                                     num, &local_qiov, 0);
+            max_bytes -= num;
+            qemu_iovec_destroy(&local_qiov);
+        } else {
+            num = bytes_remaining;
+            ret = qemu_iovec_memset(qiov, bytes - bytes_remaining, 0,
+                                    bytes_remaining);
+        }
+        if (ret < 0) {
+            goto out;
+        }
+        bytes_remaining -= num;
     }

 out:
-    return ret;
+    return ret < 0 ? ret : 0;
 }

 /*
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v3 2/6] raw_bsd: Don't advertise flags not supported by protocol layer
  2016-07-15 18:31 [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Eric Blake
  2016-07-15 18:31 ` [Qemu-devel] [PATCH v3 1/6] block: Fragment reads to max transfer length Eric Blake
@ 2016-07-15 18:32 ` Eric Blake
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 3/6] block: Fragment writes to max transfer length Eric Blake
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2016-07-15 18:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, stefanha, Max Reitz

The raw format layer supports all flags via passthrough - but
it only makes sense to pass through flags that the lower layer
actually supports.

The next patch gives stronger reasoning for why this is correct.
At the moment, the raw format layer ignores the max_transfer
limit of its protocol layer, and an attempt to do the qemu-io
'w -f 0 40m' to an NBD server that lacks FUA will pass the entire
40m request to the NBD driver, which then fragments the request
itself into a 32m write, 8m write, and flush.  But once the block
layer starts honoring limits and fragmenting packets, the raw
driver will hand the NBD driver two separate requests; if both
requests have BDRV_REQ_FUA set, then this would result in a 32m
write, flush, 8m write, and second flush.  By having the raw
layer no longer advertise FUA support when the protocol layer
lacks it, we are back to a single flush at the block layer for
the overall 40m request.

Note that 'w -f -z 0 40m' does not currently exhibit the same
problem, because there, the fragmentation does not occur until
at the NBD layer (the raw layer has .bdrv_co_pwrite_zeroes, and
the NBD layer doesn't advertise max_pwrite_zeroes to constrain
things at the raw layer) - but the problem is latent and we
would again have too many flushes without this patch once the
NBD layer implements support for the new NBD_CMD_WRITE_ZEROES
command, if it sets max_pwrite_zeroes to the same 32m limit as
recommended by the NBD protocol.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v3: no change
v2: no code change, but hoist earlier in series and reword commit
---
 block/raw_bsd.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 5f9dd29..d767413 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -192,8 +192,10 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
                     Error **errp)
 {
     bs->sg = bs->file->bs->sg;
-    bs->supported_write_flags = BDRV_REQ_FUA;
-    bs->supported_zero_flags = BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP;
+    bs->supported_write_flags = BDRV_REQ_FUA &
+        bs->file->bs->supported_write_flags;
+    bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
+        bs->file->bs->supported_zero_flags;

     if (bs->probed && !bdrv_is_read_only(bs)) {
         fprintf(stderr,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v3 3/6] block: Fragment writes to max transfer length
  2016-07-15 18:31 [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Eric Blake
  2016-07-15 18:31 ` [Qemu-devel] [PATCH v3 1/6] block: Fragment reads to max transfer length Eric Blake
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 2/6] raw_bsd: Don't advertise flags not supported by protocol layer Eric Blake
@ 2016-07-15 18:32 ` Eric Blake
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 4/6] nbd: Rely on block layer to break up large requests Eric Blake
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Eric Blake @ 2016-07-15 18:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, stefanha, Max Reitz

Drivers should be able to rely on the block layer honoring the
max transfer length, rather than needing to return -EINVAL
(iscsi) or manually fragment things (nbd).  We already fragment
write zeroes at the block layer; this patch adds the fragmentation
for normal writes, after requests have been aligned (fragmenting
before alignment would lead to multiple unaligned requests, rather
than just the head and tail).

When fragmenting a large request where FUA was requested, but
where we know that FUA is implemented by flushing all requests
rather than the given request, then we can still get by with
only one flush.  Note, however, that we need a followup patch
to the raw format driver to avoid a regression in the number of
flushes actually issued.

The return value was previously nebulous on success (sometimes
zero, sometimes the length written); since we never have a short
write, and since fragmenting may store yet another positive
value in 'ret', change the function to always return 0 on success,
matching what we do in bdrv_aligned_preadv().

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: another return semantic tweak
v2: Tweak success return semantics to match read
---
 block/io.c | 35 +++++++++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index ceff694..86db77e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1269,7 +1269,8 @@ fail:
 }

 /*
- * Forwards an already correctly aligned write request to the BlockDriver.
+ * Forwards an already correctly aligned write request to the BlockDriver,
+ * after possibly fragmenting it.
  */
 static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     BdrvTrackedRequest *req, int64_t offset, unsigned int bytes,
@@ -1281,6 +1282,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,

     int64_t start_sector = offset >> BDRV_SECTOR_BITS;
     int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+    uint64_t bytes_remaining = bytes;
+    int max_transfer;

     assert(is_power_of_2(align));
     assert((offset & (align - 1)) == 0);
@@ -1288,6 +1291,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     assert(!qiov || bytes == qiov->size);
     assert((bs->open_flags & BDRV_O_NO_IO) == 0);
     assert(!(flags & ~BDRV_REQ_MASK));
+    max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
+                                   align);

     waited = wait_serialising_requests(req);
     assert(!waited || !req->serialising);
@@ -1310,9 +1315,34 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     } else if (flags & BDRV_REQ_ZERO_WRITE) {
         bdrv_debug_event(bs, BLKDBG_PWRITEV_ZERO);
         ret = bdrv_co_do_pwrite_zeroes(bs, offset, bytes, flags);
-    } else {
+    } else if (bytes <= max_transfer) {
         bdrv_debug_event(bs, BLKDBG_PWRITEV);
         ret = bdrv_driver_pwritev(bs, offset, bytes, qiov, flags);
+    } else {
+        bdrv_debug_event(bs, BLKDBG_PWRITEV);
+        while (bytes_remaining) {
+            int num = MIN(bytes_remaining, max_transfer);
+            QEMUIOVector local_qiov;
+            int local_flags = flags;
+
+            assert(num);
+            if (num < bytes_remaining && (flags & BDRV_REQ_FUA) &&
+                !(bs->supported_write_flags & BDRV_REQ_FUA)) {
+                /* If FUA is going to be emulated by flush, we only
+                 * need to flush on the last iteration */
+                local_flags &= ~BDRV_REQ_FUA;
+            }
+            qemu_iovec_init(&local_qiov, qiov->niov);
+            qemu_iovec_concat(&local_qiov, qiov, bytes - bytes_remaining, num);
+
+            ret = bdrv_driver_pwritev(bs, offset + bytes - bytes_remaining,
+                                      num, &local_qiov, local_flags);
+            qemu_iovec_destroy(&local_qiov);
+            if (ret < 0) {
+                break;
+            }
+            bytes_remaining -= num;
+        }
     }
     bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);

@@ -1324,6 +1354,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,

     if (ret >= 0) {
         bs->total_sectors = MAX(bs->total_sectors, end_sector);
+        ret = 0;
     }

     return ret;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v3 4/6] nbd: Rely on block layer to break up large requests
  2016-07-15 18:31 [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Eric Blake
                   ` (2 preceding siblings ...)
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 3/6] block: Fragment writes to max transfer length Eric Blake
@ 2016-07-15 18:32 ` Eric Blake
  2016-07-18  8:16   ` Paolo Bonzini
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 5/6] nbd: Drop unused offset parameter Eric Blake
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Eric Blake @ 2016-07-15 18:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, stefanha, Paolo Bonzini, Max Reitz

Now that the block layer will honor max_transfer, we can simplify
our code to rely on that guarantee.

The readv code can call directly into nbd-client, just as the
writev code has done since commit 52a4650.

Interestingly enough, while qemu-io 'w 0 40m' splits into a 32M
and 8M transaction, 'w -z 0 40m' splits into two 16M and an 8M,
because the block layer caps the bounce buffer for writing zeroes
at 16M.  When we later introduce support for NBD_CMD_WRITE_ZEROES,
we can get a full 32M zero write (or larger, if the client and
server negotiate that write zeroes can use a larger size than
ordinary writes).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/nbd-client.c | 51 ++++++++-------------------------------------------
 block/nbd.c        | 12 +++---------
 2 files changed, 11 insertions(+), 52 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 4cc408d..f1fb58b 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -217,15 +217,15 @@ static void nbd_coroutine_end(NbdClientSession *s,
     }
 }

-static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
-                          int nb_sectors, QEMUIOVector *qiov,
-                          int offset)
+int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
+                        int nb_sectors, QEMUIOVector *qiov)
 {
     NbdClientSession *client = nbd_get_client_session(bs);
     struct nbd_request request = { .type = NBD_CMD_READ };
     struct nbd_reply reply;
     ssize_t ret;

+    assert(nb_sectors <= NBD_MAX_SECTORS);
     request.from = sector_num * 512;
     request.len = nb_sectors * 512;

@@ -234,16 +234,15 @@ static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
     if (ret < 0) {
         reply.error = -ret;
     } else {
-        nbd_co_receive_reply(client, &request, &reply, qiov, offset);
+        nbd_co_receive_reply(client, &request, &reply, qiov, 0);
     }
     nbd_coroutine_end(client, &request);
     return -reply.error;

 }

-static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
-                           int nb_sectors, QEMUIOVector *qiov,
-                           int offset, int flags)
+int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
+                         int nb_sectors, QEMUIOVector *qiov, int flags)
 {
     NbdClientSession *client = nbd_get_client_session(bs);
     struct nbd_request request = { .type = NBD_CMD_WRITE };
@@ -255,11 +254,12 @@ static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
         request.type |= NBD_CMD_FLAG_FUA;
     }

+    assert(nb_sectors <= NBD_MAX_SECTORS);
     request.from = sector_num * 512;
     request.len = nb_sectors * 512;

     nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, qiov, offset);
+    ret = nbd_co_send_request(bs, &request, qiov, 0);
     if (ret < 0) {
         reply.error = -ret;
     } else {
@@ -269,41 +269,6 @@ static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
     return -reply.error;
 }

-int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
-                        int nb_sectors, QEMUIOVector *qiov)
-{
-    int offset = 0;
-    int ret;
-    while (nb_sectors > NBD_MAX_SECTORS) {
-        ret = nbd_co_readv_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
-        if (ret < 0) {
-            return ret;
-        }
-        offset += NBD_MAX_SECTORS * 512;
-        sector_num += NBD_MAX_SECTORS;
-        nb_sectors -= NBD_MAX_SECTORS;
-    }
-    return nbd_co_readv_1(bs, sector_num, nb_sectors, qiov, offset);
-}
-
-int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
-                         int nb_sectors, QEMUIOVector *qiov, int flags)
-{
-    int offset = 0;
-    int ret;
-    while (nb_sectors > NBD_MAX_SECTORS) {
-        ret = nbd_co_writev_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset,
-                              flags);
-        if (ret < 0) {
-            return ret;
-        }
-        offset += NBD_MAX_SECTORS * 512;
-        sector_num += NBD_MAX_SECTORS;
-        nb_sectors -= NBD_MAX_SECTORS;
-    }
-    return nbd_co_writev_1(bs, sector_num, nb_sectors, qiov, offset, flags);
-}
-
 int nbd_client_co_flush(BlockDriverState *bs)
 {
     NbdClientSession *client = nbd_get_client_session(bs);
diff --git a/block/nbd.c b/block/nbd.c
index 08e5b67..8a13078 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -349,12 +349,6 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
     return ret;
 }

-static int nbd_co_readv(BlockDriverState *bs, int64_t sector_num,
-                        int nb_sectors, QEMUIOVector *qiov)
-{
-    return nbd_client_co_readv(bs, sector_num, nb_sectors, qiov);
-}
-
 static int nbd_co_flush(BlockDriverState *bs)
 {
     return nbd_client_co_flush(bs);
@@ -450,7 +444,7 @@ static BlockDriver bdrv_nbd = {
     .instance_size              = sizeof(BDRVNBDState),
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_co_readv,
+    .bdrv_co_readv              = nbd_client_co_readv,
     .bdrv_co_writev_flags       = nbd_client_co_writev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
@@ -468,7 +462,7 @@ static BlockDriver bdrv_nbd_tcp = {
     .instance_size              = sizeof(BDRVNBDState),
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_co_readv,
+    .bdrv_co_readv              = nbd_client_co_readv,
     .bdrv_co_writev_flags       = nbd_client_co_writev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
@@ -486,7 +480,7 @@ static BlockDriver bdrv_nbd_unix = {
     .instance_size              = sizeof(BDRVNBDState),
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_co_readv,
+    .bdrv_co_readv              = nbd_client_co_readv,
     .bdrv_co_writev_flags       = nbd_client_co_writev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v3 5/6] nbd: Drop unused offset parameter
  2016-07-15 18:31 [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Eric Blake
                   ` (3 preceding siblings ...)
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 4/6] nbd: Rely on block layer to break up large requests Eric Blake
@ 2016-07-15 18:32 ` Eric Blake
  2016-07-18  8:16   ` Paolo Bonzini
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 6/6] iscsi: Rely on block layer to break up large requests Eric Blake
  2016-07-19 16:12 ` [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Stefan Hajnoczi
  6 siblings, 1 reply; 11+ messages in thread
From: Eric Blake @ 2016-07-15 18:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, stefanha, Paolo Bonzini, Max Reitz

Now that NBD relies on the block layer to fragment things, we no
longer need to track an offset argument for which fragment of
a request we are actually servicing.

While at it, use true and false instead of 0 and 1 for a bool
parameter.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v3: no change
v2: minor formatting tweak [Kevin]
---
 include/block/nbd.h |  1 -
 nbd/nbd-internal.h  |  4 ++--
 block/nbd-client.c  | 31 ++++++++++++++++---------------
 nbd/common.c        |  5 +----
 4 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index eeda3eb..503f514 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -89,7 +89,6 @@ enum {
 ssize_t nbd_wr_syncv(QIOChannel *ioc,
                      struct iovec *iov,
                      size_t niov,
-                     size_t offset,
                      size_t length,
                      bool do_read);
 int nbd_receive_negotiate(QIOChannel *ioc, const char *name, uint32_t *flags,
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 26a9f4d..93a6ca8 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -101,14 +101,14 @@ static inline ssize_t read_sync(QIOChannel *ioc, void *buffer, size_t size)
      * our request/reply.  Synchronization is done with recv_coroutine, so
      * that this is coroutine-safe.
      */
-    return nbd_wr_syncv(ioc, &iov, 1, 0, size, true);
+    return nbd_wr_syncv(ioc, &iov, 1, size, true);
 }

 static inline ssize_t write_sync(QIOChannel *ioc, void *buffer, size_t size)
 {
     struct iovec iov = { .iov_base = buffer, .iov_len = size };

-    return nbd_wr_syncv(ioc, &iov, 1, 0, size, false);
+    return nbd_wr_syncv(ioc, &iov, 1, size, false);
 }

 struct NBDTLSHandshakeData {
diff --git a/block/nbd-client.c b/block/nbd-client.c
index f1fb58b..f184844 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -116,7 +116,7 @@ static void nbd_restart_write(void *opaque)

 static int nbd_co_send_request(BlockDriverState *bs,
                                struct nbd_request *request,
-                               QEMUIOVector *qiov, int offset)
+                               QEMUIOVector *qiov)
 {
     NbdClientSession *s = nbd_get_client_session(bs);
     AioContext *aio_context;
@@ -149,8 +149,8 @@ static int nbd_co_send_request(BlockDriverState *bs,
         qio_channel_set_cork(s->ioc, true);
         rc = nbd_send_request(s->ioc, request);
         if (rc >= 0) {
-            ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov,
-                               offset, request->len, 0);
+            ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
+                               false);
             if (ret != request->len) {
                 rc = -EIO;
             }
@@ -167,8 +167,9 @@ static int nbd_co_send_request(BlockDriverState *bs,
 }

 static void nbd_co_receive_reply(NbdClientSession *s,
-    struct nbd_request *request, struct nbd_reply *reply,
-    QEMUIOVector *qiov, int offset)
+                                 struct nbd_request *request,
+                                 struct nbd_reply *reply,
+                                 QEMUIOVector *qiov)
 {
     int ret;

@@ -181,8 +182,8 @@ static void nbd_co_receive_reply(NbdClientSession *s,
         reply->error = EIO;
     } else {
         if (qiov && reply->error == 0) {
-            ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov,
-                               offset, request->len, 1);
+            ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
+                               true);
             if (ret != request->len) {
                 reply->error = EIO;
             }
@@ -230,11 +231,11 @@ int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
     request.len = nb_sectors * 512;

     nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, NULL, 0);
+    ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         reply.error = -ret;
     } else {
-        nbd_co_receive_reply(client, &request, &reply, qiov, 0);
+        nbd_co_receive_reply(client, &request, &reply, qiov);
     }
     nbd_coroutine_end(client, &request);
     return -reply.error;
@@ -259,11 +260,11 @@ int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
     request.len = nb_sectors * 512;

     nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, qiov, 0);
+    ret = nbd_co_send_request(bs, &request, qiov);
     if (ret < 0) {
         reply.error = -ret;
     } else {
-        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
+        nbd_co_receive_reply(client, &request, &reply, NULL);
     }
     nbd_coroutine_end(client, &request);
     return -reply.error;
@@ -284,11 +285,11 @@ int nbd_client_co_flush(BlockDriverState *bs)
     request.len = 0;

     nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, NULL, 0);
+    ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         reply.error = -ret;
     } else {
-        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
+        nbd_co_receive_reply(client, &request, &reply, NULL);
     }
     nbd_coroutine_end(client, &request);
     return -reply.error;
@@ -309,11 +310,11 @@ int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
     request.len = nb_sectors * 512;

     nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, NULL, 0);
+    ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         reply.error = -ret;
     } else {
-        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
+        nbd_co_receive_reply(client, &request, &reply, NULL);
     }
     nbd_coroutine_end(client, &request);
     return -reply.error;
diff --git a/nbd/common.c b/nbd/common.c
index 8ddb2dd..b583a4f 100644
--- a/nbd/common.c
+++ b/nbd/common.c
@@ -23,7 +23,6 @@
 ssize_t nbd_wr_syncv(QIOChannel *ioc,
                      struct iovec *iov,
                      size_t niov,
-                     size_t offset,
                      size_t length,
                      bool do_read)
 {
@@ -33,9 +32,7 @@ ssize_t nbd_wr_syncv(QIOChannel *ioc,
     struct iovec *local_iov_head = local_iov;
     unsigned int nlocal_iov = niov;

-    nlocal_iov = iov_copy(local_iov, nlocal_iov,
-                          iov, niov,
-                          offset, length);
+    nlocal_iov = iov_copy(local_iov, nlocal_iov, iov, niov, 0, length);

     while (nlocal_iov > 0) {
         ssize_t len;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v3 6/6] iscsi: Rely on block layer to break up large requests
  2016-07-15 18:31 [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Eric Blake
                   ` (4 preceding siblings ...)
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 5/6] nbd: Drop unused offset parameter Eric Blake
@ 2016-07-15 18:32 ` Eric Blake
  2016-07-18  8:16   ` Paolo Bonzini
  2016-07-19 16:12 ` [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Stefan Hajnoczi
  6 siblings, 1 reply; 11+ messages in thread
From: Eric Blake @ 2016-07-15 18:32 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, qemu-block, stefanha, Ronnie Sahlberg,
	Paolo Bonzini, Peter Lieven, Max Reitz

Now that the block layer honors max_request, we don't need to
bother with an EINVAL on overlarge requests, but can instead
assert that requests are well-behaved.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/iscsi.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index cf1e9e7..bdc7ade 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -472,11 +472,8 @@ iscsi_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
         return -EINVAL;
     }

-    if (bs->bl.max_transfer &&
-        nb_sectors << BDRV_SECTOR_BITS > bs->bl.max_transfer) {
-        error_report("iSCSI Error: Write of %d sectors exceeds max_xfer_len "
-                     "of %" PRIu32 " bytes", nb_sectors, bs->bl.max_transfer);
-        return -EINVAL;
+    if (bs->bl.max_transfer) {
+        assert(nb_sectors << BDRV_SECTOR_BITS <= bs->bl.max_transfer);
     }

     lba = sector_qemu2lun(sector_num, iscsilun);
@@ -650,11 +647,8 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
         return -EINVAL;
     }

-    if (bs->bl.max_transfer &&
-        nb_sectors << BDRV_SECTOR_BITS > bs->bl.max_transfer) {
-        error_report("iSCSI Error: Read of %d sectors exceeds max_xfer_len "
-                     "of %" PRIu32 " bytes", nb_sectors, bs->bl.max_transfer);
-        return -EINVAL;
+    if (bs->bl.max_transfer) {
+        assert(nb_sectors << BDRV_SECTOR_BITS <= bs->bl.max_transfer);
     }

     if (iscsilun->lbprz && nb_sectors >= ISCSI_CHECKALLOC_THRES &&
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/6] nbd: Rely on block layer to break up large requests
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 4/6] nbd: Rely on block layer to break up large requests Eric Blake
@ 2016-07-18  8:16   ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2016-07-18  8:16 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, famz, qemu-block, stefanha, Max Reitz



On 15/07/2016 20:32, Eric Blake wrote:
> Now that the block layer will honor max_transfer, we can simplify
> our code to rely on that guarantee.
> 
> The readv code can call directly into nbd-client, just as the
> writev code has done since commit 52a4650.
> 
> Interestingly enough, while qemu-io 'w 0 40m' splits into a 32M
> and 8M transaction, 'w -z 0 40m' splits into two 16M and an 8M,
> because the block layer caps the bounce buffer for writing zeroes
> at 16M.  When we later introduce support for NBD_CMD_WRITE_ZEROES,
> we can get a full 32M zero write (or larger, if the client and
> server negotiate that write zeroes can use a larger size than
> ordinary writes).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/nbd-client.c | 51 ++++++++-------------------------------------------
>  block/nbd.c        | 12 +++---------
>  2 files changed, 11 insertions(+), 52 deletions(-)
> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 4cc408d..f1fb58b 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -217,15 +217,15 @@ static void nbd_coroutine_end(NbdClientSession *s,
>      }
>  }
> 
> -static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
> -                          int nb_sectors, QEMUIOVector *qiov,
> -                          int offset)
> +int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
> +                        int nb_sectors, QEMUIOVector *qiov)
>  {
>      NbdClientSession *client = nbd_get_client_session(bs);
>      struct nbd_request request = { .type = NBD_CMD_READ };
>      struct nbd_reply reply;
>      ssize_t ret;
> 
> +    assert(nb_sectors <= NBD_MAX_SECTORS);
>      request.from = sector_num * 512;
>      request.len = nb_sectors * 512;
> 
> @@ -234,16 +234,15 @@ static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
>      if (ret < 0) {
>          reply.error = -ret;
>      } else {
> -        nbd_co_receive_reply(client, &request, &reply, qiov, offset);
> +        nbd_co_receive_reply(client, &request, &reply, qiov, 0);
>      }
>      nbd_coroutine_end(client, &request);
>      return -reply.error;
> 
>  }
> 
> -static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
> -                           int nb_sectors, QEMUIOVector *qiov,
> -                           int offset, int flags)
> +int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
> +                         int nb_sectors, QEMUIOVector *qiov, int flags)
>  {
>      NbdClientSession *client = nbd_get_client_session(bs);
>      struct nbd_request request = { .type = NBD_CMD_WRITE };
> @@ -255,11 +254,12 @@ static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
>          request.type |= NBD_CMD_FLAG_FUA;
>      }
> 
> +    assert(nb_sectors <= NBD_MAX_SECTORS);
>      request.from = sector_num * 512;
>      request.len = nb_sectors * 512;
> 
>      nbd_coroutine_start(client, &request);
> -    ret = nbd_co_send_request(bs, &request, qiov, offset);
> +    ret = nbd_co_send_request(bs, &request, qiov, 0);
>      if (ret < 0) {
>          reply.error = -ret;
>      } else {
> @@ -269,41 +269,6 @@ static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
>      return -reply.error;
>  }
> 
> -int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
> -                        int nb_sectors, QEMUIOVector *qiov)
> -{
> -    int offset = 0;
> -    int ret;
> -    while (nb_sectors > NBD_MAX_SECTORS) {
> -        ret = nbd_co_readv_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
> -        if (ret < 0) {
> -            return ret;
> -        }
> -        offset += NBD_MAX_SECTORS * 512;
> -        sector_num += NBD_MAX_SECTORS;
> -        nb_sectors -= NBD_MAX_SECTORS;
> -    }
> -    return nbd_co_readv_1(bs, sector_num, nb_sectors, qiov, offset);
> -}
> -
> -int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
> -                         int nb_sectors, QEMUIOVector *qiov, int flags)
> -{
> -    int offset = 0;
> -    int ret;
> -    while (nb_sectors > NBD_MAX_SECTORS) {
> -        ret = nbd_co_writev_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset,
> -                              flags);
> -        if (ret < 0) {
> -            return ret;
> -        }
> -        offset += NBD_MAX_SECTORS * 512;
> -        sector_num += NBD_MAX_SECTORS;
> -        nb_sectors -= NBD_MAX_SECTORS;
> -    }
> -    return nbd_co_writev_1(bs, sector_num, nb_sectors, qiov, offset, flags);
> -}
> -
>  int nbd_client_co_flush(BlockDriverState *bs)
>  {
>      NbdClientSession *client = nbd_get_client_session(bs);
> diff --git a/block/nbd.c b/block/nbd.c
> index 08e5b67..8a13078 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -349,12 +349,6 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
>      return ret;
>  }
> 
> -static int nbd_co_readv(BlockDriverState *bs, int64_t sector_num,
> -                        int nb_sectors, QEMUIOVector *qiov)
> -{
> -    return nbd_client_co_readv(bs, sector_num, nb_sectors, qiov);
> -}
> -
>  static int nbd_co_flush(BlockDriverState *bs)
>  {
>      return nbd_client_co_flush(bs);
> @@ -450,7 +444,7 @@ static BlockDriver bdrv_nbd = {
>      .instance_size              = sizeof(BDRVNBDState),
>      .bdrv_parse_filename        = nbd_parse_filename,
>      .bdrv_file_open             = nbd_open,
> -    .bdrv_co_readv              = nbd_co_readv,
> +    .bdrv_co_readv              = nbd_client_co_readv,
>      .bdrv_co_writev_flags       = nbd_client_co_writev,
>      .bdrv_close                 = nbd_close,
>      .bdrv_co_flush_to_os        = nbd_co_flush,
> @@ -468,7 +462,7 @@ static BlockDriver bdrv_nbd_tcp = {
>      .instance_size              = sizeof(BDRVNBDState),
>      .bdrv_parse_filename        = nbd_parse_filename,
>      .bdrv_file_open             = nbd_open,
> -    .bdrv_co_readv              = nbd_co_readv,
> +    .bdrv_co_readv              = nbd_client_co_readv,
>      .bdrv_co_writev_flags       = nbd_client_co_writev,
>      .bdrv_close                 = nbd_close,
>      .bdrv_co_flush_to_os        = nbd_co_flush,
> @@ -486,7 +480,7 @@ static BlockDriver bdrv_nbd_unix = {
>      .instance_size              = sizeof(BDRVNBDState),
>      .bdrv_parse_filename        = nbd_parse_filename,
>      .bdrv_file_open             = nbd_open,
> -    .bdrv_co_readv              = nbd_co_readv,
> +    .bdrv_co_readv              = nbd_client_co_readv,
>      .bdrv_co_writev_flags       = nbd_client_co_writev,
>      .bdrv_close                 = nbd_close,
>      .bdrv_co_flush_to_os        = nbd_co_flush,
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] nbd: Drop unused offset parameter
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 5/6] nbd: Drop unused offset parameter Eric Blake
@ 2016-07-18  8:16   ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2016-07-18  8:16 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, famz, qemu-block, stefanha, Max Reitz



On 15/07/2016 20:32, Eric Blake wrote:
> Now that NBD relies on the block layer to fragment things, we no
> longer need to track an offset argument for which fragment of
> a request we are actually servicing.
> 
> While at it, use true and false instead of 0 and 1 for a bool
> parameter.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> ---
> v3: no change
> v2: minor formatting tweak [Kevin]
> ---
>  include/block/nbd.h |  1 -
>  nbd/nbd-internal.h  |  4 ++--
>  block/nbd-client.c  | 31 ++++++++++++++++---------------
>  nbd/common.c        |  5 +----
>  4 files changed, 19 insertions(+), 22 deletions(-)
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index eeda3eb..503f514 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -89,7 +89,6 @@ enum {
>  ssize_t nbd_wr_syncv(QIOChannel *ioc,
>                       struct iovec *iov,
>                       size_t niov,
> -                     size_t offset,
>                       size_t length,
>                       bool do_read);
>  int nbd_receive_negotiate(QIOChannel *ioc, const char *name, uint32_t *flags,
> diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
> index 26a9f4d..93a6ca8 100644
> --- a/nbd/nbd-internal.h
> +++ b/nbd/nbd-internal.h
> @@ -101,14 +101,14 @@ static inline ssize_t read_sync(QIOChannel *ioc, void *buffer, size_t size)
>       * our request/reply.  Synchronization is done with recv_coroutine, so
>       * that this is coroutine-safe.
>       */
> -    return nbd_wr_syncv(ioc, &iov, 1, 0, size, true);
> +    return nbd_wr_syncv(ioc, &iov, 1, size, true);
>  }
> 
>  static inline ssize_t write_sync(QIOChannel *ioc, void *buffer, size_t size)
>  {
>      struct iovec iov = { .iov_base = buffer, .iov_len = size };
> 
> -    return nbd_wr_syncv(ioc, &iov, 1, 0, size, false);
> +    return nbd_wr_syncv(ioc, &iov, 1, size, false);
>  }
> 
>  struct NBDTLSHandshakeData {
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index f1fb58b..f184844 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -116,7 +116,7 @@ static void nbd_restart_write(void *opaque)
> 
>  static int nbd_co_send_request(BlockDriverState *bs,
>                                 struct nbd_request *request,
> -                               QEMUIOVector *qiov, int offset)
> +                               QEMUIOVector *qiov)
>  {
>      NbdClientSession *s = nbd_get_client_session(bs);
>      AioContext *aio_context;
> @@ -149,8 +149,8 @@ static int nbd_co_send_request(BlockDriverState *bs,
>          qio_channel_set_cork(s->ioc, true);
>          rc = nbd_send_request(s->ioc, request);
>          if (rc >= 0) {
> -            ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov,
> -                               offset, request->len, 0);
> +            ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
> +                               false);
>              if (ret != request->len) {
>                  rc = -EIO;
>              }
> @@ -167,8 +167,9 @@ static int nbd_co_send_request(BlockDriverState *bs,
>  }
> 
>  static void nbd_co_receive_reply(NbdClientSession *s,
> -    struct nbd_request *request, struct nbd_reply *reply,
> -    QEMUIOVector *qiov, int offset)
> +                                 struct nbd_request *request,
> +                                 struct nbd_reply *reply,
> +                                 QEMUIOVector *qiov)
>  {
>      int ret;
> 
> @@ -181,8 +182,8 @@ static void nbd_co_receive_reply(NbdClientSession *s,
>          reply->error = EIO;
>      } else {
>          if (qiov && reply->error == 0) {
> -            ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov,
> -                               offset, request->len, 1);
> +            ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
> +                               true);
>              if (ret != request->len) {
>                  reply->error = EIO;
>              }
> @@ -230,11 +231,11 @@ int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
>      request.len = nb_sectors * 512;
> 
>      nbd_coroutine_start(client, &request);
> -    ret = nbd_co_send_request(bs, &request, NULL, 0);
> +    ret = nbd_co_send_request(bs, &request, NULL);
>      if (ret < 0) {
>          reply.error = -ret;
>      } else {
> -        nbd_co_receive_reply(client, &request, &reply, qiov, 0);
> +        nbd_co_receive_reply(client, &request, &reply, qiov);
>      }
>      nbd_coroutine_end(client, &request);
>      return -reply.error;
> @@ -259,11 +260,11 @@ int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
>      request.len = nb_sectors * 512;
> 
>      nbd_coroutine_start(client, &request);
> -    ret = nbd_co_send_request(bs, &request, qiov, 0);
> +    ret = nbd_co_send_request(bs, &request, qiov);
>      if (ret < 0) {
>          reply.error = -ret;
>      } else {
> -        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
> +        nbd_co_receive_reply(client, &request, &reply, NULL);
>      }
>      nbd_coroutine_end(client, &request);
>      return -reply.error;
> @@ -284,11 +285,11 @@ int nbd_client_co_flush(BlockDriverState *bs)
>      request.len = 0;
> 
>      nbd_coroutine_start(client, &request);
> -    ret = nbd_co_send_request(bs, &request, NULL, 0);
> +    ret = nbd_co_send_request(bs, &request, NULL);
>      if (ret < 0) {
>          reply.error = -ret;
>      } else {
> -        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
> +        nbd_co_receive_reply(client, &request, &reply, NULL);
>      }
>      nbd_coroutine_end(client, &request);
>      return -reply.error;
> @@ -309,11 +310,11 @@ int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
>      request.len = nb_sectors * 512;
> 
>      nbd_coroutine_start(client, &request);
> -    ret = nbd_co_send_request(bs, &request, NULL, 0);
> +    ret = nbd_co_send_request(bs, &request, NULL);
>      if (ret < 0) {
>          reply.error = -ret;
>      } else {
> -        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
> +        nbd_co_receive_reply(client, &request, &reply, NULL);
>      }
>      nbd_coroutine_end(client, &request);
>      return -reply.error;
> diff --git a/nbd/common.c b/nbd/common.c
> index 8ddb2dd..b583a4f 100644
> --- a/nbd/common.c
> +++ b/nbd/common.c
> @@ -23,7 +23,6 @@
>  ssize_t nbd_wr_syncv(QIOChannel *ioc,
>                       struct iovec *iov,
>                       size_t niov,
> -                     size_t offset,
>                       size_t length,
>                       bool do_read)
>  {
> @@ -33,9 +32,7 @@ ssize_t nbd_wr_syncv(QIOChannel *ioc,
>      struct iovec *local_iov_head = local_iov;
>      unsigned int nlocal_iov = niov;
> 
> -    nlocal_iov = iov_copy(local_iov, nlocal_iov,
> -                          iov, niov,
> -                          offset, length);
> +    nlocal_iov = iov_copy(local_iov, nlocal_iov, iov, niov, 0, length);
> 
>      while (nlocal_iov > 0) {
>          ssize_t len;
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3 6/6] iscsi: Rely on block layer to break up large requests
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 6/6] iscsi: Rely on block layer to break up large requests Eric Blake
@ 2016-07-18  8:16   ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2016-07-18  8:16 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: kwolf, famz, qemu-block, stefanha, Ronnie Sahlberg, Peter Lieven,
	Max Reitz



On 15/07/2016 20:32, Eric Blake wrote:
> Now that the block layer honors max_request, we don't need to
> bother with an EINVAL on overlarge requests, but can instead
> assert that requests are well-behaved.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/iscsi.c | 14 ++++----------
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/block/iscsi.c b/block/iscsi.c
> index cf1e9e7..bdc7ade 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -472,11 +472,8 @@ iscsi_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
>          return -EINVAL;
>      }
> 
> -    if (bs->bl.max_transfer &&
> -        nb_sectors << BDRV_SECTOR_BITS > bs->bl.max_transfer) {
> -        error_report("iSCSI Error: Write of %d sectors exceeds max_xfer_len "
> -                     "of %" PRIu32 " bytes", nb_sectors, bs->bl.max_transfer);
> -        return -EINVAL;
> +    if (bs->bl.max_transfer) {
> +        assert(nb_sectors << BDRV_SECTOR_BITS <= bs->bl.max_transfer);
>      }
> 
>      lba = sector_qemu2lun(sector_num, iscsilun);
> @@ -650,11 +647,8 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
>          return -EINVAL;
>      }
> 
> -    if (bs->bl.max_transfer &&
> -        nb_sectors << BDRV_SECTOR_BITS > bs->bl.max_transfer) {
> -        error_report("iSCSI Error: Read of %d sectors exceeds max_xfer_len "
> -                     "of %" PRIu32 " bytes", nb_sectors, bs->bl.max_transfer);
> -        return -EINVAL;
> +    if (bs->bl.max_transfer) {
> +        assert(nb_sectors << BDRV_SECTOR_BITS <= bs->bl.max_transfer);
>      }
> 
>      if (iscsilun->lbprz && nb_sectors >= ISCSI_CHECKALLOC_THRES &&
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer
  2016-07-15 18:31 [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Eric Blake
                   ` (5 preceding siblings ...)
  2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 6/6] iscsi: Rely on block layer to break up large requests Eric Blake
@ 2016-07-19 16:12 ` Stefan Hajnoczi
  6 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2016-07-19 16:12 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, kwolf, famz, stefanha, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2240 bytes --]

On Fri, Jul 15, 2016 at 12:31:58PM -0600, Eric Blake wrote:
> We have max_transfer documented in BlockLimits, but while we
> honor it during pwrite_zeroes, we were blindly ignoring it
> during pwritev and preadv, leading to multiple drivers having
> to implement fragmentation themselves.  This series moves
> fragmentation to the block layer, then fixes the NBD and iscsi
> driver to use it.
> 
> qcow2 still does self-fragmenting, but that's because of cluster
> boundaries where it really has to do additional work beyond what
> the block layer can automatically provide.
> 
> Prequisite: Kevin's latest block branch PULL request
> 
> Also available as a tag at:
> git fetch git://repo.or.cz/qemu/ericb.git nbd-fragment-v3
> 
> Changes since v2:
> - patch 1, 3: change return semantics to be 0, not bytes, on success,
> since at least one caller asserts during 'make check' otherwise
> 
> 001/6:[0008] [FC] 'block: Fragment reads to max transfer length'
> 002/6:[----] [--] 'raw_bsd: Don't advertise flags not supported by protocol layer'
> 003/6:[0002] [FC] 'block: Fragment writes to max transfer length'
> 004/6:[----] [--] 'nbd: Rely on block layer to break up large requests'
> 005/6:[----] [--] 'nbd: Drop unused offset parameter'
> 006/6:[----] [--] 'iscsi: Rely on block layer to break up large requests'
> 
> Eric Blake (6):
>   block: Fragment reads to max transfer length
>   raw_bsd: Don't advertise flags not supported by protocol layer
>   block: Fragment writes to max transfer length
>   nbd: Rely on block layer to break up large requests
>   nbd: Drop unused offset parameter
>   iscsi: Rely on block layer to break up large requests
> 
>  include/block/nbd.h |  1 -
>  nbd/nbd-internal.h  |  4 +--
>  block/io.c          | 90 +++++++++++++++++++++++++++++++++++++++--------------
>  block/iscsi.c       | 14 +++------
>  block/nbd-client.c  | 78 +++++++++++++---------------------------------
>  block/nbd.c         | 12 ++-----
>  block/raw_bsd.c     |  6 ++--
>  nbd/common.c        |  5 +--
>  8 files changed, 103 insertions(+), 107 deletions(-)
> 
> -- 
> 2.5.5
> 
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-07-19 16:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-15 18:31 [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Eric Blake
2016-07-15 18:31 ` [Qemu-devel] [PATCH v3 1/6] block: Fragment reads to max transfer length Eric Blake
2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 2/6] raw_bsd: Don't advertise flags not supported by protocol layer Eric Blake
2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 3/6] block: Fragment writes to max transfer length Eric Blake
2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 4/6] nbd: Rely on block layer to break up large requests Eric Blake
2016-07-18  8:16   ` Paolo Bonzini
2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 5/6] nbd: Drop unused offset parameter Eric Blake
2016-07-18  8:16   ` Paolo Bonzini
2016-07-15 18:32 ` [Qemu-devel] [PATCH v3 6/6] iscsi: Rely on block layer to break up large requests Eric Blake
2016-07-18  8:16   ` Paolo Bonzini
2016-07-19 16:12 ` [Qemu-devel] [PATCH v3 0/6] Auto-fragment large transactions at the block layer Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.