All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 00/22] Byte-based block limits
@ 2016-06-23 22:37 Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 01/22] block: Tighter assertions on bdrv_aligned_pwritev() Eric Blake
                   ` (22 more replies)
  0 siblings, 23 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha

BlockLimits is currently an ugly mix of byte limits vs.
sector limits.  Unify it.  Fix some bugs I found in
bdrv_aligned_preadv() while at it.

Prequisite: none (built on Kevin's block branch, which is currently
merged into master)

Also available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-limits-v3

Since v2:
- address review comments, add R-b on uncontested patches
- split a couple of patches that did too much at once
- add a couple of new patches at the end for things I noticed
during cleanups
- see individual patches for more details

001/22:[----] [--] 'block: Tighter assertions on bdrv_aligned_pwritev()'
002/22:[----] [--] 'block: Document supported flags during bdrv_aligned_preadv()'
003/22:[----] [--] 'block: Fix harmless off-by-one in bdrv_aligned_preadv()'
004/22:[0001] [FC] 'nbd: Allow larger requests'
005/22:[----] [--] 'nbd: Advertise realistic limits to block layer'
006/22:[----] [--] 'iscsi: Advertise realistic limits to block layer'
007/22:[down] 'scsi: Advertise limits by blocksize, not 512'
008/22:[----] [-C] 'block: Give nonzero result to blk_get_max_transfer_length()'
009/22:[0010] [FC] 'blkdebug: Set request_alignment during .bdrv_refresh_limits()'
010/22:[----] [--] 'iscsi: Set request_alignment during .bdrv_refresh_limits()'
011/22:[----] [--] 'qcow2: Set request_alignment during .bdrv_refresh_limits()'
012/22:[----] [--] 'raw-win32: Set request_alignment during .bdrv_refresh_limits()'
013/22:[----] [--] 'block: Set request_alignment during .bdrv_refresh_limits()'
014/22:[----] [--] 'block: Set default request_alignment during bdrv_refresh_limits()'
015/22:[0017] [FC] 'block: Switch transfer length bounds to byte-based'
016/22:[down] 'block: Wording tweaks to write zeroes limits'
017/22:[0007] [FC] 'block: Switch discard length bounds to byte-based'
018/22:[down] 'block: Drop raw_refresh_limits()'
019/22:[0012] [FC] 'block: Split bdrv_merge_limits() from bdrv_refresh_limits()'
020/22:[0006] [FC] 'block: Move request_alignment into BlockLimit'
021/22:[down] 'block: Fix error message style'
022/22:[down] 'block: Use bool as appropriate for BDS members'

Since v1:
- drop things already done in Kevin's work
- rebase
- split out lots of cleanup work to bdrv_refresh_limits() so
that qemu-iotests does not gain any problems on 77

Eric Blake (22):
  block: Tighter assertions on bdrv_aligned_pwritev()
  block: Document supported flags during bdrv_aligned_preadv()
  block: Fix harmless off-by-one in bdrv_aligned_preadv()
  nbd: Allow larger requests
  nbd: Advertise realistic limits to block layer
  iscsi: Advertise realistic limits to block layer
  scsi: Advertise limits by blocksize, not 512
  block: Give nonzero result to blk_get_max_transfer_length()
  blkdebug: Set request_alignment during .bdrv_refresh_limits()
  iscsi: Set request_alignment during .bdrv_refresh_limits()
  qcow2: Set request_alignment during .bdrv_refresh_limits()
  raw-win32: Set request_alignment during .bdrv_refresh_limits()
  block: Set request_alignment during .bdrv_refresh_limits()
  block: Set default request_alignment during bdrv_refresh_limits()
  block: Switch transfer length bounds to byte-based
  block: Wording tweaks to write zeroes limits
  block: Switch discard length bounds to byte-based
  block: Drop raw_refresh_limits()
  block: Split bdrv_merge_limits() from bdrv_refresh_limits()
  block: Move request_alignment into BlockLimit
  block: Fix error message style
  block: Use bool as appropriate for BDS members

 qapi/block-core.json           |  3 +-
 include/block/block.h          |  8 ++--
 include/block/block_int.h      | 57 +++++++++++++++---------
 include/block/nbd.h            |  2 +
 include/sysemu/block-backend.h |  2 +-
 block.c                        | 25 ++++++-----
 block/blkdebug.c               | 19 ++++++--
 block/block-backend.c          |  9 ++--
 block/bochs.c                  |  9 +++-
 block/cloop.c                  |  9 +++-
 block/crypto.c                 |  4 +-
 block/dmg.c                    |  9 +++-
 block/io.c                     | 98 +++++++++++++++++++++++-------------------
 block/iscsi.c                  | 45 ++++++++++---------
 block/nbd-client.c             |  4 --
 block/nbd.c                    |  4 +-
 block/qcow.c                   |  2 +-
 block/qcow2.c                  |  9 ++--
 block/raw-posix.c              | 24 +++++------
 block/raw-win32.c              | 10 ++---
 block/raw_bsd.c                |  8 +---
 block/vvfat.c                  | 11 +++--
 hw/block/virtio-blk.c          | 10 ++---
 hw/scsi/scsi-generic.c         | 15 ++++---
 qemu-img.c                     |  9 ++--
 25 files changed, 231 insertions(+), 174 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 01/22] block: Tighter assertions on bdrv_aligned_pwritev()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 02/22] block: Document supported flags during bdrv_aligned_preadv() Eric Blake
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

For symmetry with bdrv_aligned_preadv(), assert that the caller
really has aligned things properly. This requires adding an align
parameter, which is used now only in the new asserts, but will
come in handy in a later patch that adds auto-fragmentation to the
max transfer size, since that value need not always be a multiple
of the alignment, and therefore must be rounded down.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v3: no change
v2: new patch
---
 block/io.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 7cf3645..b95e856 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1254,7 +1254,7 @@ fail:
  */
 static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     BdrvTrackedRequest *req, int64_t offset, unsigned int bytes,
-    QEMUIOVector *qiov, int flags)
+    int64_t align, QEMUIOVector *qiov, int flags)
 {
     BlockDriver *drv = bs->drv;
     bool waited;
@@ -1263,6 +1263,9 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
     int64_t start_sector = offset >> BDRV_SECTOR_BITS;
     int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);

+    assert(is_power_of_2(align));
+    assert((offset & (align - 1)) == 0);
+    assert((bytes & (align - 1)) == 0);
     assert(!qiov || bytes == qiov->size);
     assert((bs->open_flags & BDRV_O_NO_IO) == 0);
     assert(!(flags & ~BDRV_REQ_MASK));
@@ -1349,7 +1352,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BlockDriverState *bs,

         memset(buf + head_padding_bytes, 0, zero_bytes);
         ret = bdrv_aligned_pwritev(bs, req, offset & ~(align - 1), align,
-                                   &local_qiov,
+                                   align, &local_qiov,
                                    flags & ~BDRV_REQ_ZERO_WRITE);
         if (ret < 0) {
             goto fail;
@@ -1362,7 +1365,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BlockDriverState *bs,
     if (bytes >= align) {
         /* Write the aligned part in the middle. */
         uint64_t aligned_bytes = bytes & ~(align - 1);
-        ret = bdrv_aligned_pwritev(bs, req, offset, aligned_bytes,
+        ret = bdrv_aligned_pwritev(bs, req, offset, aligned_bytes, align,
                                    NULL, flags);
         if (ret < 0) {
             goto fail;
@@ -1386,7 +1389,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BlockDriverState *bs,
         bdrv_debug_event(bs, BLKDBG_PWRITEV_RMW_AFTER_TAIL);

         memset(buf, 0, bytes);
-        ret = bdrv_aligned_pwritev(bs, req, offset, align,
+        ret = bdrv_aligned_pwritev(bs, req, offset, align, align,
                                    &local_qiov, flags & ~BDRV_REQ_ZERO_WRITE);
     }
 fail:
@@ -1511,7 +1514,7 @@ int coroutine_fn bdrv_co_pwritev(BlockDriverState *bs,
         bytes = ROUND_UP(bytes, align);
     }

-    ret = bdrv_aligned_pwritev(bs, &req, offset, bytes,
+    ret = bdrv_aligned_pwritev(bs, &req, offset, bytes, align,
                                use_local_qiov ? &local_qiov : qiov,
                                flags);

-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 02/22] block: Document supported flags during bdrv_aligned_preadv()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 01/22] block: Tighter assertions on bdrv_aligned_pwritev() Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 03/22] block: Fix harmless off-by-one in bdrv_aligned_preadv() Eric Blake
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

We don't pass any flags on to drivers to handle.  Tighten an
assert to explain why we pass 0 to bdrv_driver_preadv(), and add
some comments on things to be aware of if we want to turn on
per-BDS BDRV_REQ_FUA support during reads in the future.  Also,
document that we may want to consider using unmap during
copy-on-read operations where the read is all zeroes.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v3: no change
v2: retitle from 'Honor flags during bdrv_aligned_preadv()', and
change scope of patch
---
 block/io.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index b95e856..994d3fa 100644
--- a/block/io.c
+++ b/block/io.c
@@ -945,6 +945,9 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,

     if (drv->bdrv_co_pwrite_zeroes &&
         buffer_is_zero(bounce_buffer, iov.iov_len)) {
+        /* FIXME: Should we (perhaps conditionally) be setting
+         * BDRV_REQ_MAY_UNMAP, if it will allow for a sparser copy
+         * that still correctly reads as zero? */
         ret = bdrv_co_do_pwrite_zeroes(bs, cluster_offset, cluster_bytes, 0);
     } else {
         /* This does not change the data on the disk, it is not necessary
@@ -987,7 +990,12 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
     assert((bytes & (align - 1)) == 0);
     assert(!qiov || bytes == qiov->size);
     assert((bs->open_flags & BDRV_O_NO_IO) == 0);
-    assert(!(flags & ~BDRV_REQ_MASK));
+
+    /* TODO: We would need a per-BDS .supported_read_flags and
+     * potential fallback support, if we ever implement any read flags
+     * to pass through to drivers.  For now, there aren't any
+     * passthrough flags.  */
+    assert(!(flags & ~(BDRV_REQ_NO_SERIALISING | BDRV_REQ_COPY_ON_READ)));

     /* Handle Copy on Read and associated serialisation */
     if (flags & BDRV_REQ_COPY_ON_READ) {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 03/22] block: Fix harmless off-by-one in bdrv_aligned_preadv()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 01/22] block: Tighter assertions on bdrv_aligned_pwritev() Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 02/22] block: Document supported flags during bdrv_aligned_preadv() Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 04/22] nbd: Allow larger requests Eric Blake
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

If the amount of data to read ends exactly on the total size
of the bs, then we were wasting time creating a local qiov
to read the data in preparation for what would normally be
appending zeroes beyond the end, even though this corner case
has nothing further to do.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v3: no change
v2: no change, add R-b
---
 block/io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 994d3fa..82c9ff0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1036,7 +1036,7 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
     }

     max_bytes = ROUND_UP(MAX(0, total_bytes - offset), align);
-    if (bytes < max_bytes) {
+    if (bytes <= max_bytes) {
         ret = bdrv_driver_preadv(bs, offset, bytes, qiov, 0);
     } else if (max_bytes > 0) {
         QEMUIOVector local_qiov;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 04/22] nbd: Allow larger requests
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (2 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 03/22] block: Fix harmless off-by-one in bdrv_aligned_preadv() Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 05/22] nbd: Advertise realistic limits to block layer Eric Blake
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, qemu-stable, Paolo Bonzini, Max Reitz

The NBD layer was breaking up request at a limit of 2040 sectors
(just under 1M) to cater to old qemu-nbd. But the server limit
was raised to 32M in commit 2d8214885 to match the kernel, more
than three years ago; and the upstream NBD Protocol is proposing
documentation that without any explicit communication to state
otherwise, a client should be able to safely assume that a 32M
transaction will work.  It is time to rely on the larger sizing,
and any downstream distro that cares about maximum
interoperability to older qemu-nbd servers can just tweak the
value of #define NBD_MAX_SECTORS.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v3: rebase to master, cc qemu-stable
v2: new patch
---
 include/block/nbd.h | 2 ++
 block/nbd-client.c  | 4 ----
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index df1f804..eeda3eb 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -77,6 +77,8 @@ enum {

 /* Maximum size of a single READ/WRITE data buffer */
 #define NBD_MAX_BUFFER_SIZE (32 * 1024 * 1024)
+#define NBD_MAX_SECTORS (NBD_MAX_BUFFER_SIZE / BDRV_SECTOR_SIZE)
+
 /* Maximum size of an export name. The NBD spec requires 256 and
  * suggests that servers support up to 4096, but we stick to only the
  * required size so that we can stack-allocate the names, and because
diff --git a/block/nbd-client.c b/block/nbd-client.c
index 4d13444..420bce8 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -269,10 +269,6 @@ static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
     return -reply.error;
 }

-/* qemu-nbd has a limit of slightly less than 1M per request.  Try to
- * remain aligned to 4K. */
-#define NBD_MAX_SECTORS 2040
-
 int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
                         int nb_sectors, QEMUIOVector *qiov)
 {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 05/22] nbd: Advertise realistic limits to block layer
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (3 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 04/22] nbd: Allow larger requests Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 06/22] iscsi: " Eric Blake
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Paolo Bonzini, Max Reitz

We were basing the advertisement of maximum discard and transfer
length off of UINT32_MAX, but since the rest of the block layer
has signed int limits on a transaction, nothing could ever reach
that maximum, and we risk overflowing an int once things are
converted to byte-based rather than sector-based limits.  What's
more, we DO have a much smaller limit: both the current kernel
and qemu-nbd have a hard limit of 32M on a read or write
transaction, and while they may also permit up to a full 32 bits
on a discard transaction, the upstream NBD protocol is proposing
wording that without any explicit advertisement otherwise,
clients should limit ALL requests to the same limits as read and
write, even though the other requests do not actually require as
many bytes across the wire.  So the better limit to tell the
block layer is 32M for both values.

Behavior doesn't actually change with this patch (the block layer
is currently ignoring the max_transfer advertisements); but when
that problem is fixed in a later series, this patch will prevent
the exposure of a latent bug.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v3: enhance commit message
v2: new patch
---
 block/nbd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 6015e8b..bf67c8a 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -362,8 +362,8 @@ static int nbd_co_flush(BlockDriverState *bs)

 static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
 {
-    bs->bl.max_discard = UINT32_MAX >> BDRV_SECTOR_BITS;
-    bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;
+    bs->bl.max_discard = NBD_MAX_SECTORS;
+    bs->bl.max_transfer_length = NBD_MAX_SECTORS;
 }

 static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 06/22] iscsi: Advertise realistic limits to block layer
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (4 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 05/22] nbd: Advertise realistic limits to block layer Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 07/22] scsi: Advertise limits by blocksize, not 512 Eric Blake
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, Ronnie Sahlberg,
	Paolo Bonzini, Peter Lieven, Max Reitz

The function sector_limits_lun2qemu() returns a value in units of
the block layer's 512-byte sector, and can be as large as
0x40000000, which is much larger than the block layer's inherent
limit of BDRV_REQUEST_MAX_SECTORS.  The block layer already
handles '0' as a synonym to the inherent limit, and it is nicer
to return this value than it is to calculate an arbitrary
maximum, for two reasons: we want to ensure that the block layer
continues to special-case '0' as 'no limit beyond the inherent
limits'; and we want to be able to someday expand the block
layer to allow 64-bit limits, where auditing for uses of
BDRV_REQUEST_MAX_SECTORS will help us make sure we aren't
artificially constraining iscsi to old block layer limits.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v3: no change
v2: new patch
---
 block/iscsi.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 7e78ade..4290e41 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1697,7 +1697,9 @@ static void iscsi_close(BlockDriverState *bs)

 static int sector_limits_lun2qemu(int64_t sector, IscsiLun *iscsilun)
 {
-    return MIN(sector_lun2qemu(sector, iscsilun), INT_MAX / 2 + 1);
+    int limit = MIN(sector_lun2qemu(sector, iscsilun), INT_MAX / 2 + 1);
+
+    return limit < BDRV_REQUEST_MAX_SECTORS ? limit : 0;
 }

 static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 07/22] scsi: Advertise limits by blocksize, not 512
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (5 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 06/22] iscsi: " Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  5:22   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 08/22] block: Give nonzero result to blk_get_max_transfer_length() Eric Blake
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, qemu-stable, Paolo Bonzini

s->blocksize may be larger than 512, in which case our
tweaks to max_xfer_len and opt_xfer_len must be scaled
appropriately.

Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org

---
v3: new patch
---
 hw/scsi/scsi-generic.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 6a2d89a..75e227d 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -225,7 +225,8 @@ static void scsi_read_complete(void * opaque, int ret)
     if (s->type == TYPE_DISK &&
         r->req.cmd.buf[0] == INQUIRY &&
         r->req.cmd.buf[2] == 0xb0) {
-        uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk);
+        uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk) /
+            (s->blocksize / BDRV_SECTOR_SIZE);
         if (max_xfer_len) {
             stl_be_p(&r->buf[8], max_xfer_len);
             /* Also take care of the opt xfer len. */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 08/22] block: Give nonzero result to blk_get_max_transfer_length()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (6 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 07/22] scsi: Advertise limits by blocksize, not 512 Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  5:24   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 09/22] blkdebug: Set request_alignment during .bdrv_refresh_limits() Eric Blake
                   ` (14 subsequent siblings)
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, Max Reitz, Michael S. Tsirkin,
	Paolo Bonzini

Making all callers special-case 0 as unlimited is awkward,
and we DO have a hard maximum of BDRV_REQUEST_MAX_SECTORS given
our current block layer API limits.

In the case of scsi, this means that we now always advertise a
limit to the guest, even in cases where the underlying layers
previously use 0 for no inherent limit beyond the block layer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>

---
v3: rebase to scsi limits fix
v2: new patch
---
 block/block-backend.c  |  7 ++++---
 hw/block/virtio-blk.c  |  3 +--
 hw/scsi/scsi-generic.c | 12 ++++++------
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 34500e6..1fb070b 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1303,15 +1303,16 @@ int blk_get_flags(BlockBackend *blk)
     }
 }

+/* Returns the maximum transfer length, in sectors; guaranteed nonzero */
 int blk_get_max_transfer_length(BlockBackend *blk)
 {
     BlockDriverState *bs = blk_bs(blk);
+    int max = 0;

     if (bs) {
-        return bs->bl.max_transfer_length;
-    } else {
-        return 0;
+        max = bs->bl.max_transfer_length;
     }
+    return MIN_NON_ZERO(max, BDRV_REQUEST_MAX_SECTORS);
 }

 int blk_get_max_iov(BlockBackend *blk)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 284e646..1d2792e 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -382,7 +382,7 @@ static int multireq_compare(const void *a, const void *b)
 void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
 {
     int i = 0, start = 0, num_reqs = 0, niov = 0, nb_sectors = 0;
-    int max_xfer_len = 0;
+    int max_xfer_len;
     int64_t sector_num = 0;

     if (mrb->num_reqs == 1) {
@@ -392,7 +392,6 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
     }

     max_xfer_len = blk_get_max_transfer_length(mrb->reqs[0]->dev->blk);
-    max_xfer_len = MIN_NON_ZERO(max_xfer_len, BDRV_REQUEST_MAX_SECTORS);

     qsort(mrb->reqs, mrb->num_reqs, sizeof(*mrb->reqs),
           &multireq_compare);
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 75e227d..0cb8568 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -227,12 +227,12 @@ static void scsi_read_complete(void * opaque, int ret)
         r->req.cmd.buf[2] == 0xb0) {
         uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk) /
             (s->blocksize / BDRV_SECTOR_SIZE);
-        if (max_xfer_len) {
-            stl_be_p(&r->buf[8], max_xfer_len);
-            /* Also take care of the opt xfer len. */
-            if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
-                stl_be_p(&r->buf[12], max_xfer_len);
-            }
+
+        assert(max_xfer_len);
+        stl_be_p(&r->buf[8], max_xfer_len);
+        /* Also take care of the opt xfer len. */
+        if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
+            stl_be_p(&r->buf[12], max_xfer_len);
         }
     }
     scsi_req_data(&r->req, len);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 09/22] blkdebug: Set request_alignment during .bdrv_refresh_limits()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (7 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 08/22] block: Give nonzero result to blk_get_max_transfer_length() Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  5:42   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 10/22] iscsi: " Eric Blake
                   ` (13 subsequent siblings)
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, Max Reitz, Markus Armbruster

We want to eventually stick request_alignment alongside other
BlockLimits, but first, we must ensure it is populated at the
same time as all other limits, rather than being a special case
that is set only when a block is first opened.

Note that when the user does not provide "align", then we were
defaulting to bs->request_alignment - but at this stage in the
initialization, that was always 512.  We were also rejecting an
explicit "align":0 from the user; this patch now allows that,
as an explicit request for the default alignment (which may not
always be 512 in the future).

qemu-iotests 77 is particularly sensitive to the fact that we
can specify an artificial alignment override in blkdebug, and
that override must continue to work even when limits are
refreshed on an already open device.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: rework to allow user to specify "align":0 for default
v2: new patch
---
 qapi/block-core.json |  3 ++-
 block/blkdebug.c     | 19 +++++++++++++++----
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 98a20d2..ac8f5f6 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1961,7 +1961,8 @@
 #
 # @config:          #optional filename of the configuration file
 #
-# @align:           #optional required alignment for requests in bytes
+# @align:           #optional required alignment for requests in bytes,
+#                   must be power of 2, or 0 for default
 #
 # @inject-error:    #optional array of error injection descriptions
 #
diff --git a/block/blkdebug.c b/block/blkdebug.c
index 20d25bd..54b6870 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -37,6 +37,7 @@
 typedef struct BDRVBlkdebugState {
     int state;
     int new_state;
+    int align;

     QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
     QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
@@ -382,10 +383,10 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
     }

     /* Set request alignment */
-    align = qemu_opt_get_size(opts, "align", bs->request_alignment);
-    if (align > 0 && align < INT_MAX && !(align & (align - 1))) {
-        bs->request_alignment = align;
-    } else {
+    align = qemu_opt_get_size(opts, "align", 0);
+    if (align < INT_MAX && is_power_of_2(align)) {
+        s->align = align;
+    } else if (align) {
         error_setg(errp, "Invalid alignment");
         ret = -EINVAL;
         goto fail_unref;
@@ -720,6 +721,15 @@ static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
     bs->full_open_options = opts;
 }

+static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    BDRVBlkdebugState *s = bs->opaque;
+
+    if (s->align) {
+        bs->request_alignment = s->align;
+    }
+}
+
 static int blkdebug_reopen_prepare(BDRVReopenState *reopen_state,
                                    BlockReopenQueue *queue, Error **errp)
 {
@@ -738,6 +748,7 @@ static BlockDriver bdrv_blkdebug = {
     .bdrv_getlength         = blkdebug_getlength,
     .bdrv_truncate          = blkdebug_truncate,
     .bdrv_refresh_filename  = blkdebug_refresh_filename,
+    .bdrv_refresh_limits    = blkdebug_refresh_limits,

     .bdrv_aio_readv         = blkdebug_aio_readv,
     .bdrv_aio_writev        = blkdebug_aio_writev,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 10/22] iscsi: Set request_alignment during .bdrv_refresh_limits()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (8 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 09/22] blkdebug: Set request_alignment during .bdrv_refresh_limits() Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 11/22] qcow2: " Eric Blake
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, Ronnie Sahlberg,
	Paolo Bonzini, Peter Lieven, Max Reitz

We want to eventually stick request_alignment alongside other
BlockLimits, but first, we must ensure it is populated at the
same time as all other limits, rather than being a special case
that is set only when a block is first opened.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>

---
v3: no change
v2: new patch
---
 block/iscsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 4290e41..b4661c9 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1588,7 +1588,6 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
         goto out;
     }
     bs->total_sectors = sector_lun2qemu(iscsilun->num_blocks, iscsilun);
-    bs->request_alignment = iscsilun->block_size;

     /* We don't have any emulation for devices other than disks and CD-ROMs, so
      * this must be sg ioctl compatible. We force it to be sg, otherwise qemu
@@ -1710,6 +1709,8 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
     IscsiLun *iscsilun = bs->opaque;
     uint32_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;

+    bs->request_alignment = iscsilun->block_size;
+
     if (iscsilun->bl.max_xfer_len) {
         max_xfer_len = MIN(max_xfer_len, iscsilun->bl.max_xfer_len);
     }
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 11/22] qcow2: Set request_alignment during .bdrv_refresh_limits()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (9 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 10/22] iscsi: " Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 12/22] raw-win32: " Eric Blake
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

We want to eventually stick request_alignment alongside other
BlockLimits, but first, we must ensure it is populated at the
same time as all other limits, rather than being a special case
that is set only when a block is first opened.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>

---
v3: no change
v2: new patch
---
 block/qcow2.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 23f666d..48f80b6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -981,9 +981,6 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
         }

         bs->encrypted = 1;
-
-        /* Encryption works on a sector granularity */
-        bs->request_alignment = BDRV_SECTOR_SIZE;
     }

     s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
@@ -1202,6 +1199,10 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BDRVQcow2State *s = bs->opaque;

+    if (bs->encrypted) {
+        /* Encryption works on a sector granularity */
+        bs->request_alignment = BDRV_SECTOR_SIZE;
+    }
     bs->bl.pwrite_zeroes_alignment = s->cluster_size;
 }

-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 12/22] raw-win32: Set request_alignment during .bdrv_refresh_limits()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (10 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 11/22] qcow2: " Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 13/22] block: " Eric Blake
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

We want to eventually stick request_alignment alongside other
BlockLimits, but first, we must ensure it is populated at the
same time as all other limits, rather than being a special case
that is set only when a block is first opened.

In this case, raw_probe_alignment() already did what we needed,
so just fix its signature and wire it in correctly.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>

---
v3: no change
v2: new patch
---
 block/raw-win32.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/raw-win32.c b/block/raw-win32.c
index fd23891..88382d9 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -222,7 +222,7 @@ static void raw_attach_aio_context(BlockDriverState *bs,
     }
 }

-static void raw_probe_alignment(BlockDriverState *bs)
+static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
 {
     BDRVRawState *s = bs->opaque;
     DWORD sectorsPerCluster, freeClusters, totalClusters, count;
@@ -365,7 +365,6 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
         win32_aio_attach_aio_context(s->aio, bdrv_get_aio_context(bs));
     }

-    raw_probe_alignment(bs);
     ret = 0;
 fail:
     qemu_opts_del(opts);
@@ -550,6 +549,7 @@ BlockDriver bdrv_file = {
     .bdrv_needs_filename = true,
     .bdrv_parse_filename = raw_parse_filename,
     .bdrv_file_open     = raw_open,
+    .bdrv_refresh_limits = raw_probe_alignment,
     .bdrv_close         = raw_close,
     .bdrv_create        = raw_create,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 13/22] block: Set request_alignment during .bdrv_refresh_limits()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (11 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 12/22] raw-win32: " Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 14/22] block: Set default request_alignment during bdrv_refresh_limits() Eric Blake
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

We want to eventually stick request_alignment alongside other
BlockLimits, but first, we must ensure it is populated at the
same time as all other limits, rather than being a special case
that is set only when a block is first opened.

Add a .bdrv_refresh_limits() to all four of our legacy devices
that will always be sector-only (bochs, cloop, dmg, vvfat), in
spite of their recent conversion to expose a byte interface.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>

---
v3: no change
v2: new patch
---
 block/bochs.c | 7 ++++++-
 block/cloop.c | 7 ++++++-
 block/dmg.c   | 7 ++++++-
 block/vvfat.c | 7 ++++++-
 4 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/block/bochs.c b/block/bochs.c
index 6c8d0f3..182c50b 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -105,7 +105,6 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;

     bs->read_only = 1; // no write support yet
-    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */

     ret = bdrv_pread(bs->file->bs, 0, &bochs, sizeof(bochs));
     if (ret < 0) {
@@ -189,6 +188,11 @@ fail:
     return ret;
 }

+static void bochs_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+}
+
 static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
 {
     BDRVBochsState *s = bs->opaque;
@@ -283,6 +287,7 @@ static BlockDriver bdrv_bochs = {
     .instance_size	= sizeof(BDRVBochsState),
     .bdrv_probe		= bochs_probe,
     .bdrv_open		= bochs_open,
+    .bdrv_refresh_limits = bochs_refresh_limits,
     .bdrv_co_preadv = bochs_co_preadv,
     .bdrv_close		= bochs_close,
 };
diff --git a/block/cloop.c b/block/cloop.c
index ea5a92b..d574003 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -67,7 +67,6 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;

     bs->read_only = 1;
-    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */

     /* read header */
     ret = bdrv_pread(bs->file->bs, 128, &s->block_size, 4);
@@ -199,6 +198,11 @@ fail:
     return ret;
 }

+static void cloop_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+}
+
 static inline int cloop_read_block(BlockDriverState *bs, int block_num)
 {
     BDRVCloopState *s = bs->opaque;
@@ -280,6 +284,7 @@ static BlockDriver bdrv_cloop = {
     .instance_size  = sizeof(BDRVCloopState),
     .bdrv_probe     = cloop_probe,
     .bdrv_open      = cloop_open,
+    .bdrv_refresh_limits = cloop_refresh_limits,
     .bdrv_co_preadv = cloop_co_preadv,
     .bdrv_close     = cloop_close,
 };
diff --git a/block/dmg.c b/block/dmg.c
index 06eb513..1e53cd8 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -439,7 +439,6 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;

     bs->read_only = 1;
-    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */

     s->n_chunks = 0;
     s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
@@ -547,6 +546,11 @@ fail:
     return ret;
 }

+static void dmg_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+}
+
 static inline int is_sector_in_chunk(BDRVDMGState* s,
                 uint32_t chunk_num, uint64_t sector_num)
 {
@@ -720,6 +724,7 @@ static BlockDriver bdrv_dmg = {
     .instance_size  = sizeof(BDRVDMGState),
     .bdrv_probe     = dmg_probe,
     .bdrv_open      = dmg_open,
+    .bdrv_refresh_limits = dmg_refresh_limits,
     .bdrv_co_preadv = dmg_co_preadv,
     .bdrv_close     = dmg_close,
 };
diff --git a/block/vvfat.c b/block/vvfat.c
index 5569450..4d44636 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1177,7 +1177,6 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
         bs->read_only = 0;
     }

-    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
     bs->total_sectors = cyls * heads * secs;

     if (init_directories(s, dirname, heads, secs, errp)) {
@@ -1209,6 +1208,11 @@ fail:
     return ret;
 }

+static void vvfat_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+}
+
 static inline void vvfat_close_current_file(BDRVVVFATState *s)
 {
     if(s->current_mapping) {
@@ -3046,6 +3050,7 @@ static BlockDriver bdrv_vvfat = {

     .bdrv_parse_filename    = vvfat_parse_filename,
     .bdrv_file_open         = vvfat_open,
+    .bdrv_refresh_limits    = vvfat_refresh_limits,
     .bdrv_close             = vvfat_close,

     .bdrv_co_preadv         = vvfat_co_preadv,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 14/22] block: Set default request_alignment during bdrv_refresh_limits()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (12 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 13/22] block: " Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based Eric Blake
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

We want to eventually stick request_alignment alongside other
BlockLimits, but first, we must ensure it is populated at the
same time as all other limits, rather than being a special case
that is set only when a block is first opened.

Now that all drivers have been updated to supply an override
of request_alignment during their .bdrv_refresh_limits(), as
needed, the block layer itself can defer setting the default
alignment until part of the overall bdrv_refresh_limits().

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>

---
v3: no change
v2: new patch
---
 block.c    | 1 -
 block/io.c | 3 +++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index f4648e9..c2fbf06 100644
--- a/block.c
+++ b/block.c
@@ -937,7 +937,6 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file,
         goto fail_opts;
     }

-    bs->request_alignment = drv->bdrv_co_preadv ? 1 : 512;
     bs->read_only = !(bs->open_flags & BDRV_O_RDWR);

     if (use_bdrv_whitelist && !bdrv_is_whitelisted(drv, bs->read_only)) {
diff --git a/block/io.c b/block/io.c
index 82c9ff0..323e822 100644
--- a/block/io.c
+++ b/block/io.c
@@ -78,6 +78,9 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
         return;
     }

+    /* Default alignment based on whether driver has byte interface */
+    bs->request_alignment = drv->bdrv_co_preadv ? 1 : 512;
+
     /* Take some limits from the children as a default */
     if (bs->file) {
         bdrv_refresh_limits(bs->file->bs, &local_err);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (13 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 14/22] block: Set default request_alignment during bdrv_refresh_limits() Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  6:06   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits Eric Blake
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, Max Reitz, Ronnie Sahlberg,
	Paolo Bonzini, Peter Lieven, Michael S. Tsirkin

Sector-based limits are awkward to think about; in our on-going
quest to move to byte-based interfaces, convert max_transfer_length
and opt_transfer_length.  Rename them (dropping the _length suffix)
so that the compiler will help us catch the change in semantics
across any rebased code, and improve the documentation.  Use unsigned
values, so that we don't have to worry about negative values and
so that bit-twiddling is easier; however, we are still constrained
by 2^31 of signed int in most APIs.

When a value comes from an external source (iscsi and raw-posix),
sanitize the results to ensure that opt_transfer is a power of 2.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: rebase to scsi limits fix, sanitize rather than assert on external
values
v2: Hoist unrelated cleanups earlier, use name 'max_transfer' in more
places, tweak iscsi calculations
---
 include/block/block_int.h      | 11 +++++++----
 include/sysemu/block-backend.h |  2 +-
 block/block-backend.c          | 10 +++++-----
 block/io.c                     | 23 +++++++++++------------
 block/iscsi.c                  | 23 +++++++++++++++--------
 block/nbd.c                    |  2 +-
 block/raw-posix.c              |  4 ++--
 hw/block/virtio-blk.c          |  9 +++++----
 hw/scsi/scsi-generic.c         | 12 ++++++------
 qemu-img.c                     |  8 ++++----
 10 files changed, 57 insertions(+), 47 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 2057156..7d2b152 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -338,11 +338,14 @@ typedef struct BlockLimits {
      * power of 2, and less than max_pwrite_zeroes if that is set */
     uint32_t pwrite_zeroes_alignment;

-    /* optimal transfer length in sectors */
-    int opt_transfer_length;
+    /* optimal transfer length in bytes (must be power of 2, and
+     * multiple of bs->request_alignment), or 0 if no preferred size */
+    uint32_t opt_transfer;

-    /* maximal transfer length in sectors */
-    int max_transfer_length;
+    /* maximal transfer length in bytes (need not be power of 2, but
+     * should be multiple of opt_transfer), or 0 for no 32-bit limit.
+     * For now, anything larger than INT_MAX is clamped down. */
+    uint32_t max_transfer;

     /* memory alignment so that no bounce buffer is needed */
     size_t min_mem_alignment;
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index c04af8e..2469a1c 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -170,7 +170,7 @@ bool blk_is_available(BlockBackend *blk);
 void blk_lock_medium(BlockBackend *blk, bool locked);
 void blk_eject(BlockBackend *blk, bool eject_flag);
 int blk_get_flags(BlockBackend *blk);
-int blk_get_max_transfer_length(BlockBackend *blk);
+uint32_t blk_get_max_transfer(BlockBackend *blk);
 int blk_get_max_iov(BlockBackend *blk);
 void blk_set_guest_block_size(BlockBackend *blk, int align);
 void *blk_try_blockalign(BlockBackend *blk, size_t size);
diff --git a/block/block-backend.c b/block/block-backend.c
index 1fb070b..e042544 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1303,16 +1303,16 @@ int blk_get_flags(BlockBackend *blk)
     }
 }

-/* Returns the maximum transfer length, in sectors; guaranteed nonzero */
-int blk_get_max_transfer_length(BlockBackend *blk)
+/* Returns the maximum transfer length, in bytes; guaranteed nonzero */
+uint32_t blk_get_max_transfer(BlockBackend *blk)
 {
     BlockDriverState *bs = blk_bs(blk);
-    int max = 0;
+    uint32_t max = 0;

     if (bs) {
-        max = bs->bl.max_transfer_length;
+        max = bs->bl.max_transfer;
     }
-    return MIN_NON_ZERO(max, BDRV_REQUEST_MAX_SECTORS);
+    return MIN_NON_ZERO(max, INT_MAX);
 }

 int blk_get_max_iov(BlockBackend *blk)
diff --git a/block/io.c b/block/io.c
index 323e822..8ca9d43 100644
--- a/block/io.c
+++ b/block/io.c
@@ -88,8 +88,8 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
             error_propagate(errp, local_err);
             return;
         }
-        bs->bl.opt_transfer_length = bs->file->bs->bl.opt_transfer_length;
-        bs->bl.max_transfer_length = bs->file->bs->bl.max_transfer_length;
+        bs->bl.opt_transfer = bs->file->bs->bl.opt_transfer;
+        bs->bl.max_transfer = bs->file->bs->bl.max_transfer;
         bs->bl.min_mem_alignment = bs->file->bs->bl.min_mem_alignment;
         bs->bl.opt_mem_alignment = bs->file->bs->bl.opt_mem_alignment;
         bs->bl.max_iov = bs->file->bs->bl.max_iov;
@@ -107,12 +107,10 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
             error_propagate(errp, local_err);
             return;
         }
-        bs->bl.opt_transfer_length =
-            MAX(bs->bl.opt_transfer_length,
-                bs->backing->bs->bl.opt_transfer_length);
-        bs->bl.max_transfer_length =
-            MIN_NON_ZERO(bs->bl.max_transfer_length,
-                         bs->backing->bs->bl.max_transfer_length);
+        bs->bl.opt_transfer = MAX(bs->bl.opt_transfer,
+                                  bs->backing->bs->bl.opt_transfer);
+        bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
+                                           bs->backing->bs->bl.max_transfer);
         bs->bl.opt_mem_alignment =
             MAX(bs->bl.opt_mem_alignment,
                 bs->backing->bs->bl.opt_mem_alignment);
@@ -1156,7 +1154,8 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
     return bdrv_co_do_readv(bs, sector_num, nb_sectors, qiov, 0);
 }

-#define MAX_WRITE_ZEROES_BOUNCE_BUFFER 32768
+/* Maximum buffer for write zeroes fallback, in bytes */
+#define MAX_WRITE_ZEROES_BOUNCE_BUFFER (32768 << BDRV_SECTOR_BITS)

 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
     int64_t offset, int count, BdrvRequestFlags flags)
@@ -1214,7 +1213,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,

         if (ret == -ENOTSUP) {
             /* Fall back to bounce buffer if write zeroes is unsupported */
-            int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer_length,
+            int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
                                             MAX_WRITE_ZEROES_BOUNCE_BUFFER);
             BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;

@@ -1225,7 +1224,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
                 write_flags &= ~BDRV_REQ_FUA;
                 need_flush = true;
             }
-            num = MIN(num, max_xfer_len << BDRV_SECTOR_BITS);
+            num = MIN(num, max_transfer);
             iov.iov_len = num;
             if (iov.iov_base == NULL) {
                 iov.iov_base = qemu_try_blockalign(bs, num);
@@ -1242,7 +1241,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
             /* Keep bounce buffer around if it is big enough for all
              * all future requests.
              */
-            if (num < max_xfer_len << BDRV_SECTOR_BITS) {
+            if (num < max_transfer) {
                 qemu_vfree(iov.iov_base);
                 iov.iov_base = NULL;
             }
diff --git a/block/iscsi.c b/block/iscsi.c
index b4661c9..368687d 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -473,9 +473,10 @@ iscsi_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
         return -EINVAL;
     }

-    if (bs->bl.max_transfer_length && nb_sectors > bs->bl.max_transfer_length) {
+    if (bs->bl.max_transfer &&
+        nb_sectors << BDRV_SECTOR_BITS > bs->bl.max_transfer) {
         error_report("iSCSI Error: Write of %d sectors exceeds max_xfer_len "
-                     "of %d sectors", nb_sectors, bs->bl.max_transfer_length);
+                     "of %" PRIu32 " bytes", nb_sectors, bs->bl.max_transfer);
         return -EINVAL;
     }

@@ -650,9 +651,10 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
         return -EINVAL;
     }

-    if (bs->bl.max_transfer_length && nb_sectors > bs->bl.max_transfer_length) {
+    if (bs->bl.max_transfer &&
+        nb_sectors << BDRV_SECTOR_BITS > bs->bl.max_transfer) {
         error_report("iSCSI Error: Read of %d sectors exceeds max_xfer_len "
-                     "of %d sectors", nb_sectors, bs->bl.max_transfer_length);
+                     "of %" PRIu32 " bytes", nb_sectors, bs->bl.max_transfer);
         return -EINVAL;
     }

@@ -1707,7 +1709,7 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
      * iscsi_open(): iscsi targets don't change their limits. */

     IscsiLun *iscsilun = bs->opaque;
-    uint32_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;
+    uint64_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;

     bs->request_alignment = iscsilun->block_size;

@@ -1715,7 +1717,9 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
         max_xfer_len = MIN(max_xfer_len, iscsilun->bl.max_xfer_len);
     }

-    bs->bl.max_transfer_length = sector_limits_lun2qemu(max_xfer_len, iscsilun);
+    if (max_xfer_len * iscsilun->block_size < INT_MAX) {
+        bs->bl.max_transfer = max_xfer_len * iscsilun->block_size;
+    }

     if (iscsilun->lbp.lbpu) {
         if (iscsilun->bl.max_unmap < 0xffffffff) {
@@ -1738,8 +1742,11 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
     } else {
         bs->bl.pwrite_zeroes_alignment = iscsilun->block_size;
     }
-    bs->bl.opt_transfer_length =
-        sector_limits_lun2qemu(iscsilun->bl.opt_xfer_len, iscsilun);
+    if (iscsilun->bl.opt_xfer_len &&
+        iscsilun->bl.opt_xfer_len < INT_MAX / iscsilun->block_size) {
+        bs->bl.opt_transfer = pow2floor(iscsilun->bl.opt_xfer_len *
+                                        iscsilun->block_size);
+    }
 }

 /* Note that this will not re-establish a connection with an iSCSI target - it
diff --git a/block/nbd.c b/block/nbd.c
index bf67c8a..f5511ea 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -363,7 +363,7 @@ static int nbd_co_flush(BlockDriverState *bs)
 static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     bs->bl.max_discard = NBD_MAX_SECTORS;
-    bs->bl.max_transfer_length = NBD_MAX_SECTORS;
+    bs->bl.max_transfer = NBD_MAX_BUFFER_SIZE;
 }

 static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
diff --git a/block/raw-posix.c b/block/raw-posix.c
index bef7a67..8da2f94 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -745,8 +745,8 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
     if (!fstat(s->fd, &st)) {
         if (S_ISBLK(st.st_mode)) {
             int ret = hdev_get_max_transfer_length(s->fd);
-            if (ret >= 0) {
-                bs->bl.max_transfer_length = ret;
+            if (ret > 0 && ret <= BDRV_REQUEST_MAX_SECTORS) {
+                bs->bl.max_transfer = pow2floor(ret << BDRV_SECTOR_BITS);
             }
         }
     }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 1d2792e..5000c27 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -382,7 +382,7 @@ static int multireq_compare(const void *a, const void *b)
 void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
 {
     int i = 0, start = 0, num_reqs = 0, niov = 0, nb_sectors = 0;
-    int max_xfer_len;
+    uint32_t max_transfer;
     int64_t sector_num = 0;

     if (mrb->num_reqs == 1) {
@@ -391,7 +391,7 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
         return;
     }

-    max_xfer_len = blk_get_max_transfer_length(mrb->reqs[0]->dev->blk);
+    max_transfer = blk_get_max_transfer(mrb->reqs[0]->dev->blk);

     qsort(mrb->reqs, mrb->num_reqs, sizeof(*mrb->reqs),
           &multireq_compare);
@@ -407,8 +407,9 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
              */
             if (sector_num + nb_sectors != req->sector_num ||
                 niov > blk_get_max_iov(blk) - req->qiov.niov ||
-                req->qiov.size / BDRV_SECTOR_SIZE > max_xfer_len ||
-                nb_sectors > max_xfer_len - req->qiov.size / BDRV_SECTOR_SIZE) {
+                req->qiov.size > max_transfer ||
+                nb_sectors > (max_transfer -
+                              req->qiov.size) / BDRV_SECTOR_SIZE) {
                 submit_requests(blk, mrb, start, num_reqs, niov);
                 num_reqs = 0;
             }
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 0cb8568..7a588a7 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -225,14 +225,14 @@ static void scsi_read_complete(void * opaque, int ret)
     if (s->type == TYPE_DISK &&
         r->req.cmd.buf[0] == INQUIRY &&
         r->req.cmd.buf[2] == 0xb0) {
-        uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk) /
-            (s->blocksize / BDRV_SECTOR_SIZE);
+        uint32_t max_transfer =
+            blk_get_max_transfer(s->conf.blk) / s->blocksize;

-        assert(max_xfer_len);
-        stl_be_p(&r->buf[8], max_xfer_len);
+        assert(max_transfer);
+        stl_be_p(&r->buf[8], max_transfer);
         /* Also take care of the opt xfer len. */
-        if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
-            stl_be_p(&r->buf[12], max_xfer_len);
+        if (ldl_be_p(&r->buf[12]) > max_transfer) {
+            stl_be_p(&r->buf[12], max_transfer);
         }
     }
     scsi_req_data(&r->req, len);
diff --git a/qemu-img.c b/qemu-img.c
index 14e2661..cf9876d 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2078,13 +2078,13 @@ static int img_convert(int argc, char **argv)
     }
     out_bs = blk_bs(out_blk);

-    /* increase bufsectors from the default 4096 (2M) if opt_transfer_length
+    /* increase bufsectors from the default 4096 (2M) if opt_transfer
      * or discard_alignment of the out_bs is greater. Limit to 32768 (16MB)
      * as maximum. */
     bufsectors = MIN(32768,
-                     MAX(bufsectors, MAX(out_bs->bl.opt_transfer_length,
-                                         out_bs->bl.discard_alignment))
-                    );
+                     MAX(bufsectors,
+                         MAX(out_bs->bl.opt_transfer >> BDRV_SECTOR_BITS,
+                             out_bs->bl.discard_alignment)));

     if (skip_create) {
         int64_t output_sectors = blk_nb_sectors(out_blk);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (14 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  6:12   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based Eric Blake
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

Improve the documentation of the write zeroes limits, to mention
additional constraints that drivers should observe.  Worth squashing
into commit cf081fca, if that hadn't been pushed already :)

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: new patch, split off from "block: Switch discard length bounds..."
---
 include/block/block_int.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7d2b152..7a4a00f 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -331,11 +331,14 @@ typedef struct BlockLimits {
     int64_t discard_alignment;

     /* maximum number of bytes that can zeroized at once (since it is
-     * signed, it must be < 2G, if set) */
+     * signed, it must be < 2G, if set), should be multiple of
+     * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
     int32_t max_pwrite_zeroes;

     /* optimal alignment for write zeroes requests in bytes, must be
-     * power of 2, and less than max_pwrite_zeroes if that is set */
+     * power of 2, less than max_pwrite_zeroes if that is set, and
+     * multiple of bs->request_alignment. May be 0 if
+     * bs->request_alignment is good enough */
     uint32_t pwrite_zeroes_alignment;

     /* optimal transfer length in bytes (must be power of 2, and
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (15 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  6:43   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 18/22] block: Drop raw_refresh_limits() Eric Blake
                   ` (5 subsequent siblings)
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, Max Reitz, Ronnie Sahlberg,
	Paolo Bonzini, Peter Lieven

Sector-based limits are awkward to think about; in our on-going
quest to move to byte-based interfaces, convert max_discard and
discard_alignment.  Rename them, using 'pdiscard' as an aid to
track which remaining discard interfaces need conversion, and so
that the compiler will help us catch the change in semantics
across any rebased code.  The BlockLimits type is now completely
byte-based; and in iscsi.c, sector_limits_lun2qemu() is no
longer needed.

pdiscard_alignment is made unsigned (we use power-of-2 alignments
as bitmasks, where unsigned is easier to think about) while
leaving max_pdiscard signed (since we still have an 'int'
interface); this is comparable to what commit cf081fc did for
write zeroes limits.  We may later want to make everything an
unsigned 64-bit limit - but that requires a bigger code audit.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: split out write_zeroes wording tweaks, improve commit message
v2: rebase nbd and iscsi limits across earlier improvements
---
 include/block/block_int.h | 14 ++++++++++----
 block/io.c                | 16 +++++++++-------
 block/iscsi.c             | 19 ++++++-------------
 block/nbd.c               |  2 +-
 qemu-img.c                |  3 ++-
 5 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7a4a00f..388ef80 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -324,11 +324,17 @@ struct BlockDriver {
 };

 typedef struct BlockLimits {
-    /* maximum number of sectors that can be discarded at once */
-    int max_discard;
+    /* maximum number of bytes that can be discarded at once (since it
+     * is signed, it must be < 2G, if set), should be multiple of
+     * pdiscard_alignment, but need not be power of 2. May be 0 if no
+     * inherent 32-bit limit */
+    int32_t max_pdiscard;

-    /* optimal alignment for discard requests in sectors */
-    int64_t discard_alignment;
+    /* optimal alignment for discard requests in bytes, must be power
+     * of 2, less than max_discard if that is set, and multiple of
+     * bs->request_alignment. May be 0 if bs->request_alignment is
+     * good enough */
+    uint32_t pdiscard_alignment;

     /* maximum number of bytes that can zeroized at once (since it is
      * signed, it must be < 2G, if set), should be multiple of
diff --git a/block/io.c b/block/io.c
index 8ca9d43..0f15d05 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2368,19 +2368,21 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
         goto out;
     }

-    max_discard = MIN_NON_ZERO(bs->bl.max_discard, BDRV_REQUEST_MAX_SECTORS);
+    max_discard = MIN_NON_ZERO(bs->bl.max_pdiscard >> BDRV_SECTOR_BITS,
+                               BDRV_REQUEST_MAX_SECTORS);
     while (nb_sectors > 0) {
         int ret;
         int num = nb_sectors;
+        int discard_alignment = bs->bl.pdiscard_alignment >> BDRV_SECTOR_BITS;

         /* align request */
-        if (bs->bl.discard_alignment &&
-            num >= bs->bl.discard_alignment &&
-            sector_num % bs->bl.discard_alignment) {
-            if (num > bs->bl.discard_alignment) {
-                num = bs->bl.discard_alignment;
+        if (discard_alignment &&
+            num >= discard_alignment &&
+            sector_num % discard_alignment) {
+            if (num > discard_alignment) {
+                num = discard_alignment;
             }
-            num -= sector_num % bs->bl.discard_alignment;
+            num -= sector_num % discard_alignment;
         }

         /* limit request size */
diff --git a/block/iscsi.c b/block/iscsi.c
index 368687d..0d16c31 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1696,13 +1696,6 @@ static void iscsi_close(BlockDriverState *bs)
     memset(iscsilun, 0, sizeof(IscsiLun));
 }

-static int sector_limits_lun2qemu(int64_t sector, IscsiLun *iscsilun)
-{
-    int limit = MIN(sector_lun2qemu(sector, iscsilun), INT_MAX / 2 + 1);
-
-    return limit < BDRV_REQUEST_MAX_SECTORS ? limit : 0;
-}
-
 static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     /* We don't actually refresh here, but just return data queried in
@@ -1722,14 +1715,14 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
     }

     if (iscsilun->lbp.lbpu) {
-        if (iscsilun->bl.max_unmap < 0xffffffff) {
-            bs->bl.max_discard =
-                sector_limits_lun2qemu(iscsilun->bl.max_unmap, iscsilun);
+        if (iscsilun->bl.max_unmap < 0xffffffff / iscsilun->block_size) {
+            bs->bl.max_pdiscard =
+                iscsilun->bl.max_unmap * iscsilun->block_size;
         }
-        bs->bl.discard_alignment =
-            sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
+        bs->bl.pdiscard_alignment =
+            iscsilun->bl.opt_unmap_gran * iscsilun->block_size;
     } else {
-        bs->bl.discard_alignment = iscsilun->block_size >> BDRV_SECTOR_BITS;
+        bs->bl.pdiscard_alignment = iscsilun->block_size;
     }

     if (iscsilun->bl.max_ws_len < 0xffffffff / iscsilun->block_size) {
diff --git a/block/nbd.c b/block/nbd.c
index f5511ea..08e5b67 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -362,7 +362,7 @@ static int nbd_co_flush(BlockDriverState *bs)

 static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
 {
-    bs->bl.max_discard = NBD_MAX_SECTORS;
+    bs->bl.max_pdiscard = NBD_MAX_BUFFER_SIZE;
     bs->bl.max_transfer = NBD_MAX_BUFFER_SIZE;
 }

diff --git a/qemu-img.c b/qemu-img.c
index cf9876d..1237d61 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2084,7 +2084,8 @@ static int img_convert(int argc, char **argv)
     bufsectors = MIN(32768,
                      MAX(bufsectors,
                          MAX(out_bs->bl.opt_transfer >> BDRV_SECTOR_BITS,
-                             out_bs->bl.discard_alignment)));
+                             out_bs->bl.pdiscard_alignment >>
+                             BDRV_SECTOR_BITS)));

     if (skip_create) {
         int64_t output_sectors = blk_nb_sectors(out_blk);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 18/22] block: Drop raw_refresh_limits()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (16 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  6:44   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 19/22] block: Split bdrv_merge_limits() from bdrv_refresh_limits() Eric Blake
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

The raw block driver was blindly copying all limits from bs->file,
even though: 1. the main bdrv_refresh_limits() already does this
for many of the limits, and 2. blindly copying from the children
can weaken any stricter limits that were already inherited from
the backing chain during the main bdrv_refresh_limits().  Also,
a future patch is about to move .request_alignment into
BlockLimits, and that is a limit that should NOT be copied from
other layers in the BDS chain.

Thus, we can completely drop raw_refresh_limits(), and rely on
the block layer setting up the proper limits.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: new patch, split out from 'block: Split bdrv_merge_limits()...'
---
 block/raw_bsd.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 7f63791..5855e84 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -1,6 +1,6 @@
 /* BlockDriver implementation for "raw"
  *
- * Copyright (C) 2010, 2013, Red Hat, Inc.
+ * Copyright (C) 2010-2016 Red Hat, Inc.
  * Copyright (C) 2010, Blue Swirl <blauwirbel@gmail.com>
  * Copyright (C) 2009, Anthony Liguori <aliguori@us.ibm.com>
  *
@@ -150,11 +150,6 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
     return bdrv_get_info(bs->file->bs, bdi);
 }

-static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
-{
-    bs->bl = bs->file->bs->bl;
-}
-
 static int raw_truncate(BlockDriverState *bs, int64_t offset)
 {
     return bdrv_truncate(bs->file->bs, offset);
@@ -252,7 +247,6 @@ BlockDriver bdrv_raw = {
     .bdrv_getlength       = &raw_getlength,
     .has_variable_length  = true,
     .bdrv_get_info        = &raw_get_info,
-    .bdrv_refresh_limits  = &raw_refresh_limits,
     .bdrv_probe_blocksizes = &raw_probe_blocksizes,
     .bdrv_probe_geometry  = &raw_probe_geometry,
     .bdrv_media_changed   = &raw_media_changed,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 19/22] block: Split bdrv_merge_limits() from bdrv_refresh_limits()
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (17 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 18/22] block: Drop raw_refresh_limits() Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  6:48   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit Eric Blake
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

During bdrv_merge_limits(), we were computing initial limits
based on another BDS in two places.  At first glance, the two
computations are not identical (one is doing straight copying,
the other is doing merging towards or away from zero) - but
when you realize that the first round is starting with all-0
memory, all of the merging happens to work.  Factoring out the
merging makes it easier to track how two BDS limits are merged,
in case we have future reasons to merge in even more limits.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: Split raw block driver changes to its own patch, make new
function static
v2: new patch
---
 block/io.c | 31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/block/io.c b/block/io.c
index 0f15d05..69dbbd3 100644
--- a/block/io.c
+++ b/block/io.c
@@ -67,6 +67,17 @@ static void bdrv_parent_drained_end(BlockDriverState *bs)
     }
 }

+static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
+{
+    dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
+    dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
+    dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
+                                 src->opt_mem_alignment);
+    dst->min_mem_alignment = MAX(dst->min_mem_alignment,
+                                 src->min_mem_alignment);
+    dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
+}
+
 void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BlockDriver *drv = bs->drv;
@@ -88,11 +99,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
             error_propagate(errp, local_err);
             return;
         }
-        bs->bl.opt_transfer = bs->file->bs->bl.opt_transfer;
-        bs->bl.max_transfer = bs->file->bs->bl.max_transfer;
-        bs->bl.min_mem_alignment = bs->file->bs->bl.min_mem_alignment;
-        bs->bl.opt_mem_alignment = bs->file->bs->bl.opt_mem_alignment;
-        bs->bl.max_iov = bs->file->bs->bl.max_iov;
+        bdrv_merge_limits(&bs->bl, &bs->file->bs->bl);
     } else {
         bs->bl.min_mem_alignment = 512;
         bs->bl.opt_mem_alignment = getpagesize();
@@ -107,19 +114,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
             error_propagate(errp, local_err);
             return;
         }
-        bs->bl.opt_transfer = MAX(bs->bl.opt_transfer,
-                                  bs->backing->bs->bl.opt_transfer);
-        bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
-                                           bs->backing->bs->bl.max_transfer);
-        bs->bl.opt_mem_alignment =
-            MAX(bs->bl.opt_mem_alignment,
-                bs->backing->bs->bl.opt_mem_alignment);
-        bs->bl.min_mem_alignment =
-            MAX(bs->bl.min_mem_alignment,
-                bs->backing->bs->bl.min_mem_alignment);
-        bs->bl.max_iov =
-            MIN(bs->bl.max_iov,
-                bs->backing->bs->bl.max_iov);
+        bdrv_merge_limits(&bs->bl, &bs->backing->bs->bl);
     }

     /* Then let the driver override it */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (18 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 19/22] block: Split bdrv_merge_limits() from bdrv_refresh_limits() Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  7:07   ` Fam Zheng
  2016-06-24 13:45   ` Kevin Wolf
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 21/22] block: Fix error message style Eric Blake
                   ` (2 subsequent siblings)
  22 siblings, 2 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, Max Reitz, Ronnie Sahlberg,
	Paolo Bonzini, Peter Lieven

It makes more sense to have ALL block size limit constraints
in the same struct.  Improve the documentation while at it.

Simplify a couple of conditionals, now that we have audited and
documented that request_alignment is always non-zero.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: drop dead conditionals
v2: drop hacks for save/restore of alignment, now that earlier
patches handled setting up BlockLimits more sanely
---
 include/block/block_int.h | 22 +++++++++++++---------
 block.c                   |  2 +-
 block/blkdebug.c          |  2 +-
 block/bochs.c             |  2 +-
 block/cloop.c             |  2 +-
 block/dmg.c               |  2 +-
 block/io.c                | 14 +++++++-------
 block/iscsi.c             |  2 +-
 block/qcow2.c             |  2 +-
 block/raw-posix.c         | 16 ++++++++--------
 block/raw-win32.c         |  6 +++---
 block/vvfat.c             |  2 +-
 12 files changed, 39 insertions(+), 35 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 388ef80..845800e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -324,6 +324,12 @@ struct BlockDriver {
 };

 typedef struct BlockLimits {
+    /* Alignment requirement, in bytes, for offset/length of I/O
+     * requests. Must be a power of 2 less than INT_MAX. A value of 0
+     * defaults to 1 for drivers with modern byte interfaces, and to
+     * 512 otherwise. */
+    uint32_t request_alignment;
+
     /* maximum number of bytes that can be discarded at once (since it
      * is signed, it must be < 2G, if set), should be multiple of
      * pdiscard_alignment, but need not be power of 2. May be 0 if no
@@ -332,8 +338,8 @@ typedef struct BlockLimits {

     /* optimal alignment for discard requests in bytes, must be power
      * of 2, less than max_discard if that is set, and multiple of
-     * bs->request_alignment. May be 0 if bs->request_alignment is
-     * good enough */
+     * bl.request_alignment. May be 0 if bl.request_alignment is good
+     * enough */
     uint32_t pdiscard_alignment;

     /* maximum number of bytes that can zeroized at once (since it is
@@ -343,12 +349,12 @@ typedef struct BlockLimits {

     /* optimal alignment for write zeroes requests in bytes, must be
      * power of 2, less than max_pwrite_zeroes if that is set, and
-     * multiple of bs->request_alignment. May be 0 if
-     * bs->request_alignment is good enough */
+     * multiple of bl.request_alignment. May be 0 if
+     * bl.request_alignment is good enough */
     uint32_t pwrite_zeroes_alignment;

     /* optimal transfer length in bytes (must be power of 2, and
-     * multiple of bs->request_alignment), or 0 if no preferred size */
+     * multiple of bl.request_alignment), or 0 if no preferred size */
     uint32_t opt_transfer;

     /* maximal transfer length in bytes (need not be power of 2, but
@@ -356,10 +362,10 @@ typedef struct BlockLimits {
      * For now, anything larger than INT_MAX is clamped down. */
     uint32_t max_transfer;

-    /* memory alignment so that no bounce buffer is needed */
+    /* memory alignment, in bytes so that no bounce buffer is needed */
     size_t min_mem_alignment;

-    /* memory alignment for bounce buffer */
+    /* memory alignment, in bytes, for bounce buffer */
     size_t opt_mem_alignment;

     /* maximum number of iovec elements */
@@ -465,8 +471,6 @@ struct BlockDriverState {
     /* I/O Limits */
     BlockLimits bl;

-    /* Alignment requirement for offset/length of I/O requests */
-    unsigned int request_alignment;
     /* Flags honored during pwrite (so far: BDRV_REQ_FUA) */
     unsigned int supported_write_flags;
     /* Flags honored during pwrite_zeroes (so far: BDRV_REQ_FUA,
diff --git a/block.c b/block.c
index c2fbf06..34894ad 100644
--- a/block.c
+++ b/block.c
@@ -1016,7 +1016,7 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file,

     assert(bdrv_opt_mem_align(bs) != 0);
     assert(bdrv_min_mem_align(bs) != 0);
-    assert(is_power_of_2(bs->request_alignment) || bdrv_is_sg(bs));
+    assert(is_power_of_2(bs->bl.request_alignment));

     qemu_opts_del(opts);
     return 0;
diff --git a/block/blkdebug.c b/block/blkdebug.c
index 54b6870..b6ecee3 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -726,7 +726,7 @@ static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
     BDRVBlkdebugState *s = bs->opaque;

     if (s->align) {
-        bs->request_alignment = s->align;
+        bs->bl.request_alignment = s->align;
     }
 }

diff --git a/block/bochs.c b/block/bochs.c
index 182c50b..4194f1d 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -190,7 +190,7 @@ fail:

 static void bochs_refresh_limits(BlockDriverState *bs, Error **errp)
 {
-    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+    bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
 }

 static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
diff --git a/block/cloop.c b/block/cloop.c
index d574003..b5dc286 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -200,7 +200,7 @@ fail:

 static void cloop_refresh_limits(BlockDriverState *bs, Error **errp)
 {
-    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+    bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
 }

 static inline int cloop_read_block(BlockDriverState *bs, int block_num)
diff --git a/block/dmg.c b/block/dmg.c
index 1e53cd8..9612c21 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -548,7 +548,7 @@ fail:

 static void dmg_refresh_limits(BlockDriverState *bs, Error **errp)
 {
-    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+    bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
 }

 static inline int is_sector_in_chunk(BDRVDMGState* s,
diff --git a/block/io.c b/block/io.c
index 69dbbd3..b9e53e3 100644
--- a/block/io.c
+++ b/block/io.c
@@ -90,7 +90,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
     }

     /* Default alignment based on whether driver has byte interface */
-    bs->request_alignment = drv->bdrv_co_preadv ? 1 : 512;
+    bs->bl.request_alignment = drv->bdrv_co_preadv ? 1 : 512;

     /* Take some limits from the children as a default */
     if (bs->file) {
@@ -459,7 +459,7 @@ static int bdrv_get_cluster_size(BlockDriverState *bs)

     ret = bdrv_get_info(bs, &bdi);
     if (ret < 0 || bdi.cluster_size == 0) {
-        return bs->request_alignment;
+        return bs->bl.request_alignment;
     } else {
         return bdi.cluster_size;
     }
@@ -1068,7 +1068,7 @@ int coroutine_fn bdrv_co_preadv(BlockDriverState *bs,
     BlockDriver *drv = bs->drv;
     BdrvTrackedRequest req;

-    uint64_t align = bs->request_alignment;
+    uint64_t align = bs->bl.request_alignment;
     uint8_t *head_buf = NULL;
     uint8_t *tail_buf = NULL;
     QEMUIOVector local_qiov;
@@ -1164,8 +1164,8 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
     int tail = 0;

     int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
-    int alignment = MAX(bs->bl.pwrite_zeroes_alignment ?: 1,
-                        bs->request_alignment);
+    int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
+                        bs->bl.request_alignment);

     assert(is_power_of_2(alignment));
     head = offset & (alignment - 1);
@@ -1324,7 +1324,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BlockDriverState *bs,
     uint8_t *buf = NULL;
     QEMUIOVector local_qiov;
     struct iovec iov;
-    uint64_t align = bs->request_alignment;
+    uint64_t align = bs->bl.request_alignment;
     unsigned int head_padding_bytes, tail_padding_bytes;
     int ret = 0;

@@ -1411,7 +1411,7 @@ int coroutine_fn bdrv_co_pwritev(BlockDriverState *bs,
     BdrvRequestFlags flags)
 {
     BdrvTrackedRequest req;
-    uint64_t align = bs->request_alignment;
+    uint64_t align = bs->bl.request_alignment;
     uint8_t *head_buf = NULL;
     uint8_t *tail_buf = NULL;
     QEMUIOVector local_qiov;
diff --git a/block/iscsi.c b/block/iscsi.c
index 0d16c31..6e8d4fe 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1704,7 +1704,7 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
     IscsiLun *iscsilun = bs->opaque;
     uint64_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;

-    bs->request_alignment = iscsilun->block_size;
+    bs->bl.request_alignment = iscsilun->block_size;

     if (iscsilun->bl.max_xfer_len) {
         max_xfer_len = MIN(max_xfer_len, iscsilun->bl.max_xfer_len);
diff --git a/block/qcow2.c b/block/qcow2.c
index 48f80b6..fdf13cb 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1201,7 +1201,7 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)

     if (bs->encrypted) {
         /* Encryption works on a sector granularity */
-        bs->request_alignment = BDRV_SECTOR_SIZE;
+        bs->bl.request_alignment = BDRV_SECTOR_SIZE;
     }
     bs->bl.pwrite_zeroes_alignment = s->cluster_size;
 }
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 8da2f94..d3d7cce 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -302,22 +302,22 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
     /* For SCSI generic devices the alignment is not really used.
        With buffered I/O, we don't have any restrictions. */
     if (bdrv_is_sg(bs) || !s->needs_alignment) {
-        bs->request_alignment = 1;
+        bs->bl.request_alignment = 1;
         s->buf_align = 1;
         return;
     }

-    bs->request_alignment = 0;
+    bs->bl.request_alignment = 0;
     s->buf_align = 0;
     /* Let's try to use the logical blocksize for the alignment. */
-    if (probe_logical_blocksize(fd, &bs->request_alignment) < 0) {
-        bs->request_alignment = 0;
+    if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) {
+        bs->bl.request_alignment = 0;
     }
 #ifdef CONFIG_XFS
     if (s->is_xfs) {
         struct dioattr da;
         if (xfsctl(NULL, fd, XFS_IOC_DIOINFO, &da) >= 0) {
-            bs->request_alignment = da.d_miniosz;
+            bs->bl.request_alignment = da.d_miniosz;
             /* The kernel returns wrong information for d_mem */
             /* s->buf_align = da.d_mem; */
         }
@@ -337,19 +337,19 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
         qemu_vfree(buf);
     }

-    if (!bs->request_alignment) {
+    if (!bs->bl.request_alignment) {
         size_t align;
         buf = qemu_memalign(s->buf_align, max_align);
         for (align = 512; align <= max_align; align <<= 1) {
             if (raw_is_io_aligned(fd, buf, align)) {
-                bs->request_alignment = align;
+                bs->bl.request_alignment = align;
                 break;
             }
         }
         qemu_vfree(buf);
     }

-    if (!s->buf_align || !bs->request_alignment) {
+    if (!s->buf_align || !bs->bl.request_alignment) {
         error_setg(errp, "Could not find working O_DIRECT alignment. "
                          "Try cache.direct=off.");
     }
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 88382d9..62edb1a 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -230,14 +230,14 @@ static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
     BOOL status;

     if (s->type == FTYPE_CD) {
-        bs->request_alignment = 2048;
+        bs->bl.request_alignment = 2048;
         return;
     }
     if (s->type == FTYPE_HARDDISK) {
         status = DeviceIoControl(s->hfile, IOCTL_DISK_GET_DRIVE_GEOMETRY_EX,
                                  NULL, 0, &dg, sizeof(dg), &count, NULL);
         if (status != 0) {
-            bs->request_alignment = dg.Geometry.BytesPerSector;
+            bs->bl.request_alignment = dg.Geometry.BytesPerSector;
             return;
         }
         /* try GetDiskFreeSpace too */
@@ -247,7 +247,7 @@ static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
         GetDiskFreeSpace(s->drive_path, &sectorsPerCluster,
                          &dg.Geometry.BytesPerSector,
                          &freeClusters, &totalClusters);
-        bs->request_alignment = dg.Geometry.BytesPerSector;
+        bs->bl.request_alignment = dg.Geometry.BytesPerSector;
     }
 }

diff --git a/block/vvfat.c b/block/vvfat.c
index 4d44636..fc948cb 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1210,7 +1210,7 @@ fail:

 static void vvfat_refresh_limits(BlockDriverState *bs, Error **errp)
 {
-    bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
+    bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
 }

 static inline void vvfat_close_current_file(BDRVVVFATState *s)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 21/22] block: Fix error message style
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (19 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  7:09   ` Fam Zheng
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 22/22] block: Use bool as appropriate for BDS members Eric Blake
  2016-06-24 14:32 ` [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Kevin Wolf
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, kwolf, famz, stefanha, Max Reitz

error_setg() is not supposed to be used for multi-sentence
messages; tweak the message to append a hint instead.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: new patch
---
 block/raw-posix.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index d3d7cce..c979ac3 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -350,8 +350,8 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
     }

     if (!s->buf_align || !bs->bl.request_alignment) {
-        error_setg(errp, "Could not find working O_DIRECT alignment. "
-                         "Try cache.direct=off.");
+        error_setg(errp, "Could not find working O_DIRECT alignment");
+        error_append_hint(errp, "Try cache.direct=off\n");
     }
 }

-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PATCH v3 22/22] block: Use bool as appropriate for BDS members
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (20 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 21/22] block: Fix error message style Eric Blake
@ 2016-06-23 22:37 ` Eric Blake
  2016-06-24  7:12   ` Fam Zheng
  2016-06-24 14:32 ` [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Kevin Wolf
  22 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-23 22:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, kwolf, famz, stefanha, Max Reitz, Ronnie Sahlberg,
	Paolo Bonzini, Peter Lieven

Using int for values that are only used as booleans is confusing.
While at it, rearrange a couple of members so that all the bools
are contiguous.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: new patch
---
 include/block/block.h     |  8 ++++----
 include/block/block_int.h | 13 +++++++------
 block.c                   | 22 +++++++++++-----------
 block/bochs.c             |  2 +-
 block/cloop.c             |  2 +-
 block/crypto.c            |  4 ++--
 block/dmg.c               |  2 +-
 block/iscsi.c             |  2 +-
 block/qcow.c              |  2 +-
 block/qcow2.c             |  2 +-
 block/vvfat.c             |  4 ++--
 11 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 733a8ec..211a0f2 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -362,8 +362,8 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
                             int64_t sector_num, int nb_sectors, int *pnum);

-int bdrv_is_read_only(BlockDriverState *bs);
-int bdrv_is_sg(BlockDriverState *bs);
+bool bdrv_is_read_only(BlockDriverState *bs);
+bool bdrv_is_sg(BlockDriverState *bs);
 bool bdrv_is_inserted(BlockDriverState *bs);
 int bdrv_media_changed(BlockDriverState *bs);
 void bdrv_lock_medium(BlockDriverState *bs, bool locked);
@@ -390,8 +390,8 @@ BlockDriverState *bdrv_first(BdrvNextIterator *it);
 BlockDriverState *bdrv_next(BdrvNextIterator *it);

 BlockDriverState *bdrv_next_monitor_owned(BlockDriverState *bs);
-int bdrv_is_encrypted(BlockDriverState *bs);
-int bdrv_key_required(BlockDriverState *bs);
+bool bdrv_is_encrypted(BlockDriverState *bs);
+bool bdrv_key_required(BlockDriverState *bs);
 int bdrv_set_key(BlockDriverState *bs, const char *key);
 void bdrv_add_key(BlockDriverState *bs, const char *key, Error **errp);
 int bdrv_query_missing_keys(void);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 845800e..206f982 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -429,14 +429,15 @@ struct BdrvChild {
 struct BlockDriverState {
     int64_t total_sectors; /* if we are reading a disk image, give its
                               size in sectors */
-    int read_only; /* if true, the media is read only */
     int open_flags; /* flags used to open the file, re-used for re-open */
-    int encrypted; /* if true, the media is encrypted */
-    int valid_key; /* if true, a valid encryption key has been set */
-    int sg;        /* if true, the device is a /dev/sg* */
-    int copy_on_read; /* if true, copy read backing sectors into image
+    bool read_only; /* if true, the media is read only */
+    bool encrypted; /* if true, the media is encrypted */
+    bool valid_key; /* if true, a valid encryption key has been set */
+    bool sg;        /* if true, the device is a /dev/sg* */
+    bool probed;    /* if true, format was probed rather than specified */
+
+    int copy_on_read; /* if nonzero, copy read backing sectors into image.
                          note this is a reference count */
-    bool probed;

     BlockDriver *drv; /* NULL means no media */
     void *opaque;
diff --git a/block.c b/block.c
index 34894ad..947df29 100644
--- a/block.c
+++ b/block.c
@@ -2183,9 +2183,9 @@ static void bdrv_close(BlockDriverState *bs)
         bs->backing_file[0] = '\0';
         bs->backing_format[0] = '\0';
         bs->total_sectors = 0;
-        bs->encrypted = 0;
-        bs->valid_key = 0;
-        bs->sg = 0;
+        bs->encrypted = false;
+        bs->valid_key = false;
+        bs->sg = false;
         QDECREF(bs->options);
         QDECREF(bs->explicit_options);
         bs->options = NULL;
@@ -2643,30 +2643,30 @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr)
     *nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
 }

-int bdrv_is_read_only(BlockDriverState *bs)
+bool bdrv_is_read_only(BlockDriverState *bs)
 {
     return bs->read_only;
 }

-int bdrv_is_sg(BlockDriverState *bs)
+bool bdrv_is_sg(BlockDriverState *bs)
 {
     return bs->sg;
 }

-int bdrv_is_encrypted(BlockDriverState *bs)
+bool bdrv_is_encrypted(BlockDriverState *bs)
 {
     if (bs->backing && bs->backing->bs->encrypted) {
-        return 1;
+        return true;
     }
     return bs->encrypted;
 }

-int bdrv_key_required(BlockDriverState *bs)
+bool bdrv_key_required(BlockDriverState *bs)
 {
     BdrvChild *backing = bs->backing;

     if (backing && backing->bs->encrypted && !backing->bs->valid_key) {
-        return 1;
+        return true;
     }
     return (bs->encrypted && !bs->valid_key);
 }
@@ -2688,10 +2688,10 @@ int bdrv_set_key(BlockDriverState *bs, const char *key)
     }
     ret = bs->drv->bdrv_set_key(bs, key);
     if (ret < 0) {
-        bs->valid_key = 0;
+        bs->valid_key = false;
     } else if (!bs->valid_key) {
         /* call the change callback now, we skipped it on open */
-        bs->valid_key = 1;
+        bs->valid_key = true;
         bdrv_parent_cb_change_media(bs, true);
     }
     return ret;
diff --git a/block/bochs.c b/block/bochs.c
index 4194f1d..6427ad4 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -104,7 +104,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
     struct bochs_header bochs;
     int ret;

-    bs->read_only = 1; // no write support yet
+    bs->read_only = true; /* no write support yet */

     ret = bdrv_pread(bs->file->bs, 0, &bochs, sizeof(bochs));
     if (ret < 0) {
diff --git a/block/cloop.c b/block/cloop.c
index b5dc286..8f046e1 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -66,7 +66,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
     uint32_t offsets_size, max_compressed_block_size = 1, i;
     int ret;

-    bs->read_only = 1;
+    bs->read_only = true;

     /* read header */
     ret = bdrv_pread(bs->file->bs, 128, &s->block_size, 4);
diff --git a/block/crypto.c b/block/crypto.c
index 758e14e..ec1f247 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -322,8 +322,8 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
         goto cleanup;
     }

-    bs->encrypted = 1;
-    bs->valid_key = 1;
+    bs->encrypted = true;
+    bs->valid_key = true;

     ret = 0;
  cleanup:
diff --git a/block/dmg.c b/block/dmg.c
index 9612c21..11a0673 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -438,7 +438,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
     int64_t offset;
     int ret;

-    bs->read_only = 1;
+    bs->read_only = true;

     s->n_chunks = 0;
     s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
diff --git a/block/iscsi.c b/block/iscsi.c
index 6e8d4fe..ac351f3 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1596,7 +1596,7 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
      * will try to read from the device to guess the image format.
      */
     if (iscsilun->type != TYPE_DISK && iscsilun->type != TYPE_ROM) {
-        bs->sg = 1;
+        bs->sg = true;
     }

     task = iscsi_do_inquiry(iscsilun->iscsi, iscsilun->lun, 1,
diff --git a/block/qcow.c b/block/qcow.c
index 312af52..e4175b8 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -174,7 +174,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
             goto fail;
         }

-        bs->encrypted = 1;
+        bs->encrypted = true;
     }
     s->cluster_bits = header.cluster_bits;
     s->cluster_size = 1 << s->cluster_bits;
diff --git a/block/qcow2.c b/block/qcow2.c
index fdf13cb..0178931 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -980,7 +980,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
             goto fail;
         }

-        bs->encrypted = 1;
+        bs->encrypted = true;
     }

     s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
diff --git a/block/vvfat.c b/block/vvfat.c
index fc948cb..55b5759 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1158,7 +1158,7 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
     s->current_cluster=0xffffffff;

     /* read only is the default for safety */
-    bs->read_only = 1;
+    bs->read_only = true;
     s->qcow = s->write_target = NULL;
     s->qcow_filename = NULL;
     s->fat2 = NULL;
@@ -1174,7 +1174,7 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
         if (ret < 0) {
             goto fail;
         }
-        bs->read_only = 0;
+        bs->read_only = false;
     }

     bs->total_sectors = cyls * heads * secs;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 07/22] scsi: Advertise limits by blocksize, not 512
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 07/22] scsi: Advertise limits by blocksize, not 512 Eric Blake
@ 2016-06-24  5:22   ` Fam Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  5:22 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, kwolf, stefanha, qemu-stable, Paolo Bonzini

On Thu, 06/23 16:37, Eric Blake wrote:
> s->blocksize may be larger than 512, in which case our
> tweaks to max_xfer_len and opt_xfer_len must be scaled
> appropriately.
> 
> Reported-by: Fam Zheng <famz@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> CC: qemu-stable@nongnu.org
> 
> ---
> v3: new patch
> ---
>  hw/scsi/scsi-generic.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 6a2d89a..75e227d 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -225,7 +225,8 @@ static void scsi_read_complete(void * opaque, int ret)
>      if (s->type == TYPE_DISK &&
>          r->req.cmd.buf[0] == INQUIRY &&
>          r->req.cmd.buf[2] == 0xb0) {
> -        uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk);
> +        uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk) /
> +            (s->blocksize / BDRV_SECTOR_SIZE);
>          if (max_xfer_len) {
>              stl_be_p(&r->buf[8], max_xfer_len);
>              /* Also take care of the opt xfer len. */
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 08/22] block: Give nonzero result to blk_get_max_transfer_length()
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 08/22] block: Give nonzero result to blk_get_max_transfer_length() Eric Blake
@ 2016-06-24  5:24   ` Fam Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  5:24 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz,
	Michael S. Tsirkin, Paolo Bonzini

On Thu, 06/23 16:37, Eric Blake wrote:
> Making all callers special-case 0 as unlimited is awkward,
> and we DO have a hard maximum of BDRV_REQUEST_MAX_SECTORS given
> our current block layer API limits.
> 
> In the case of scsi, this means that we now always advertise a
> limit to the guest, even in cases where the underlying layers
> previously use 0 for no inherent limit beyond the block layer.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
> 
> ---
> v3: rebase to scsi limits fix
> v2: new patch
> ---
>  block/block-backend.c  |  7 ++++---
>  hw/block/virtio-blk.c  |  3 +--
>  hw/scsi/scsi-generic.c | 12 ++++++------
>  3 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 34500e6..1fb070b 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1303,15 +1303,16 @@ int blk_get_flags(BlockBackend *blk)
>      }
>  }
> 
> +/* Returns the maximum transfer length, in sectors; guaranteed nonzero */
>  int blk_get_max_transfer_length(BlockBackend *blk)
>  {
>      BlockDriverState *bs = blk_bs(blk);
> +    int max = 0;
> 
>      if (bs) {
> -        return bs->bl.max_transfer_length;
> -    } else {
> -        return 0;
> +        max = bs->bl.max_transfer_length;
>      }
> +    return MIN_NON_ZERO(max, BDRV_REQUEST_MAX_SECTORS);
>  }
> 
>  int blk_get_max_iov(BlockBackend *blk)
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 284e646..1d2792e 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -382,7 +382,7 @@ static int multireq_compare(const void *a, const void *b)
>  void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
>  {
>      int i = 0, start = 0, num_reqs = 0, niov = 0, nb_sectors = 0;
> -    int max_xfer_len = 0;
> +    int max_xfer_len;
>      int64_t sector_num = 0;
> 
>      if (mrb->num_reqs == 1) {
> @@ -392,7 +392,6 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
>      }
> 
>      max_xfer_len = blk_get_max_transfer_length(mrb->reqs[0]->dev->blk);
> -    max_xfer_len = MIN_NON_ZERO(max_xfer_len, BDRV_REQUEST_MAX_SECTORS);
> 
>      qsort(mrb->reqs, mrb->num_reqs, sizeof(*mrb->reqs),
>            &multireq_compare);
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 75e227d..0cb8568 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -227,12 +227,12 @@ static void scsi_read_complete(void * opaque, int ret)
>          r->req.cmd.buf[2] == 0xb0) {
>          uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk) /
>              (s->blocksize / BDRV_SECTOR_SIZE);
> -        if (max_xfer_len) {
> -            stl_be_p(&r->buf[8], max_xfer_len);
> -            /* Also take care of the opt xfer len. */
> -            if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
> -                stl_be_p(&r->buf[12], max_xfer_len);
> -            }
> +
> +        assert(max_xfer_len);
> +        stl_be_p(&r->buf[8], max_xfer_len);
> +        /* Also take care of the opt xfer len. */
> +        if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
> +            stl_be_p(&r->buf[12], max_xfer_len);
>          }
>      }
>      scsi_req_data(&r->req, len);
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 09/22] blkdebug: Set request_alignment during .bdrv_refresh_limits()
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 09/22] blkdebug: Set request_alignment during .bdrv_refresh_limits() Eric Blake
@ 2016-06-24  5:42   ` Fam Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  5:42 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz, Markus Armbruster

On Thu, 06/23 16:37, Eric Blake wrote:
> We want to eventually stick request_alignment alongside other
> BlockLimits, but first, we must ensure it is populated at the
> same time as all other limits, rather than being a special case
> that is set only when a block is first opened.
> 
> Note that when the user does not provide "align", then we were
> defaulting to bs->request_alignment - but at this stage in the
> initialization, that was always 512.  We were also rejecting an
> explicit "align":0 from the user; this patch now allows that,
> as an explicit request for the default alignment (which may not
> always be 512 in the future).
> 
> qemu-iotests 77 is particularly sensitive to the fact that we
> can specify an artificial alignment override in blkdebug, and
> that override must continue to work even when limits are
> refreshed on an already open device.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> v3: rework to allow user to specify "align":0 for default
> v2: new patch
> ---
>  qapi/block-core.json |  3 ++-
>  block/blkdebug.c     | 19 +++++++++++++++----
>  2 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 98a20d2..ac8f5f6 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1961,7 +1961,8 @@
>  #
>  # @config:          #optional filename of the configuration file
>  #
> -# @align:           #optional required alignment for requests in bytes
> +# @align:           #optional required alignment for requests in bytes,
> +#                   must be power of 2, or 0 for default
>  #
>  # @inject-error:    #optional array of error injection descriptions
>  #
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 20d25bd..54b6870 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -37,6 +37,7 @@
>  typedef struct BDRVBlkdebugState {
>      int state;
>      int new_state;
> +    int align;
> 
>      QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
>      QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
> @@ -382,10 +383,10 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
>      }
> 
>      /* Set request alignment */
> -    align = qemu_opt_get_size(opts, "align", bs->request_alignment);
> -    if (align > 0 && align < INT_MAX && !(align & (align - 1))) {
> -        bs->request_alignment = align;
> -    } else {
> +    align = qemu_opt_get_size(opts, "align", 0);
> +    if (align < INT_MAX && is_power_of_2(align)) {
> +        s->align = align;
> +    } else if (align) {
>          error_setg(errp, "Invalid alignment");
>          ret = -EINVAL;
>          goto fail_unref;
> @@ -720,6 +721,15 @@ static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
>      bs->full_open_options = opts;
>  }
> 
> +static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> +    BDRVBlkdebugState *s = bs->opaque;
> +
> +    if (s->align) {
> +        bs->request_alignment = s->align;
> +    }
> +}
> +
>  static int blkdebug_reopen_prepare(BDRVReopenState *reopen_state,
>                                     BlockReopenQueue *queue, Error **errp)
>  {
> @@ -738,6 +748,7 @@ static BlockDriver bdrv_blkdebug = {
>      .bdrv_getlength         = blkdebug_getlength,
>      .bdrv_truncate          = blkdebug_truncate,
>      .bdrv_refresh_filename  = blkdebug_refresh_filename,
> +    .bdrv_refresh_limits    = blkdebug_refresh_limits,
> 
>      .bdrv_aio_readv         = blkdebug_aio_readv,
>      .bdrv_aio_writev        = blkdebug_aio_writev,
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based Eric Blake
@ 2016-06-24  6:06   ` Fam Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  6:06 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven, Michael S. Tsirkin

On Thu, 06/23 16:37, Eric Blake wrote:
> Sector-based limits are awkward to think about; in our on-going
> quest to move to byte-based interfaces, convert max_transfer_length
> and opt_transfer_length.  Rename them (dropping the _length suffix)
> so that the compiler will help us catch the change in semantics
> across any rebased code, and improve the documentation.  Use unsigned
> values, so that we don't have to worry about negative values and
> so that bit-twiddling is easier; however, we are still constrained
> by 2^31 of signed int in most APIs.
> 
> When a value comes from an external source (iscsi and raw-posix),
> sanitize the results to ensure that opt_transfer is a power of 2.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits Eric Blake
@ 2016-06-24  6:12   ` Fam Zheng
  2016-06-24 14:10     ` Eric Blake
  0 siblings, 1 reply; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  6:12 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz

On Thu, 06/23 16:37, Eric Blake wrote:
> Improve the documentation of the write zeroes limits, to mention
> additional constraints that drivers should observe.  Worth squashing
> into commit cf081fca, if that hadn't been pushed already :)
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> v3: new patch, split off from "block: Switch discard length bounds..."
> ---
>  include/block/block_int.h | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 7d2b152..7a4a00f 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -331,11 +331,14 @@ typedef struct BlockLimits {
>      int64_t discard_alignment;
> 
>      /* maximum number of bytes that can zeroized at once (since it is
> -     * signed, it must be < 2G, if set) */
> +     * signed, it must be < 2G, if set), should be multiple of
> +     * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */

"inherent 32-bit limit"? What is special about 32-bit other than this field is
32-bit? Anyway,

Reviewed-by: Fam Zheng <famz@redhat.com>

>      int32_t max_pwrite_zeroes;
> 
>      /* optimal alignment for write zeroes requests in bytes, must be
> -     * power of 2, and less than max_pwrite_zeroes if that is set */
> +     * power of 2, less than max_pwrite_zeroes if that is set, and
> +     * multiple of bs->request_alignment. May be 0 if
> +     * bs->request_alignment is good enough */
>      uint32_t pwrite_zeroes_alignment;
> 
>      /* optimal transfer length in bytes (must be power of 2, and
> -- 
> 2.5.5
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based Eric Blake
@ 2016-06-24  6:43   ` Fam Zheng
  2016-06-24 14:15     ` Eric Blake
  0 siblings, 1 reply; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  6:43 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven

On Thu, 06/23 16:37, Eric Blake wrote:
> Sector-based limits are awkward to think about; in our on-going
> quest to move to byte-based interfaces, convert max_discard and
> discard_alignment.  Rename them, using 'pdiscard' as an aid to
> track which remaining discard interfaces need conversion, and so
> that the compiler will help us catch the change in semantics
> across any rebased code.  The BlockLimits type is now completely
> byte-based; and in iscsi.c, sector_limits_lun2qemu() is no
> longer needed.
> 
> pdiscard_alignment is made unsigned (we use power-of-2 alignments
> as bitmasks, where unsigned is easier to think about) while
> leaving max_pdiscard signed (since we still have an 'int'
> interface); this is comparable to what commit cf081fc did for
> write zeroes limits.  We may later want to make everything an
> unsigned 64-bit limit - but that requires a bigger code audit.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> v3: split out write_zeroes wording tweaks, improve commit message
> v2: rebase nbd and iscsi limits across earlier improvements
> ---
>  include/block/block_int.h | 14 ++++++++++----
>  block/io.c                | 16 +++++++++-------
>  block/iscsi.c             | 19 ++++++-------------
>  block/nbd.c               |  2 +-
>  qemu-img.c                |  3 ++-
>  5 files changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 7a4a00f..388ef80 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -324,11 +324,17 @@ struct BlockDriver {
>  };
> 
>  typedef struct BlockLimits {
> -    /* maximum number of sectors that can be discarded at once */
> -    int max_discard;
> +    /* maximum number of bytes that can be discarded at once (since it
> +     * is signed, it must be < 2G, if set), should be multiple of
> +     * pdiscard_alignment, but need not be power of 2. May be 0 if no
> +     * inherent 32-bit limit */
> +    int32_t max_pdiscard;
> 
> -    /* optimal alignment for discard requests in sectors */
> -    int64_t discard_alignment;
> +    /* optimal alignment for discard requests in bytes, must be power
> +     * of 2, less than max_discard if that is set, and multiple of

s/max_discard/max_pdiscard/

> +     * bs->request_alignment. May be 0 if bs->request_alignment is
> +     * good enough */
> +    uint32_t pdiscard_alignment;
> 
>      /* maximum number of bytes that can zeroized at once (since it is
>       * signed, it must be < 2G, if set), should be multiple of
> diff --git a/block/io.c b/block/io.c
> index 8ca9d43..0f15d05 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2368,19 +2368,21 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
>          goto out;
>      }
> 
> -    max_discard = MIN_NON_ZERO(bs->bl.max_discard, BDRV_REQUEST_MAX_SECTORS);
> +    max_discard = MIN_NON_ZERO(bs->bl.max_pdiscard >> BDRV_SECTOR_BITS,
> +                               BDRV_REQUEST_MAX_SECTORS);
>      while (nb_sectors > 0) {
>          int ret;
>          int num = nb_sectors;
> +        int discard_alignment = bs->bl.pdiscard_alignment >> BDRV_SECTOR_BITS;
> 
>          /* align request */
> -        if (bs->bl.discard_alignment &&
> -            num >= bs->bl.discard_alignment &&
> -            sector_num % bs->bl.discard_alignment) {
> -            if (num > bs->bl.discard_alignment) {
> -                num = bs->bl.discard_alignment;
> +        if (discard_alignment &&
> +            num >= discard_alignment &&
> +            sector_num % discard_alignment) {
> +            if (num > discard_alignment) {
> +                num = discard_alignment;
>              }
> -            num -= sector_num % bs->bl.discard_alignment;
> +            num -= sector_num % discard_alignment;

Or just

               num = discard_alignment - sector_num % discard_alignment;

without the if.

Otherwise looks good,

Reviewed-by: Fam Zheng <famz@redhat.com>

>          }
> 
>          /* limit request size */
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 368687d..0d16c31 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -1696,13 +1696,6 @@ static void iscsi_close(BlockDriverState *bs)
>      memset(iscsilun, 0, sizeof(IscsiLun));
>  }
> 
> -static int sector_limits_lun2qemu(int64_t sector, IscsiLun *iscsilun)
> -{
> -    int limit = MIN(sector_lun2qemu(sector, iscsilun), INT_MAX / 2 + 1);
> -
> -    return limit < BDRV_REQUEST_MAX_SECTORS ? limit : 0;
> -}
> -
>  static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>      /* We don't actually refresh here, but just return data queried in
> @@ -1722,14 +1715,14 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
>      }
> 
>      if (iscsilun->lbp.lbpu) {
> -        if (iscsilun->bl.max_unmap < 0xffffffff) {
> -            bs->bl.max_discard =
> -                sector_limits_lun2qemu(iscsilun->bl.max_unmap, iscsilun);
> +        if (iscsilun->bl.max_unmap < 0xffffffff / iscsilun->block_size) {
> +            bs->bl.max_pdiscard =
> +                iscsilun->bl.max_unmap * iscsilun->block_size;
>          }
> -        bs->bl.discard_alignment =
> -            sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
> +        bs->bl.pdiscard_alignment =
> +            iscsilun->bl.opt_unmap_gran * iscsilun->block_size;
>      } else {
> -        bs->bl.discard_alignment = iscsilun->block_size >> BDRV_SECTOR_BITS;
> +        bs->bl.pdiscard_alignment = iscsilun->block_size;
>      }
> 
>      if (iscsilun->bl.max_ws_len < 0xffffffff / iscsilun->block_size) {

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 18/22] block: Drop raw_refresh_limits()
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 18/22] block: Drop raw_refresh_limits() Eric Blake
@ 2016-06-24  6:44   ` Fam Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  6:44 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz

On Thu, 06/23 16:37, Eric Blake wrote:
> The raw block driver was blindly copying all limits from bs->file,
> even though: 1. the main bdrv_refresh_limits() already does this
> for many of the limits, and 2. blindly copying from the children
> can weaken any stricter limits that were already inherited from
> the backing chain during the main bdrv_refresh_limits().  Also,
> a future patch is about to move .request_alignment into
> BlockLimits, and that is a limit that should NOT be copied from
> other layers in the BDS chain.
> 
> Thus, we can completely drop raw_refresh_limits(), and rely on
> the block layer setting up the proper limits.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> v3: new patch, split out from 'block: Split bdrv_merge_limits()...'
> ---
>  block/raw_bsd.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/block/raw_bsd.c b/block/raw_bsd.c
> index 7f63791..5855e84 100644
> --- a/block/raw_bsd.c
> +++ b/block/raw_bsd.c
> @@ -1,6 +1,6 @@
>  /* BlockDriver implementation for "raw"
>   *
> - * Copyright (C) 2010, 2013, Red Hat, Inc.
> + * Copyright (C) 2010-2016 Red Hat, Inc.
>   * Copyright (C) 2010, Blue Swirl <blauwirbel@gmail.com>
>   * Copyright (C) 2009, Anthony Liguori <aliguori@us.ibm.com>
>   *
> @@ -150,11 +150,6 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
>      return bdrv_get_info(bs->file->bs, bdi);
>  }
> 
> -static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
> -{
> -    bs->bl = bs->file->bs->bl;
> -}
> -
>  static int raw_truncate(BlockDriverState *bs, int64_t offset)
>  {
>      return bdrv_truncate(bs->file->bs, offset);
> @@ -252,7 +247,6 @@ BlockDriver bdrv_raw = {
>      .bdrv_getlength       = &raw_getlength,
>      .has_variable_length  = true,
>      .bdrv_get_info        = &raw_get_info,
> -    .bdrv_refresh_limits  = &raw_refresh_limits,
>      .bdrv_probe_blocksizes = &raw_probe_blocksizes,
>      .bdrv_probe_geometry  = &raw_probe_geometry,
>      .bdrv_media_changed   = &raw_media_changed,
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 19/22] block: Split bdrv_merge_limits() from bdrv_refresh_limits()
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 19/22] block: Split bdrv_merge_limits() from bdrv_refresh_limits() Eric Blake
@ 2016-06-24  6:48   ` Fam Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  6:48 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz

On Thu, 06/23 16:37, Eric Blake wrote:
> During bdrv_merge_limits(), we were computing initial limits
> based on another BDS in two places.  At first glance, the two
> computations are not identical (one is doing straight copying,
> the other is doing merging towards or away from zero) - but
> when you realize that the first round is starting with all-0
> memory, all of the merging happens to work.  Factoring out the
> merging makes it easier to track how two BDS limits are merged,
> in case we have future reasons to merge in even more limits.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> v3: Split raw block driver changes to its own patch, make new
> function static
> v2: new patch
> ---
>  block/io.c | 31 +++++++++++++------------------
>  1 file changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 0f15d05..69dbbd3 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -67,6 +67,17 @@ static void bdrv_parent_drained_end(BlockDriverState *bs)
>      }
>  }
> 
> +static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
> +{
> +    dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
> +    dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
> +    dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
> +                                 src->opt_mem_alignment);
> +    dst->min_mem_alignment = MAX(dst->min_mem_alignment,
> +                                 src->min_mem_alignment);
> +    dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
> +}
> +
>  void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>      BlockDriver *drv = bs->drv;
> @@ -88,11 +99,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
>              error_propagate(errp, local_err);
>              return;
>          }
> -        bs->bl.opt_transfer = bs->file->bs->bl.opt_transfer;
> -        bs->bl.max_transfer = bs->file->bs->bl.max_transfer;
> -        bs->bl.min_mem_alignment = bs->file->bs->bl.min_mem_alignment;
> -        bs->bl.opt_mem_alignment = bs->file->bs->bl.opt_mem_alignment;
> -        bs->bl.max_iov = bs->file->bs->bl.max_iov;
> +        bdrv_merge_limits(&bs->bl, &bs->file->bs->bl);
>      } else {
>          bs->bl.min_mem_alignment = 512;
>          bs->bl.opt_mem_alignment = getpagesize();
> @@ -107,19 +114,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
>              error_propagate(errp, local_err);
>              return;
>          }
> -        bs->bl.opt_transfer = MAX(bs->bl.opt_transfer,
> -                                  bs->backing->bs->bl.opt_transfer);
> -        bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
> -                                           bs->backing->bs->bl.max_transfer);
> -        bs->bl.opt_mem_alignment =
> -            MAX(bs->bl.opt_mem_alignment,
> -                bs->backing->bs->bl.opt_mem_alignment);
> -        bs->bl.min_mem_alignment =
> -            MAX(bs->bl.min_mem_alignment,
> -                bs->backing->bs->bl.min_mem_alignment);
> -        bs->bl.max_iov =
> -            MIN(bs->bl.max_iov,
> -                bs->backing->bs->bl.max_iov);
> +        bdrv_merge_limits(&bs->bl, &bs->backing->bs->bl);
>      }
> 
>      /* Then let the driver override it */
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit Eric Blake
@ 2016-06-24  7:07   ` Fam Zheng
  2016-06-24 13:45   ` Kevin Wolf
  1 sibling, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  7:07 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven

On Thu, 06/23 16:37, Eric Blake wrote:
> It makes more sense to have ALL block size limit constraints
> in the same struct.  Improve the documentation while at it.
> 
> Simplify a couple of conditionals, now that we have audited and
> documented that request_alignment is always non-zero.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 21/22] block: Fix error message style
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 21/22] block: Fix error message style Eric Blake
@ 2016-06-24  7:09   ` Fam Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  7:09 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz

On Thu, 06/23 16:37, Eric Blake wrote:
> error_setg() is not supposed to be used for multi-sentence
> messages; tweak the message to append a hint instead.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> v3: new patch
> ---
>  block/raw-posix.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index d3d7cce..c979ac3 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -350,8 +350,8 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
>      }
> 
>      if (!s->buf_align || !bs->bl.request_alignment) {
> -        error_setg(errp, "Could not find working O_DIRECT alignment. "
> -                         "Try cache.direct=off.");
> +        error_setg(errp, "Could not find working O_DIRECT alignment");
> +        error_append_hint(errp, "Try cache.direct=off\n");
>      }
>  }
> 
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/22] block: Use bool as appropriate for BDS members
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 22/22] block: Use bool as appropriate for BDS members Eric Blake
@ 2016-06-24  7:12   ` Fam Zheng
  0 siblings, 0 replies; 40+ messages in thread
From: Fam Zheng @ 2016-06-24  7:12 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven

On Thu, 06/23 16:37, Eric Blake wrote:
> Using int for values that are only used as booleans is confusing.
> While at it, rearrange a couple of members so that all the bools
> are contiguous.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit Eric Blake
  2016-06-24  7:07   ` Fam Zheng
@ 2016-06-24 13:45   ` Kevin Wolf
  2016-06-24 14:17     ` Eric Blake
  1 sibling, 1 reply; 40+ messages in thread
From: Kevin Wolf @ 2016-06-24 13:45 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, famz, stefanha, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven

Am 24.06.2016 um 00:37 hat Eric Blake geschrieben:
> It makes more sense to have ALL block size limit constraints
> in the same struct.  Improve the documentation while at it.
> 
> Simplify a couple of conditionals, now that we have audited and
> documented that request_alignment is always non-zero.

Now you mention in the commit message that request_alignment is always
non-zero and this is supposedly documented...

> Signed-off-by: Eric Blake <eblake@redhat.com>
> 

> @@ -324,6 +324,12 @@ struct BlockDriver {
>  };
> 
>  typedef struct BlockLimits {
> +    /* Alignment requirement, in bytes, for offset/length of I/O
> +     * requests. Must be a power of 2 less than INT_MAX. A value of 0
> +     * defaults to 1 for drivers with modern byte interfaces, and to
> +     * 512 otherwise. */
> +    uint32_t request_alignment;

...but you didn't actually change the documentation.

This is the only problem I saw in this series, so if you post a new
version of the comment, I can change it while applying the series.

Kevin

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits
  2016-06-24  6:12   ` Fam Zheng
@ 2016-06-24 14:10     ` Eric Blake
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-24 14:10 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1686 bytes --]

On 06/24/2016 12:12 AM, Fam Zheng wrote:
> On Thu, 06/23 16:37, Eric Blake wrote:
>> Improve the documentation of the write zeroes limits, to mention
>> additional constraints that drivers should observe.  Worth squashing
>> into commit cf081fca, if that hadn't been pushed already :)
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>>
>> ---
>> v3: new patch, split off from "block: Switch discard length bounds..."
>> ---
>>  include/block/block_int.h | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 7d2b152..7a4a00f 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -331,11 +331,14 @@ typedef struct BlockLimits {
>>      int64_t discard_alignment;
>>
>>      /* maximum number of bytes that can zeroized at once (since it is
>> -     * signed, it must be < 2G, if set) */
>> +     * signed, it must be < 2G, if set), should be multiple of
>> +     * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
> 
> "inherent 32-bit limit"? What is special about 32-bit other than this field is
> 32-bit? Anyway,
> 
> Reviewed-by: Fam Zheng <famz@redhat.com>

Someday, we may want to bump it up to a 64-bit field, and convert the
driver callback for write_zeroes to take a 64-bit size.  But first we'd
have to audit to make sure all drivers properly advertise their maximum
(for example, NBD is still hard-coded to a 32-bit maximum write_zeroes,
even if the callback would permit a 64-bit size).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based
  2016-06-24  6:43   ` Fam Zheng
@ 2016-06-24 14:15     ` Eric Blake
  2016-06-24 14:29       ` Kevin Wolf
  0 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2016-06-24 14:15 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, qemu-block, kwolf, stefanha, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven

[-- Attachment #1: Type: text/plain, Size: 2388 bytes --]

On 06/24/2016 12:43 AM, Fam Zheng wrote:
> On Thu, 06/23 16:37, Eric Blake wrote:
>> Sector-based limits are awkward to think about; in our on-going
>> quest to move to byte-based interfaces, convert max_discard and
>> discard_alignment.  Rename them, using 'pdiscard' as an aid to
>> track which remaining discard interfaces need conversion, and so
>> that the compiler will help us catch the change in semantics
>> across any rebased code.  The BlockLimits type is now completely
>> byte-based; and in iscsi.c, sector_limits_lun2qemu() is no
>> longer needed.
>>
>> pdiscard_alignment is made unsigned (we use power-of-2 alignments
>> as bitmasks, where unsigned is easier to think about) while
>> leaving max_pdiscard signed (since we still have an 'int'
>> interface); this is comparable to what commit cf081fc did for
>> write zeroes limits.  We may later want to make everything an
>> unsigned 64-bit limit - but that requires a bigger code audit.
>>

>> -    /* optimal alignment for discard requests in sectors */
>> -    int64_t discard_alignment;
>> +    /* optimal alignment for discard requests in bytes, must be power
>> +     * of 2, less than max_discard if that is set, and multiple of
> 
> s/max_discard/max_pdiscard/

Maintainer could touch it up on pull request.


>>          /* align request */
>> -        if (bs->bl.discard_alignment &&
>> -            num >= bs->bl.discard_alignment &&
>> -            sector_num % bs->bl.discard_alignment) {
>> -            if (num > bs->bl.discard_alignment) {
>> -                num = bs->bl.discard_alignment;
>> +        if (discard_alignment &&
>> +            num >= discard_alignment &&
>> +            sector_num % discard_alignment) {
>> +            if (num > discard_alignment) {
>> +                num = discard_alignment;
>>              }
>> -            num -= sector_num % bs->bl.discard_alignment;
>> +            num -= sector_num % discard_alignment;
> 
> Or just
> 
>                num = discard_alignment - sector_num % discard_alignment;
> 
> without the if.
> 

Sure. It all gets simplified later when I switch to bdrv_co_pdiscard().
 Up to the maintainer.

> Otherwise looks good,
> 
> Reviewed-by: Fam Zheng <famz@redhat.com>
> 


-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit
  2016-06-24 13:45   ` Kevin Wolf
@ 2016-06-24 14:17     ` Eric Blake
  0 siblings, 0 replies; 40+ messages in thread
From: Eric Blake @ 2016-06-24 14:17 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, qemu-block, famz, stefanha, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven

[-- Attachment #1: Type: text/plain, Size: 1491 bytes --]

On 06/24/2016 07:45 AM, Kevin Wolf wrote:
> Am 24.06.2016 um 00:37 hat Eric Blake geschrieben:
>> It makes more sense to have ALL block size limit constraints
>> in the same struct.  Improve the documentation while at it.
>>
>> Simplify a couple of conditionals, now that we have audited and
>> documented that request_alignment is always non-zero.
> 
> Now you mention in the commit message that request_alignment is always
> non-zero and this is supposedly documented...
> 
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>>
> 
>> @@ -324,6 +324,12 @@ struct BlockDriver {
>>  };
>>
>>  typedef struct BlockLimits {
>> +    /* Alignment requirement, in bytes, for offset/length of I/O
>> +     * requests. Must be a power of 2 less than INT_MAX. A value of 0
>> +     * defaults to 1 for drivers with modern byte interfaces, and to
>> +     * 512 otherwise. */
>> +    uint32_t request_alignment;
> 
> ...but you didn't actually change the documentation.

Oops.  Blame late night patch editing :)

> 
> This is the only problem I saw in this series, so if you post a new
> version of the comment, I can change it while applying the series.

/* Alignment requirement, in bytes, for offset/length of I/O
 * requests. Must be a power of 2 less than INT_MAX; defaults to
 * 1 for drivers with modern byte interfaces, and to 512
 * otherwise. */

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based
  2016-06-24 14:15     ` Eric Blake
@ 2016-06-24 14:29       ` Kevin Wolf
  0 siblings, 0 replies; 40+ messages in thread
From: Kevin Wolf @ 2016-06-24 14:29 UTC (permalink / raw)
  To: Eric Blake
  Cc: Fam Zheng, qemu-devel, qemu-block, stefanha, Max Reitz,
	Ronnie Sahlberg, Paolo Bonzini, Peter Lieven

[-- Attachment #1: Type: text/plain, Size: 2586 bytes --]

Am 24.06.2016 um 16:15 hat Eric Blake geschrieben:
> On 06/24/2016 12:43 AM, Fam Zheng wrote:
> > On Thu, 06/23 16:37, Eric Blake wrote:
> >> Sector-based limits are awkward to think about; in our on-going
> >> quest to move to byte-based interfaces, convert max_discard and
> >> discard_alignment.  Rename them, using 'pdiscard' as an aid to
> >> track which remaining discard interfaces need conversion, and so
> >> that the compiler will help us catch the change in semantics
> >> across any rebased code.  The BlockLimits type is now completely
> >> byte-based; and in iscsi.c, sector_limits_lun2qemu() is no
> >> longer needed.
> >>
> >> pdiscard_alignment is made unsigned (we use power-of-2 alignments
> >> as bitmasks, where unsigned is easier to think about) while
> >> leaving max_pdiscard signed (since we still have an 'int'
> >> interface); this is comparable to what commit cf081fc did for
> >> write zeroes limits.  We may later want to make everything an
> >> unsigned 64-bit limit - but that requires a bigger code audit.
> >>
> 
> >> -    /* optimal alignment for discard requests in sectors */
> >> -    int64_t discard_alignment;
> >> +    /* optimal alignment for discard requests in bytes, must be power
> >> +     * of 2, less than max_discard if that is set, and multiple of
> > 
> > s/max_discard/max_pdiscard/
> 
> Maintainer could touch it up on pull request.

Okay, no problem.

> >>          /* align request */
> >> -        if (bs->bl.discard_alignment &&
> >> -            num >= bs->bl.discard_alignment &&
> >> -            sector_num % bs->bl.discard_alignment) {
> >> -            if (num > bs->bl.discard_alignment) {
> >> -                num = bs->bl.discard_alignment;
> >> +        if (discard_alignment &&
> >> +            num >= discard_alignment &&
> >> +            sector_num % discard_alignment) {
> >> +            if (num > discard_alignment) {
> >> +                num = discard_alignment;
> >>              }
> >> -            num -= sector_num % bs->bl.discard_alignment;
> >> +            num -= sector_num % discard_alignment;
> > 
> > Or just
> > 
> >                num = discard_alignment - sector_num % discard_alignment;
> > 
> > without the if.
> > 
> 
> Sure. It all gets simplified later when I switch to bdrv_co_pdiscard().
>  Up to the maintainer.

This is an actual code change and not a bug fix, so I'll leave this one
alone. We can always have a follow-up patch, but as you say your other
work will simplify it anyway, I guess that's not necessary.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PATCH v3 00/22] Byte-based block limits
  2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
                   ` (21 preceding siblings ...)
  2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 22/22] block: Use bool as appropriate for BDS members Eric Blake
@ 2016-06-24 14:32 ` Kevin Wolf
  22 siblings, 0 replies; 40+ messages in thread
From: Kevin Wolf @ 2016-06-24 14:32 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, qemu-block, famz, stefanha

Am 24.06.2016 um 00:37 hat Eric Blake geschrieben:
> BlockLimits is currently an ugly mix of byte limits vs.
> sector limits.  Unify it.  Fix some bugs I found in
> bdrv_aligned_preadv() while at it.
> 
> Prequisite: none (built on Kevin's block branch, which is currently
> merged into master)
> 
> Also available as a tag at:
> git fetch git://repo.or.cz/qemu/ericb.git nbd-limits-v3

Thanks, fixed up the two comments and applied to my block branch.

Kevin

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2016-06-24 14:32 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 01/22] block: Tighter assertions on bdrv_aligned_pwritev() Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 02/22] block: Document supported flags during bdrv_aligned_preadv() Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 03/22] block: Fix harmless off-by-one in bdrv_aligned_preadv() Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 04/22] nbd: Allow larger requests Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 05/22] nbd: Advertise realistic limits to block layer Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 06/22] iscsi: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 07/22] scsi: Advertise limits by blocksize, not 512 Eric Blake
2016-06-24  5:22   ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 08/22] block: Give nonzero result to blk_get_max_transfer_length() Eric Blake
2016-06-24  5:24   ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 09/22] blkdebug: Set request_alignment during .bdrv_refresh_limits() Eric Blake
2016-06-24  5:42   ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 10/22] iscsi: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 11/22] qcow2: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 12/22] raw-win32: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 13/22] block: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 14/22] block: Set default request_alignment during bdrv_refresh_limits() Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based Eric Blake
2016-06-24  6:06   ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits Eric Blake
2016-06-24  6:12   ` Fam Zheng
2016-06-24 14:10     ` Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based Eric Blake
2016-06-24  6:43   ` Fam Zheng
2016-06-24 14:15     ` Eric Blake
2016-06-24 14:29       ` Kevin Wolf
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 18/22] block: Drop raw_refresh_limits() Eric Blake
2016-06-24  6:44   ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 19/22] block: Split bdrv_merge_limits() from bdrv_refresh_limits() Eric Blake
2016-06-24  6:48   ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit Eric Blake
2016-06-24  7:07   ` Fam Zheng
2016-06-24 13:45   ` Kevin Wolf
2016-06-24 14:17     ` Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 21/22] block: Fix error message style Eric Blake
2016-06-24  7:09   ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 22/22] block: Use bool as appropriate for BDS members Eric Blake
2016-06-24  7:12   ` Fam Zheng
2016-06-24 14:32 ` [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.