All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard
@ 2016-07-15 23:22 Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 01/19] block: Convert bdrv_co_discard() to byte-based Eric Blake
                   ` (19 more replies)
  0 siblings, 20 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha

Allow NBD to pass a byte-aligned discard request over the wire.

Prerequisite: Kevin's block branch merged with current qemu.git master,
plus my work on auto-fragmenting (v3 at the moment):
https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg03550.html

Also available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-discard-v2

This is a merge of two series both posted before soft freeze, both
of which had initial positive review with only a few changes needed:
byte-based block discard
https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg06491.html
Switch raw NBD to byte-based
https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07030.html

I'd still hoping for one other series to make it into 2.7, based
on being posted prior to soft freeze:
nbd: efficient write zeroes
https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07463.html

Changes since v1:
- rebase to latest master (several contextual conflicts)
- fix bug with aio_pdiscard using a stack variable qiov after
its scope ends
- tweak raw_bsd to remain sector-based if probing occurred

001/19:[----] [--] 'block: Convert bdrv_co_discard() to byte-based'
002/19:[0004] [FC] 'block: Convert bdrv_discard() to byte-based'
003/19:[0034] [FC] 'block: Switch BlockRequest to byte-based'
004/19:[0008] [FC] 'block: Convert bdrv_aio_discard() to byte-based'
005/19:[----] [--] 'block: Convert BB interface to byte-based discards'
006/19:[----] [--] 'raw-posix: Switch paio_submit() to byte-based'
007/19:[----] [--] 'rbd: Switch rbd_start_aio() to byte-based'
008/19:[----] [--] 'block: Convert .bdrv_aio_discard() to byte-based'
009/19:[----] [--] 'block: Add .bdrv_co_pdiscard() driver callback'
010/19:[----] [-C] 'blkreplay: Switch .bdrv_co_discard() to byte-based'
011/19:[----] [--] 'gluster: Switch .bdrv_co_discard() to byte-based'
012/19:[----] [--] 'iscsi: Switch .bdrv_co_discard() to byte-based'
013/19:[----] [--] 'nbd: Switch .bdrv_co_discard() to byte-based'
014/19:[----] [--] 'qcow2: Switch .bdrv_co_discard() to byte-based'
015/19:[----] [-C] 'raw_bsd: Switch .bdrv_co_discard() to byte-based'
016/19:[----] [--] 'sheepdog: Switch .bdrv_co_discard() to byte-based'
017/19:[----] [--] 'block: Kill .bdrv_co_discard()'
018/19:[----] [--] 'nbd: Convert to byte-based interface'
019/19:[0028] [FC] 'raw_bsd: Convert to byte-based interface'

Eric Blake (19):
  block: Convert bdrv_co_discard() to byte-based
  block: Convert bdrv_discard() to byte-based
  block: Switch BlockRequest to byte-based
  block: Convert bdrv_aio_discard() to byte-based
  block: Convert BB interface to byte-based discards
  raw-posix: Switch paio_submit() to byte-based
  rbd: Switch rbd_start_aio() to byte-based
  block: Convert .bdrv_aio_discard() to byte-based
  block: Add .bdrv_co_pdiscard() driver callback
  blkreplay: Switch .bdrv_co_discard() to byte-based
  gluster: Switch .bdrv_co_discard() to byte-based
  iscsi: Switch .bdrv_co_discard() to byte-based
  nbd: Switch .bdrv_co_discard() to byte-based
  qcow2: Switch .bdrv_co_discard() to byte-based
  raw_bsd: Switch .bdrv_co_discard() to byte-based
  sheepdog: Switch .bdrv_co_discard() to byte-based
  block: Kill .bdrv_co_discard()
  nbd: Convert to byte-based interface
  raw_bsd: Convert to byte-based interface

 block/nbd-client.h             |  11 ++-
 include/block/block.h          |  10 +--
 include/block/block_int.h      |   8 +--
 include/block/nbd.h            |   1 -
 include/sysemu/block-backend.h |   9 ++-
 block/blkreplay.c              |   8 +--
 block/block-backend.c          |  22 +++---
 block/gluster.c                |  14 ++--
 block/io.c                     | 154 ++++++++++++++++++++---------------------
 block/iscsi.c                  |  18 +++--
 block/mirror.c                 |   5 +-
 block/nbd-client.c             |  41 ++++++-----
 block/nbd.c                    |  24 +++----
 block/qcow2-refcount.c         |   4 +-
 block/qcow2.c                  |  10 +--
 block/raw-posix.c              |  24 +++----
 block/raw-win32.c              |  19 ++---
 block/raw_bsd.c                |  53 ++++++++------
 block/rbd.c                    |  29 ++++----
 block/sheepdog.c               |  17 +++--
 hw/block/xen_disk.c            |   7 +-
 hw/ide/core.c                  |   6 +-
 hw/scsi/scsi-disk.c            |   8 +--
 nbd/server.c                   |  19 ++---
 qemu-io-cmds.c                 |   3 +-
 block/trace-events             |   4 +-
 26 files changed, 260 insertions(+), 268 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 01/19] block: Convert bdrv_co_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 02/19] block: Convert bdrv_discard() " Eric Blake
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz, Fam Zheng

Another step towards byte-based interfaces everywhere.  Replace
the sector-based bdrv_co_discard() with a new byte-based
bdrv_co_pdiscard(), which silently ignores any unaligned head
or tail.  Driver callbacks will be converted in followup patches.

By calculating the alignment outside of the loop, and clamping
the max discard to an aligned value, we can simplify the actions
done within the loop.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---

Yes, this patch is yet one more place that will need to be fixed up
before 2.7, in order to support iscsi devices that advertise a 15M
opt_discard and max_discard.  I plan on submitting that as a followup
series (as a bug fix, that qualifies as post-hard-freeze; while this
series has been around since before soft freeze but probably doesn't
qualify for inclusion once hard freeze hits)

 include/block/block.h |  2 +-
 block/blkreplay.c     |  3 ++-
 block/block-backend.c |  3 ++-
 block/io.c            | 67 +++++++++++++++++++++++++++------------------------
 block/raw_bsd.c       |  3 ++-
 5 files changed, 42 insertions(+), 36 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 616d8b9..4f5cebf 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -342,7 +342,7 @@ void coroutine_fn bdrv_co_drain(BlockDriverState *bs);
 void bdrv_drain_all(void);

 int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
-int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
+int bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset, int count);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
 bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs);
diff --git a/block/blkreplay.c b/block/blkreplay.c
index 3368c8c..c69e5a5 100755
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -118,7 +118,8 @@ static int coroutine_fn blkreplay_co_discard(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors)
 {
     uint64_t reqid = request_id++;
-    int ret = bdrv_co_discard(bs->file->bs, sector_num, nb_sectors);
+    int ret = bdrv_co_pdiscard(bs->file->bs, sector_num << BDRV_SECTOR_BITS,
+                               nb_sectors << BDRV_SECTOR_BITS);
     block_request_create(reqid, bs, qemu_coroutine_self());
     qemu_coroutine_yield();

diff --git a/block/block-backend.c b/block/block-backend.c
index f9cea1b..d982cf9 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1113,7 +1113,8 @@ int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
         return ret;
     }

-    return bdrv_co_discard(blk_bs(blk), sector_num, nb_sectors);
+    return bdrv_co_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
+                            nb_sectors << BDRV_SECTOR_BITS);
 }

 int blk_co_flush(BlockBackend *blk)
diff --git a/block/io.c b/block/io.c
index 86db77e..4e04df2 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2198,7 +2198,8 @@ static void coroutine_fn bdrv_aio_discard_co_entry(void *opaque)
     BlockAIOCBCoroutine *acb = opaque;
     BlockDriverState *bs = acb->common.bs;

-    acb->req.error = bdrv_co_discard(bs, acb->req.sector, acb->req.nb_sectors);
+    acb->req.error = bdrv_co_pdiscard(bs, acb->req.sector << BDRV_SECTOR_BITS,
+                                      acb->req.nb_sectors << BDRV_SECTOR_BITS);
     bdrv_co_complete(acb);
 }

@@ -2378,20 +2379,22 @@ static void coroutine_fn bdrv_discard_co_entry(void *opaque)
 {
     DiscardCo *rwco = opaque;

-    rwco->ret = bdrv_co_discard(rwco->bs, rwco->sector_num, rwco->nb_sectors);
+    rwco->ret = bdrv_co_pdiscard(rwco->bs, rwco->sector_num << BDRV_SECTOR_BITS,
+                                 rwco->nb_sectors << BDRV_SECTOR_BITS);
 }

-int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
-                                 int nb_sectors)
+int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
+                                  int count)
 {
     BdrvTrackedRequest req;
-    int max_discard, ret;
+    int max_pdiscard, ret;
+    int head, align;

     if (!bs->drv) {
         return -ENOMEDIUM;
     }

-    ret = bdrv_check_request(bs, sector_num, nb_sectors);
+    ret = bdrv_check_byte_request(bs, offset, count);
     if (ret < 0) {
         return ret;
     } else if (bs->read_only) {
@@ -2408,45 +2411,45 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
         return 0;
     }

-    tracked_request_begin(&req, bs, sector_num << BDRV_SECTOR_BITS,
-                          nb_sectors << BDRV_SECTOR_BITS, BDRV_TRACKED_DISCARD);
+    /* Discard is advisory, so ignore any unaligned head or tail */
+    align = MAX(BDRV_SECTOR_SIZE,
+                MAX(bs->bl.pdiscard_alignment, bs->bl.request_alignment));
+    assert(is_power_of_2(align));
+    head = MIN(count, -offset & (align - 1));
+    if (head) {
+        count -= head;
+        offset += head;
+    }
+    count = QEMU_ALIGN_DOWN(count, align);
+    if (!count) {
+        return 0;
+    }
+
+    tracked_request_begin(&req, bs, offset, count, BDRV_TRACKED_DISCARD);

     ret = notifier_with_return_list_notify(&bs->before_write_notifiers, &req);
     if (ret < 0) {
         goto out;
     }

-    max_discard = MIN_NON_ZERO(bs->bl.max_pdiscard >> BDRV_SECTOR_BITS,
-                               BDRV_REQUEST_MAX_SECTORS);
-    while (nb_sectors > 0) {
+    max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX),
+                                   align);
+
+    while (count > 0) {
         int ret;
-        int num = nb_sectors;
-        int discard_alignment = bs->bl.pdiscard_alignment >> BDRV_SECTOR_BITS;
-
-        /* align request */
-        if (discard_alignment &&
-            num >= discard_alignment &&
-            sector_num % discard_alignment) {
-            if (num > discard_alignment) {
-                num = discard_alignment;
-            }
-            num -= sector_num % discard_alignment;
-        }
-
-        /* limit request size */
-        if (num > max_discard) {
-            num = max_discard;
-        }
+        int num = MIN(count, max_pdiscard);

         if (bs->drv->bdrv_co_discard) {
-            ret = bs->drv->bdrv_co_discard(bs, sector_num, num);
+            ret = bs->drv->bdrv_co_discard(bs, offset >> BDRV_SECTOR_BITS,
+                                           num >> BDRV_SECTOR_BITS);
         } else {
             BlockAIOCB *acb;
             CoroutineIOCompletion co = {
                 .coroutine = qemu_coroutine_self(),
             };

-            acb = bs->drv->bdrv_aio_discard(bs, sector_num, nb_sectors,
+            acb = bs->drv->bdrv_aio_discard(bs, offset >> BDRV_SECTOR_BITS,
+                                            num >> BDRV_SECTOR_BITS,
                                             bdrv_co_io_em_complete, &co);
             if (acb == NULL) {
                 ret = -EIO;
@@ -2460,8 +2463,8 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
             goto out;
         }

-        sector_num += num;
-        nb_sectors -= num;
+        offset += num;
+        count -= num;
     }
     ret = 0;
 out:
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index d767413..68f0a91 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -137,7 +137,8 @@ static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs,
 static int coroutine_fn raw_co_discard(BlockDriverState *bs,
                                        int64_t sector_num, int nb_sectors)
 {
-    return bdrv_co_discard(bs->file->bs, sector_num, nb_sectors);
+    return bdrv_co_pdiscard(bs->file->bs, sector_num << BDRV_SECTOR_BITS,
+                            nb_sectors << BDRV_SECTOR_BITS);
 }

 static int64_t raw_getlength(BlockDriverState *bs)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 02/19] block: Convert bdrv_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 01/19] block: Convert bdrv_co_discard() to byte-based Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 03/19] block: Switch BlockRequest " Eric Blake
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz, Fam Zheng

Another step towards byte-based interfaces everywhere.  Replace
the sector-based bdrv_discard() with a new byte-based
bdrv_pdiscard(), which silently ignores any unaligned head
or tail.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v2: rebase to master, trivial enough to keep R-b
---
 include/block/block.h  |  2 +-
 block/block-backend.c  |  3 ++-
 block/io.c             | 19 +++++++++----------
 block/qcow2-refcount.c |  4 +---
 4 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 4f5cebf..94cabbb 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -341,7 +341,7 @@ void bdrv_drain(BlockDriverState *bs);
 void coroutine_fn bdrv_co_drain(BlockDriverState *bs);
 void bdrv_drain_all(void);

-int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
+int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int count);
 int bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset, int count);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
diff --git a/block/block-backend.c b/block/block-backend.c
index d982cf9..83b6407 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1512,7 +1512,8 @@ int blk_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
         return ret;
     }

-    return bdrv_discard(blk_bs(blk), sector_num, nb_sectors);
+    return bdrv_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
+                         nb_sectors << BDRV_SECTOR_BITS);
 }

 int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
diff --git a/block/io.c b/block/io.c
index 4e04df2..2b4dc6e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2371,16 +2371,15 @@ int bdrv_flush(BlockDriverState *bs)

 typedef struct DiscardCo {
     BlockDriverState *bs;
-    int64_t sector_num;
-    int nb_sectors;
+    int64_t offset;
+    int count;
     int ret;
 } DiscardCo;
-static void coroutine_fn bdrv_discard_co_entry(void *opaque)
+static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque)
 {
     DiscardCo *rwco = opaque;

-    rwco->ret = bdrv_co_pdiscard(rwco->bs, rwco->sector_num << BDRV_SECTOR_BITS,
-                                 rwco->nb_sectors << BDRV_SECTOR_BITS);
+    rwco->ret = bdrv_co_pdiscard(rwco->bs, rwco->offset, rwco->count);
 }

 int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
@@ -2474,23 +2473,23 @@ out:
     return ret;
 }

-int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
+int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int count)
 {
     Coroutine *co;
     DiscardCo rwco = {
         .bs = bs,
-        .sector_num = sector_num,
-        .nb_sectors = nb_sectors,
+        .offset = offset,
+        .count = count,
         .ret = NOT_DONE,
     };

     if (qemu_in_coroutine()) {
         /* Fast-path if already in coroutine context */
-        bdrv_discard_co_entry(&rwco);
+        bdrv_pdiscard_co_entry(&rwco);
     } else {
         AioContext *aio_context = bdrv_get_aio_context(bs);

-        co = qemu_coroutine_create(bdrv_discard_co_entry, &rwco);
+        co = qemu_coroutine_create(bdrv_pdiscard_co_entry, &rwco);
         qemu_coroutine_enter(co);
         while (rwco.ret == NOT_DONE) {
             aio_poll(aio_context, true);
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 49b6ce6..cbfb3fe 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -615,9 +615,7 @@ void qcow2_process_discards(BlockDriverState *bs, int ret)

         /* Discard is optional, ignore the return value */
         if (ret >= 0) {
-            bdrv_discard(bs->file->bs,
-                         d->offset >> BDRV_SECTOR_BITS,
-                         d->bytes >> BDRV_SECTOR_BITS);
+            bdrv_pdiscard(bs->file->bs, d->offset, d->bytes);
         }

         g_free(d);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 03/19] block: Switch BlockRequest to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 01/19] block: Convert bdrv_co_discard() to byte-based Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 02/19] block: Convert bdrv_discard() " Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 04/19] block: Convert bdrv_aio_discard() " Eric Blake
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Fam Zheng, Max Reitz

BlockRequest is the internal struct used by bdrv_aio_*.  At the
moment, all such calls were sector-based, but we will eventually
convert to byte-based; start by changing the internal variables
to be byte-based.  No change to behavior, although the read and
write code can now go byte-based through more of the stack.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: Don't pass out-of-scope local qiov through aio, rebase to master
---
 block/io.c | 62 ++++++++++++++++++++++++++++++--------------------------------
 1 file changed, 30 insertions(+), 32 deletions(-)

diff --git a/block/io.c b/block/io.c
index 2b4dc6e..478aade 100644
--- a/block/io.c
+++ b/block/io.c
@@ -33,14 +33,13 @@

 #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */

-static BlockAIOCB *bdrv_co_aio_rw_vector(BdrvChild *child,
-                                         int64_t sector_num,
-                                         QEMUIOVector *qiov,
-                                         int nb_sectors,
-                                         BdrvRequestFlags flags,
-                                         BlockCompletionFunc *cb,
-                                         void *opaque,
-                                         bool is_write);
+static BlockAIOCB *bdrv_co_aio_prw_vector(BdrvChild *child,
+                                          int64_t offset,
+                                          QEMUIOVector *qiov,
+                                          BdrvRequestFlags flags,
+                                          BlockCompletionFunc *cb,
+                                          void *opaque,
+                                          bool is_write);
 static void coroutine_fn bdrv_co_do_rw(void *opaque);
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
     int64_t offset, int count, BdrvRequestFlags flags);
@@ -2014,8 +2013,9 @@ BlockAIOCB *bdrv_aio_readv(BdrvChild *child, int64_t sector_num,
 {
     trace_bdrv_aio_readv(child->bs, sector_num, nb_sectors, opaque);

-    return bdrv_co_aio_rw_vector(child, sector_num, qiov, nb_sectors, 0,
-                                 cb, opaque, false);
+    assert(nb_sectors << BDRV_SECTOR_BITS == qiov->size);
+    return bdrv_co_aio_prw_vector(child, sector_num << BDRV_SECTOR_BITS, qiov,
+                                  0, cb, opaque, false);
 }

 BlockAIOCB *bdrv_aio_writev(BdrvChild *child, int64_t sector_num,
@@ -2024,8 +2024,9 @@ BlockAIOCB *bdrv_aio_writev(BdrvChild *child, int64_t sector_num,
 {
     trace_bdrv_aio_writev(child->bs, sector_num, nb_sectors, opaque);

-    return bdrv_co_aio_rw_vector(child, sector_num, qiov, nb_sectors, 0,
-                                 cb, opaque, true);
+    assert(nb_sectors << BDRV_SECTOR_BITS == qiov->size);
+    return bdrv_co_aio_prw_vector(child, sector_num << BDRV_SECTOR_BITS, qiov,
+                                  0, cb, opaque, true);
 }

 void bdrv_aio_cancel(BlockAIOCB *acb)
@@ -2061,8 +2062,8 @@ typedef struct BlockRequest {
     union {
         /* Used during read, write, trim */
         struct {
-            int64_t sector;
-            int nb_sectors;
+            int64_t offset;
+            int bytes;
             int flags;
             QEMUIOVector *qiov;
         };
@@ -2126,24 +2127,23 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque)
     BlockAIOCBCoroutine *acb = opaque;

     if (!acb->is_write) {
-        acb->req.error = bdrv_co_do_readv(acb->child, acb->req.sector,
-            acb->req.nb_sectors, acb->req.qiov, acb->req.flags);
+        acb->req.error = bdrv_co_preadv(acb->child, acb->req.offset,
+            acb->req.qiov->size, acb->req.qiov, acb->req.flags);
     } else {
-        acb->req.error = bdrv_co_do_writev(acb->child, acb->req.sector,
-            acb->req.nb_sectors, acb->req.qiov, acb->req.flags);
+        acb->req.error = bdrv_co_pwritev(acb->child, acb->req.offset,
+            acb->req.qiov->size, acb->req.qiov, acb->req.flags);
     }

     bdrv_co_complete(acb);
 }

-static BlockAIOCB *bdrv_co_aio_rw_vector(BdrvChild *child,
-                                         int64_t sector_num,
-                                         QEMUIOVector *qiov,
-                                         int nb_sectors,
-                                         BdrvRequestFlags flags,
-                                         BlockCompletionFunc *cb,
-                                         void *opaque,
-                                         bool is_write)
+static BlockAIOCB *bdrv_co_aio_prw_vector(BdrvChild *child,
+                                          int64_t offset,
+                                          QEMUIOVector *qiov,
+                                          BdrvRequestFlags flags,
+                                          BlockCompletionFunc *cb,
+                                          void *opaque,
+                                          bool is_write)
 {
     Coroutine *co;
     BlockAIOCBCoroutine *acb;
@@ -2152,8 +2152,7 @@ static BlockAIOCB *bdrv_co_aio_rw_vector(BdrvChild *child,
     acb->child = child;
     acb->need_bh = true;
     acb->req.error = -EINPROGRESS;
-    acb->req.sector = sector_num;
-    acb->req.nb_sectors = nb_sectors;
+    acb->req.offset = offset;
     acb->req.qiov = qiov;
     acb->req.flags = flags;
     acb->is_write = is_write;
@@ -2198,8 +2197,7 @@ static void coroutine_fn bdrv_aio_discard_co_entry(void *opaque)
     BlockAIOCBCoroutine *acb = opaque;
     BlockDriverState *bs = acb->common.bs;

-    acb->req.error = bdrv_co_pdiscard(bs, acb->req.sector << BDRV_SECTOR_BITS,
-                                      acb->req.nb_sectors << BDRV_SECTOR_BITS);
+    acb->req.error = bdrv_co_pdiscard(bs, acb->req.offset, acb->req.bytes);
     bdrv_co_complete(acb);
 }

@@ -2215,8 +2213,8 @@ BlockAIOCB *bdrv_aio_discard(BlockDriverState *bs,
     acb = qemu_aio_get(&bdrv_em_co_aiocb_info, bs, cb, opaque);
     acb->need_bh = true;
     acb->req.error = -EINPROGRESS;
-    acb->req.sector = sector_num;
-    acb->req.nb_sectors = nb_sectors;
+    acb->req.offset = sector_num << BDRV_SECTOR_BITS;
+    acb->req.bytes = nb_sectors << BDRV_SECTOR_BITS;
     co = qemu_coroutine_create(bdrv_aio_discard_co_entry, acb);
     qemu_coroutine_enter(co);

-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 04/19] block: Convert bdrv_aio_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (2 preceding siblings ...)
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 03/19] block: Switch BlockRequest " Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:22   ` Eric Blake
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz, Fam Zheng

Another step towards byte-based interfaces everywhere.  Replace
the sector-based bdrv_aio_discard() with a new byte-based
bdrv_aio_pdiscard(), which silently ignores any unaligned head
or tail.  Driver callbacks will be converted in followup patches.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: rebase to master and to qiov change in previous patch
---
 include/block/block.h |  6 +++---
 block/block-backend.c |  3 ++-
 block/io.c            | 15 +++++++--------
 block/trace-events    |  2 +-
 4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 94cabbb..11c162d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -316,9 +316,9 @@ BlockAIOCB *bdrv_aio_writev(BdrvChild *child, int64_t sector_num,
                             BlockCompletionFunc *cb, void *opaque);
 BlockAIOCB *bdrv_aio_flush(BlockDriverState *bs,
                            BlockCompletionFunc *cb, void *opaque);
-BlockAIOCB *bdrv_aio_discard(BlockDriverState *bs,
-                             int64_t sector_num, int nb_sectors,
-                             BlockCompletionFunc *cb, void *opaque);
+BlockAIOCB *bdrv_aio_pdiscard(BlockDriverState *bs,
+                              int64_t offset, int count,
+                              BlockCompletionFunc *cb, void *opaque);
 void bdrv_aio_cancel(BlockAIOCB *acb);
 void bdrv_aio_cancel_async(BlockAIOCB *acb);

diff --git a/block/block-backend.c b/block/block-backend.c
index 83b6407..8b16b95 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1074,7 +1074,8 @@ BlockAIOCB *blk_aio_discard(BlockBackend *blk,
         return blk_abort_aio_request(blk, cb, opaque, ret);
     }

-    return bdrv_aio_discard(blk_bs(blk), sector_num, nb_sectors, cb, opaque);
+    return bdrv_aio_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
+                             nb_sectors << BDRV_SECTOR_BITS, cb, opaque);
 }

 void blk_aio_cancel(BlockAIOCB *acb)
diff --git a/block/io.c b/block/io.c
index 478aade..3babbdc 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2192,7 +2192,7 @@ BlockAIOCB *bdrv_aio_flush(BlockDriverState *bs,
     return &acb->common;
 }

-static void coroutine_fn bdrv_aio_discard_co_entry(void *opaque)
+static void coroutine_fn bdrv_aio_pdiscard_co_entry(void *opaque)
 {
     BlockAIOCBCoroutine *acb = opaque;
     BlockDriverState *bs = acb->common.bs;
@@ -2201,21 +2201,20 @@ static void coroutine_fn bdrv_aio_discard_co_entry(void *opaque)
     bdrv_co_complete(acb);
 }

-BlockAIOCB *bdrv_aio_discard(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque)
+BlockAIOCB *bdrv_aio_pdiscard(BlockDriverState *bs, int64_t offset, int count,
+                              BlockCompletionFunc *cb, void *opaque)
 {
     Coroutine *co;
     BlockAIOCBCoroutine *acb;

-    trace_bdrv_aio_discard(bs, sector_num, nb_sectors, opaque);
+    trace_bdrv_aio_pdiscard(bs, offset, count, opaque);

     acb = qemu_aio_get(&bdrv_em_co_aiocb_info, bs, cb, opaque);
     acb->need_bh = true;
     acb->req.error = -EINPROGRESS;
-    acb->req.offset = sector_num << BDRV_SECTOR_BITS;
-    acb->req.bytes = nb_sectors << BDRV_SECTOR_BITS;
-    co = qemu_coroutine_create(bdrv_aio_discard_co_entry, acb);
+    acb->req.offset = offset;
+    acb->req.bytes = count;
+    co = qemu_coroutine_create(bdrv_aio_pdiscard_co_entry, acb);
     qemu_coroutine_enter(co);

     bdrv_co_maybe_schedule_bh(acb);
diff --git a/block/trace-events b/block/trace-events
index 354967e..90d618a 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -9,7 +9,7 @@ blk_co_preadv(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags
 blk_co_pwritev(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags) "blk %p bs %p offset %"PRId64" bytes %u flags %x"

 # block/io.c
-bdrv_aio_discard(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
+bdrv_aio_pdiscard(void *bs, int64_t offset, int count, void *opaque) "bs %p offset %"PRId64" count %d opaque %p"
 bdrv_aio_flush(void *bs, void *opaque) "bs %p opaque %p"
 bdrv_aio_readv(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
 bdrv_aio_writev(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 05/19] block: Convert BB interface to byte-based discards
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
@ 2016-07-15 23:22   ` Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 02/19] block: Convert bdrv_discard() " Eric Blake
                     ` (18 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz, Jeff Cody,
	Stefano Stabellini, Anthony Perard, John Snow, open list:X86

Change sector-based blk_discard(), blk_co_discard(), and
blk_aio_discard() to instead be byte-based blk_pdiscard(),
blk_co_pdiscard(), and blk_aio_pdiscard().  NBD gets a lot
simpler now that ignoring the unaligned portion of a
byte-based discard request is handled under the hood by
the block layer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v2: tweak commit message for grep'ability
---
 include/sysemu/block-backend.h |  9 ++++-----
 block/block-backend.c          | 25 +++++++++++--------------
 block/mirror.c                 |  5 +++--
 hw/block/xen_disk.c            |  7 ++++---
 hw/ide/core.c                  |  6 ++++--
 hw/scsi/scsi-disk.c            |  8 ++++----
 nbd/server.c                   | 19 +++++--------------
 qemu-io-cmds.c                 |  3 +--
 8 files changed, 36 insertions(+), 46 deletions(-)

diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 3c3e82f..2da4905 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -139,15 +139,14 @@ BlockAIOCB *blk_aio_pwritev(BlockBackend *blk, int64_t offset,
                             BlockCompletionFunc *cb, void *opaque);
 BlockAIOCB *blk_aio_flush(BlockBackend *blk,
                           BlockCompletionFunc *cb, void *opaque);
-BlockAIOCB *blk_aio_discard(BlockBackend *blk,
-                            int64_t sector_num, int nb_sectors,
-                            BlockCompletionFunc *cb, void *opaque);
+BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int count,
+                             BlockCompletionFunc *cb, void *opaque);
 void blk_aio_cancel(BlockAIOCB *acb);
 void blk_aio_cancel_async(BlockAIOCB *acb);
 int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
 BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
                           BlockCompletionFunc *cb, void *opaque);
-int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors);
+int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int count);
 int blk_co_flush(BlockBackend *blk);
 int blk_flush(BlockBackend *blk);
 int blk_flush_all(void);
@@ -207,7 +206,7 @@ int coroutine_fn blk_co_pwrite_zeroes(BlockBackend *blk, int64_t offset,
 int blk_write_compressed(BlockBackend *blk, int64_t sector_num,
                          const uint8_t *buf, int nb_sectors);
 int blk_truncate(BlockBackend *blk, int64_t offset);
-int blk_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors);
+int blk_pdiscard(BlockBackend *blk, int64_t offset, int count);
 int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
                      int64_t pos, int size);
 int blk_load_vmstate(BlockBackend *blk, uint8_t *buf, int64_t pos, int size);
diff --git a/block/block-backend.c b/block/block-backend.c
index 8b16b95..effa038 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1065,17 +1065,16 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
     return bdrv_aio_flush(blk_bs(blk), cb, opaque);
 }

-BlockAIOCB *blk_aio_discard(BlockBackend *blk,
-                            int64_t sector_num, int nb_sectors,
-                            BlockCompletionFunc *cb, void *opaque)
+BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk,
+                             int64_t offset, int count,
+                             BlockCompletionFunc *cb, void *opaque)
 {
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
+    int ret = blk_check_byte_request(blk, offset, count);
     if (ret < 0) {
         return blk_abort_aio_request(blk, cb, opaque, ret);
     }

-    return bdrv_aio_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
-                             nb_sectors << BDRV_SECTOR_BITS, cb, opaque);
+    return bdrv_aio_pdiscard(blk_bs(blk), offset, count, cb, opaque);
 }

 void blk_aio_cancel(BlockAIOCB *acb)
@@ -1107,15 +1106,14 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
     return bdrv_aio_ioctl(blk_bs(blk), req, buf, cb, opaque);
 }

-int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
+int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int count)
 {
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
+    int ret = blk_check_byte_request(blk, offset, count);
     if (ret < 0) {
         return ret;
     }

-    return bdrv_co_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
-                            nb_sectors << BDRV_SECTOR_BITS);
+    return bdrv_co_pdiscard(blk_bs(blk), offset, count);
 }

 int blk_co_flush(BlockBackend *blk)
@@ -1506,15 +1504,14 @@ int blk_truncate(BlockBackend *blk, int64_t offset)
     return bdrv_truncate(blk_bs(blk), offset);
 }

-int blk_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
+int blk_pdiscard(BlockBackend *blk, int64_t offset, int count)
 {
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
+    int ret = blk_check_byte_request(blk, offset, count);
     if (ret < 0) {
         return ret;
     }

-    return bdrv_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
-                         nb_sectors << BDRV_SECTOR_BITS);
+    return bdrv_pdiscard(blk_bs(blk), offset, count);
 }

 int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
diff --git a/block/mirror.c b/block/mirror.c
index b1e633e..617bb18 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -303,8 +303,9 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
     s->in_flight++;
     s->sectors_in_flight += nb_sectors;
     if (is_discard) {
-        blk_aio_discard(s->target, sector_num, op->nb_sectors,
-                        mirror_write_complete, op);
+        blk_aio_pdiscard(s->target, sector_num << BDRV_SECTOR_BITS,
+                         op->nb_sectors << BDRV_SECTOR_BITS,
+                         mirror_write_complete, op);
     } else {
         blk_aio_pwrite_zeroes(s->target, sector_num * BDRV_SECTOR_SIZE,
                               op->nb_sectors * BDRV_SECTOR_SIZE,
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 90aca73..3b8ad33 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -574,9 +574,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
     {
         struct blkif_request_discard *discard_req = (void *)&ioreq->req;
         ioreq->aio_inflight++;
-        blk_aio_discard(blkdev->blk,
-                        discard_req->sector_number, discard_req->nr_sectors,
-                        qemu_aio_complete, ioreq);
+        blk_aio_pdiscard(blkdev->blk,
+                         discard_req->sector_number << BDRV_SECTOR_BITS,
+                         discard_req->nr_sectors << BDRV_SECTOR_BITS,
+                         qemu_aio_complete, ioreq);
         break;
     }
     default:
diff --git a/hw/ide/core.c b/hw/ide/core.c
index f2d131b..894c4c8 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -423,8 +423,10 @@ static void ide_issue_trim_cb(void *opaque, int ret)
                 }

                 /* Got an entry! Submit and exit.  */
-                iocb->aiocb = blk_aio_discard(iocb->blk, sector, count,
-                                              ide_issue_trim_cb, opaque);
+                iocb->aiocb = blk_aio_pdiscard(iocb->blk,
+                                               sector << BDRV_SECTOR_BITS,
+                                               count << BDRV_SECTOR_BITS,
+                                               ide_issue_trim_cb, opaque);
                 return;
             }

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 8dbfc10..836a155 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -1609,10 +1609,10 @@ static void scsi_unmap_complete_noio(UnmapCBData *data, int ret)
             goto done;
         }

-        r->req.aiocb = blk_aio_discard(s->qdev.conf.blk,
-                                       sector_num * (s->qdev.blocksize / 512),
-                                       nb_sectors * (s->qdev.blocksize / 512),
-                                       scsi_unmap_complete, data);
+        r->req.aiocb = blk_aio_pdiscard(s->qdev.conf.blk,
+                                        sector_num * s->qdev.blocksize,
+                                        nb_sectors * s->qdev.blocksize,
+                                        scsi_unmap_complete, data);
         data->count--;
         data->inbuf += 16;
         return;
diff --git a/nbd/server.c b/nbd/server.c
index fbc82de..29e2099 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1182,20 +1182,11 @@ static void nbd_trip(void *opaque)
         break;
     case NBD_CMD_TRIM:
         TRACE("Request type is TRIM");
-        /* Ignore unaligned head or tail, until block layer adds byte
-         * interface */
-        if (request.len >= BDRV_SECTOR_SIZE) {
-            request.len -= (request.from + request.len) % BDRV_SECTOR_SIZE;
-            ret = blk_co_discard(exp->blk,
-                                 DIV_ROUND_UP(request.from + exp->dev_offset,
-                                              BDRV_SECTOR_SIZE),
-                                 request.len / BDRV_SECTOR_SIZE);
-            if (ret < 0) {
-                LOG("discard failed");
-                reply.error = -ret;
-            }
-        } else {
-            TRACE("trim request too small, ignoring");
+        ret = blk_co_pdiscard(exp->blk, request.from + exp->dev_offset,
+                              request.len);
+        if (ret < 0) {
+            LOG("discard failed");
+            reply.error = -ret;
         }
         if (nbd_co_send_reply(req, &reply, 0) < 0) {
             goto out;
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 6e29edb..25954f5 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -1696,8 +1696,7 @@ static int discard_f(BlockBackend *blk, int argc, char **argv)
     }

     gettimeofday(&t1, NULL);
-    ret = blk_discard(blk, offset >> BDRV_SECTOR_BITS,
-                      count >> BDRV_SECTOR_BITS);
+    ret = blk_pdiscard(blk, offset, count);
     gettimeofday(&t2, NULL);

     if (ret < 0) {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 05/19] block: Convert BB interface to byte-based discards
@ 2016-07-15 23:22   ` Eric Blake
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, Stefano Stabellini, qemu-block, Jeff Cody, Max Reitz,
	open list:X86, stefanha, Anthony Perard, pbonzini, John Snow

Change sector-based blk_discard(), blk_co_discard(), and
blk_aio_discard() to instead be byte-based blk_pdiscard(),
blk_co_pdiscard(), and blk_aio_pdiscard().  NBD gets a lot
simpler now that ignoring the unaligned portion of a
byte-based discard request is handled under the hood by
the block layer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

---
v2: tweak commit message for grep'ability
---
 include/sysemu/block-backend.h |  9 ++++-----
 block/block-backend.c          | 25 +++++++++++--------------
 block/mirror.c                 |  5 +++--
 hw/block/xen_disk.c            |  7 ++++---
 hw/ide/core.c                  |  6 ++++--
 hw/scsi/scsi-disk.c            |  8 ++++----
 nbd/server.c                   | 19 +++++--------------
 qemu-io-cmds.c                 |  3 +--
 8 files changed, 36 insertions(+), 46 deletions(-)

diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 3c3e82f..2da4905 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -139,15 +139,14 @@ BlockAIOCB *blk_aio_pwritev(BlockBackend *blk, int64_t offset,
                             BlockCompletionFunc *cb, void *opaque);
 BlockAIOCB *blk_aio_flush(BlockBackend *blk,
                           BlockCompletionFunc *cb, void *opaque);
-BlockAIOCB *blk_aio_discard(BlockBackend *blk,
-                            int64_t sector_num, int nb_sectors,
-                            BlockCompletionFunc *cb, void *opaque);
+BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int count,
+                             BlockCompletionFunc *cb, void *opaque);
 void blk_aio_cancel(BlockAIOCB *acb);
 void blk_aio_cancel_async(BlockAIOCB *acb);
 int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
 BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
                           BlockCompletionFunc *cb, void *opaque);
-int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors);
+int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int count);
 int blk_co_flush(BlockBackend *blk);
 int blk_flush(BlockBackend *blk);
 int blk_flush_all(void);
@@ -207,7 +206,7 @@ int coroutine_fn blk_co_pwrite_zeroes(BlockBackend *blk, int64_t offset,
 int blk_write_compressed(BlockBackend *blk, int64_t sector_num,
                          const uint8_t *buf, int nb_sectors);
 int blk_truncate(BlockBackend *blk, int64_t offset);
-int blk_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors);
+int blk_pdiscard(BlockBackend *blk, int64_t offset, int count);
 int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
                      int64_t pos, int size);
 int blk_load_vmstate(BlockBackend *blk, uint8_t *buf, int64_t pos, int size);
diff --git a/block/block-backend.c b/block/block-backend.c
index 8b16b95..effa038 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1065,17 +1065,16 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
     return bdrv_aio_flush(blk_bs(blk), cb, opaque);
 }

-BlockAIOCB *blk_aio_discard(BlockBackend *blk,
-                            int64_t sector_num, int nb_sectors,
-                            BlockCompletionFunc *cb, void *opaque)
+BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk,
+                             int64_t offset, int count,
+                             BlockCompletionFunc *cb, void *opaque)
 {
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
+    int ret = blk_check_byte_request(blk, offset, count);
     if (ret < 0) {
         return blk_abort_aio_request(blk, cb, opaque, ret);
     }

-    return bdrv_aio_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
-                             nb_sectors << BDRV_SECTOR_BITS, cb, opaque);
+    return bdrv_aio_pdiscard(blk_bs(blk), offset, count, cb, opaque);
 }

 void blk_aio_cancel(BlockAIOCB *acb)
@@ -1107,15 +1106,14 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
     return bdrv_aio_ioctl(blk_bs(blk), req, buf, cb, opaque);
 }

-int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
+int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int count)
 {
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
+    int ret = blk_check_byte_request(blk, offset, count);
     if (ret < 0) {
         return ret;
     }

-    return bdrv_co_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
-                            nb_sectors << BDRV_SECTOR_BITS);
+    return bdrv_co_pdiscard(blk_bs(blk), offset, count);
 }

 int blk_co_flush(BlockBackend *blk)
@@ -1506,15 +1504,14 @@ int blk_truncate(BlockBackend *blk, int64_t offset)
     return bdrv_truncate(blk_bs(blk), offset);
 }

-int blk_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
+int blk_pdiscard(BlockBackend *blk, int64_t offset, int count)
 {
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
+    int ret = blk_check_byte_request(blk, offset, count);
     if (ret < 0) {
         return ret;
     }

-    return bdrv_pdiscard(blk_bs(blk), sector_num << BDRV_SECTOR_BITS,
-                         nb_sectors << BDRV_SECTOR_BITS);
+    return bdrv_pdiscard(blk_bs(blk), offset, count);
 }

 int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
diff --git a/block/mirror.c b/block/mirror.c
index b1e633e..617bb18 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -303,8 +303,9 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
     s->in_flight++;
     s->sectors_in_flight += nb_sectors;
     if (is_discard) {
-        blk_aio_discard(s->target, sector_num, op->nb_sectors,
-                        mirror_write_complete, op);
+        blk_aio_pdiscard(s->target, sector_num << BDRV_SECTOR_BITS,
+                         op->nb_sectors << BDRV_SECTOR_BITS,
+                         mirror_write_complete, op);
     } else {
         blk_aio_pwrite_zeroes(s->target, sector_num * BDRV_SECTOR_SIZE,
                               op->nb_sectors * BDRV_SECTOR_SIZE,
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 90aca73..3b8ad33 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -574,9 +574,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
     {
         struct blkif_request_discard *discard_req = (void *)&ioreq->req;
         ioreq->aio_inflight++;
-        blk_aio_discard(blkdev->blk,
-                        discard_req->sector_number, discard_req->nr_sectors,
-                        qemu_aio_complete, ioreq);
+        blk_aio_pdiscard(blkdev->blk,
+                         discard_req->sector_number << BDRV_SECTOR_BITS,
+                         discard_req->nr_sectors << BDRV_SECTOR_BITS,
+                         qemu_aio_complete, ioreq);
         break;
     }
     default:
diff --git a/hw/ide/core.c b/hw/ide/core.c
index f2d131b..894c4c8 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -423,8 +423,10 @@ static void ide_issue_trim_cb(void *opaque, int ret)
                 }

                 /* Got an entry! Submit and exit.  */
-                iocb->aiocb = blk_aio_discard(iocb->blk, sector, count,
-                                              ide_issue_trim_cb, opaque);
+                iocb->aiocb = blk_aio_pdiscard(iocb->blk,
+                                               sector << BDRV_SECTOR_BITS,
+                                               count << BDRV_SECTOR_BITS,
+                                               ide_issue_trim_cb, opaque);
                 return;
             }

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 8dbfc10..836a155 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -1609,10 +1609,10 @@ static void scsi_unmap_complete_noio(UnmapCBData *data, int ret)
             goto done;
         }

-        r->req.aiocb = blk_aio_discard(s->qdev.conf.blk,
-                                       sector_num * (s->qdev.blocksize / 512),
-                                       nb_sectors * (s->qdev.blocksize / 512),
-                                       scsi_unmap_complete, data);
+        r->req.aiocb = blk_aio_pdiscard(s->qdev.conf.blk,
+                                        sector_num * s->qdev.blocksize,
+                                        nb_sectors * s->qdev.blocksize,
+                                        scsi_unmap_complete, data);
         data->count--;
         data->inbuf += 16;
         return;
diff --git a/nbd/server.c b/nbd/server.c
index fbc82de..29e2099 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1182,20 +1182,11 @@ static void nbd_trip(void *opaque)
         break;
     case NBD_CMD_TRIM:
         TRACE("Request type is TRIM");
-        /* Ignore unaligned head or tail, until block layer adds byte
-         * interface */
-        if (request.len >= BDRV_SECTOR_SIZE) {
-            request.len -= (request.from + request.len) % BDRV_SECTOR_SIZE;
-            ret = blk_co_discard(exp->blk,
-                                 DIV_ROUND_UP(request.from + exp->dev_offset,
-                                              BDRV_SECTOR_SIZE),
-                                 request.len / BDRV_SECTOR_SIZE);
-            if (ret < 0) {
-                LOG("discard failed");
-                reply.error = -ret;
-            }
-        } else {
-            TRACE("trim request too small, ignoring");
+        ret = blk_co_pdiscard(exp->blk, request.from + exp->dev_offset,
+                              request.len);
+        if (ret < 0) {
+            LOG("discard failed");
+            reply.error = -ret;
         }
         if (nbd_co_send_reply(req, &reply, 0) < 0) {
             goto out;
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 6e29edb..25954f5 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -1696,8 +1696,7 @@ static int discard_f(BlockBackend *blk, int argc, char **argv)
     }

     gettimeofday(&t1, NULL);
-    ret = blk_discard(blk, offset >> BDRV_SECTOR_BITS,
-                      count >> BDRV_SECTOR_BITS);
+    ret = blk_pdiscard(blk, offset, count);
     gettimeofday(&t2, NULL);

     if (ret < 0) {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 06/19] raw-posix: Switch paio_submit() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (4 preceding siblings ...)
  2016-07-15 23:22   ` Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 07/19] rbd: Switch rbd_start_aio() " Eric Blake
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz

The only remaining uses of paio_submit() were flush (with no
offset or count) and discard (which we are switching to byte-based);
furthermore, the similarly named paio_submit_co() is already
byte-based.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/raw-posix.c  | 14 ++++++++------
 block/raw-win32.c  | 19 +++++++++++--------
 block/trace-events |  2 +-
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index d1c3bd8..6ed329d 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1287,7 +1287,7 @@ static int paio_submit_co(BlockDriverState *bs, int fd,
 }

 static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        int64_t offset, QEMUIOVector *qiov, int count,
         BlockCompletionFunc *cb, void *opaque, int type)
 {
     RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
@@ -1297,8 +1297,8 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd,
     acb->aio_type = type;
     acb->aio_fildes = fd;

-    acb->aio_nbytes = nb_sectors * BDRV_SECTOR_SIZE;
-    acb->aio_offset = sector_num * BDRV_SECTOR_SIZE;
+    acb->aio_nbytes = count;
+    acb->aio_offset = offset;

     if (qiov) {
         acb->aio_iov = qiov->iov;
@@ -1306,7 +1306,7 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd,
         assert(qiov->size == acb->aio_nbytes);
     }

-    trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
+    trace_paio_submit(acb, opaque, offset, count, type);
     pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
     return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
@@ -1871,7 +1871,8 @@ static coroutine_fn BlockAIOCB *raw_aio_discard(BlockDriverState *bs,
 {
     BDRVRawState *s = bs->opaque;

-    return paio_submit(bs, s->fd, sector_num, NULL, nb_sectors,
+    return paio_submit(bs, s->fd, sector_num << BDRV_SECTOR_BITS, NULL,
+                       nb_sectors << BDRV_SECTOR_BITS,
                        cb, opaque, QEMU_AIO_DISCARD);
 }

@@ -2294,7 +2295,8 @@ static coroutine_fn BlockAIOCB *hdev_aio_discard(BlockDriverState *bs,
     if (fd_open(bs) < 0) {
         return NULL;
     }
-    return paio_submit(bs, s->fd, sector_num, NULL, nb_sectors,
+    return paio_submit(bs, s->fd, sector_num << BDRV_SECTOR_BITS, NULL,
+                       nb_sectors << BDRV_SECTOR_BITS,
                        cb, opaque, QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
 }

diff --git a/block/raw-win32.c b/block/raw-win32.c
index 62edb1a..3ff53d6 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -142,7 +142,7 @@ static int aio_worker(void *arg)
 }

 static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        int64_t offset, QEMUIOVector *qiov, int count,
         BlockCompletionFunc *cb, void *opaque, int type)
 {
     RawWin32AIOData *acb = g_new(RawWin32AIOData, 1);
@@ -155,11 +155,12 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
     if (qiov) {
         acb->aio_iov = qiov->iov;
         acb->aio_niov = qiov->niov;
+        assert(qiov->size == count);
     }
-    acb->aio_nbytes = nb_sectors * 512;
-    acb->aio_offset = sector_num * 512;
+    acb->aio_nbytes = count;
+    acb->aio_offset = offset;

-    trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
+    trace_paio_submit(acb, opaque, offset, count, type);
     pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
     return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
@@ -378,9 +379,10 @@ static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
     BDRVRawState *s = bs->opaque;
     if (s->aio) {
         return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
-                                nb_sectors, cb, opaque, QEMU_AIO_READ); 
+                                nb_sectors, cb, opaque, QEMU_AIO_READ);
     } else {
-        return paio_submit(bs, s->hfile, sector_num, qiov, nb_sectors,
+        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
+                           nb_sectors << BDRV_SECTOR_BITS,
                            cb, opaque, QEMU_AIO_READ);
     }
 }
@@ -392,9 +394,10 @@ static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
     BDRVRawState *s = bs->opaque;
     if (s->aio) {
         return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
-                                nb_sectors, cb, opaque, QEMU_AIO_WRITE); 
+                                nb_sectors, cb, opaque, QEMU_AIO_WRITE);
     } else {
-        return paio_submit(bs, s->hfile, sector_num, qiov, nb_sectors,
+        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
+                           nb_sectors << BDRV_SECTOR_BITS,
                            cb, opaque, QEMU_AIO_WRITE);
     }
 }
diff --git a/block/trace-events b/block/trace-events
index 90d618a..978ef4f 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -58,7 +58,7 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p"
 # block/raw-win32.c
 # block/raw-posix.c
 paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d"
-paio_submit(void *acb, void *opaque, int64_t sector_num, int nb_sectors, int type) "acb %p opaque %p sector_num %"PRId64" nb_sectors %d type %d"
+paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d"

 # block/qcow2.c
 qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset %" PRIx64 " bytes %d"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 07/19] rbd: Switch rbd_start_aio() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (5 preceding siblings ...)
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 06/19] raw-posix: Switch paio_submit() to byte-based Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 08/19] block: Convert .bdrv_aio_discard() " Eric Blake
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, pbonzini, kwolf, stefanha, Josh Durgin, Jeff Cody, Max Reitz

The internal function converts to byte-based before calling into
RBD code; hoist the conversion to the callers so that callers
can then be switched to byte-based themselves.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/rbd.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 0a5840d..01cbb63 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -649,9 +649,9 @@ static int rbd_aio_flush_wrapper(rbd_image_t image,
 }

 static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
-                                 int64_t sector_num,
+                                 int64_t off,
                                  QEMUIOVector *qiov,
-                                 int nb_sectors,
+                                 int64_t size,
                                  BlockCompletionFunc *cb,
                                  void *opaque,
                                  RBDAIOCmd cmd)
@@ -659,7 +659,6 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
     RBDAIOCB *acb;
     RADOSCB *rcb = NULL;
     rbd_completion_t c;
-    int64_t off, size;
     char *buf;
     int r;

@@ -668,6 +667,7 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
     acb = qemu_aio_get(&rbd_aiocb_info, bs, cb, opaque);
     acb->cmd = cmd;
     acb->qiov = qiov;
+    assert(!qiov || qiov->size == size);
     if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
         acb->bounce = NULL;
     } else {
@@ -687,9 +687,6 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,

     buf = acb->bounce;

-    off = sector_num * BDRV_SECTOR_SIZE;
-    size = nb_sectors * BDRV_SECTOR_SIZE;
-
     rcb = g_new(RADOSCB, 1);
     rcb->acb = acb;
     rcb->buf = buf;
@@ -739,7 +736,8 @@ static BlockAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
                                       BlockCompletionFunc *cb,
                                       void *opaque)
 {
-    return rbd_start_aio(bs, sector_num, qiov, nb_sectors, cb, opaque,
+    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
+                         nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
                          RBD_AIO_READ);
 }

@@ -750,7 +748,8 @@ static BlockAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
                                        BlockCompletionFunc *cb,
                                        void *opaque)
 {
-    return rbd_start_aio(bs, sector_num, qiov, nb_sectors, cb, opaque,
+    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
+                         nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
                          RBD_AIO_WRITE);
 }

@@ -937,7 +936,8 @@ static BlockAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs,
                                         BlockCompletionFunc *cb,
                                         void *opaque)
 {
-    return rbd_start_aio(bs, sector_num, NULL, nb_sectors, cb, opaque,
+    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, NULL,
+                         nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
                          RBD_AIO_DISCARD);
 }
 #endif
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 08/19] block: Convert .bdrv_aio_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (6 preceding siblings ...)
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 07/19] rbd: Switch rbd_start_aio() " Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 09/19] block: Add .bdrv_co_pdiscard() driver callback Eric Blake
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, pbonzini, kwolf, stefanha, Fam Zheng, Max Reitz,
	Josh Durgin, Jeff Cody

Another step towards byte-based interfaces everywhere.  Replace
the sector-based driver callback .bdrv_aio_discard() with a new
byte-based .bdrv_aio_pdiscard().  Only raw-posix and RBD drivers
are affected, so it was not worth splitting into multiple patches.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/block/block_int.h |  4 ++--
 block/io.c                |  7 +++----
 block/raw-posix.c         | 18 ++++++++----------
 block/rbd.c               | 15 +++++++--------
 4 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8054146..0cbe250 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -142,8 +142,8 @@ struct BlockDriver {
         BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
         BlockCompletionFunc *cb, void *opaque);
-    BlockAIOCB *(*bdrv_aio_discard)(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors,
+    BlockAIOCB *(*bdrv_aio_pdiscard)(BlockDriverState *bs,
+        int64_t offset, int count,
         BlockCompletionFunc *cb, void *opaque);

     int coroutine_fn (*bdrv_co_readv)(BlockDriverState *bs,
diff --git a/block/io.c b/block/io.c
index 3babbdc..40d8444 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2403,7 +2403,7 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
         return 0;
     }

-    if (!bs->drv->bdrv_co_discard && !bs->drv->bdrv_aio_discard) {
+    if (!bs->drv->bdrv_co_discard && !bs->drv->bdrv_aio_pdiscard) {
         return 0;
     }

@@ -2444,9 +2444,8 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
                 .coroutine = qemu_coroutine_self(),
             };

-            acb = bs->drv->bdrv_aio_discard(bs, offset >> BDRV_SECTOR_BITS,
-                                            num >> BDRV_SECTOR_BITS,
-                                            bdrv_co_io_em_complete, &co);
+            acb = bs->drv->bdrv_aio_pdiscard(bs, offset, num,
+                                             bdrv_co_io_em_complete, &co);
             if (acb == NULL) {
                 ret = -EIO;
                 goto out;
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 6ed329d..2c98cab 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1865,14 +1865,13 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
     return ret | BDRV_BLOCK_OFFSET_VALID | start;
 }

-static coroutine_fn BlockAIOCB *raw_aio_discard(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors,
+static coroutine_fn BlockAIOCB *raw_aio_pdiscard(BlockDriverState *bs,
+    int64_t offset, int count,
     BlockCompletionFunc *cb, void *opaque)
 {
     BDRVRawState *s = bs->opaque;

-    return paio_submit(bs, s->fd, sector_num << BDRV_SECTOR_BITS, NULL,
-                       nb_sectors << BDRV_SECTOR_BITS,
+    return paio_submit(bs, s->fd, offset, NULL, count,
                        cb, opaque, QEMU_AIO_DISCARD);
 }

@@ -1944,7 +1943,7 @@ BlockDriver bdrv_file = {
     .bdrv_co_preadv         = raw_co_preadv,
     .bdrv_co_pwritev        = raw_co_pwritev,
     .bdrv_aio_flush = raw_aio_flush,
-    .bdrv_aio_discard = raw_aio_discard,
+    .bdrv_aio_pdiscard = raw_aio_pdiscard,
     .bdrv_refresh_limits = raw_refresh_limits,
     .bdrv_io_plug = raw_aio_plug,
     .bdrv_io_unplug = raw_aio_unplug,
@@ -2286,8 +2285,8 @@ static int fd_open(BlockDriverState *bs)
     return -EIO;
 }

-static coroutine_fn BlockAIOCB *hdev_aio_discard(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors,
+static coroutine_fn BlockAIOCB *hdev_aio_pdiscard(BlockDriverState *bs,
+    int64_t offset, int count,
     BlockCompletionFunc *cb, void *opaque)
 {
     BDRVRawState *s = bs->opaque;
@@ -2295,8 +2294,7 @@ static coroutine_fn BlockAIOCB *hdev_aio_discard(BlockDriverState *bs,
     if (fd_open(bs) < 0) {
         return NULL;
     }
-    return paio_submit(bs, s->fd, sector_num << BDRV_SECTOR_BITS, NULL,
-                       nb_sectors << BDRV_SECTOR_BITS,
+    return paio_submit(bs, s->fd, offset, NULL, count,
                        cb, opaque, QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
 }

@@ -2391,7 +2389,7 @@ static BlockDriver bdrv_host_device = {
     .bdrv_co_preadv         = raw_co_preadv,
     .bdrv_co_pwritev        = raw_co_pwritev,
     .bdrv_aio_flush	= raw_aio_flush,
-    .bdrv_aio_discard   = hdev_aio_discard,
+    .bdrv_aio_pdiscard   = hdev_aio_pdiscard,
     .bdrv_refresh_limits = raw_refresh_limits,
     .bdrv_io_plug = raw_aio_plug,
     .bdrv_io_unplug = raw_aio_unplug,
diff --git a/block/rbd.c b/block/rbd.c
index 01cbb63..0106fea 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -930,14 +930,13 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
 }

 #ifdef LIBRBD_SUPPORTS_DISCARD
-static BlockAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs,
-                                        int64_t sector_num,
-                                        int nb_sectors,
-                                        BlockCompletionFunc *cb,
-                                        void *opaque)
+static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs,
+                                         int64_t offset,
+                                         int count,
+                                         BlockCompletionFunc *cb,
+                                         void *opaque)
 {
-    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, NULL,
-                         nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
+    return rbd_start_aio(bs, offset, NULL, count, cb, opaque,
                          RBD_AIO_DISCARD);
 }
 #endif
@@ -1001,7 +1000,7 @@ static BlockDriver bdrv_rbd = {
 #endif

 #ifdef LIBRBD_SUPPORTS_DISCARD
-    .bdrv_aio_discard       = qemu_rbd_aio_discard,
+    .bdrv_aio_pdiscard      = qemu_rbd_aio_pdiscard,
 #endif

     .bdrv_snapshot_create   = qemu_rbd_snap_create,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 09/19] block: Add .bdrv_co_pdiscard() driver callback
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (7 preceding siblings ...)
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 08/19] block: Convert .bdrv_aio_discard() " Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 10/19] blkreplay: Switch .bdrv_co_discard() to byte-based Eric Blake
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Fam Zheng, Max Reitz

There's enough drivers with a sector-based callback that it will
be easier to switch one at a time.  This patch adds a byte-based
callback, and then after all drivers are swapped, we'll drop the
sector-based callback.

[checkpatch doesn't like the space after coroutine_fn in
block_int.h, but it's consistent with the rest of the file]

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/block/block_int.h | 2 ++
 block/io.c                | 7 +++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 0cbe250..b4d4cd2 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -167,6 +167,8 @@ struct BlockDriver {
         int64_t offset, int count, BdrvRequestFlags flags);
     int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors);
+    int coroutine_fn (*bdrv_co_pdiscard)(BlockDriverState *bs,
+        int64_t offset, int count);
     int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *pnum,
         BlockDriverState **file);
diff --git a/block/io.c b/block/io.c
index 40d8444..ee87fbf 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2403,7 +2403,8 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
         return 0;
     }

-    if (!bs->drv->bdrv_co_discard && !bs->drv->bdrv_aio_pdiscard) {
+    if (!bs->drv->bdrv_co_discard && !bs->drv->bdrv_co_pdiscard &&
+        !bs->drv->bdrv_aio_pdiscard) {
         return 0;
     }

@@ -2435,7 +2436,9 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
         int ret;
         int num = MIN(count, max_pdiscard);

-        if (bs->drv->bdrv_co_discard) {
+        if (bs->drv->bdrv_co_pdiscard) {
+            ret = bs->drv->bdrv_co_pdiscard(bs, offset, num);
+        } else if (bs->drv->bdrv_co_discard) {
             ret = bs->drv->bdrv_co_discard(bs, offset >> BDRV_SECTOR_BITS,
                                            num >> BDRV_SECTOR_BITS);
         } else {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 10/19] blkreplay: Switch .bdrv_co_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (8 preceding siblings ...)
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 09/19] block: Add .bdrv_co_pdiscard() driver callback Eric Blake
@ 2016-07-15 23:22 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 11/19] gluster: " Eric Blake
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz

Another step towards killing off sector-based block APIs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/blkreplay.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/blkreplay.c b/block/blkreplay.c
index c69e5a5..30f9d5f 100755
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -114,12 +114,11 @@ static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs,
     return ret;
 }

-static int coroutine_fn blkreplay_co_discard(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors)
+static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs,
+                                              int64_t offset, int count)
 {
     uint64_t reqid = request_id++;
-    int ret = bdrv_co_pdiscard(bs->file->bs, sector_num << BDRV_SECTOR_BITS,
-                               nb_sectors << BDRV_SECTOR_BITS);
+    int ret = bdrv_co_pdiscard(bs->file->bs, offset, count);
     block_request_create(reqid, bs, qemu_coroutine_self());
     qemu_coroutine_yield();

@@ -149,7 +148,7 @@ static BlockDriver bdrv_blkreplay = {
     .bdrv_co_pwritev        = blkreplay_co_pwritev,

     .bdrv_co_pwrite_zeroes  = blkreplay_co_pwrite_zeroes,
-    .bdrv_co_discard        = blkreplay_co_discard,
+    .bdrv_co_pdiscard       = blkreplay_co_pdiscard,
     .bdrv_co_flush          = blkreplay_co_flush,
 };

-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 11/19] gluster: Switch .bdrv_co_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (9 preceding siblings ...)
  2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 10/19] blkreplay: Switch .bdrv_co_discard() to byte-based Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 12/19] iscsi: " Eric Blake
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Jeff Cody, Max Reitz

Another step towards killing off sector-based block APIs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/gluster.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 406c1e6..ef3b0de 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -724,14 +724,12 @@ error:
 }

 #ifdef CONFIG_GLUSTERFS_DISCARD
-static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors)
+static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
+                                                 int64_t offset, int size)
 {
     int ret;
     GlusterAIOCB acb;
     BDRVGlusterState *s = bs->opaque;
-    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
-    off_t offset = sector_num * BDRV_SECTOR_SIZE;

     acb.size = 0;
     acb.ret = 0;
@@ -976,7 +974,7 @@ static BlockDriver bdrv_gluster = {
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
     .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
-    .bdrv_co_discard              = qemu_gluster_co_discard,
+    .bdrv_co_pdiscard             = qemu_gluster_co_pdiscard,
 #endif
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_pwrite_zeroes        = qemu_gluster_co_pwrite_zeroes,
@@ -1004,7 +1002,7 @@ static BlockDriver bdrv_gluster_tcp = {
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
     .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
-    .bdrv_co_discard              = qemu_gluster_co_discard,
+    .bdrv_co_pdiscard             = qemu_gluster_co_pdiscard,
 #endif
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_pwrite_zeroes        = qemu_gluster_co_pwrite_zeroes,
@@ -1032,7 +1030,7 @@ static BlockDriver bdrv_gluster_unix = {
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
     .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
-    .bdrv_co_discard              = qemu_gluster_co_discard,
+    .bdrv_co_pdiscard             = qemu_gluster_co_pdiscard,
 #endif
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_pwrite_zeroes        = qemu_gluster_co_pwrite_zeroes,
@@ -1060,7 +1058,7 @@ static BlockDriver bdrv_gluster_rdma = {
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
     .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
-    .bdrv_co_discard              = qemu_gluster_co_discard,
+    .bdrv_co_pdiscard             = qemu_gluster_co_pdiscard,
 #endif
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
     .bdrv_co_pwrite_zeroes        = qemu_gluster_co_pwrite_zeroes,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 12/19] iscsi: Switch .bdrv_co_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (10 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 11/19] gluster: " Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 13/19] nbd: " Eric Blake
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, pbonzini, kwolf, stefanha, Ronnie Sahlberg,
	Peter Lieven, Max Reitz

Another step towards killing off sector-based block APIs.

Unlike write_zeroes, where we can be handed unaligned requests
and must fail gracefully with -ENOTSUP for a fallback, we are
guaranteed that discard requests are always aligned because the
block layer already ignored unaligned head/tail.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/iscsi.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index bdc7ade..da509df 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -923,29 +923,26 @@ iscsi_getlength(BlockDriverState *bs)
 }

 static int
-coroutine_fn iscsi_co_discard(BlockDriverState *bs, int64_t sector_num,
-                                   int nb_sectors)
+coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
 {
     IscsiLun *iscsilun = bs->opaque;
     struct IscsiTask iTask;
     struct unmap_list list;

-    if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
-        return -EINVAL;
-    }
+    assert(is_byte_request_lun_aligned(offset, count, iscsilun));

     if (!iscsilun->lbp.lbpu) {
         /* UNMAP is not supported by the target */
         return 0;
     }

-    list.lba = sector_qemu2lun(sector_num, iscsilun);
-    list.num = sector_qemu2lun(nb_sectors, iscsilun);
+    list.lba = offset / iscsilun->block_size;
+    list.num = count / iscsilun->block_size;

     iscsi_co_init_iscsitask(iscsilun, &iTask);
 retry:
     if (iscsi_unmap_task(iscsilun->iscsi, iscsilun->lun, 0, 0, &list, 1,
-                     iscsi_co_generic_cb, &iTask) == NULL) {
+                         iscsi_co_generic_cb, &iTask) == NULL) {
         return -ENOMEM;
     }

@@ -975,7 +972,8 @@ retry:
         return iTask.err_code;
     }

-    iscsi_allocationmap_clear(iscsilun, sector_num, nb_sectors);
+    iscsi_allocationmap_clear(iscsilun, offset >> BDRV_SECTOR_BITS,
+                              count >> BDRV_SECTOR_BITS);

     return 0;
 }
@@ -1862,7 +1860,7 @@ static BlockDriver bdrv_iscsi = {
     .bdrv_refresh_limits = iscsi_refresh_limits,

     .bdrv_co_get_block_status = iscsi_co_get_block_status,
-    .bdrv_co_discard      = iscsi_co_discard,
+    .bdrv_co_pdiscard      = iscsi_co_pdiscard,
     .bdrv_co_pwrite_zeroes = iscsi_co_pwrite_zeroes,
     .bdrv_co_readv         = iscsi_co_readv,
     .bdrv_co_writev_flags  = iscsi_co_writev_flags,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 13/19] nbd: Switch .bdrv_co_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (11 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 12/19] iscsi: " Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 14/19] qcow2: " Eric Blake
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz

Another step towards killing off sector-based block APIs.

While at it, call directly into nbd-client.c instead of having
a pointless trivial wrapper in nbd.c.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/nbd-client.h |  3 +--
 block/nbd-client.c | 11 ++++++-----
 block/nbd.c        | 12 +++---------
 3 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index c618dad..62dec33 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -44,8 +44,7 @@ int nbd_client_init(BlockDriverState *bs,
                     Error **errp);
 void nbd_client_close(BlockDriverState *bs);

-int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
-                          int nb_sectors);
+int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count);
 int nbd_client_co_flush(BlockDriverState *bs);
 int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
                          int nb_sectors, QEMUIOVector *qiov, int flags);
diff --git a/block/nbd-client.c b/block/nbd-client.c
index f184844..d22feea 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -295,19 +295,20 @@ int nbd_client_co_flush(BlockDriverState *bs)
     return -reply.error;
 }

-int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
-                          int nb_sectors)
+int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
 {
     NbdClientSession *client = nbd_get_client_session(bs);
-    struct nbd_request request = { .type = NBD_CMD_TRIM };
+    struct nbd_request request = {
+        .type = NBD_CMD_TRIM,
+        .from = offset,
+        .len = count,
+    };
     struct nbd_reply reply;
     ssize_t ret;

     if (!(client->nbdflags & NBD_FLAG_SEND_TRIM)) {
         return 0;
     }
-    request.from = sector_num * 512;
-    request.len = nb_sectors * 512;

     nbd_coroutine_start(client, &request);
     ret = nbd_co_send_request(bs, &request, NULL);
diff --git a/block/nbd.c b/block/nbd.c
index 8a13078..42cae0e 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -360,12 +360,6 @@ static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
     bs->bl.max_transfer = NBD_MAX_BUFFER_SIZE;
 }

-static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
-                          int nb_sectors)
-{
-    return nbd_client_co_discard(bs, sector_num, nb_sectors);
-}
-
 static void nbd_close(BlockDriverState *bs)
 {
     nbd_client_close(bs);
@@ -448,7 +442,7 @@ static BlockDriver bdrv_nbd = {
     .bdrv_co_writev_flags       = nbd_client_co_writev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
-    .bdrv_co_discard            = nbd_co_discard,
+    .bdrv_co_pdiscard           = nbd_client_co_pdiscard,
     .bdrv_refresh_limits        = nbd_refresh_limits,
     .bdrv_getlength             = nbd_getlength,
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
@@ -466,7 +460,7 @@ static BlockDriver bdrv_nbd_tcp = {
     .bdrv_co_writev_flags       = nbd_client_co_writev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
-    .bdrv_co_discard            = nbd_co_discard,
+    .bdrv_co_pdiscard           = nbd_client_co_pdiscard,
     .bdrv_refresh_limits        = nbd_refresh_limits,
     .bdrv_getlength             = nbd_getlength,
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
@@ -484,7 +478,7 @@ static BlockDriver bdrv_nbd_unix = {
     .bdrv_co_writev_flags       = nbd_client_co_writev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
-    .bdrv_co_discard            = nbd_co_discard,
+    .bdrv_co_pdiscard           = nbd_client_co_pdiscard,
     .bdrv_refresh_limits        = nbd_refresh_limits,
     .bdrv_getlength             = nbd_getlength,
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 14/19] qcow2: Switch .bdrv_co_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (12 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 13/19] nbd: " Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 15/19] raw_bsd: " Eric Blake
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz

Another step towards killing off sector-based block APIs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/qcow2.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index a6bca73..d620d0a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2479,15 +2479,15 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
     return ret;
 }

-static coroutine_fn int qcow2_co_discard(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors)
+static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs,
+                                          int64_t offset, int count)
 {
     int ret;
     BDRVQcow2State *s = bs->opaque;

     qemu_co_mutex_lock(&s->lock);
-    ret = qcow2_discard_clusters(bs, sector_num << BDRV_SECTOR_BITS,
-        nb_sectors, QCOW2_DISCARD_REQUEST, false);
+    ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS,
+                                 QCOW2_DISCARD_REQUEST, false);
     qemu_co_mutex_unlock(&s->lock);
     return ret;
 }
@@ -3410,7 +3410,7 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_co_flush_to_os    = qcow2_co_flush_to_os,

     .bdrv_co_pwrite_zeroes  = qcow2_co_pwrite_zeroes,
-    .bdrv_co_discard        = qcow2_co_discard,
+    .bdrv_co_pdiscard       = qcow2_co_pdiscard,
     .bdrv_truncate          = qcow2_truncate,
     .bdrv_write_compressed  = qcow2_write_compressed,
     .bdrv_make_empty        = qcow2_make_empty,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 15/19] raw_bsd: Switch .bdrv_co_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (13 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 14/19] qcow2: " Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 16/19] sheepdog: " Eric Blake
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz

Another step towards killing off sector-based block APIs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/raw_bsd.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 68f0a91..961aa13 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -134,11 +134,10 @@ static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs,
     return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
 }

-static int coroutine_fn raw_co_discard(BlockDriverState *bs,
-                                       int64_t sector_num, int nb_sectors)
+static int coroutine_fn raw_co_pdiscard(BlockDriverState *bs,
+                                        int64_t offset, int count)
 {
-    return bdrv_co_pdiscard(bs->file->bs, sector_num << BDRV_SECTOR_BITS,
-                            nb_sectors << BDRV_SECTOR_BITS);
+    return bdrv_co_pdiscard(bs->file->bs, offset, count);
 }

 static int64_t raw_getlength(BlockDriverState *bs)
@@ -244,7 +243,7 @@ BlockDriver bdrv_raw = {
     .bdrv_co_readv        = &raw_co_readv,
     .bdrv_co_writev_flags = &raw_co_writev_flags,
     .bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes,
-    .bdrv_co_discard      = &raw_co_discard,
+    .bdrv_co_pdiscard     = &raw_co_pdiscard,
     .bdrv_co_get_block_status = &raw_co_get_block_status,
     .bdrv_truncate        = &raw_truncate,
     .bdrv_getlength       = &raw_getlength,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 16/19] sheepdog: Switch .bdrv_co_discard() to byte-based
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (14 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 15/19] raw_bsd: " Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 17/19] block: Kill .bdrv_co_discard() Eric Blake
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, pbonzini, kwolf, stefanha, Hitoshi Mitake, Liu Yuan,
	Jeff Cody, Max Reitz, open list:Sheepdog

Another step towards killing off sector-based block APIs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/sheepdog.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index e739c56..66e1cb2 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -2800,8 +2800,8 @@ static int sd_load_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
 }


-static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
-                                      int nb_sectors)
+static coroutine_fn int sd_co_pdiscard(BlockDriverState *bs, int64_t offset,
+                                      int count)
 {
     SheepdogAIOCB *acb;
     BDRVSheepdogState *s = bs->opaque;
@@ -2811,7 +2811,7 @@ static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
     uint32_t zero = 0;

     if (!s->discard_supported) {
-            return 0;
+        return 0;
     }

     memset(&discard_iov, 0, sizeof(discard_iov));
@@ -2820,7 +2820,10 @@ static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
     iov.iov_len = sizeof(zero);
     discard_iov.iov = &iov;
     discard_iov.niov = 1;
-    acb = sd_aio_setup(bs, &discard_iov, sector_num, nb_sectors);
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((count & (BDRV_SECTOR_SIZE - 1)) == 0);
+    acb = sd_aio_setup(bs, &discard_iov, offset >> BDRV_SECTOR_BITS,
+                       count >> BDRV_SECTOR_BITS);
     acb->aiocb_type = AIOCB_DISCARD_OBJ;
     acb->aio_done_func = sd_finish_aiocb;

@@ -2954,7 +2957,7 @@ static BlockDriver bdrv_sheepdog = {
     .bdrv_co_readv  = sd_co_readv,
     .bdrv_co_writev = sd_co_writev,
     .bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
-    .bdrv_co_discard = sd_co_discard,
+    .bdrv_co_pdiscard = sd_co_pdiscard,
     .bdrv_co_get_block_status = sd_co_get_block_status,

     .bdrv_snapshot_create   = sd_snapshot_create,
@@ -2990,7 +2993,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
     .bdrv_co_readv  = sd_co_readv,
     .bdrv_co_writev = sd_co_writev,
     .bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
-    .bdrv_co_discard = sd_co_discard,
+    .bdrv_co_pdiscard = sd_co_pdiscard,
     .bdrv_co_get_block_status = sd_co_get_block_status,

     .bdrv_snapshot_create   = sd_snapshot_create,
@@ -3026,7 +3029,7 @@ static BlockDriver bdrv_sheepdog_unix = {
     .bdrv_co_readv  = sd_co_readv,
     .bdrv_co_writev = sd_co_writev,
     .bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
-    .bdrv_co_discard = sd_co_discard,
+    .bdrv_co_pdiscard = sd_co_pdiscard,
     .bdrv_co_get_block_status = sd_co_get_block_status,

     .bdrv_snapshot_create   = sd_snapshot_create,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 17/19] block: Kill .bdrv_co_discard()
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (15 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 16/19] sheepdog: " Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 18/19] nbd: Convert to byte-based interface Eric Blake
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Fam Zheng, Max Reitz

Now that all drivers have a byte-based .bdrv_co_pdiscard(), we
no longer need to worry about the sector-based version.  We can
also relax our minimum alignment to 1 for drivers that support it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/block/block_int.h | 2 --
 block/io.c                | 9 ++-------
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index b4d4cd2..42bbed4 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -165,8 +165,6 @@ struct BlockDriver {
      */
     int coroutine_fn (*bdrv_co_pwrite_zeroes)(BlockDriverState *bs,
         int64_t offset, int count, BdrvRequestFlags flags);
-    int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors);
     int coroutine_fn (*bdrv_co_pdiscard)(BlockDriverState *bs,
         int64_t offset, int count);
     int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs,
diff --git a/block/io.c b/block/io.c
index ee87fbf..b81d1fc 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2403,14 +2403,12 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
         return 0;
     }

-    if (!bs->drv->bdrv_co_discard && !bs->drv->bdrv_co_pdiscard &&
-        !bs->drv->bdrv_aio_pdiscard) {
+    if (!bs->drv->bdrv_co_pdiscard && !bs->drv->bdrv_aio_pdiscard) {
         return 0;
     }

     /* Discard is advisory, so ignore any unaligned head or tail */
-    align = MAX(BDRV_SECTOR_SIZE,
-                MAX(bs->bl.pdiscard_alignment, bs->bl.request_alignment));
+    align = MAX(bs->bl.pdiscard_alignment, bs->bl.request_alignment);
     assert(is_power_of_2(align));
     head = MIN(count, -offset & (align - 1));
     if (head) {
@@ -2438,9 +2436,6 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,

         if (bs->drv->bdrv_co_pdiscard) {
             ret = bs->drv->bdrv_co_pdiscard(bs, offset, num);
-        } else if (bs->drv->bdrv_co_discard) {
-            ret = bs->drv->bdrv_co_discard(bs, offset >> BDRV_SECTOR_BITS,
-                                           num >> BDRV_SECTOR_BITS);
         } else {
             BlockAIOCB *acb;
             CoroutineIOCompletion co = {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 18/19] nbd: Convert to byte-based interface
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (16 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 17/19] block: Kill .bdrv_co_discard() Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 19/19] raw_bsd: " Eric Blake
  2016-07-19 16:12 ` [Qemu-devel] [Qemu-block] [PATCH for-2.7 v2 00/19] byte-based block discard Stefan Hajnoczi
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz

The NBD protocol doesn't have any notion of sectors, so it is
a fairly easy conversion to use byte-based read and write.

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

---
v2: fix typo in commit message
---
 block/nbd-client.h  |  8 ++++----
 include/block/nbd.h |  1 -
 block/nbd-client.c  | 30 +++++++++++++++++-------------
 block/nbd.c         | 12 ++++++------
 4 files changed, 27 insertions(+), 24 deletions(-)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index 62dec33..fa9817b 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -46,10 +46,10 @@ void nbd_client_close(BlockDriverState *bs);

 int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count);
 int nbd_client_co_flush(BlockDriverState *bs);
-int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
-                         int nb_sectors, QEMUIOVector *qiov, int flags);
-int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
-                        int nb_sectors, QEMUIOVector *qiov);
+int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
+                          uint64_t bytes, QEMUIOVector *qiov, int flags);
+int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
+                         uint64_t bytes, QEMUIOVector *qiov, int flags);

 void nbd_client_detach_aio_context(BlockDriverState *bs);
 void nbd_client_attach_aio_context(BlockDriverState *bs,
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 503f514..cb91820 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -77,7 +77,6 @@ enum {

 /* Maximum size of a single READ/WRITE data buffer */
 #define NBD_MAX_BUFFER_SIZE (32 * 1024 * 1024)
-#define NBD_MAX_SECTORS (NBD_MAX_BUFFER_SIZE / BDRV_SECTOR_SIZE)

 /* Maximum size of an export name. The NBD spec requires 256 and
  * suggests that servers support up to 4096, but we stick to only the
diff --git a/block/nbd-client.c b/block/nbd-client.c
index d22feea..2cf3237 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -218,17 +218,20 @@ static void nbd_coroutine_end(NbdClientSession *s,
     }
 }

-int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
-                        int nb_sectors, QEMUIOVector *qiov)
+int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
+                         uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
     NbdClientSession *client = nbd_get_client_session(bs);
-    struct nbd_request request = { .type = NBD_CMD_READ };
+    struct nbd_request request = {
+        .type = NBD_CMD_READ,
+        .from = offset,
+        .len = bytes,
+    };
     struct nbd_reply reply;
     ssize_t ret;

-    assert(nb_sectors <= NBD_MAX_SECTORS);
-    request.from = sector_num * 512;
-    request.len = nb_sectors * 512;
+    assert(bytes <= NBD_MAX_BUFFER_SIZE);
+    assert(!flags);

     nbd_coroutine_start(client, &request);
     ret = nbd_co_send_request(bs, &request, NULL);
@@ -239,14 +242,17 @@ int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
     }
     nbd_coroutine_end(client, &request);
     return -reply.error;
-
 }

-int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
-                         int nb_sectors, QEMUIOVector *qiov, int flags)
+int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
+                          uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
     NbdClientSession *client = nbd_get_client_session(bs);
-    struct nbd_request request = { .type = NBD_CMD_WRITE };
+    struct nbd_request request = {
+        .type = NBD_CMD_WRITE,
+        .from = offset,
+        .len = bytes,
+    };
     struct nbd_reply reply;
     ssize_t ret;

@@ -255,9 +261,7 @@ int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
         request.type |= NBD_CMD_FLAG_FUA;
     }

-    assert(nb_sectors <= NBD_MAX_SECTORS);
-    request.from = sector_num * 512;
-    request.len = nb_sectors * 512;
+    assert(bytes <= NBD_MAX_BUFFER_SIZE);

     nbd_coroutine_start(client, &request);
     ret = nbd_co_send_request(bs, &request, qiov);
diff --git a/block/nbd.c b/block/nbd.c
index 42cae0e..8d57220 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -438,8 +438,8 @@ static BlockDriver bdrv_nbd = {
     .instance_size              = sizeof(BDRVNBDState),
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_client_co_readv,
-    .bdrv_co_writev_flags       = nbd_client_co_writev,
+    .bdrv_co_preadv             = nbd_client_co_preadv,
+    .bdrv_co_pwritev            = nbd_client_co_pwritev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
     .bdrv_co_pdiscard           = nbd_client_co_pdiscard,
@@ -456,8 +456,8 @@ static BlockDriver bdrv_nbd_tcp = {
     .instance_size              = sizeof(BDRVNBDState),
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_client_co_readv,
-    .bdrv_co_writev_flags       = nbd_client_co_writev,
+    .bdrv_co_preadv             = nbd_client_co_preadv,
+    .bdrv_co_pwritev            = nbd_client_co_pwritev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
     .bdrv_co_pdiscard           = nbd_client_co_pdiscard,
@@ -474,8 +474,8 @@ static BlockDriver bdrv_nbd_unix = {
     .instance_size              = sizeof(BDRVNBDState),
     .bdrv_parse_filename        = nbd_parse_filename,
     .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_client_co_readv,
-    .bdrv_co_writev_flags       = nbd_client_co_writev,
+    .bdrv_co_preadv             = nbd_client_co_preadv,
+    .bdrv_co_pwritev            = nbd_client_co_pwritev,
     .bdrv_close                 = nbd_close,
     .bdrv_co_flush_to_os        = nbd_co_flush,
     .bdrv_co_pdiscard           = nbd_client_co_pdiscard,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v2 19/19] raw_bsd: Convert to byte-based interface
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (17 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 18/19] nbd: Convert to byte-based interface Eric Blake
@ 2016-07-15 23:23 ` Eric Blake
  2016-07-19 16:12 ` [Qemu-devel] [Qemu-block] [PATCH for-2.7 v2 00/19] byte-based block discard Stefan Hajnoczi
  19 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2016-07-15 23:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, pbonzini, kwolf, stefanha, Max Reitz

Since the raw format driver is just passing things through, we can
do byte-based read and write if the underlying protocol does
likewise.

There's one tricky part - if we probed the image format, we document
that we restrict operations on the initial sector.  It's easiest to
keep this guarantee by enforcing read-modify-write on sub-sector
operations (yes, this partially reverts commit ad82be2f).

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: Rather than reject sub-sector write to sector 0, enforce a
RMW by setting request_alignment [Paolo]
---
 block/raw_bsd.c | 45 ++++++++++++++++++++++++++-------------------
 1 file changed, 26 insertions(+), 19 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 961aa13..588d408 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -50,33 +50,30 @@ static int raw_reopen_prepare(BDRVReopenState *reopen_state,
     return 0;
 }

-static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num,
-                                     int nb_sectors, QEMUIOVector *qiov)
+static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
+                                      uint64_t bytes, QEMUIOVector *qiov,
+                                      int flags)
 {
     BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-    return bdrv_co_readv(bs->file, sector_num, nb_sectors, qiov);
+    return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
 }

-static int coroutine_fn
-raw_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
-                    QEMUIOVector *qiov, int flags)
+static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
+                                       uint64_t bytes, QEMUIOVector *qiov,
+                                       int flags)
 {
     void *buf = NULL;
     BlockDriver *drv;
     QEMUIOVector local_qiov;
     int ret;

-    if (bs->probed && sector_num == 0) {
-        /* As long as these conditions are true, we can't get partial writes to
-         * the probe buffer and can just directly check the request. */
+    if (bs->probed && offset < BLOCK_PROBE_BUF_SIZE && bytes) {
+        /* Handling partial writes would be a pain - so we just
+         * require that guests have 512-byte request alignment if
+         * probing occurred */
         QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
         QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
-
-        if (nb_sectors == 0) {
-            /* qemu_iovec_to_buf() would fail, but we want to return success
-             * instead of -EINVAL in this case. */
-            return 0;
-        }
+        assert(offset == 0 && bytes >= BLOCK_PROBE_BUF_SIZE);

         buf = qemu_try_blockalign(bs->file->bs, 512);
         if (!buf) {
@@ -105,8 +102,7 @@ raw_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
     }

     BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
-    ret = bdrv_co_pwritev(bs->file, sector_num * BDRV_SECTOR_SIZE,
-                          nb_sectors * BDRV_SECTOR_SIZE, qiov, flags);
+    ret = bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);

 fail:
     if (qiov == &local_qiov) {
@@ -150,6 +146,16 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
     return bdrv_get_info(bs->file->bs, bdi);
 }

+static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    if (bs->probed) {
+        /* To make it easier to protect the first sector, any probed
+         * image is restricted to read-modify-write on sub-sector
+         * operations. */
+        bs->bl.request_alignment = BDRV_SECTOR_SIZE;
+    }
+}
+
 static int raw_truncate(BlockDriverState *bs, int64_t offset)
 {
     return bdrv_truncate(bs->file->bs, offset);
@@ -240,8 +246,8 @@ BlockDriver bdrv_raw = {
     .bdrv_open            = &raw_open,
     .bdrv_close           = &raw_close,
     .bdrv_create          = &raw_create,
-    .bdrv_co_readv        = &raw_co_readv,
-    .bdrv_co_writev_flags = &raw_co_writev_flags,
+    .bdrv_co_preadv       = &raw_co_preadv,
+    .bdrv_co_pwritev      = &raw_co_pwritev,
     .bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes,
     .bdrv_co_pdiscard     = &raw_co_pdiscard,
     .bdrv_co_get_block_status = &raw_co_get_block_status,
@@ -249,6 +255,7 @@ BlockDriver bdrv_raw = {
     .bdrv_getlength       = &raw_getlength,
     .has_variable_length  = true,
     .bdrv_get_info        = &raw_get_info,
+    .bdrv_refresh_limits  = &raw_refresh_limits,
     .bdrv_probe_blocksizes = &raw_probe_blocksizes,
     .bdrv_probe_geometry  = &raw_probe_geometry,
     .bdrv_media_changed   = &raw_media_changed,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH for-2.7 v2 00/19] byte-based block discard
  2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
                   ` (18 preceding siblings ...)
  2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 19/19] raw_bsd: " Eric Blake
@ 2016-07-19 16:12 ` Stefan Hajnoczi
  19 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2016-07-19 16:12 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, kwolf, pbonzini, stefanha, qemu-block

[-- Attachment #1: Type: text/plain, Size: 4954 bytes --]

On Fri, Jul 15, 2016 at 05:22:49PM -0600, Eric Blake wrote:
> Allow NBD to pass a byte-aligned discard request over the wire.
> 
> Prerequisite: Kevin's block branch merged with current qemu.git master,
> plus my work on auto-fragmenting (v3 at the moment):
> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg03550.html
> 
> Also available as a tag at:
> git fetch git://repo.or.cz/qemu/ericb.git nbd-discard-v2
> 
> This is a merge of two series both posted before soft freeze, both
> of which had initial positive review with only a few changes needed:
> byte-based block discard
> https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg06491.html
> Switch raw NBD to byte-based
> https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07030.html
> 
> I'd still hoping for one other series to make it into 2.7, based
> on being posted prior to soft freeze:
> nbd: efficient write zeroes
> https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07463.html
> 
> Changes since v1:
> - rebase to latest master (several contextual conflicts)
> - fix bug with aio_pdiscard using a stack variable qiov after
> its scope ends
> - tweak raw_bsd to remain sector-based if probing occurred
> 
> 001/19:[----] [--] 'block: Convert bdrv_co_discard() to byte-based'
> 002/19:[0004] [FC] 'block: Convert bdrv_discard() to byte-based'
> 003/19:[0034] [FC] 'block: Switch BlockRequest to byte-based'
> 004/19:[0008] [FC] 'block: Convert bdrv_aio_discard() to byte-based'
> 005/19:[----] [--] 'block: Convert BB interface to byte-based discards'
> 006/19:[----] [--] 'raw-posix: Switch paio_submit() to byte-based'
> 007/19:[----] [--] 'rbd: Switch rbd_start_aio() to byte-based'
> 008/19:[----] [--] 'block: Convert .bdrv_aio_discard() to byte-based'
> 009/19:[----] [--] 'block: Add .bdrv_co_pdiscard() driver callback'
> 010/19:[----] [-C] 'blkreplay: Switch .bdrv_co_discard() to byte-based'
> 011/19:[----] [--] 'gluster: Switch .bdrv_co_discard() to byte-based'
> 012/19:[----] [--] 'iscsi: Switch .bdrv_co_discard() to byte-based'
> 013/19:[----] [--] 'nbd: Switch .bdrv_co_discard() to byte-based'
> 014/19:[----] [--] 'qcow2: Switch .bdrv_co_discard() to byte-based'
> 015/19:[----] [-C] 'raw_bsd: Switch .bdrv_co_discard() to byte-based'
> 016/19:[----] [--] 'sheepdog: Switch .bdrv_co_discard() to byte-based'
> 017/19:[----] [--] 'block: Kill .bdrv_co_discard()'
> 018/19:[----] [--] 'nbd: Convert to byte-based interface'
> 019/19:[0028] [FC] 'raw_bsd: Convert to byte-based interface'
> 
> Eric Blake (19):
>   block: Convert bdrv_co_discard() to byte-based
>   block: Convert bdrv_discard() to byte-based
>   block: Switch BlockRequest to byte-based
>   block: Convert bdrv_aio_discard() to byte-based
>   block: Convert BB interface to byte-based discards
>   raw-posix: Switch paio_submit() to byte-based
>   rbd: Switch rbd_start_aio() to byte-based
>   block: Convert .bdrv_aio_discard() to byte-based
>   block: Add .bdrv_co_pdiscard() driver callback
>   blkreplay: Switch .bdrv_co_discard() to byte-based
>   gluster: Switch .bdrv_co_discard() to byte-based
>   iscsi: Switch .bdrv_co_discard() to byte-based
>   nbd: Switch .bdrv_co_discard() to byte-based
>   qcow2: Switch .bdrv_co_discard() to byte-based
>   raw_bsd: Switch .bdrv_co_discard() to byte-based
>   sheepdog: Switch .bdrv_co_discard() to byte-based
>   block: Kill .bdrv_co_discard()
>   nbd: Convert to byte-based interface
>   raw_bsd: Convert to byte-based interface
> 
>  block/nbd-client.h             |  11 ++-
>  include/block/block.h          |  10 +--
>  include/block/block_int.h      |   8 +--
>  include/block/nbd.h            |   1 -
>  include/sysemu/block-backend.h |   9 ++-
>  block/blkreplay.c              |   8 +--
>  block/block-backend.c          |  22 +++---
>  block/gluster.c                |  14 ++--
>  block/io.c                     | 154 ++++++++++++++++++++---------------------
>  block/iscsi.c                  |  18 +++--
>  block/mirror.c                 |   5 +-
>  block/nbd-client.c             |  41 ++++++-----
>  block/nbd.c                    |  24 +++----
>  block/qcow2-refcount.c         |   4 +-
>  block/qcow2.c                  |  10 +--
>  block/raw-posix.c              |  24 +++----
>  block/raw-win32.c              |  19 ++---
>  block/raw_bsd.c                |  53 ++++++++------
>  block/rbd.c                    |  29 ++++----
>  block/sheepdog.c               |  17 +++--
>  hw/block/xen_disk.c            |   7 +-
>  hw/ide/core.c                  |   6 +-
>  hw/scsi/scsi-disk.c            |   8 +--
>  nbd/server.c                   |  19 ++---
>  qemu-io-cmds.c                 |   3 +-
>  block/trace-events             |   4 +-
>  26 files changed, 260 insertions(+), 268 deletions(-)
> 
> -- 
> 2.5.5
> 
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-07-19 16:12 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-15 23:22 [Qemu-devel] [PATCH for-2.7 v2 00/19] byte-based block discard Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 01/19] block: Convert bdrv_co_discard() to byte-based Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 02/19] block: Convert bdrv_discard() " Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 03/19] block: Switch BlockRequest " Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 04/19] block: Convert bdrv_aio_discard() " Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 05/19] block: Convert BB interface to byte-based discards Eric Blake
2016-07-15 23:22   ` Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 06/19] raw-posix: Switch paio_submit() to byte-based Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 07/19] rbd: Switch rbd_start_aio() " Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 08/19] block: Convert .bdrv_aio_discard() " Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 09/19] block: Add .bdrv_co_pdiscard() driver callback Eric Blake
2016-07-15 23:22 ` [Qemu-devel] [PATCH v2 10/19] blkreplay: Switch .bdrv_co_discard() to byte-based Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 11/19] gluster: " Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 12/19] iscsi: " Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 13/19] nbd: " Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 14/19] qcow2: " Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 15/19] raw_bsd: " Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 16/19] sheepdog: " Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 17/19] block: Kill .bdrv_co_discard() Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 18/19] nbd: Convert to byte-based interface Eric Blake
2016-07-15 23:23 ` [Qemu-devel] [PATCH v2 19/19] raw_bsd: " Eric Blake
2016-07-19 16:12 ` [Qemu-devel] [Qemu-block] [PATCH for-2.7 v2 00/19] byte-based block discard Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.