All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
@ 2013-09-24 13:34 Peter Lieven
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 01/20] block: make BdrvRequestFlags public Peter Lieven
                   ` (20 more replies)
  0 siblings, 21 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

this patch adds the ability for targets to stay sparse during
block migration (if the zero_blocks capability is set) and qemu-img convert
even if the target does not have has_zero_init = 1.

the series was especially developed for iSCSI, but it should also work
with other drivers with little or no adjustments. these adjustments
should be limited to providing block provisioning information through
get_block_info and/or honouring BDRV_REQ_MAY_UNMAP on writing zeroes.

there are also 4 patches which are fixes/enhancements for the get_block_status API
discovered during development of this series.

v2->v3:
 - fix merge conflict in block/qcow2_cluster.c
 - changed return type of bdrv_has_discard_zeroes and
   bdrv_has_discard_write_zeroes to bool.
 - moved alignment and limits info to a BlockLimits struct (Paolo).
 - added magic constanst for default maximum in bdrv_co_do_write_zeroes
   and bdrv_co_discard (Eric).
 - bdrv_co_do_write_zeroes: allocating the bounce buffer only once (Eric),
   fixed bounce iov_len in the fall back path.
 - bdrv_zeroize: added inline docu (Eric) and do not mask flags passed
   to bdrv_write_zeroes (Eric).
 - qemu-img: changed the default hint for -S (min_sparse) in the usage
   help to 4k. not changing the default as it is unclear why this default
   was set. size suffixes are already supported (Eric).

v1->v2:
 - moved block max_discard and max_write_zeroes to BlockDriverState
 - added discard_alignment and write_zeroes_alignment to BlockDriverState
 - added bdrv_has_discard_zeroes() and bdrv_has_discard_write_zeroes()
 - added logic to bdrv_co_discard and bdrv_co_do_write_zeroes to honour
   limit and alignment info.
 - added support for -S 0 in qemu-img convert.

Peter Lieven (20):
  block: make BdrvRequestFlags public
  block: add flags to bdrv_*_write_zeroes
  block: introduce BDRV_REQ_MAY_UNMAP request flag
  block: introduce bdrv_has_discard_zeroes and
    bdrv_has_discard_write_zeroes
  block/raw: add bdrv_has_discard_zeroes and
    bdrv_has_discard_write_zeroes
  block: add BlockLimits structure to BlockDriverState
  block: honour BlockLimits in bdrv_co_do_write_zeroes
  block: honour BlockLimits in bdrv_co_discard
  iscsi: simplify iscsi_co_discard
  iscsi: set limits in BlockDriverState
  iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes
  iscsi: add bdrv_co_write_zeroes
  block: introduce bdrv_zeroize
  block/get_block_status: set *pnum = 0 on error
  block/get_block_status: avoid segfault if there is no backing_hd
  block/get_block_status: avoid redundant callouts on raw devices
  block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks
  qemu-img: add support for fully allocated images
  qemu-img: conditionally zero out target on convert
  block/raw: copy BlockLimits on raw_open

 block-migration.c         |    3 +-
 block.c                   |  205 +++++++++++++++++++++++++++++++++++++--------
 block/backup.c            |    3 +-
 block/iscsi.c             |  152 ++++++++++++++++++++++++---------
 block/qcow2-cluster.c     |    2 +-
 block/qcow2.c             |    2 +-
 block/qed.c               |    3 +-
 block/raw_bsd.c           |   62 ++++++++------
 block/vmdk.c              |    3 +-
 include/block/block.h     |   19 ++++-
 include/block/block_int.h |   32 ++++++-
 qemu-img.c                |   18 +++-
 qemu-io-cmds.c            |    2 +-
 13 files changed, 393 insertions(+), 113 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 01/20] block: make BdrvRequestFlags public
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
@ 2013-09-24 13:34 ` Peter Lieven
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 02/20] block: add flags to bdrv_*_write_zeroes Peter Lieven
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c               |    5 -----
 include/block/block.h |    5 +++++
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index ea4956d..8eefb97 100644
--- a/block.c
+++ b/block.c
@@ -51,11 +51,6 @@
 
 #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
 
-typedef enum {
-    BDRV_REQ_COPY_ON_READ = 0x1,
-    BDRV_REQ_ZERO_WRITE   = 0x2,
-} BdrvRequestFlags;
-
 static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load);
 static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
diff --git a/include/block/block.h b/include/block/block.h
index f808550..9fb0c3d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -62,6 +62,11 @@ typedef struct BlockDevOps {
     void (*resize_cb)(void *opaque);
 } BlockDevOps;
 
+typedef enum {
+    BDRV_REQ_COPY_ON_READ = 0x1,
+    BDRV_REQ_ZERO_WRITE   = 0x2,
+} BdrvRequestFlags;
+
 #define BDRV_O_RDWR        0x0002
 #define BDRV_O_SNAPSHOT    0x0008 /* open the file read only and save writes in a snapshot */
 #define BDRV_O_NOCACHE     0x0020 /* do not use the host page cache */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 02/20] block: add flags to bdrv_*_write_zeroes
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 01/20] block: make BdrvRequestFlags public Peter Lieven
@ 2013-09-24 13:34 ` Peter Lieven
  2013-10-02 15:32   ` Eric Blake
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 03/20] block: introduce BDRV_REQ_MAY_UNMAP request flag Peter Lieven
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block-migration.c         |    2 +-
 block.c                   |   20 +++++++++++---------
 block/backup.c            |    3 ++-
 block/qcow2-cluster.c     |    2 +-
 block/qcow2.c             |    2 +-
 block/qed.c               |    3 ++-
 block/raw_bsd.c           |    5 +++--
 block/vmdk.c              |    3 ++-
 include/block/block.h     |    4 ++--
 include/block/block_int.h |    2 +-
 qemu-io-cmds.c            |    2 +-
 11 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index daf9ec1..713a8e3 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -780,7 +780,7 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
             }
 
             if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
-                ret = bdrv_write_zeroes(bs, addr, nr_sectors);
+                ret = bdrv_write_zeroes(bs, addr, nr_sectors, 0);
             } else {
                 buf = g_malloc(BLOCK_SIZE);
                 qemu_get_buffer(f, buf, BLOCK_SIZE);
diff --git a/block.c b/block.c
index 8eefb97..c453ec1 100644
--- a/block.c
+++ b/block.c
@@ -79,7 +79,7 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
                                                bool is_write);
 static void coroutine_fn bdrv_co_do_rw(void *opaque);
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors);
+    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
 
 static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
     QTAILQ_HEAD_INITIALIZER(bdrv_states);
@@ -2391,10 +2391,11 @@ int bdrv_writev(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov)
     return bdrv_rwv_co(bs, sector_num, qiov, true, 0);
 }
 
-int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
+int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
+                      int nb_sectors, BdrvRequestFlags flags)
 {
     return bdrv_rw_co(bs, sector_num, NULL, nb_sectors, true,
-                      BDRV_REQ_ZERO_WRITE);
+                      BDRV_REQ_ZERO_WRITE | flags);
 }
 
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
@@ -2576,7 +2577,7 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
     if (drv->bdrv_co_write_zeroes &&
         buffer_is_zero(bounce_buffer, iov.iov_len)) {
         ret = bdrv_co_do_write_zeroes(bs, cluster_sector_num,
-                                      cluster_nb_sectors);
+                                      cluster_nb_sectors, 0);
     } else {
         /* This does not change the data on the disk, it is not necessary
          * to flush even in cache=writethrough mode.
@@ -2710,7 +2711,7 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
 }
 
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors)
+    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
     BlockDriver *drv = bs->drv;
     QEMUIOVector qiov;
@@ -2722,7 +2723,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
 
     /* First try the efficient write zeroes operation */
     if (drv->bdrv_co_write_zeroes) {
-        ret = drv->bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
+        ret = drv->bdrv_co_write_zeroes(bs, sector_num, nb_sectors, flags);
         if (ret != -ENOTSUP) {
             return ret;
         }
@@ -2777,7 +2778,7 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
     if (ret < 0) {
         /* Do nothing, write notifier decided to fail this request */
     } else if (flags & BDRV_REQ_ZERO_WRITE) {
-        ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors);
+        ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors, flags);
     } else {
         ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
     }
@@ -2811,12 +2812,13 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
 }
 
 int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs,
-                                      int64_t sector_num, int nb_sectors)
+                                      int64_t sector_num, int nb_sectors,
+                                      BdrvRequestFlags flags)
 {
     trace_bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
 
     return bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
-                             BDRV_REQ_ZERO_WRITE);
+                             BDRV_REQ_ZERO_WRITE | flags);
 }
 
 /**
diff --git a/block/backup.c b/block/backup.c
index 04c4b5c..5f6a642 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -138,7 +138,8 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
 
         if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
             ret = bdrv_co_write_zeroes(job->target,
-                                       start * BACKUP_SECTORS_PER_CLUSTER, n);
+                                       start * BACKUP_SECTORS_PER_CLUSTER,
+                                       n, 0);
         } else {
             ret = bdrv_co_writev(job->target,
                                  start * BACKUP_SECTORS_PER_CLUSTER, n,
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 738ff73..5215f5b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1601,7 +1601,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
             }
 
             ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE,
-                                    s->cluster_sectors);
+                                    s->cluster_sectors, 0);
             if (ret < 0) {
                 qcow2_free_clusters(bs, offset, s->cluster_size,
                         QCOW2_DISCARD_ALWAYS);
diff --git a/block/qcow2.c b/block/qcow2.c
index 318d95d..6cb3e44 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1600,7 +1600,7 @@ static int qcow2_make_empty(BlockDriverState *bs)
 }
 
 static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors)
+    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
     int ret;
     BDRVQcowState *s = bs->opaque;
diff --git a/block/qed.c b/block/qed.c
index 6c0cba0..adc2736 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1397,7 +1397,8 @@ static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
 
 static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
                                                  int64_t sector_num,
-                                                 int nb_sectors)
+                                                 int nb_sectors,
+                                                 BdrvRequestFlags flags)
 {
     BlockDriverAIOCB *blockacb;
     BDRVQEDState *s = bs->opaque;
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index d4ace60..d5ab295 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -66,9 +66,10 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
 }
 
 static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
-                                            int64_t sector_num, int nb_sectors)
+                                            int64_t sector_num, int nb_sectors,
+                                            BdrvRequestFlags flags)
 {
-    return bdrv_co_write_zeroes(bs->file, sector_num, nb_sectors);
+    return bdrv_co_write_zeroes(bs->file, sector_num, nb_sectors, flags);
 }
 
 static int coroutine_fn raw_co_discard(BlockDriverState *bs,
diff --git a/block/vmdk.c b/block/vmdk.c
index 96ef1b5..debf974 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1391,7 +1391,8 @@ static coroutine_fn int vmdk_co_write(BlockDriverState *bs, int64_t sector_num,
 
 static int coroutine_fn vmdk_co_write_zeroes(BlockDriverState *bs,
                                              int64_t sector_num,
-                                             int nb_sectors)
+                                             int nb_sectors,
+                                             BdrvRequestFlags flags)
 {
     int ret;
     BDRVVmdkState *s = bs->opaque;
diff --git a/include/block/block.h b/include/block/block.h
index 9fb0c3d..47c6475 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -188,7 +188,7 @@ int bdrv_read_unthrottled(BlockDriverState *bs, int64_t sector_num,
 int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors);
 int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-               int nb_sectors);
+               int nb_sectors, BdrvRequestFlags flags);
 int bdrv_writev(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov);
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
                void *buf, int count);
@@ -210,7 +210,7 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
  * because it may allocate memory for the entire region.
  */
 int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-    int nb_sectors);
+    int nb_sectors, BdrvRequestFlags flags);
 BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
     const char *backing_file);
 int bdrv_get_backing_file_depth(BlockDriverState *bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 3eeb6fe..56e1994 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -118,7 +118,7 @@ struct BlockDriver {
      * instead.
      */
     int coroutine_fn (*bdrv_co_write_zeroes)(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors);
+        int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
     int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors);
     int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs,
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 8565d49..66c0b9e 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -441,7 +441,7 @@ static void coroutine_fn co_write_zeroes_entry(void *opaque)
     CoWriteZeroes *data = opaque;
 
     data->ret = bdrv_co_write_zeroes(data->bs, data->offset / BDRV_SECTOR_SIZE,
-                                     data->count / BDRV_SECTOR_SIZE);
+                                     data->count / BDRV_SECTOR_SIZE, 0);
     data->done = true;
     if (data->ret < 0) {
         *data->total = data->ret;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 03/20] block: introduce BDRV_REQ_MAY_UNMAP request flag
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 01/20] block: make BdrvRequestFlags public Peter Lieven
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 02/20] block: add flags to bdrv_*_write_zeroes Peter Lieven
@ 2013-09-24 13:34 ` Peter Lieven
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 04/20] block: introduce bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes Peter Lieven
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block-migration.c     |    3 ++-
 block.c               |    4 ++++
 block/backup.c        |    2 +-
 include/block/block.h |    7 +++++++
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 713a8e3..fc4ef93 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -780,7 +780,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
             }
 
             if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
-                ret = bdrv_write_zeroes(bs, addr, nr_sectors, 0);
+                ret = bdrv_write_zeroes(bs, addr, nr_sectors,
+                                        BDRV_REQ_MAY_UNMAP);
             } else {
                 buf = g_malloc(BLOCK_SIZE);
                 qemu_get_buffer(f, buf, BLOCK_SIZE);
diff --git a/block.c b/block.c
index c453ec1..e7bf6af 100644
--- a/block.c
+++ b/block.c
@@ -2817,6 +2817,10 @@ int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs,
 {
     trace_bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
 
+    if (!(bs->open_flags & BDRV_O_UNMAP)) {
+        flags &= ~BDRV_REQ_MAY_UNMAP;
+    }
+
     return bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
                              BDRV_REQ_ZERO_WRITE | flags);
 }
diff --git a/block/backup.c b/block/backup.c
index 5f6a642..1ab080d 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -139,7 +139,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
         if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
             ret = bdrv_co_write_zeroes(job->target,
                                        start * BACKUP_SECTORS_PER_CLUSTER,
-                                       n, 0);
+                                       n, BDRV_REQ_MAY_UNMAP);
         } else {
             ret = bdrv_co_writev(job->target,
                                  start * BACKUP_SECTORS_PER_CLUSTER, n,
diff --git a/include/block/block.h b/include/block/block.h
index 47c6475..b72c81a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -65,6 +65,13 @@ typedef struct BlockDevOps {
 typedef enum {
     BDRV_REQ_COPY_ON_READ = 0x1,
     BDRV_REQ_ZERO_WRITE   = 0x2,
+    /* The BDRV_REQ_MAY_UNMAP flag is used to indicate that the block driver
+     * is allowed to optimize a write zeroes request by unmapping (discarding)
+     * blocks if it is guaranteed that the result will read back as
+     * zeroes. The flag is only passed to the driver if the block device is
+     * opened with BDRV_O_UNMAP.
+     */
+    BDRV_REQ_MAY_UNMAP    = 0x4,
 } BdrvRequestFlags;
 
 #define BDRV_O_RDWR        0x0002
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 04/20] block: introduce bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (2 preceding siblings ...)
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 03/20] block: introduce BDRV_REQ_MAY_UNMAP request flag Peter Lieven
@ 2013-09-24 13:34 ` Peter Lieven
  2013-10-02 15:41   ` Eric Blake
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 05/20] block/raw: add " Peter Lieven
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c                   |   29 +++++++++++++++++++++++++++++
 include/block/block.h     |    2 ++
 include/block/block_int.h |   13 +++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/block.c b/block.c
index e7bf6af..ac35cb5 100644
--- a/block.c
+++ b/block.c
@@ -3101,6 +3101,35 @@ int bdrv_has_zero_init(BlockDriverState *bs)
     return 0;
 }
 
+bool bdrv_has_discard_zeroes(BlockDriverState *bs)
+{
+    assert(bs->drv);
+
+    if (bs->backing_hd) {
+        return false;
+    }
+    if (bs->drv->bdrv_has_discard_zeroes) {
+        return bs->drv->bdrv_has_discard_zeroes(bs);
+    }
+
+    return false;
+}
+
+bool bdrv_has_discard_write_zeroes(BlockDriverState *bs)
+{
+    assert(bs->drv);
+
+    if (bs->backing_hd || !(bs->open_flags & BDRV_O_UNMAP)) {
+        return false;
+    }
+
+    if (bs->drv->bdrv_has_discard_write_zeroes) {
+        return bs->drv->bdrv_has_discard_write_zeroes(bs);
+    }
+
+    return false;
+}
+
 typedef struct BdrvCoGetBlockStatusData {
     BlockDriverState *bs;
     BlockDriverState *base;
diff --git a/include/block/block.h b/include/block/block.h
index b72c81a..2de226f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -310,6 +310,8 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
+bool bdrv_has_discard_zeroes(BlockDriverState *bs);
+bool bdrv_has_discard_write_zeroes(BlockDriverState *bs);
 int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
                               int nb_sectors, int *pnum);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 56e1994..cf78c3f 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -211,6 +211,19 @@ struct BlockDriver {
      */
     int (*bdrv_has_zero_init)(BlockDriverState *bs);
 
+    /*
+     * Returns true if discarded blocks read back as zeroes.
+     */
+    bool (*bdrv_has_discard_zeroes)(BlockDriverState *bs);
+
+    /*
+     * Returns true if the driver can optimize writing zeroes by discarding
+     * sectors. It is additionally required that the block device is
+     * opened with BDRV_O_UNMAP and the that write zeroes request carries
+     * the BDRV_REQ_MAY_UNMAP flag for this to work.
+     */
+    bool (*bdrv_has_discard_write_zeroes)(BlockDriverState *bs);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 05/20] block/raw: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (3 preceding siblings ...)
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 04/20] block: introduce bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes Peter Lieven
@ 2013-09-24 13:34 ` Peter Lieven
  2013-10-02 15:43   ` Eric Blake
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 06/20] block: add BlockLimits structure to BlockDriverState Peter Lieven
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/raw_bsd.c |   56 +++++++++++++++++++++++++++++++++----------------------
 1 file changed, 34 insertions(+), 22 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index d5ab295..8dc7bba 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -131,6 +131,16 @@ static int raw_has_zero_init(BlockDriverState *bs)
     return bdrv_has_zero_init(bs->file);
 }
 
+static bool raw_has_discard_zeroes(BlockDriverState *bs)
+{
+    return bdrv_has_discard_zeroes(bs->file);
+}
+
+static bool raw_has_discard_write_zeroes(BlockDriverState *bs)
+{
+    return bdrv_has_discard_write_zeroes(bs->file);
+}
+
 static int raw_create(const char *filename, QEMUOptionParameter *options,
                       Error **errp)
 {
@@ -165,28 +175,30 @@ static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
 }
 
 static BlockDriver bdrv_raw = {
-    .format_name          = "raw",
-    .bdrv_probe           = &raw_probe,
-    .bdrv_reopen_prepare  = &raw_reopen_prepare,
-    .bdrv_open            = &raw_open,
-    .bdrv_close           = &raw_close,
-    .bdrv_create          = &raw_create,
-    .bdrv_co_readv        = &raw_co_readv,
-    .bdrv_co_writev       = &raw_co_writev,
-    .bdrv_co_write_zeroes = &raw_co_write_zeroes,
-    .bdrv_co_discard      = &raw_co_discard,
-    .bdrv_co_get_block_status = &raw_co_get_block_status,
-    .bdrv_truncate        = &raw_truncate,
-    .bdrv_getlength       = &raw_getlength,
-    .bdrv_get_info        = &raw_get_info,
-    .bdrv_is_inserted     = &raw_is_inserted,
-    .bdrv_media_changed   = &raw_media_changed,
-    .bdrv_eject           = &raw_eject,
-    .bdrv_lock_medium     = &raw_lock_medium,
-    .bdrv_ioctl           = &raw_ioctl,
-    .bdrv_aio_ioctl       = &raw_aio_ioctl,
-    .create_options       = &raw_create_options[0],
-    .bdrv_has_zero_init   = &raw_has_zero_init
+    .format_name                   = "raw",
+    .bdrv_probe                    = &raw_probe,
+    .bdrv_reopen_prepare           = &raw_reopen_prepare,
+    .bdrv_open                     = &raw_open,
+    .bdrv_close                    = &raw_close,
+    .bdrv_create                   = &raw_create,
+    .bdrv_co_readv                 = &raw_co_readv,
+    .bdrv_co_writev                = &raw_co_writev,
+    .bdrv_co_write_zeroes          = &raw_co_write_zeroes,
+    .bdrv_co_discard               = &raw_co_discard,
+    .bdrv_co_get_block_status      = &raw_co_get_block_status,
+    .bdrv_truncate                 = &raw_truncate,
+    .bdrv_getlength                = &raw_getlength,
+    .bdrv_get_info                 = &raw_get_info,
+    .bdrv_is_inserted              = &raw_is_inserted,
+    .bdrv_media_changed            = &raw_media_changed,
+    .bdrv_eject                    = &raw_eject,
+    .bdrv_lock_medium              = &raw_lock_medium,
+    .bdrv_ioctl                    = &raw_ioctl,
+    .bdrv_aio_ioctl                = &raw_aio_ioctl,
+    .create_options                = &raw_create_options[0],
+    .bdrv_has_zero_init            = &raw_has_zero_init,
+    .bdrv_has_discard_zeroes       = &raw_has_discard_zeroes,
+    .bdrv_has_discard_write_zeroes = &raw_has_discard_write_zeroes,
 };
 
 static void bdrv_raw_init(void)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 06/20] block: add BlockLimits structure to BlockDriverState
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (4 preceding siblings ...)
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 05/20] block/raw: add " Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-10-02 15:53   ` Eric Blake
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 07/20] block: honour BlockLimits in bdrv_co_do_write_zeroes Peter Lieven
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

this patch adds BlockLimits which introduces discard and write_zeroes
limits and alignment information to the BlockDriverState.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 include/block/block_int.h |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index cf78c3f..cda99f9 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -227,6 +227,20 @@ struct BlockDriver {
     QLIST_ENTRY(BlockDriver) list;
 };
 
+struct BlockLimits {
+    /* maximum number of sectors that can be discarded at once */
+    int max_discard;
+
+    /* optimal alignment for discard requests in sectors */
+    int64_t discard_alignment;
+
+    /* maximum number of sectors that can zeroized at once */
+    int max_write_zeroes;
+
+    /* optimal alignment for write zeroes requests in sectors */
+    int64_t write_zeroes_alignment;
+};
+
 /*
  * Note: the function bdrv_append() copies and swaps contents of
  * BlockDriverStates, so if you add new fields to this struct, please
@@ -280,6 +294,9 @@ struct BlockDriverState {
     uint64_t total_time_ns[BDRV_MAX_IOTYPE];
     uint64_t wr_highest_sector;
 
+    /* I/O Limits */
+    struct BlockLimits bl;
+
     /* Whether the disk can expand beyond total_sectors */
     int growable;
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 07/20] block: honour BlockLimits in bdrv_co_do_write_zeroes
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (5 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 06/20] block: add BlockLimits structure to BlockDriverState Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-10-02 16:37   ` Eric Blake
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard Peter Lieven
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |   65 +++++++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 49 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index ac35cb5..580570a 100644
--- a/block.c
+++ b/block.c
@@ -2710,32 +2710,65 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
                             BDRV_REQ_COPY_ON_READ);
 }
 
+/* if no limit is specified in the BlockLimits use a default
+ * of 32768 512-byte sectors (16 MiB) per request.
+ */
+#define MAX_WRITE_ZEROES_DEFAULT 32768
+
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
     BlockDriver *drv = bs->drv;
     QEMUIOVector qiov;
-    struct iovec iov;
-    int ret;
+    struct iovec iov = {0};
+    int ret = 0;
 
-    /* TODO Emulate only part of misaligned requests instead of letting block
-     * drivers return -ENOTSUP and emulate everything */
+    int max_write_zeroes = bs->bl.max_write_zeroes ?
+                           bs->bl.max_write_zeroes : MAX_WRITE_ZEROES_DEFAULT;
 
-    /* First try the efficient write zeroes operation */
-    if (drv->bdrv_co_write_zeroes) {
-        ret = drv->bdrv_co_write_zeroes(bs, sector_num, nb_sectors, flags);
-        if (ret != -ENOTSUP) {
-            return ret;
+    while (nb_sectors > 0 && !ret) {
+        int num = nb_sectors;
+
+        /* align request */
+        if (bs->bl.write_zeroes_alignment &&
+            num >= bs->bl.write_zeroes_alignment &&
+            sector_num % bs->bl.write_zeroes_alignment) {
+            if (num > bs->bl.write_zeroes_alignment) {
+                num = bs->bl.write_zeroes_alignment;
+            }
+            num -= sector_num % bs->bl.write_zeroes_alignment;
         }
-    }
 
-    /* Fall back to bounce buffer if write zeroes is unsupported */
-    iov.iov_len  = nb_sectors * BDRV_SECTOR_SIZE;
-    iov.iov_base = qemu_blockalign(bs, iov.iov_len);
-    memset(iov.iov_base, 0, iov.iov_len);
-    qemu_iovec_init_external(&qiov, &iov, 1);
+        /* limit request size */
+        if (num > max_write_zeroes) {
+            num = max_write_zeroes;
+        }
+
+        ret = -ENOTSUP;
+        /* First try the efficient write zeroes operation */
+        if (drv->bdrv_co_write_zeroes) {
+            ret = drv->bdrv_co_write_zeroes(bs, sector_num, num, flags);
+        }
+
+        if (ret == -ENOTSUP) {
+            /* Fall back to bounce buffer if write zeroes is unsupported */
+            iov.iov_len  = num * BDRV_SECTOR_SIZE;
+            if (iov.iov_base == NULL) {
+                /* allocate bounce buffer only once and ensure that it
+                 * is big enough for this and all future requests.
+                 */
+                size_t bufsize = num <= nb_sectors ? num : max_write_zeroes;
+                iov.iov_base = qemu_blockalign(bs, bufsize * BDRV_SECTOR_SIZE);
+                memset(iov.iov_base, 0, bufsize * BDRV_SECTOR_SIZE);
+            }
+            qemu_iovec_init_external(&qiov, &iov, 1);
 
-    ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, &qiov);
+            ret = drv->bdrv_co_writev(bs, sector_num, num, &qiov);
+        }
+
+        sector_num += num;
+        nb_sectors -= num;
+    }
 
     qemu_vfree(iov.iov_base);
     return ret;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (6 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 07/20] block: honour BlockLimits in bdrv_co_do_write_zeroes Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-10-02 16:41   ` Eric Blake
  2013-10-07  8:29   ` Stefan Hajnoczi
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 09/20] iscsi: simplify iscsi_co_discard Peter Lieven
                   ` (12 subsequent siblings)
  20 siblings, 2 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |   37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 580570a..784e545 100644
--- a/block.c
+++ b/block.c
@@ -4224,6 +4224,11 @@ static void coroutine_fn bdrv_discard_co_entry(void *opaque)
     rwco->ret = bdrv_co_discard(rwco->bs, rwco->sector_num, rwco->nb_sectors);
 }
 
+/* if no limit is specified in the BlockLimits use a default
+ * of 32768 512-byte sectors (16 MiB) per request.
+ */
+#define MAX_DISCARD_DEFAULT 32768
+
 int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
                                  int nb_sectors)
 {
@@ -4245,7 +4250,37 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
     }
 
     if (bs->drv->bdrv_co_discard) {
-        return bs->drv->bdrv_co_discard(bs, sector_num, nb_sectors);
+        int max_discard = bs->bl.max_discard ?
+                          bs->bl.max_discard : MAX_DISCARD_DEFAULT;
+
+        while (nb_sectors > 0) {
+            int ret;
+            int num = nb_sectors;
+
+            /* align request */
+            if (bs->bl.discard_alignment &&
+                num >= bs->bl.discard_alignment &&
+                sector_num % bs->bl.discard_alignment) {
+                if (num > bs->bl.discard_alignment) {
+                    num = bs->bl.discard_alignment;
+                }
+                num -= sector_num % bs->bl.discard_alignment;
+            }
+
+            /* limit request size */
+            if (num > max_discard) {
+                num = max_discard;
+            }
+
+            ret = bs->drv->bdrv_co_discard(bs, sector_num, num);
+            if (ret) {
+                return ret;
+            }
+
+            sector_num += num;
+            nb_sectors -= num;
+        }
+        return 0;
     } else if (bs->drv->bdrv_aio_discard) {
         BlockDriverAIOCB *acb;
         CoroutineIOCompletion co = {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 09/20] iscsi: simplify iscsi_co_discard
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (7 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 10/20] iscsi: set limits in BlockDriverState Peter Lieven
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

now that bdrv_co_discard can handle limits we do not need
the request split logic here anymore.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/iscsi.c |   67 +++++++++++++++++++++------------------------------------
 1 file changed, 25 insertions(+), 42 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 4460382..ce8823d 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -87,7 +87,6 @@ typedef struct IscsiAIOCB {
 #define NOP_INTERVAL 5000
 #define MAX_NOP_FAILURES 3
 #define ISCSI_CMD_RETRIES 5
-#define ISCSI_MAX_UNMAP 131072
 
 static void
 iscsi_bh_cb(void *p)
@@ -912,8 +911,6 @@ coroutine_fn iscsi_co_discard(BlockDriverState *bs, int64_t sector_num,
     IscsiLun *iscsilun = bs->opaque;
     struct IscsiTask iTask;
     struct unmap_list list;
-    uint32_t nb_blocks;
-    uint32_t max_unmap;
 
     if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
         return -EINVAL;
@@ -925,52 +922,38 @@ coroutine_fn iscsi_co_discard(BlockDriverState *bs, int64_t sector_num,
     }
 
     list.lba = sector_qemu2lun(sector_num, iscsilun);
-    nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
+    list.num = sector_qemu2lun(nb_sectors, iscsilun);
 
-    max_unmap = iscsilun->bl.max_unmap;
-    if (max_unmap == 0xffffffff) {
-        max_unmap = ISCSI_MAX_UNMAP;
-    }
-
-    while (nb_blocks > 0) {
-        iscsi_co_init_iscsitask(iscsilun, &iTask);
-        list.num = nb_blocks;
-        if (list.num > max_unmap) {
-            list.num = max_unmap;
-        }
+    iscsi_co_init_iscsitask(iscsilun, &iTask);
 retry:
-        if (iscsi_unmap_task(iscsilun->iscsi, iscsilun->lun, 0, 0, &list, 1,
-                         iscsi_co_generic_cb, &iTask) == NULL) {
-            return -EIO;
-        }
-
-        while (!iTask.complete) {
-            iscsi_set_events(iscsilun);
-            qemu_coroutine_yield();
-        }
+    if (iscsi_unmap_task(iscsilun->iscsi, iscsilun->lun, 0, 0, &list, 1,
+                     iscsi_co_generic_cb, &iTask) == NULL) {
+        return -EIO;
+    }
 
-        if (iTask.task != NULL) {
-            scsi_free_scsi_task(iTask.task);
-            iTask.task = NULL;
-        }
+    while (!iTask.complete) {
+        iscsi_set_events(iscsilun);
+        qemu_coroutine_yield();
+    }
 
-        if (iTask.do_retry) {
-            goto retry;
-        }
+    if (iTask.task != NULL) {
+        scsi_free_scsi_task(iTask.task);
+        iTask.task = NULL;
+    }
 
-        if (iTask.status == SCSI_STATUS_CHECK_CONDITION) {
-            /* the target might fail with a check condition if it
-               is not happy with the alignment of the UNMAP request
-               we silently fail in this case */
-            return 0;
-        }
+    if (iTask.do_retry) {
+        goto retry;
+    }
 
-        if (iTask.status != SCSI_STATUS_GOOD) {
-            return -EIO;
-        }
+    if (iTask.status == SCSI_STATUS_CHECK_CONDITION) {
+        /* the target might fail with a check condition if it
+           is not happy with the alignment of the UNMAP request
+           we silently fail in this case */
+        return 0;
+    }
 
-        list.lba += list.num;
-        nb_blocks -= list.num;
+    if (iTask.status != SCSI_STATUS_GOOD) {
+        return -EIO;
     }
 
     return 0;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 10/20] iscsi: set limits in BlockDriverState
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (8 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 09/20] iscsi: simplify iscsi_co_discard Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-10-02 16:43   ` Eric Blake
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 11/20] iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes Peter Lieven
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/iscsi.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index ce8823d..90dc7c2 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1367,6 +1367,20 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
                sizeof(struct scsi_inquiry_block_limits));
         scsi_free_scsi_task(task);
         task = NULL;
+
+        if (iscsilun->bl.max_unmap < 0xffffffff) {
+            bs->bl.max_discard = sector_lun2qemu(iscsilun->bl.max_unmap,
+                                                 iscsilun);
+        }
+        bs->bl.discard_alignment = sector_lun2qemu(iscsilun->bl.opt_unmap_gran,
+                                                   iscsilun);
+
+        if (iscsilun->bl.max_ws_len < 0xffffffff) {
+            bs->bl.max_write_zeroes = sector_lun2qemu(iscsilun->bl.max_ws_len,
+                                                      iscsilun);
+        }
+        bs->bl.write_zeroes_alignment = sector_lun2qemu(iscsilun->bl.opt_unmap_gran,
+                                                        iscsilun);
     }
 
 #if defined(LIBISCSI_FEATURE_NOP_COUNTER)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 11/20] iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (9 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 10/20] iscsi: set limits in BlockDriverState Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-10-02 16:45   ` Eric Blake
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 12/20] iscsi: add bdrv_co_write_zeroes Peter Lieven
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/iscsi.c |   16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 90dc7c2..f6cd322 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1449,6 +1449,18 @@ static int iscsi_has_zero_init(BlockDriverState *bs)
     return 0;
 }
 
+static bool iscsi_has_discard_zeroes(BlockDriverState *bs)
+{
+    IscsiLun *iscsilun = bs->opaque;
+    return !!iscsilun->lbprz;
+}
+
+static bool iscsi_has_discard_write_zeroes(BlockDriverState *bs)
+{
+    IscsiLun *iscsilun = bs->opaque;
+    return iscsilun->lbprz && iscsilun->lbp.lbpws;
+}
+
 static int iscsi_create(const char *filename, QEMUOptionParameter *options,
                         Error **errp)
 {
@@ -1534,7 +1546,9 @@ static BlockDriver bdrv_iscsi = {
     .bdrv_aio_writev = iscsi_aio_writev,
     .bdrv_aio_flush  = iscsi_aio_flush,
 
-    .bdrv_has_zero_init = iscsi_has_zero_init,
+    .bdrv_has_zero_init            = iscsi_has_zero_init,
+    .bdrv_has_discard_zeroes       = iscsi_has_discard_zeroes,
+    .bdrv_has_discard_write_zeroes = iscsi_has_discard_write_zeroes,
 
 #ifdef __linux__
     .bdrv_ioctl       = iscsi_ioctl,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 12/20] iscsi: add bdrv_co_write_zeroes
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (10 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 11/20] iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize Peter Lieven
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/iscsi.c |   59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index f6cd322..8f52ef8 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -56,6 +56,7 @@ typedef struct IscsiLun {
     uint8_t lbprz;
     struct scsi_inquiry_logical_block_provisioning lbp;
     struct scsi_inquiry_block_limits bl;
+    unsigned char *zeroblock;
 } IscsiLun;
 
 typedef struct IscsiTask {
@@ -959,6 +960,62 @@ retry:
     return 0;
 }
 
+
+static int
+coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
+                                   int nb_sectors, BdrvRequestFlags flags)
+{
+    IscsiLun *iscsilun = bs->opaque;
+    struct IscsiTask iTask;
+    uint64_t lba;
+    uint32_t nb_blocks;
+
+    if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+        return -EINVAL;
+    }
+
+    if (!iscsilun->lbp.lbpws) {
+        /* WRITE SAME is not supported by the target */
+        return -ENOTSUP;
+    }
+
+    lba = sector_qemu2lun(sector_num, iscsilun);
+    nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
+
+    if (iscsilun->zeroblock == NULL) {
+        iscsilun->zeroblock = g_malloc0(iscsilun->block_size);
+    }
+
+    iscsi_co_init_iscsitask(iscsilun, &iTask);
+retry:
+    if (iscsi_writesame16_task(iscsilun->iscsi, iscsilun->lun, lba,
+                               iscsilun->zeroblock, iscsilun->block_size,
+                               nb_blocks, 0, !!(flags & BDRV_REQ_MAY_UNMAP),
+                               0, 0, iscsi_co_generic_cb, &iTask) == NULL) {
+        return -EIO;
+    }
+
+    while (!iTask.complete) {
+        iscsi_set_events(iscsilun);
+        qemu_coroutine_yield();
+    }
+
+    if (iTask.task != NULL) {
+        scsi_free_scsi_task(iTask.task);
+        iTask.task = NULL;
+    }
+
+    if (iTask.do_retry) {
+        goto retry;
+    }
+
+    if (iTask.status != SCSI_STATUS_GOOD) {
+        return -EIO;
+    }
+
+    return 0;
+}
+
 static int parse_chap(struct iscsi_context *iscsi, const char *target)
 {
     QemuOptsList *list;
@@ -1421,6 +1478,7 @@ static void iscsi_close(BlockDriverState *bs)
     }
     qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL);
     iscsi_destroy_context(iscsi);
+    g_free(iscsilun->zeroblock);
     memset(iscsilun, 0, sizeof(IscsiLun));
 }
 
@@ -1541,6 +1599,7 @@ static BlockDriver bdrv_iscsi = {
     .bdrv_co_get_block_status = iscsi_co_get_block_status,
 #endif
     .bdrv_co_discard      = iscsi_co_discard,
+    .bdrv_co_write_zeroes = iscsi_co_write_zeroes,
 
     .bdrv_aio_readv  = iscsi_aio_readv,
     .bdrv_aio_writev = iscsi_aio_writev,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (11 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 12/20] iscsi: add bdrv_co_write_zeroes Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-10-02 16:51   ` Eric Blake
  2013-10-07  8:34   ` Stefan Hajnoczi
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 14/20] block/get_block_status: set *pnum = 0 on error Peter Lieven
                   ` (7 subsequent siblings)
  20 siblings, 2 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

this patch adds a call to completely zero out a block device.
the operation is sped up by checking the block status and
only writing zeroes to the device if they currently do not
return zeroes. optionally the zero writing can be sped up
by setting the flag BDRV_REQ_MAY_UNMAP to emulate the zero
write by unmapping if the driver supports it.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c               |   37 +++++++++++++++++++++++++++++++++++++
 include/block/block.h |    1 +
 2 files changed, 38 insertions(+)

diff --git a/block.c b/block.c
index 784e545..66b9eae 100644
--- a/block.c
+++ b/block.c
@@ -2398,6 +2398,43 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
                       BDRV_REQ_ZERO_WRITE | flags);
 }
 
+/*
+ * Completely zero out a block device with the help of bdrv_write_zeroes.
+ * The operation is sped up by checking the block status and only writing
+ * zeroes to the device if they currently do not return zeroes. Optional
+ * flags are passed through to bdrv_write_zeroes (e.g. BDRV_REQ_MAY_UNMAP).
+ *
+ * Returns < 0 on error, 0 on success. For error codes see bdrv_write().
+ */
+int bdrv_zeroize(BlockDriverState *bs, BdrvRequestFlags flags)
+{
+    int64_t target_size = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
+    int64_t ret, nb_sectors, sector_num = 0;
+    int n;
+
+    for (;;) {
+        nb_sectors = target_size - sector_num;
+        if (nb_sectors <= 0) {
+            return 0;
+        }
+        if (nb_sectors > INT_MAX) {
+            nb_sectors = INT_MAX;
+        }
+        ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
+        if (ret & BDRV_BLOCK_ZERO) {
+            sector_num += n;
+            continue;
+        }
+        ret = bdrv_write_zeroes(bs, sector_num, n, flags);
+        if (ret < 0) {
+            error_report("error writing zeroes at sector %" PRId64 ": %s",
+                         sector_num, strerror(-ret));
+            return ret;
+        }
+        sector_num += n;
+    }
+}
+
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
                void *buf, int count1)
 {
diff --git a/include/block/block.h b/include/block/block.h
index 2de226f..e059188 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -196,6 +196,7 @@ int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors);
 int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
                int nb_sectors, BdrvRequestFlags flags);
+int bdrv_zeroize(BlockDriverState *bs, BdrvRequestFlags flags);
 int bdrv_writev(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov);
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
                void *buf, int count);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 14/20] block/get_block_status: set *pnum = 0 on error
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (12 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 15/20] block/get_block_status: avoid segfault if there is no backing_hd Peter Lieven
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

if the call is invoked through bdrv_is_allocated the caller might
expect *pnum = 0 on error. however, a new implementation of
bdrv_get_block_status might only return a negative exit value on
error while keeping *pnum untouched.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/block.c b/block.c
index 66b9eae..3499c90 100644
--- a/block.c
+++ b/block.c
@@ -3259,6 +3259,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
 
     ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
     if (ret < 0) {
+        *pnum = 0;
         return ret;
     }
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 15/20] block/get_block_status: avoid segfault if there is no backing_hd
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (13 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 14/20] block/get_block_status: set *pnum = 0 on error Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 16/20] block/get_block_status: avoid redundant callouts on raw devices Peter Lieven
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 3499c90..3bf1163 100644
--- a/block.c
+++ b/block.c
@@ -3266,7 +3266,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
     if (!(ret & BDRV_BLOCK_DATA)) {
         if (bdrv_has_zero_init(bs)) {
             ret |= BDRV_BLOCK_ZERO;
-        } else {
+        } else if (bs->backing_hd) {
             BlockDriverState *bs2 = bs->backing_hd;
             int64_t length2 = bdrv_getlength(bs2);
             if (length2 >= 0 && sector_num >= (length2 >> BDRV_SECTOR_BITS)) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 16/20] block/get_block_status: avoid redundant callouts on raw devices
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (14 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 15/20] block/get_block_status: avoid segfault if there is no backing_hd Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 17/20] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks Peter Lieven
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs->file. however, the raw driver already
has called bdrv_get_block_status on bs->file.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 3bf1163..f56660d 100644
--- a/block.c
+++ b/block.c
@@ -3277,7 +3277,8 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
 
     if (bs->file &&
         (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
-        (ret & BDRV_BLOCK_OFFSET_VALID)) {
+        (ret & BDRV_BLOCK_OFFSET_VALID) &&
+        strcmp(bs->drv->format_name, "raw")) {
         ret2 = bdrv_co_get_block_status(bs->file, ret >> BDRV_SECTOR_BITS,
                                         *pnum, pnum);
         if (ret2 >= 0) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 17/20] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (15 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 16/20] block/get_block_status: avoid redundant callouts on raw devices Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 18/20] qemu-img: add support for fully allocated images Peter Lieven
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

this patch does 2 things:
a) only do additional call outs if BDRV_BLOCK_ZERO is not already set.
b) use the newly introduced bdrv_has_discard_zeroes() to return the
   zero state of an unallocated block. the used callout to
   bdrv_has_zero_init() is only valid right after bdrv_create.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index f56660d..032809a 100644
--- a/block.c
+++ b/block.c
@@ -3263,8 +3263,8 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
         return ret;
     }
 
-    if (!(ret & BDRV_BLOCK_DATA)) {
-        if (bdrv_has_zero_init(bs)) {
+    if (!(ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO)) {
+        if (bdrv_has_discard_zeroes(bs)) {
             ret |= BDRV_BLOCK_ZERO;
         } else if (bs->backing_hd) {
             BlockDriverState *bs2 = bs->backing_hd;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 18/20] qemu-img: add support for fully allocated images
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (16 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 17/20] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-10-02 17:01   ` Eric Blake
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 19/20] qemu-img: conditionally zero out target on convert Peter Lieven
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 qemu-img.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 926f0a0..c6eff15 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -100,8 +100,10 @@ static void help(void)
            "  '-h' with or without a command shows this help and lists the supported formats\n"
            "  '-p' show progress of command (only certain commands)\n"
            "  '-q' use Quiet mode - do not print any output (except errors)\n"
-           "  '-S' indicates the consecutive number of bytes that must contain only zeros\n"
-           "       for qemu-img to create a sparse image during conversion\n"
+           "  '-S' indicates the consecutive number of bytes (defaults to 4k) that must\n"
+           "       contain only zeros for qemu-img to create a sparse image during\n"
+           "       conversion. if the number of bytes is 0 sparse files are disabled and\n"
+           "       images will always be fully allocated\n"
            "  '--output' takes the format in which the output must be done (human or json)\n"
            "  '-n' skips the target volume creation (useful if the volume is created\n"
            "       prior to running qemu-img)\n"
@@ -1465,7 +1467,7 @@ static int img_convert(int argc, char **argv)
         /* signal EOF to align */
         bdrv_write_compressed(out_bs, 0, NULL, 0);
     } else {
-        int has_zero_init = bdrv_has_zero_init(out_bs);
+        int has_zero_init = min_sparse ? bdrv_has_zero_init(out_bs) : 0;
 
         sector_num = 0; // total number of sectors converted so far
         nb_sectors = total_sectors - sector_num;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 19/20] qemu-img: conditionally zero out target on convert
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (17 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 18/20] qemu-img: add support for fully allocated images Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 20/20] block/raw: copy BlockLimits on raw_open Peter Lieven
  2013-10-07  8:42 ` [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Stefan Hajnoczi
  20 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

If the target has_zero_init = 0, but supports efficiently
writing zeroes by unmapping we call bdrv_zeroize to
avoid fully allocating the target. Ihis currently
is designed especially for iscsi.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 qemu-img.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/qemu-img.c b/qemu-img.c
index c6eff15..2a41942 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1353,7 +1353,7 @@ static int img_convert(int argc, char **argv)
         }
     }
 
-    flags = BDRV_O_RDWR;
+    flags = min_sparse ? (BDRV_O_RDWR | BDRV_O_UNMAP) : BDRV_O_RDWR;
     ret = bdrv_parse_cache_flags(cache, &flags);
     if (ret < 0) {
         error_report("Invalid cache option: %s", cache);
@@ -1469,6 +1469,14 @@ static int img_convert(int argc, char **argv)
     } else {
         int has_zero_init = min_sparse ? bdrv_has_zero_init(out_bs) : 0;
 
+        if (!has_zero_init && bdrv_has_discard_write_zeroes(out_bs)) {
+            ret = bdrv_zeroize(out_bs, BDRV_REQ_MAY_UNMAP);
+            if (ret < 0) {
+                goto out;
+            }
+            has_zero_init = 1;
+        }
+
         sector_num = 0; // total number of sectors converted so far
         nb_sectors = total_sectors - sector_num;
         if (nb_sectors != 0) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [Qemu-devel] [PATCHv3 20/20] block/raw: copy BlockLimits on raw_open
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (18 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 19/20] qemu-img: conditionally zero out target on convert Peter Lieven
@ 2013-09-24 13:35 ` Peter Lieven
  2013-10-02 17:11   ` Eric Blake
  2013-10-07  8:42 ` [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Stefan Hajnoczi
  20 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-09-24 13:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/raw_bsd.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 8dc7bba..7af26ad 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -159,6 +159,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
                     Error **errp)
 {
     bs->sg = bs->file->sg;
+    memcpy(&bs->bl, &bs->file->bl, sizeof(struct BlockLimits));
     return 0;
 }
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 02/20] block: add flags to bdrv_*_write_zeroes
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 02/20] block: add flags to bdrv_*_write_zeroes Peter Lieven
@ 2013-10-02 15:32   ` Eric Blake
  0 siblings, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 15:32 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 780 bytes --]

On 09/24/2013 07:34 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block-migration.c         |    2 +-
>  block.c                   |   20 +++++++++++---------
>  block/backup.c            |    3 ++-
>  block/qcow2-cluster.c     |    2 +-
>  block/qcow2.c             |    2 +-
>  block/qed.c               |    3 ++-
>  block/raw_bsd.c           |    5 +++--
>  block/vmdk.c              |    3 ++-
>  include/block/block.h     |    4 ++--
>  include/block/block_int.h |    2 +-
>  qemu-io-cmds.c            |    2 +-
>  11 files changed, 27 insertions(+), 21 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 04/20] block: introduce bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 04/20] block: introduce bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes Peter Lieven
@ 2013-10-02 15:41   ` Eric Blake
  0 siblings, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 15:41 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 454 bytes --]

On 09/24/2013 07:34 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c                   |   29 +++++++++++++++++++++++++++++
>  include/block/block.h     |    2 ++
>  include/block/block_int.h |   13 +++++++++++++
>  3 files changed, 44 insertions(+)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 05/20] block/raw: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes
  2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 05/20] block/raw: add " Peter Lieven
@ 2013-10-02 15:43   ` Eric Blake
  0 siblings, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 15:43 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 395 bytes --]

On 09/24/2013 07:34 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block/raw_bsd.c |   56 +++++++++++++++++++++++++++++++++----------------------
>  1 file changed, 34 insertions(+), 22 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 06/20] block: add BlockLimits structure to BlockDriverState
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 06/20] block: add BlockLimits structure to BlockDriverState Peter Lieven
@ 2013-10-02 15:53   ` Eric Blake
  2013-10-07  8:10     ` Stefan Hajnoczi
  0 siblings, 1 reply; 47+ messages in thread
From: Eric Blake @ 2013-10-02 15:53 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1229 bytes --]

On 09/24/2013 07:35 AM, Peter Lieven wrote:
> this patch adds BlockLimits which introduces discard and write_zeroes
> limits and alignment information to the BlockDriverState.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  include/block/block_int.h |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)

>  
> +struct BlockLimits {

Should this be a typedef?  Either in include/qemu/typedefs.h (where
BlockDriverState is listed), or locally (see BdrvTrackedRequest in the
same file for an example)?

> @@ -280,6 +294,9 @@ struct BlockDriverState {
>      uint64_t total_time_ns[BDRV_MAX_IOTYPE];
>      uint64_t wr_highest_sector;
>  
> +    /* I/O Limits */
> +    struct BlockLimits bl;
> +

All other struct/pointer-to-struct members in BlockDriverState are
listed by typedef name, rather than calling out 'struct foo'.

My question is one of style/consistency, not of C correctness; so unless
a maintainer actually agrees that a typedef change is needed so that you
comply with project coding standards, feel free to add:

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 07/20] block: honour BlockLimits in bdrv_co_do_write_zeroes
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 07/20] block: honour BlockLimits in bdrv_co_do_write_zeroes Peter Lieven
@ 2013-10-02 16:37   ` Eric Blake
  0 siblings, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 16:37 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

On 09/24/2013 07:35 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |   65 +++++++++++++++++++++++++++++++++++++++++++++++----------------
>  1 file changed, 49 insertions(+), 16 deletions(-)
> 
> diff --git a/block.c b/block.c
> index ac35cb5..580570a 100644
> --- a/block.c
> +++ b/block.c
> @@ -2710,32 +2710,65 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
>                              BDRV_REQ_COPY_ON_READ);
>  }
>  


> +
> +        if (ret == -ENOTSUP) {
> +            /* Fall back to bounce buffer if write zeroes is unsupported */
> +            iov.iov_len  = num * BDRV_SECTOR_SIZE;

Why 2 spaces?

But that's trivial.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard Peter Lieven
@ 2013-10-02 16:41   ` Eric Blake
  2013-10-07  8:29   ` Stefan Hajnoczi
  1 sibling, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 16:41 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 367 bytes --]

On 09/24/2013 07:35 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |   37 ++++++++++++++++++++++++++++++++++++-
>  1 file changed, 36 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 10/20] iscsi: set limits in BlockDriverState
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 10/20] iscsi: set limits in BlockDriverState Peter Lieven
@ 2013-10-02 16:43   ` Eric Blake
  0 siblings, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 16:43 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 331 bytes --]

On 09/24/2013 07:35 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block/iscsi.c |   14 ++++++++++++++
>  1 file changed, 14 insertions(+)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 11/20] iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 11/20] iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes Peter Lieven
@ 2013-10-02 16:45   ` Eric Blake
  0 siblings, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 16:45 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 352 bytes --]

On 09/24/2013 07:35 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block/iscsi.c |   16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize Peter Lieven
@ 2013-10-02 16:51   ` Eric Blake
  2013-10-07  8:34   ` Stefan Hajnoczi
  1 sibling, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 16:51 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 766 bytes --]

On 09/24/2013 07:35 AM, Peter Lieven wrote:
> this patch adds a call to completely zero out a block device.
> the operation is sped up by checking the block status and
> only writing zeroes to the device if they currently do not
> return zeroes. optionally the zero writing can be sped up
> by setting the flag BDRV_REQ_MAY_UNMAP to emulate the zero
> write by unmapping if the driver supports it.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c               |   37 +++++++++++++++++++++++++++++++++++++
>  include/block/block.h |    1 +
>  2 files changed, 38 insertions(+)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 18/20] qemu-img: add support for fully allocated images
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 18/20] qemu-img: add support for fully allocated images Peter Lieven
@ 2013-10-02 17:01   ` Eric Blake
  0 siblings, 0 replies; 47+ messages in thread
From: Eric Blake @ 2013-10-02 17:01 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1334 bytes --]

On 09/24/2013 07:35 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  qemu-img.c |    8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)

Missing a counterpart change to qemu-img.texi.

> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 926f0a0..c6eff15 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -100,8 +100,10 @@ static void help(void)
>             "  '-h' with or without a command shows this help and lists the supported formats\n"
>             "  '-p' show progress of command (only certain commands)\n"
>             "  '-q' use Quiet mode - do not print any output (except errors)\n"
> -           "  '-S' indicates the consecutive number of bytes that must contain only zeros\n"
> -           "       for qemu-img to create a sparse image during conversion\n"
> +           "  '-S' indicates the consecutive number of bytes (defaults to 4k) that must\n"
> +           "       contain only zeros for qemu-img to create a sparse image during\n"
> +           "       conversion. if the number of bytes is 0 sparse files are disabled and\n"
> +           "       images will always be fully allocated\n"

the texi page should have similar wording.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 20/20] block/raw: copy BlockLimits on raw_open
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 20/20] block/raw: copy BlockLimits on raw_open Peter Lieven
@ 2013-10-02 17:11   ` Eric Blake
  2013-10-07  8:38     ` Stefan Hajnoczi
  0 siblings, 1 reply; 47+ messages in thread
From: Eric Blake @ 2013-10-02 17:11 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1155 bytes --]

On 09/24/2013 07:35 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block/raw_bsd.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/raw_bsd.c b/block/raw_bsd.c
> index 8dc7bba..7af26ad 100644
> --- a/block/raw_bsd.c
> +++ b/block/raw_bsd.c
> @@ -159,6 +159,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
>                      Error **errp)
>  {
>      bs->sg = bs->file->sg;
> +    memcpy(&bs->bl, &bs->file->bl, sizeof(struct BlockLimits));

Personally, I think that sizeof(var) is more robust, because if the
declaration of var is ever changed, you don't have to remember to also
touch up the memcpy.  As in:

memcpy(&bs->bl, &bs->file->bl, sizeof(bs->bl));

But there's plenty of examples of sizeof(type) in the codebase even when
a var is handy, so you are not alone, and that's not a reason for me to
request a respin.

On the other hand, why use memcpy() at all?

bs->bl = bs->file->bl;

should do the same trick, with less typing.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 06/20] block: add BlockLimits structure to BlockDriverState
  2013-10-02 15:53   ` Eric Blake
@ 2013-10-07  8:10     ` Stefan Hajnoczi
  0 siblings, 0 replies; 47+ messages in thread
From: Stefan Hajnoczi @ 2013-10-07  8:10 UTC (permalink / raw)
  To: Eric Blake
  Cc: kwolf, stefanha, Peter Lieven, qemu-devel, anthony, pbonzini,
	ronniesahlberg

On Wed, Oct 02, 2013 at 09:53:01AM -0600, Eric Blake wrote:
> On 09/24/2013 07:35 AM, Peter Lieven wrote:
> > this patch adds BlockLimits which introduces discard and write_zeroes
> > limits and alignment information to the BlockDriverState.
> > 
> > Signed-off-by: Peter Lieven <pl@kamp.de>
> > ---
> >  include/block/block_int.h |   17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> 
> >  
> > +struct BlockLimits {
> 
> Should this be a typedef?  Either in include/qemu/typedefs.h (where
> BlockDriverState is listed), or locally (see BdrvTrackedRequest in the
> same file for an example)?
> 
> > @@ -280,6 +294,9 @@ struct BlockDriverState {
> >      uint64_t total_time_ns[BDRV_MAX_IOTYPE];
> >      uint64_t wr_highest_sector;
> >  
> > +    /* I/O Limits */
> > +    struct BlockLimits bl;
> > +
> 
> All other struct/pointer-to-struct members in BlockDriverState are
> listed by typedef name, rather than calling out 'struct foo'.
> 
> My question is one of style/consistency, not of C correctness; so unless
> a maintainer actually agrees that a typedef change is needed so that you
> comply with project coding standards, feel free to add:

QEMU coding style requires use of typedef:

typedef struct {
    ...
} BlockLimits;

Stefan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard Peter Lieven
  2013-10-02 16:41   ` Eric Blake
@ 2013-10-07  8:29   ` Stefan Hajnoczi
  2013-10-07  8:36     ` Peter Lieven
  1 sibling, 1 reply; 47+ messages in thread
From: Stefan Hajnoczi @ 2013-10-07  8:29 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, stefanha, qemu-devel, anthony, pbonzini, ronniesahlberg

On Tue, Sep 24, 2013 at 03:35:02PM +0200, Peter Lieven wrote:
> @@ -4245,7 +4250,37 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
>      }
>  
>      if (bs->drv->bdrv_co_discard) {
> -        return bs->drv->bdrv_co_discard(bs, sector_num, nb_sectors);
> +        int max_discard = bs->bl.max_discard ?
> +                          bs->bl.max_discard : MAX_DISCARD_DEFAULT;
> +
> +        while (nb_sectors > 0) {
> +            int ret;
> +            int num = nb_sectors;
> +
> +            /* align request */
> +            if (bs->bl.discard_alignment &&
> +                num >= bs->bl.discard_alignment &&
> +                sector_num % bs->bl.discard_alignment) {
> +                if (num > bs->bl.discard_alignment) {
> +                    num = bs->bl.discard_alignment;
> +                }
> +                num -= sector_num % bs->bl.discard_alignment;
> +            }

Is it always possible to discard at arbitrary sector offsets?

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize Peter Lieven
  2013-10-02 16:51   ` Eric Blake
@ 2013-10-07  8:34   ` Stefan Hajnoczi
  2013-10-07  8:39     ` Peter Lieven
  1 sibling, 1 reply; 47+ messages in thread
From: Stefan Hajnoczi @ 2013-10-07  8:34 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, stefanha, qemu-devel, anthony, pbonzini, ronniesahlberg

On Tue, Sep 24, 2013 at 03:35:07PM +0200, Peter Lieven wrote:
> this patch adds a call to completely zero out a block device.
> the operation is sped up by checking the block status and
> only writing zeroes to the device if they currently do not
> return zeroes. optionally the zero writing can be sped up
> by setting the flag BDRV_REQ_MAY_UNMAP to emulate the zero
> write by unmapping if the driver supports it.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c               |   37 +++++++++++++++++++++++++++++++++++++
>  include/block/block.h |    1 +
>  2 files changed, 38 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 784e545..66b9eae 100644
> --- a/block.c
> +++ b/block.c
> @@ -2398,6 +2398,43 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
>                        BDRV_REQ_ZERO_WRITE | flags);
>  }
>  
> +/*
> + * Completely zero out a block device with the help of bdrv_write_zeroes.
> + * The operation is sped up by checking the block status and only writing
> + * zeroes to the device if they currently do not return zeroes. Optional
> + * flags are passed through to bdrv_write_zeroes (e.g. BDRV_REQ_MAY_UNMAP).
> + *
> + * Returns < 0 on error, 0 on success. For error codes see bdrv_write().
> + */
> +int bdrv_zeroize(BlockDriverState *bs, BdrvRequestFlags flags)

BlockDriver->bdrv_make_empty() implements zeroing the entire disk for
image formats.  Please extend that function prototype instead of adding
a new interface.

Stefan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard
  2013-10-07  8:29   ` Stefan Hajnoczi
@ 2013-10-07  8:36     ` Peter Lieven
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-10-07  8:36 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, stefanha, qemu-devel, anthony, pbonzini, ronniesahlberg


Am 07.10.2013 um 10:29 schrieb Stefan Hajnoczi <stefanha@gmail.com>:

> On Tue, Sep 24, 2013 at 03:35:02PM +0200, Peter Lieven wrote:
>> @@ -4245,7 +4250,37 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
>>     }
>> 
>>     if (bs->drv->bdrv_co_discard) {
>> -        return bs->drv->bdrv_co_discard(bs, sector_num, nb_sectors);
>> +        int max_discard = bs->bl.max_discard ?
>> +                          bs->bl.max_discard : MAX_DISCARD_DEFAULT;
>> +
>> +        while (nb_sectors > 0) {
>> +            int ret;
>> +            int num = nb_sectors;
>> +
>> +            /* align request */
>> +            if (bs->bl.discard_alignment &&
>> +                num >= bs->bl.discard_alignment &&
>> +                sector_num % bs->bl.discard_alignment) {
>> +                if (num > bs->bl.discard_alignment) {
>> +                    num = bs->bl.discard_alignment;
>> +                }
>> +                num -= sector_num % bs->bl.discard_alignment;
>> +            }
> 
> Is it always possible to discard at arbitrary sector offsets?

It should be. This just makes sure that the unaligned part of a request gets executed first (eventually resulting in an -ENOTSUP) and
consecutive calls are aligned.

For example. The alignment is 1MB (2048 sectors) and max_discard its 2048 sectors as well.. Lets say you want to discard 4096 sectors starting at offset 1024.

Then there will be 3 calls
1) discard at 1024, len 1024
2) discard at 2048, len 2048
3) discard at 4096, len 1024.

Peter

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 20/20] block/raw: copy BlockLimits on raw_open
  2013-10-02 17:11   ` Eric Blake
@ 2013-10-07  8:38     ` Stefan Hajnoczi
  2013-10-07  8:40       ` Peter Lieven
  0 siblings, 1 reply; 47+ messages in thread
From: Stefan Hajnoczi @ 2013-10-07  8:38 UTC (permalink / raw)
  To: Eric Blake
  Cc: kwolf, stefanha, Peter Lieven, qemu-devel, anthony, pbonzini,
	ronniesahlberg

On Wed, Oct 02, 2013 at 11:11:05AM -0600, Eric Blake wrote:
> On 09/24/2013 07:35 AM, Peter Lieven wrote:
> > Signed-off-by: Peter Lieven <pl@kamp.de>
> > ---
> >  block/raw_bsd.c |    1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/block/raw_bsd.c b/block/raw_bsd.c
> > index 8dc7bba..7af26ad 100644
> > --- a/block/raw_bsd.c
> > +++ b/block/raw_bsd.c
> > @@ -159,6 +159,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
> >                      Error **errp)
> >  {
> >      bs->sg = bs->file->sg;
> > +    memcpy(&bs->bl, &bs->file->bl, sizeof(struct BlockLimits));
> 
> Personally, I think that sizeof(var) is more robust, because if the
> declaration of var is ever changed, you don't have to remember to also
> touch up the memcpy.  As in:
> 
> memcpy(&bs->bl, &bs->file->bl, sizeof(bs->bl));
> 
> But there's plenty of examples of sizeof(type) in the codebase even when
> a var is handy, so you are not alone, and that's not a reason for me to
> request a respin.
> 
> On the other hand, why use memcpy() at all?
> 
> bs->bl = bs->file->bl;
> 
> should do the same trick, with less typing.

Yes, please use struct assignment.

Stefan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize
  2013-10-07  8:34   ` Stefan Hajnoczi
@ 2013-10-07  8:39     ` Peter Lieven
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-10-07  8:39 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, stefanha, qemu-devel, anthony, pbonzini, ronniesahlberg


Am 07.10.2013 um 10:34 schrieb Stefan Hajnoczi <stefanha@gmail.com>:

> On Tue, Sep 24, 2013 at 03:35:07PM +0200, Peter Lieven wrote:
>> this patch adds a call to completely zero out a block device.
>> the operation is sped up by checking the block status and
>> only writing zeroes to the device if they currently do not
>> return zeroes. optionally the zero writing can be sped up
>> by setting the flag BDRV_REQ_MAY_UNMAP to emulate the zero
>> write by unmapping if the driver supports it.
>> 
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>> block.c               |   37 +++++++++++++++++++++++++++++++++++++
>> include/block/block.h |    1 +
>> 2 files changed, 38 insertions(+)
>> 
>> diff --git a/block.c b/block.c
>> index 784e545..66b9eae 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2398,6 +2398,43 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
>>                       BDRV_REQ_ZERO_WRITE | flags);
>> }
>> 
>> +/*
>> + * Completely zero out a block device with the help of bdrv_write_zeroes.
>> + * The operation is sped up by checking the block status and only writing
>> + * zeroes to the device if they currently do not return zeroes. Optional
>> + * flags are passed through to bdrv_write_zeroes (e.g. BDRV_REQ_MAY_UNMAP).
>> + *
>> + * Returns < 0 on error, 0 on success. For error codes see bdrv_write().
>> + */
>> +int bdrv_zeroize(BlockDriverState *bs, BdrvRequestFlags flags)
> 
> BlockDriver->bdrv_make_empty() implements zeroing the entire disk for
> image formats.  Please extend that function prototype instead of adding
> a new interface.

You mean, adding bdrv_make_empty as a function to block.c calling drv->bdrv_make_empty if it is defined?

Peter

> 
> Stefan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 20/20] block/raw: copy BlockLimits on raw_open
  2013-10-07  8:38     ` Stefan Hajnoczi
@ 2013-10-07  8:40       ` Peter Lieven
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-10-07  8:40 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, stefanha, ronniesahlberg, qemu-devel, anthony, pbonzini


Am 07.10.2013 um 10:38 schrieb Stefan Hajnoczi <stefanha@gmail.com>:

> On Wed, Oct 02, 2013 at 11:11:05AM -0600, Eric Blake wrote:
>> On 09/24/2013 07:35 AM, Peter Lieven wrote:
>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>> ---
>>> block/raw_bsd.c |    1 +
>>> 1 file changed, 1 insertion(+)
>>> 
>>> diff --git a/block/raw_bsd.c b/block/raw_bsd.c
>>> index 8dc7bba..7af26ad 100644
>>> --- a/block/raw_bsd.c
>>> +++ b/block/raw_bsd.c
>>> @@ -159,6 +159,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
>>>                     Error **errp)
>>> {
>>>     bs->sg = bs->file->sg;
>>> +    memcpy(&bs->bl, &bs->file->bl, sizeof(struct BlockLimits));
>> 
>> Personally, I think that sizeof(var) is more robust, because if the
>> declaration of var is ever changed, you don't have to remember to also
>> touch up the memcpy.  As in:
>> 
>> memcpy(&bs->bl, &bs->file->bl, sizeof(bs->bl));
>> 
>> But there's plenty of examples of sizeof(type) in the codebase even when
>> a var is handy, so you are not alone, and that's not a reason for me to
>> request a respin.
>> 
>> On the other hand, why use memcpy() at all?
>> 
>> bs->bl = bs->file->bl;
>> 
>> should do the same trick, with less typing.
> 
> Yes, please use struct assignment.

Okay, I was unsure because when looking at bdrv_move_feature_fields I found that there memcpy was used.

Peter

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
  2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
                   ` (19 preceding siblings ...)
  2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 20/20] block/raw: copy BlockLimits on raw_open Peter Lieven
@ 2013-10-07  8:42 ` Stefan Hajnoczi
  2013-10-07  8:47   ` Peter Lieven
  2013-10-07  9:42   ` Paolo Bonzini
  20 siblings, 2 replies; 47+ messages in thread
From: Stefan Hajnoczi @ 2013-10-07  8:42 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, stefanha, qemu-devel, anthony, pbonzini, ronniesahlberg

On Tue, Sep 24, 2013 at 03:34:54PM +0200, Peter Lieven wrote:
> this patch adds the ability for targets to stay sparse during
> block migration (if the zero_blocks capability is set) and qemu-img convert
> even if the target does not have has_zero_init = 1.
> 
> the series was especially developed for iSCSI, but it should also work
> with other drivers with little or no adjustments. these adjustments
> should be limited to providing block provisioning information through
> get_block_info and/or honouring BDRV_REQ_MAY_UNMAP on writing zeroes.

Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
few patches in this series I wondered why there is a need to expose
flags at all...

Sometimes it is useful to distinguish between zeroing at the image
format level from discarding at the device level, but I don't think we
make use of that yet.  I'd prefer to keep the interface simple for now
and add flags later, if necessary.

Or maybe I just missed something ;)

Stefan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
  2013-10-07  8:42 ` [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Stefan Hajnoczi
@ 2013-10-07  8:47   ` Peter Lieven
  2013-10-07  9:42   ` Paolo Bonzini
  1 sibling, 0 replies; 47+ messages in thread
From: Peter Lieven @ 2013-10-07  8:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, stefanha, qemu-devel, anthony, pbonzini, ronniesahlberg


Am 07.10.2013 um 10:42 schrieb Stefan Hajnoczi <stefanha@gmail.com>:

> On Tue, Sep 24, 2013 at 03:34:54PM +0200, Peter Lieven wrote:
>> this patch adds the ability for targets to stay sparse during
>> block migration (if the zero_blocks capability is set) and qemu-img convert
>> even if the target does not have has_zero_init = 1.
>> 
>> the series was especially developed for iSCSI, but it should also work
>> with other drivers with little or no adjustments. these adjustments
>> should be limited to providing block provisioning information through
>> get_block_info and/or honouring BDRV_REQ_MAY_UNMAP on writing zeroes.
> 
> Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
> avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
> few patches in this series I wondered why there is a need to expose
> flags at all...
> 
> Sometimes it is useful to distinguish between zeroing at the image
> format level from discarding at the device level, but I don't think we
> make use of that yet.  I'd prefer to keep the interface simple for now
> and add flags later, if necessary.
> 
> Or maybe I just missed something ;)
> 

There was a big discussing going on on this flags some time ago with Paolo and Kevin and they wanted to have it.
(thread was "qemu-img: conditionally discard target on convert"). For my use case I don't need the flag.

Peter

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
  2013-10-07  8:42 ` [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Stefan Hajnoczi
  2013-10-07  8:47   ` Peter Lieven
@ 2013-10-07  9:42   ` Paolo Bonzini
  2013-10-08  7:02     ` Stefan Hajnoczi
  1 sibling, 1 reply; 47+ messages in thread
From: Paolo Bonzini @ 2013-10-07  9:42 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, anthony, Peter Lieven, qemu-devel, stefanha, ronniesahlberg

Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:
> Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
> avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
> few patches in this series I wondered why there is a need to expose
> flags at all...
> 
> Sometimes it is useful to distinguish between zeroing at the image
> format level from discarding at the device level, but I don't think we
> make use of that yet.  I'd prefer to keep the interface simple for now
> and add flags later, if necessary.
> 
> Or maybe I just missed something ;)

The flag is needed to implement the right semantics for the SCSI WRITE
SAME command, which are:

- if the UNMAP bit is off, always write the sectors (that's
bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
otherwise it's emulated with bdrv_aio_writev)

- if the target can "discard and write the specified payload", you can
discard, else you must write the sectors with the correct payload
(that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).

Contrast this with the UNMAP command, which does not make any guarantee
on the content of the sectors after the command is completed (a few
months ago we agreed that, even if you have discard_zeroes=true in the
target, it is fine for UNMAP to do nothing).

Paolo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
  2013-10-07  9:42   ` Paolo Bonzini
@ 2013-10-08  7:02     ` Stefan Hajnoczi
  2013-10-08  8:01       ` Peter Lieven
  0 siblings, 1 reply; 47+ messages in thread
From: Stefan Hajnoczi @ 2013-10-08  7:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Anthony Liguori, Peter Lieven, qemu-devel,
	Stefan Hajnoczi, ronnie sahlberg

On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:
>> Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
>> avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
>> few patches in this series I wondered why there is a need to expose
>> flags at all...
>>
>> Sometimes it is useful to distinguish between zeroing at the image
>> format level from discarding at the device level, but I don't think we
>> make use of that yet.  I'd prefer to keep the interface simple for now
>> and add flags later, if necessary.
>>
>> Or maybe I just missed something ;)
>
> The flag is needed to implement the right semantics for the SCSI WRITE
> SAME command, which are:
>
> - if the UNMAP bit is off, always write the sectors (that's
> bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
> otherwise it's emulated with bdrv_aio_writev)
>
> - if the target can "discard and write the specified payload", you can
> discard, else you must write the sectors with the correct payload
> (that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).
>
> Contrast this with the UNMAP command, which does not make any guarantee
> on the content of the sectors after the command is completed (a few
> months ago we agreed that, even if you have discard_zeroes=true in the
> target, it is fine for UNMAP to do nothing).

Okay, then let's keep the patches to expose the flag.

Stefan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
  2013-10-08  7:02     ` Stefan Hajnoczi
@ 2013-10-08  8:01       ` Peter Lieven
  2013-10-08  8:59         ` Stefan Hajnoczi
  0 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-10-08  8:01 UTC (permalink / raw)
  To: Stefan Hajnoczi, Paolo Bonzini
  Cc: Kevin Wolf, Anthony Liguori, Stefan Hajnoczi, qemu-devel,
	ronnie sahlberg

On 08.10.2013 09:02, Stefan Hajnoczi wrote:
> On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:
>>> Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
>>> avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
>>> few patches in this series I wondered why there is a need to expose
>>> flags at all...
>>>
>>> Sometimes it is useful to distinguish between zeroing at the image
>>> format level from discarding at the device level, but I don't think we
>>> make use of that yet.  I'd prefer to keep the interface simple for now
>>> and add flags later, if necessary.
>>>
>>> Or maybe I just missed something ;)
>> The flag is needed to implement the right semantics for the SCSI WRITE
>> SAME command, which are:
>>
>> - if the UNMAP bit is off, always write the sectors (that's
>> bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
>> otherwise it's emulated with bdrv_aio_writev)
>>
>> - if the target can "discard and write the specified payload", you can
>> discard, else you must write the sectors with the correct payload
>> (that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).
>>
>> Contrast this with the UNMAP command, which does not make any guarantee
>> on the content of the sectors after the command is completed (a few
>> months ago we agreed that, even if you have discard_zeroes=true in the
>> target, it is fine for UNMAP to do nothing).
> Okay, then let's keep the patches to expose the flag.
Okay, then I can keep those.

Can you give a short hint if my approach with brdv_make_empty is what
you want? I would like to not change the parameters, so use BDRV_REQ_MAY_UNMAP
unconditionally.

int bdrv_make_empty(BlockDriverState *bs)
{
     int64_t target_size = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
     int64_t ret, nb_sectors, sector_num = 0;
     int n;

     if (bs->drv->bdrv_make_empty) {
         return bs->drv->bdrv_make_empty(bs);
     }

     for (;;) {
         nb_sectors = target_size - sector_num;
         if (nb_sectors <= 0) {
             return 0;
         }
         if (nb_sectors > INT_MAX) {
             nb_sectors = INT_MAX;
         }
         ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
         if (ret & BDRV_BLOCK_ZERO) {
             sector_num += n;
             continue;
         }
         ret = bdrv_write_zeroes(bs, sector_num, n, BDRV_REQ_MAY_UNMAP);
         if (ret < 0) {
             error_report("error writing zeroes at sector %" PRId64 ": %s",
                          sector_num, strerror(-ret));
             return ret;
         }
         sector_num += n;
     }
}

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
  2013-10-08  8:01       ` Peter Lieven
@ 2013-10-08  8:59         ` Stefan Hajnoczi
  2013-10-08  9:12           ` Peter Lieven
  0 siblings, 1 reply; 47+ messages in thread
From: Stefan Hajnoczi @ 2013-10-08  8:59 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, Anthony Liguori,
	Paolo Bonzini, ronnie sahlberg

On Tue, Oct 8, 2013 at 10:01 AM, Peter Lieven <pl@kamp.de> wrote:
> On 08.10.2013 09:02, Stefan Hajnoczi wrote:
>>
>> On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini <pbonzini@redhat.com>
>> wrote:
>>>
>>> Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:
>>>>
>>>> Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
>>>> avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
>>>> few patches in this series I wondered why there is a need to expose
>>>> flags at all...
>>>>
>>>> Sometimes it is useful to distinguish between zeroing at the image
>>>> format level from discarding at the device level, but I don't think we
>>>> make use of that yet.  I'd prefer to keep the interface simple for now
>>>> and add flags later, if necessary.
>>>>
>>>> Or maybe I just missed something ;)
>>>
>>> The flag is needed to implement the right semantics for the SCSI WRITE
>>> SAME command, which are:
>>>
>>> - if the UNMAP bit is off, always write the sectors (that's
>>> bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
>>> otherwise it's emulated with bdrv_aio_writev)
>>>
>>> - if the target can "discard and write the specified payload", you can
>>> discard, else you must write the sectors with the correct payload
>>> (that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).
>>>
>>> Contrast this with the UNMAP command, which does not make any guarantee
>>> on the content of the sectors after the command is completed (a few
>>> months ago we agreed that, even if you have discard_zeroes=true in the
>>> target, it is fine for UNMAP to do nothing).
>>
>> Okay, then let's keep the patches to expose the flag.
>
> Okay, then I can keep those.
>
> Can you give a short hint if my approach with brdv_make_empty is what
> you want? I would like to not change the parameters, so use
> BDRV_REQ_MAY_UNMAP
> unconditionally.
>
> int bdrv_make_empty(BlockDriverState *bs)

The semantics of bdrv_make_empty() today are: deallocate all data in
the top layer of the image file.  If there is a backing file, reads
will fall back to the backing file.

The semantics that you want are zeroing the entire disk image
(efficiently, when possible).

A flags argument is needed to support both of sets of semantics.  If
you don't like that, then I suggest creating a new function called
bdrv_make_zero().

Stefan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
  2013-10-08  8:59         ` Stefan Hajnoczi
@ 2013-10-08  9:12           ` Peter Lieven
  2013-10-08  9:26             ` Stefan Hajnoczi
  0 siblings, 1 reply; 47+ messages in thread
From: Peter Lieven @ 2013-10-08  9:12 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, Anthony Liguori,
	Paolo Bonzini, ronnie sahlberg

On 08.10.2013 10:59, Stefan Hajnoczi wrote:
> On Tue, Oct 8, 2013 at 10:01 AM, Peter Lieven <pl@kamp.de> wrote:
>> On 08.10.2013 09:02, Stefan Hajnoczi wrote:
>>> On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini <pbonzini@redhat.com>
>>> wrote:
>>>> Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:
>>>>> Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
>>>>> avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
>>>>> few patches in this series I wondered why there is a need to expose
>>>>> flags at all...
>>>>>
>>>>> Sometimes it is useful to distinguish between zeroing at the image
>>>>> format level from discarding at the device level, but I don't think we
>>>>> make use of that yet.  I'd prefer to keep the interface simple for now
>>>>> and add flags later, if necessary.
>>>>>
>>>>> Or maybe I just missed something ;)
>>>> The flag is needed to implement the right semantics for the SCSI WRITE
>>>> SAME command, which are:
>>>>
>>>> - if the UNMAP bit is off, always write the sectors (that's
>>>> bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
>>>> otherwise it's emulated with bdrv_aio_writev)
>>>>
>>>> - if the target can "discard and write the specified payload", you can
>>>> discard, else you must write the sectors with the correct payload
>>>> (that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).
>>>>
>>>> Contrast this with the UNMAP command, which does not make any guarantee
>>>> on the content of the sectors after the command is completed (a few
>>>> months ago we agreed that, even if you have discard_zeroes=true in the
>>>> target, it is fine for UNMAP to do nothing).
>>> Okay, then let's keep the patches to expose the flag.
>> Okay, then I can keep those.
>>
>> Can you give a short hint if my approach with brdv_make_empty is what
>> you want? I would like to not change the parameters, so use
>> BDRV_REQ_MAY_UNMAP
>> unconditionally.
>>
>> int bdrv_make_empty(BlockDriverState *bs)
> The semantics of bdrv_make_empty() today are: deallocate all data in
> the top layer of the image file.  If there is a backing file, reads
> will fall back to the backing file.
>
> The semantics that you want are zeroing the entire disk image
> (efficiently, when possible).
>
> A flags argument is needed to support both of sets of semantics.  If
> you don't like that, then I suggest creating a new function called
> bdrv_make_zero().
Ok, that is what I would like to do. In this case I only have to rename
bdrv_zeroize to bdrv_make_zero. Ok ?

Peter

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements
  2013-10-08  9:12           ` Peter Lieven
@ 2013-10-08  9:26             ` Stefan Hajnoczi
  0 siblings, 0 replies; 47+ messages in thread
From: Stefan Hajnoczi @ 2013-10-08  9:26 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, Anthony Liguori,
	Paolo Bonzini, ronnie sahlberg

On Tue, Oct 8, 2013 at 11:12 AM, Peter Lieven <pl@kamp.de> wrote:
> On 08.10.2013 10:59, Stefan Hajnoczi wrote:
>>
>> On Tue, Oct 8, 2013 at 10:01 AM, Peter Lieven <pl@kamp.de> wrote:
>>>
>>> On 08.10.2013 09:02, Stefan Hajnoczi wrote:
>>>>
>>>> On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini <pbonzini@redhat.com>
>>>> wrote:
>>>>>
>>>>> Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:
>>>>>>
>>>>>> Could you make bdrv_co_write_zeroes() always use UNMAP, if possible,
>>>>>> and
>>>>>> avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
>>>>>> few patches in this series I wondered why there is a need to expose
>>>>>> flags at all...
>>>>>>
>>>>>> Sometimes it is useful to distinguish between zeroing at the image
>>>>>> format level from discarding at the device level, but I don't think we
>>>>>> make use of that yet.  I'd prefer to keep the interface simple for now
>>>>>> and add flags later, if necessary.
>>>>>>
>>>>>> Or maybe I just missed something ;)
>>>>>
>>>>> The flag is needed to implement the right semantics for the SCSI WRITE
>>>>> SAME command, which are:
>>>>>
>>>>> - if the UNMAP bit is off, always write the sectors (that's
>>>>> bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is
>>>>> zero,
>>>>> otherwise it's emulated with bdrv_aio_writev)
>>>>>
>>>>> - if the target can "discard and write the specified payload", you can
>>>>> discard, else you must write the sectors with the correct payload
>>>>> (that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).
>>>>>
>>>>> Contrast this with the UNMAP command, which does not make any guarantee
>>>>> on the content of the sectors after the command is completed (a few
>>>>> months ago we agreed that, even if you have discard_zeroes=true in the
>>>>> target, it is fine for UNMAP to do nothing).
>>>>
>>>> Okay, then let's keep the patches to expose the flag.
>>>
>>> Okay, then I can keep those.
>>>
>>> Can you give a short hint if my approach with brdv_make_empty is what
>>> you want? I would like to not change the parameters, so use
>>> BDRV_REQ_MAY_UNMAP
>>> unconditionally.
>>>
>>> int bdrv_make_empty(BlockDriverState *bs)
>>
>> The semantics of bdrv_make_empty() today are: deallocate all data in
>> the top layer of the image file.  If there is a backing file, reads
>> will fall back to the backing file.
>>
>> The semantics that you want are zeroing the entire disk image
>> (efficiently, when possible).
>>
>> A flags argument is needed to support both of sets of semantics.  If
>> you don't like that, then I suggest creating a new function called
>> bdrv_make_zero().
>
> Ok, that is what I would like to do. In this case I only have to rename
> bdrv_zeroize to bdrv_make_zero. Ok ?

Yes, please.

Stefan

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2013-10-08  9:26 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-24 13:34 [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Peter Lieven
2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 01/20] block: make BdrvRequestFlags public Peter Lieven
2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 02/20] block: add flags to bdrv_*_write_zeroes Peter Lieven
2013-10-02 15:32   ` Eric Blake
2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 03/20] block: introduce BDRV_REQ_MAY_UNMAP request flag Peter Lieven
2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 04/20] block: introduce bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes Peter Lieven
2013-10-02 15:41   ` Eric Blake
2013-09-24 13:34 ` [Qemu-devel] [PATCHv3 05/20] block/raw: add " Peter Lieven
2013-10-02 15:43   ` Eric Blake
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 06/20] block: add BlockLimits structure to BlockDriverState Peter Lieven
2013-10-02 15:53   ` Eric Blake
2013-10-07  8:10     ` Stefan Hajnoczi
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 07/20] block: honour BlockLimits in bdrv_co_do_write_zeroes Peter Lieven
2013-10-02 16:37   ` Eric Blake
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 08/20] block: honour BlockLimits in bdrv_co_discard Peter Lieven
2013-10-02 16:41   ` Eric Blake
2013-10-07  8:29   ` Stefan Hajnoczi
2013-10-07  8:36     ` Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 09/20] iscsi: simplify iscsi_co_discard Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 10/20] iscsi: set limits in BlockDriverState Peter Lieven
2013-10-02 16:43   ` Eric Blake
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 11/20] iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes Peter Lieven
2013-10-02 16:45   ` Eric Blake
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 12/20] iscsi: add bdrv_co_write_zeroes Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 13/20] block: introduce bdrv_zeroize Peter Lieven
2013-10-02 16:51   ` Eric Blake
2013-10-07  8:34   ` Stefan Hajnoczi
2013-10-07  8:39     ` Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 14/20] block/get_block_status: set *pnum = 0 on error Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 15/20] block/get_block_status: avoid segfault if there is no backing_hd Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 16/20] block/get_block_status: avoid redundant callouts on raw devices Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 17/20] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 18/20] qemu-img: add support for fully allocated images Peter Lieven
2013-10-02 17:01   ` Eric Blake
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 19/20] qemu-img: conditionally zero out target on convert Peter Lieven
2013-09-24 13:35 ` [Qemu-devel] [PATCHv3 20/20] block/raw: copy BlockLimits on raw_open Peter Lieven
2013-10-02 17:11   ` Eric Blake
2013-10-07  8:38     ` Stefan Hajnoczi
2013-10-07  8:40       ` Peter Lieven
2013-10-07  8:42 ` [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements Stefan Hajnoczi
2013-10-07  8:47   ` Peter Lieven
2013-10-07  9:42   ` Paolo Bonzini
2013-10-08  7:02     ` Stefan Hajnoczi
2013-10-08  8:01       ` Peter Lieven
2013-10-08  8:59         ` Stefan Hajnoczi
2013-10-08  9:12           ` Peter Lieven
2013-10-08  9:26             ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.