All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements
@ 2013-09-13 10:24 Peter Lieven
  2013-09-13 10:24 ` [Qemu-devel] [PATCH 01/12] block: make BdrvRequestFlags public Peter Lieven
                   ` (11 more replies)
  0 siblings, 12 replies; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

this patch adds the ability for targets to stay sparse during
block migration (if the zero_blocks capability is set) and qemu-img convert
even if the target does not have has_zero_init = 1.

the series was especially developed for iSCSI, but it should also work
with other drivers with little or no adjustments. these adjustments
should be limited to providing block provisioning information through
get_block_info and/or honouring BDRV_REQ_MAY_UNMAP on writing zeroes.

the last 4 patches are fixes/enhancements for the get_block_status API
discovered during development of this series.

Peter Lieven (12):
  block: make BdrvRequestFlags public
  block: add flags to bdrv_*_write_zeroes
  block: introduce BDRV_REQ_MAY_UNMAP in bdrv_co_write_zeroes
  iscsi: add .bdrv_co_write_zeroes
  block: add logical block provisioning information to BlockDriverInfo
  iscsi: add .bdrv_get_info
  block: introduce bdrv_zeroize
  qemu-img: conditionally zero out target on convert
  block/get_block_status: set *pnum = 0 on error
  block/get_block_status: avoid segfault if there is no backing_hd
  block/get_block_status: avoid redundant callouts on raw devices
  block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks

 block-migration.c         |    3 +-
 block.c                   |   86 ++++++++++++++++++++++++++++++--------
 block/backup.c            |    3 +-
 block/iscsi.c             |  102 +++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c             |    2 +-
 block/qed.c               |    3 +-
 block/raw_bsd.c           |    5 ++-
 block/vmdk.c              |    3 +-
 include/block/block.h     |   20 ++++++++-
 include/block/block_int.h |    2 +-
 qemu-img.c                |   22 +++++++---
 qemu-io-cmds.c            |    2 +-
 12 files changed, 218 insertions(+), 35 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 01/12] block: make BdrvRequestFlags public
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
@ 2013-09-13 10:24 ` Peter Lieven
  2013-09-13 16:29   ` Eric Blake
  2013-09-13 10:24 ` [Qemu-devel] [PATCH 02/12] block: add flags to bdrv_*_write_zeroes Peter Lieven
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c               |    5 -----
 include/block/block.h |    5 +++++
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index a325efc..878f365 100644
--- a/block.c
+++ b/block.c
@@ -51,11 +51,6 @@
 
 #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
 
-typedef enum {
-    BDRV_REQ_COPY_ON_READ = 0x1,
-    BDRV_REQ_ZERO_WRITE   = 0x2,
-} BdrvRequestFlags;
-
 static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load);
 static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
diff --git a/include/block/block.h b/include/block/block.h
index 728ec1a..3249261 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -62,6 +62,11 @@ typedef struct BlockDevOps {
     void (*resize_cb)(void *opaque);
 } BlockDevOps;
 
+typedef enum {
+    BDRV_REQ_COPY_ON_READ = 0x1,
+    BDRV_REQ_ZERO_WRITE   = 0x2,
+} BdrvRequestFlags;
+
 #define BDRV_O_RDWR        0x0002
 #define BDRV_O_SNAPSHOT    0x0008 /* open the file read only and save writes in a snapshot */
 #define BDRV_O_NOCACHE     0x0020 /* do not use the host page cache */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 02/12] block: add flags to bdrv_*_write_zeroes
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
  2013-09-13 10:24 ` [Qemu-devel] [PATCH 01/12] block: make BdrvRequestFlags public Peter Lieven
@ 2013-09-13 10:24 ` Peter Lieven
  2013-09-13 16:58   ` Eric Blake
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 03/12] block: introduce BDRV_REQ_MAY_UNMAP in bdrv_co_write_zeroes Peter Lieven
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block-migration.c         |    2 +-
 block.c                   |   20 +++++++++++---------
 block/backup.c            |    3 ++-
 block/qcow2.c             |    2 +-
 block/qed.c               |    3 ++-
 block/raw_bsd.c           |    5 +++--
 block/vmdk.c              |    3 ++-
 include/block/block.h     |    4 ++--
 include/block/block_int.h |    2 +-
 qemu-io-cmds.c            |    2 +-
 10 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index daf9ec1..713a8e3 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -780,7 +780,7 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
             }
 
             if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
-                ret = bdrv_write_zeroes(bs, addr, nr_sectors);
+                ret = bdrv_write_zeroes(bs, addr, nr_sectors, 0);
             } else {
                 buf = g_malloc(BLOCK_SIZE);
                 qemu_get_buffer(f, buf, BLOCK_SIZE);
diff --git a/block.c b/block.c
index 878f365..e5918a7 100644
--- a/block.c
+++ b/block.c
@@ -79,7 +79,7 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
                                                bool is_write);
 static void coroutine_fn bdrv_co_do_rw(void *opaque);
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors);
+    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
 
 static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
     QTAILQ_HEAD_INITIALIZER(bdrv_states);
@@ -2335,10 +2335,11 @@ int bdrv_writev(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov)
     return bdrv_rwv_co(bs, sector_num, qiov, true, 0);
 }
 
-int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
+int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
+                      int nb_sectors, BdrvRequestFlags flags)
 {
     return bdrv_rw_co(bs, sector_num, NULL, nb_sectors, true,
-                      BDRV_REQ_ZERO_WRITE);
+                      BDRV_REQ_ZERO_WRITE | flags);
 }
 
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
@@ -2520,7 +2521,7 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
     if (drv->bdrv_co_write_zeroes &&
         buffer_is_zero(bounce_buffer, iov.iov_len)) {
         ret = bdrv_co_do_write_zeroes(bs, cluster_sector_num,
-                                      cluster_nb_sectors);
+                                      cluster_nb_sectors, 0);
     } else {
         /* This does not change the data on the disk, it is not necessary
          * to flush even in cache=writethrough mode.
@@ -2654,7 +2655,7 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
 }
 
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors)
+    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
     BlockDriver *drv = bs->drv;
     QEMUIOVector qiov;
@@ -2666,7 +2667,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
 
     /* First try the efficient write zeroes operation */
     if (drv->bdrv_co_write_zeroes) {
-        ret = drv->bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
+        ret = drv->bdrv_co_write_zeroes(bs, sector_num, nb_sectors, flags);
         if (ret != -ENOTSUP) {
             return ret;
         }
@@ -2721,7 +2722,7 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
     if (ret < 0) {
         /* Do nothing, write notifier decided to fail this request */
     } else if (flags & BDRV_REQ_ZERO_WRITE) {
-        ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors);
+        ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors, flags);
     } else {
         ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
     }
@@ -2755,12 +2756,13 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
 }
 
 int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs,
-                                      int64_t sector_num, int nb_sectors)
+                                      int64_t sector_num, int nb_sectors,
+                                      BdrvRequestFlags flags)
 {
     trace_bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
 
     return bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
-                             BDRV_REQ_ZERO_WRITE);
+                             BDRV_REQ_ZERO_WRITE | flags);
 }
 
 /**
diff --git a/block/backup.c b/block/backup.c
index 04c4b5c..5f6a642 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -138,7 +138,8 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
 
         if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
             ret = bdrv_co_write_zeroes(job->target,
-                                       start * BACKUP_SECTORS_PER_CLUSTER, n);
+                                       start * BACKUP_SECTORS_PER_CLUSTER,
+                                       n, 0);
         } else {
             ret = bdrv_co_writev(job->target,
                                  start * BACKUP_SECTORS_PER_CLUSTER, n,
diff --git a/block/qcow2.c b/block/qcow2.c
index 578792f..2cb59a2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1555,7 +1555,7 @@ static int qcow2_make_empty(BlockDriverState *bs)
 }
 
 static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
-    int64_t sector_num, int nb_sectors)
+    int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
     int ret;
     BDRVQcowState *s = bs->opaque;
diff --git a/block/qed.c b/block/qed.c
index 49b3a37..d6e42f3 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1389,7 +1389,8 @@ static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
 
 static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
                                                  int64_t sector_num,
-                                                 int nb_sectors)
+                                                 int nb_sectors,
+                                                 BdrvRequestFlags flags)
 {
     BlockDriverAIOCB *blockacb;
     BDRVQEDState *s = bs->opaque;
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index a9060ca..bd4811b 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -66,9 +66,10 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
 }
 
 static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
-                                            int64_t sector_num, int nb_sectors)
+                                            int64_t sector_num, int nb_sectors,
+                                            BdrvRequestFlags flags)
 {
-    return bdrv_co_write_zeroes(bs->file, sector_num, nb_sectors);
+    return bdrv_co_write_zeroes(bs->file, sector_num, nb_sectors, flags);
 }
 
 static int coroutine_fn raw_co_discard(BlockDriverState *bs,
diff --git a/block/vmdk.c b/block/vmdk.c
index fb5b529..914a9b7 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1386,7 +1386,8 @@ static coroutine_fn int vmdk_co_write(BlockDriverState *bs, int64_t sector_num,
 
 static int coroutine_fn vmdk_co_write_zeroes(BlockDriverState *bs,
                                              int64_t sector_num,
-                                             int nb_sectors)
+                                             int nb_sectors,
+                                             BdrvRequestFlags flags)
 {
     int ret;
     BDRVVmdkState *s = bs->opaque;
diff --git a/include/block/block.h b/include/block/block.h
index 3249261..ddb6870 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -187,7 +187,7 @@ int bdrv_read_unthrottled(BlockDriverState *bs, int64_t sector_num,
 int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors);
 int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-               int nb_sectors);
+               int nb_sectors, BdrvRequestFlags flags);
 int bdrv_writev(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov);
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
                void *buf, int count);
@@ -209,7 +209,7 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
  * because it may allocate memory for the entire region.
  */
 int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
-    int nb_sectors);
+    int nb_sectors, BdrvRequestFlags flags);
 BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
     const char *backing_file);
 int bdrv_get_backing_file_depth(BlockDriverState *bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7c35198..26bad36 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -115,7 +115,7 @@ struct BlockDriver {
      * instead.
      */
     int coroutine_fn (*bdrv_co_write_zeroes)(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors);
+        int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
     int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors);
     int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs,
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 8565d49..66c0b9e 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -441,7 +441,7 @@ static void coroutine_fn co_write_zeroes_entry(void *opaque)
     CoWriteZeroes *data = opaque;
 
     data->ret = bdrv_co_write_zeroes(data->bs, data->offset / BDRV_SECTOR_SIZE,
-                                     data->count / BDRV_SECTOR_SIZE);
+                                     data->count / BDRV_SECTOR_SIZE, 0);
     data->done = true;
     if (data->ret < 0) {
         *data->total = data->ret;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 03/12] block: introduce BDRV_REQ_MAY_UNMAP in bdrv_co_write_zeroes
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
  2013-09-13 10:24 ` [Qemu-devel] [PATCH 01/12] block: make BdrvRequestFlags public Peter Lieven
  2013-09-13 10:24 ` [Qemu-devel] [PATCH 02/12] block: add flags to bdrv_*_write_zeroes Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 18:07   ` Eric Blake
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 04/12] iscsi: add .bdrv_co_write_zeroes Peter Lieven
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

the BDRV_REQ_MAY_UNMAP flag is used to indicate that block driver
is allowed to optimze a write zeroes request by unmapping (discarding)
blocks if it is guaranteed that the result will read back as
zeroes.

the flag is only passed to the driver if the block device is
opened with BDRV_O_UNMAP.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block-migration.c     |    3 ++-
 block.c               |    4 ++++
 block/backup.c        |    2 +-
 include/block/block.h |    1 +
 4 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 713a8e3..fc4ef93 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -780,7 +780,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
             }
 
             if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
-                ret = bdrv_write_zeroes(bs, addr, nr_sectors, 0);
+                ret = bdrv_write_zeroes(bs, addr, nr_sectors,
+                                        BDRV_REQ_MAY_UNMAP);
             } else {
                 buf = g_malloc(BLOCK_SIZE);
                 qemu_get_buffer(f, buf, BLOCK_SIZE);
diff --git a/block.c b/block.c
index e5918a7..6f498fc 100644
--- a/block.c
+++ b/block.c
@@ -2761,6 +2761,10 @@ int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs,
 {
     trace_bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
 
+    if (!(bs->open_flags & BDRV_O_UNMAP)) {
+        flags &= ~BDRV_REQ_MAY_UNMAP;
+    }
+
     return bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
                              BDRV_REQ_ZERO_WRITE | flags);
 }
diff --git a/block/backup.c b/block/backup.c
index 5f6a642..1ab080d 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -139,7 +139,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
         if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
             ret = bdrv_co_write_zeroes(job->target,
                                        start * BACKUP_SECTORS_PER_CLUSTER,
-                                       n, 0);
+                                       n, BDRV_REQ_MAY_UNMAP);
         } else {
             ret = bdrv_co_writev(job->target,
                                  start * BACKUP_SECTORS_PER_CLUSTER, n,
diff --git a/include/block/block.h b/include/block/block.h
index ddb6870..599de7d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -65,6 +65,7 @@ typedef struct BlockDevOps {
 typedef enum {
     BDRV_REQ_COPY_ON_READ = 0x1,
     BDRV_REQ_ZERO_WRITE   = 0x2,
+    BDRV_REQ_MAY_UNMAP    = 0x4,
 } BdrvRequestFlags;
 
 #define BDRV_O_RDWR        0x0002
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 04/12] iscsi: add .bdrv_co_write_zeroes
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (2 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 03/12] block: introduce BDRV_REQ_MAY_UNMAP in bdrv_co_write_zeroes Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 18:10   ` Eric Blake
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo Peter Lieven
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/iscsi.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index 68f99d3..3d7961d 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -56,6 +56,7 @@ typedef struct IscsiLun {
     uint8_t lbprz;
     struct scsi_inquiry_logical_block_provisioning lbp;
     struct scsi_inquiry_block_limits bl;
+    unsigned char *zeroblock;
 } IscsiLun;
 
 typedef struct IscsiTask {
@@ -88,6 +89,7 @@ typedef struct IscsiAIOCB {
 #define MAX_NOP_FAILURES 3
 #define ISCSI_CMD_RETRIES 5
 #define ISCSI_MAX_UNMAP 131072
+#define ISCSI_MAX_WRITESAME 131072
 
 static void
 iscsi_bh_cb(void *p)
@@ -972,6 +974,80 @@ retry:
     return 0;
 }
 
+static int
+coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
+                                   int nb_sectors, BdrvRequestFlags flags)
+{
+    IscsiLun *iscsilun = bs->opaque;
+    struct IscsiTask iTask;
+    uint64_t lba;
+    uint32_t nb_blocks, max_ws_len;
+
+    if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+        return -EINVAL;
+    }
+
+    if (!iscsilun->lbp.lbpws) {
+        /* WRITE SAME is not supported by the target */
+        return -ENOTSUP;
+    }
+
+    lba = sector_qemu2lun(sector_num, iscsilun);
+    nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
+
+    max_ws_len = iscsilun->bl.max_ws_len;
+    if (max_ws_len == 0xffffffff) {
+        max_ws_len = ISCSI_MAX_WRITESAME;
+    }
+
+    if (iscsilun->zeroblock == NULL) {
+        iscsilun->zeroblock = g_malloc(iscsilun->block_size);
+        if (iscsilun->zeroblock == NULL) {
+            return -ENOMEM;
+        }
+        memset(iscsilun->zeroblock, 0x00, iscsilun->block_size);
+    }
+
+    while (nb_blocks > 0) {
+        uint32_t num;
+        iscsi_co_init_iscsitask(iscsilun, &iTask);
+        num = nb_blocks;
+        if (num > max_ws_len) {
+            num = max_ws_len;
+        }
+retry:
+        if (iscsi_writesame16_task(iscsilun->iscsi, iscsilun->lun, lba,
+                                   iscsilun->zeroblock, iscsilun->block_size,
+                                   num, 0, !!(flags & BDRV_REQ_MAY_UNMAP),
+                                   0, 0, iscsi_co_generic_cb, &iTask) == NULL) {
+            return -EIO;
+        }
+
+        while (!iTask.complete) {
+            iscsi_set_events(iscsilun);
+            qemu_coroutine_yield();
+        }
+
+        if (iTask.task != NULL) {
+            scsi_free_scsi_task(iTask.task);
+            iTask.task = NULL;
+        }
+
+        if (iTask.do_retry) {
+            goto retry;
+        }
+
+        if (iTask.status != SCSI_STATUS_GOOD) {
+            return -EIO;
+        }
+
+        lba += num;
+        nb_blocks -= num;
+    }
+
+    return 0;
+}
+
 static int parse_chap(struct iscsi_context *iscsi, const char *target)
 {
     QemuOptsList *list;
@@ -1419,6 +1495,7 @@ static void iscsi_close(BlockDriverState *bs)
     }
     qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL);
     iscsi_destroy_context(iscsi);
+    g_free(iscsilun->zeroblock);
     memset(iscsilun, 0, sizeof(IscsiLun));
 }
 
@@ -1524,6 +1601,7 @@ static BlockDriver bdrv_iscsi = {
 
     .bdrv_co_get_block_status = iscsi_co_get_block_status,
     .bdrv_co_discard      = iscsi_co_discard,
+    .bdrv_co_write_zeroes = iscsi_co_write_zeroes,
 
     .bdrv_aio_readv  = iscsi_aio_readv,
     .bdrv_aio_writev = iscsi_aio_writev,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (3 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 04/12] iscsi: add .bdrv_co_write_zeroes Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 10:34   ` Paolo Bonzini
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 06/12] iscsi: add .bdrv_get_info Peter Lieven
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 include/block/block.h |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 599de7d..ee17048 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -18,6 +18,15 @@ typedef struct BlockDriverInfo {
     /* offset at which the VM state can be saved (0 if not possible) */
     int64_t vm_state_offset;
     bool is_dirty;
+    /* do discarded blocks read back as zeroes? */
+    bool discard_zeroes;
+    /* is write zeroes optimized by a discard/unmap operation?
+     * this requires support for the BDRV_REQ_MAY_UNMAP flag. */
+    bool discard_write_zeroes;
+    /* maximum number of sectors that can be discarded at once */
+    int max_discard;
+    /* maximum number of sectors that can zeroized at once */
+    int max_write_zeroes;
 } BlockDriverInfo;
 
 typedef struct BlockFragInfo {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 06/12] iscsi: add .bdrv_get_info
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (4 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 18:19   ` Eric Blake
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 07/12] block: introduce bdrv_zeroize Peter Lieven
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/iscsi.c |   24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index 3d7961d..438d45e 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1577,6 +1577,29 @@ out:
     return ret;
 }
 
+static int iscsi_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+    IscsiLun *iscsilun = bs->opaque;
+    int max_discard = iscsilun->bl.max_unmap;
+    int max_write_zeroes = iscsilun->bl.max_ws_len;
+
+    if (max_discard == 0xffffffff) {
+        max_discard = ISCSI_MAX_UNMAP;
+    }
+    bdi->max_discard = sector_lun2qemu(max_discard, iscsilun);
+
+    if (max_write_zeroes == 0xffffffff) {
+        max_write_zeroes = ISCSI_MAX_WRITESAME;
+    }
+    bdi->max_write_zeroes = sector_lun2qemu(max_write_zeroes, iscsilun);
+
+    bdi->discard_zeroes =  iscsilun->lbprz;
+    bdi->discard_write_zeroes =  (bs->open_flags & BDRV_O_UNMAP) &&
+                                 iscsilun->lbprz && iscsilun->lbp.lbpws;
+
+    return 0;
+}
+
 static QEMUOptionParameter iscsi_create_options[] = {
     {
         .name = BLOCK_OPT_SIZE,
@@ -1597,6 +1620,7 @@ static BlockDriver bdrv_iscsi = {
     .create_options  = iscsi_create_options,
 
     .bdrv_getlength  = iscsi_getlength,
+    .bdrv_get_info   = iscsi_get_info,
     .bdrv_truncate   = iscsi_truncate,
 
     .bdrv_co_get_block_status = iscsi_co_get_block_status,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 07/12] block: introduce bdrv_zeroize
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (5 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 06/12] iscsi: add .bdrv_get_info Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 10:47   ` Peter Lieven
  2013-09-13 18:23   ` Eric Blake
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert Peter Lieven
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

this patch adds a call to completely zero out a block device.
the operation is speed up by checking the block status and
only writing zeroes to the device if they currently do not
return zeroes. optionally the zero writing can be speed up
by setting the flag BDRV_REQ_MAY_UNMAP to emulate the zero
write by unmapping if the driver supports it.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c               |   45 +++++++++++++++++++++++++++++++++++++++++++++
 include/block/block.h |    1 +
 2 files changed, 46 insertions(+)

diff --git a/block.c b/block.c
index 6f498fc..76a6621 100644
--- a/block.c
+++ b/block.c
@@ -2342,6 +2342,51 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
                       BDRV_REQ_ZERO_WRITE | flags);
 }
 
+int bdrv_zeroize(BlockDriverState *bs, BdrvRequestFlags flags)
+{
+    BlockDriverInfo bdi;
+    int64_t target_size = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
+    int64_t ret, nb_sectors, sector_num = 0;
+    int n;
+    /* split the write requests into 1MB chunks if the driver
+     * does not return a maximal size via bdi */
+    int max_write_zeroes = 1 << (20 - BDRV_SECTOR_BITS);
+    if (bdrv_get_info(bs, &bdi) == 0 &&
+        bdi.max_write_zeroes > 0) {
+        max_write_zeroes = bdi.max_write_zeroes;
+    }
+    for (;;) {
+        nb_sectors = target_size - sector_num;
+        if (nb_sectors <= 0) {
+            return 0;
+        }
+        if (nb_sectors > INT_MAX) {
+            nb_sectors = INT_MAX;
+        }
+        ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
+        if (ret & BDRV_BLOCK_ZERO) {
+            sector_num += n;
+            continue;
+        }
+        while (n > 0) {
+            nb_sectors = n;
+            if (nb_sectors > max_write_zeroes) {
+                nb_sectors = max_write_zeroes;
+            }
+            ret = bdrv_write_zeroes(bs, sector_num, nb_sectors,
+                                    flags & BDRV_REQ_MAY_UNMAP);
+            if (ret < 0) {
+                error_report("error writing zeroes at sector %" PRId64 ": %s",
+                             sector_num, strerror(-ret));
+                return -EIO;
+            }
+            sector_num += nb_sectors;
+            n -= nb_sectors;
+        }
+    }
+}
+
+
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
                void *buf, int count1)
 {
diff --git a/include/block/block.h b/include/block/block.h
index ee17048..71e9fa1 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -198,6 +198,7 @@ int bdrv_write(BlockDriverState *bs, int64_t sector_num,
                const uint8_t *buf, int nb_sectors);
 int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
                int nb_sectors, BdrvRequestFlags flags);
+int bdrv_zeroize(BlockDriverState *bs, BdrvRequestFlags flags);
 int bdrv_writev(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov);
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
                void *buf, int count);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (6 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 07/12] block: introduce bdrv_zeroize Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 10:36   ` Paolo Bonzini
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 09/12] block/get_block_status: set *pnum = 0 on error Peter Lieven
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

if the target has_zero_init = 0, but supports efficiently
writing zeroes by unmapping we call bdrv_zeroize to
avoid fully allocating the target. this currently
is designed especially for iscsi.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 qemu-img.c |   22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 3e5e388..6eaddc6 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1354,7 +1354,8 @@ static int img_convert(int argc, char **argv)
         }
     }
 
-    flags = BDRV_O_RDWR;
+    flags = BDRV_O_RDWR | BDRV_O_UNMAP;
+
     ret = bdrv_parse_cache_flags(cache, &flags);
     if (ret < 0) {
         error_report("Invalid cache option: %s", cache);
@@ -1386,12 +1387,13 @@ static int img_convert(int argc, char **argv)
         }
     }
 
+    ret = bdrv_get_info(out_bs, &bdi);
+    if (ret < 0) {
+        error_report("could not get block driver info");
+        goto out;
+    }
+
     if (compress) {
-        ret = bdrv_get_info(out_bs, &bdi);
-        if (ret < 0) {
-            error_report("could not get block driver info");
-            goto out;
-        }
         cluster_size = bdi.cluster_size;
         if (cluster_size <= 0 || cluster_size > IO_BUF_SIZE) {
             error_report("invalid cluster size");
@@ -1470,6 +1472,14 @@ static int img_convert(int argc, char **argv)
     } else {
         int has_zero_init = bdrv_has_zero_init(out_bs);
 
+        if (!has_zero_init && !out_baseimg && bdi.discard_write_zeroes) {
+            ret = bdrv_zeroize(out_bs, BDRV_REQ_MAY_UNMAP);
+            if (ret < 0) {
+                goto out;
+            }
+            has_zero_init = 1;
+        }
+
         sector_num = 0; // total number of sectors converted so far
         nb_sectors = total_sectors - sector_num;
         if (nb_sectors != 0) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 09/12] block/get_block_status: set *pnum = 0 on error
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (7 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 18:26   ` Eric Blake
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 10/12] block/get_block_status: avoid segfault if there is no backing_hd Peter Lieven
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

if the call is invoked through bdrv_is_allocated the caller might
expect *pnum = 0 on error. however, a new implementation of
bdrv_get_block_status might only return a negative exit value on
error while keeping *pnum untouched.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/block.c b/block.c
index 76a6621..4922fb9 100644
--- a/block.c
+++ b/block.c
@@ -3149,6 +3149,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
 
     ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
     if (ret < 0) {
+        *pnum = 0;
         return ret;
     }
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 10/12] block/get_block_status: avoid segfault if there is no backing_hd
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (8 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 09/12] block/get_block_status: set *pnum = 0 on error Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 18:26   ` Eric Blake
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 11/12] block/get_block_status: avoid redundant callouts on raw devices Peter Lieven
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 12/12] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks Peter Lieven
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 4922fb9..cf9db62 100644
--- a/block.c
+++ b/block.c
@@ -3156,7 +3156,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
     if (!(ret & BDRV_BLOCK_DATA)) {
         if (bdrv_has_zero_init(bs)) {
             ret |= BDRV_BLOCK_ZERO;
-        } else {
+        } else if (bs->backing_hd) {
             BlockDriverState *bs2 = bs->backing_hd;
             int64_t length2 = bdrv_getlength(bs2);
             if (length2 >= 0 && sector_num >= (length2 >> BDRV_SECTOR_BITS)) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 11/12] block/get_block_status: avoid redundant callouts on raw devices
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (9 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 10/12] block/get_block_status: avoid segfault if there is no backing_hd Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 18:28   ` Eric Blake
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 12/12] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks Peter Lieven
  11 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs->file. however, the raw driver already
has called bdrv_get_block_status on bs->file.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index cf9db62..282943b 100644
--- a/block.c
+++ b/block.c
@@ -3167,7 +3167,8 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
 
     if (bs->file &&
         (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
-        (ret & BDRV_BLOCK_OFFSET_VALID)) {
+        (ret & BDRV_BLOCK_OFFSET_VALID) &&
+        strcmp(bs->drv->format_name, "raw")) {
         ret2 = bdrv_co_get_block_status(bs->file, ret >> BDRV_SECTOR_BITS,
                                         *pnum, pnum);
         if (ret2 >= 0) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 12/12] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks
  2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
                   ` (10 preceding siblings ...)
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 11/12] block/get_block_status: avoid redundant callouts on raw devices Peter Lieven
@ 2013-09-13 10:25 ` Peter Lieven
  2013-09-13 13:53   ` Eric Blake
  2013-09-16  5:47   ` Peter Lieven
  11 siblings, 2 replies; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, ronniesahlberg, Peter Lieven, stefanha, anthony, pbonzini

this patch does 2 things:
a) only do additional call outs if BDRV_BLOCK_ZERO is not already set.
b) use bdi.discard_zeroes to return the zero state of an unallocated block.
   the callout to bdrv_has_zero_init() is only valid right after bdrv_create.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 282943b..6698c10 100644
--- a/block.c
+++ b/block.c
@@ -3153,8 +3153,10 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
         return ret;
     }
 
-    if (!(ret & BDRV_BLOCK_DATA)) {
-        if (bdrv_has_zero_init(bs)) {
+    if (!(ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO)) {
+        BlockDriverInfo bdi;
+        if (bdrv_get_info(bs, &bdi) == 0 &&
+            bdi.discard_zeroes) {
             ret |= BDRV_BLOCK_ZERO;
         } else if (bs->backing_hd) {
             BlockDriverState *bs2 = bs->backing_hd;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo Peter Lieven
@ 2013-09-13 10:34   ` Paolo Bonzini
  2013-09-13 10:44     ` Peter Lieven
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2013-09-13 10:34 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

Il 13/09/2013 12:25, Peter Lieven ha scritto:
> +    /* maximum number of sectors that can be discarded at once */
> +    int max_discard;
> +    /* maximum number of sectors that can zeroized at once */
> +    int max_write_zeroes;

These should not be needed outside the driver.

If you want to make them private between block.c and block/iscsi.c, you
can add them to BlockDriverState.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert Peter Lieven
@ 2013-09-13 10:36   ` Paolo Bonzini
  2013-09-13 10:46     ` Peter Lieven
  2013-09-13 18:25     ` Eric Blake
  0 siblings, 2 replies; 39+ messages in thread
From: Paolo Bonzini @ 2013-09-13 10:36 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

Il 13/09/2013 12:25, Peter Lieven ha scritto:
> if the target has_zero_init = 0, but supports efficiently
> writing zeroes by unmapping we call bdrv_zeroize to
> avoid fully allocating the target. this currently
> is designed especially for iscsi.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  qemu-img.c |   22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 3e5e388..6eaddc6 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -1354,7 +1354,8 @@ static int img_convert(int argc, char **argv)
>          }
>      }
>  
> -    flags = BDRV_O_RDWR;
> +    flags = BDRV_O_RDWR | BDRV_O_UNMAP;

I think this should be a new command-line flag.

Paolo

> +
>      ret = bdrv_parse_cache_flags(cache, &flags);
>      if (ret < 0) {
>          error_report("Invalid cache option: %s", cache);
> @@ -1386,12 +1387,13 @@ static int img_convert(int argc, char **argv)
>          }
>      }
>  
> +    ret = bdrv_get_info(out_bs, &bdi);
> +    if (ret < 0) {
> +        error_report("could not get block driver info");
> +        goto out;
> +    }
> +
>      if (compress) {
> -        ret = bdrv_get_info(out_bs, &bdi);
> -        if (ret < 0) {
> -            error_report("could not get block driver info");
> -            goto out;
> -        }
>          cluster_size = bdi.cluster_size;
>          if (cluster_size <= 0 || cluster_size > IO_BUF_SIZE) {
>              error_report("invalid cluster size");
> @@ -1470,6 +1472,14 @@ static int img_convert(int argc, char **argv)
>      } else {
>          int has_zero_init = bdrv_has_zero_init(out_bs);
>  
> +        if (!has_zero_init && !out_baseimg && bdi.discard_write_zeroes) {
> +            ret = bdrv_zeroize(out_bs, BDRV_REQ_MAY_UNMAP);
> +            if (ret < 0) {
> +                goto out;
> +            }
> +            has_zero_init = 1;
> +        }
> +
>          sector_num = 0; // total number of sectors converted so far
>          nb_sectors = total_sectors - sector_num;
>          if (nb_sectors != 0) {
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo
  2013-09-13 10:34   ` Paolo Bonzini
@ 2013-09-13 10:44     ` Peter Lieven
  2013-09-13 11:45       ` Paolo Bonzini
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

On 13.09.2013 12:34, Paolo Bonzini wrote:
> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>> +    /* maximum number of sectors that can be discarded at once */
>> +    int max_discard;
>> +    /* maximum number of sectors that can zeroized at once */
>> +    int max_write_zeroes;
> These should not be needed outside the driver.
>
> If you want to make them private between block.c and block/iscsi.c, you
> can add them to BlockDriverState.
The question is, if the discard_zeroes or discard_write_zeroes is needed
outside the driver as well?

I can put the max_* information in the block driver state. I also thought
to add alignment and granularity information even if they are currently
not yet used.

Peter

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert
  2013-09-13 10:36   ` Paolo Bonzini
@ 2013-09-13 10:46     ` Peter Lieven
  2013-09-13 11:35       ` Paolo Bonzini
  2013-09-13 18:25     ` Eric Blake
  1 sibling, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

On 13.09.2013 12:36, Paolo Bonzini wrote:
> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>> if the target has_zero_init = 0, but supports efficiently
>> writing zeroes by unmapping we call bdrv_zeroize to
>> avoid fully allocating the target. this currently
>> is designed especially for iscsi.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>>   qemu-img.c |   22 ++++++++++++++++------
>>   1 file changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/qemu-img.c b/qemu-img.c
>> index 3e5e388..6eaddc6 100644
>> --- a/qemu-img.c
>> +++ b/qemu-img.c
>> @@ -1354,7 +1354,8 @@ static int img_convert(int argc, char **argv)
>>           }
>>       }
>>   
>> -    flags = BDRV_O_RDWR;
>> +    flags = BDRV_O_RDWR | BDRV_O_UNMAP;
> I think this should be a new command-line flag.
In an earlier version there where no objections. I think it would
make the usage of qemu-img convert more complicated. For
most targets has_zero_init is 1 anyway.

Peter

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 07/12] block: introduce bdrv_zeroize
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 07/12] block: introduce bdrv_zeroize Peter Lieven
@ 2013-09-13 10:47   ` Peter Lieven
  2013-09-13 18:23   ` Eric Blake
  1 sibling, 0 replies; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 10:47 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, stefanha, qemu-devel, ronniesahlberg, anthony, pbonzini

On 13.09.2013 12:25, Peter Lieven wrote:
> this patch adds a call to completely zero out a block device.
> the operation is speed up by checking the block status and
> only writing zeroes to the device if they currently do not
> return zeroes. optionally the zero writing can be speed up
> by setting the flag BDRV_REQ_MAY_UNMAP to emulate the zero
> write by unmapping if the driver supports it.
>
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>   block.c               |   45 +++++++++++++++++++++++++++++++++++++++++++++
>   include/block/block.h |    1 +
>   2 files changed, 46 insertions(+)
>
> diff --git a/block.c b/block.c
> index 6f498fc..76a6621 100644
> --- a/block.c
> +++ b/block.c
> @@ -2342,6 +2342,51 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
>                         BDRV_REQ_ZERO_WRITE | flags);
>   }
>   
> +int bdrv_zeroize(BlockDriverState *bs, BdrvRequestFlags flags)
> +{
> +    BlockDriverInfo bdi;
> +    int64_t target_size = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
> +    int64_t ret, nb_sectors, sector_num = 0;
> +    int n;
> +    /* split the write requests into 1MB chunks if the driver
> +     * does not return a maximal size via bdi */
> +    int max_write_zeroes = 1 << (20 - BDRV_SECTOR_BITS);
> +    if (bdrv_get_info(bs, &bdi) == 0 &&
> +        bdi.max_write_zeroes > 0) {
> +        max_write_zeroes = bdi.max_write_zeroes;
> +    }
> +    for (;;) {
> +        nb_sectors = target_size - sector_num;
> +        if (nb_sectors <= 0) {
> +            return 0;
> +        }
> +        if (nb_sectors > INT_MAX) {
> +            nb_sectors = INT_MAX;
> +        }
> +        ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
forgot to check for ret < 0 here.

Peter

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert
  2013-09-13 10:46     ` Peter Lieven
@ 2013-09-13 11:35       ` Paolo Bonzini
  0 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2013-09-13 11:35 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

Il 13/09/2013 12:46, Peter Lieven ha scritto:
>>>
>>>   -    flags = BDRV_O_RDWR;
>>> +    flags = BDRV_O_RDWR | BDRV_O_UNMAP;
>> I think this should be a new command-line flag.
> In an earlier version there where no objections. I think it would
> make the usage of qemu-img convert more complicated. For
> most targets has_zero_init is 1 anyway.

If Kevin is okay with having BDRV_O_UNMAP as the default that's fine.
However there should be an option to disable it.  Perhaps it could be
"-S 0" with something like this added:

diff --git a/qemu-img.c b/qemu-img.c
index 3e5e388..f42c90c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -769,6 +769,10 @@
     int ret;
     int num_checked, num_used;

+    if (!min) {
+        return 1;
+    }
+
     if (n < min) {
         min = n;
     }


Paolo

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo
  2013-09-13 10:44     ` Peter Lieven
@ 2013-09-13 11:45       ` Paolo Bonzini
  2013-09-13 12:22         ` Peter Lieven
  2013-09-16 11:30         ` Peter Lieven
  0 siblings, 2 replies; 39+ messages in thread
From: Paolo Bonzini @ 2013-09-13 11:45 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

Il 13/09/2013 12:44, Peter Lieven ha scritto:
> On 13.09.2013 12:34, Paolo Bonzini wrote:
>> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>>> +    /* maximum number of sectors that can be discarded at once */
>>> +    int max_discard;
>>> +    /* maximum number of sectors that can zeroized at once */
>>> +    int max_write_zeroes;
>> These should not be needed outside the driver.
>>
>> If you want to make them private between block.c and block/iscsi.c, you
>> can add them to BlockDriverState.
> The question is, if the discard_zeroes or discard_write_zeroes is needed
> outside the driver as well?
> 
> I can put the max_* information in the block driver state. I also thought
> to add alignment and granularity information even if they are currently
> not yet used.

Yeah, in fact bdrv_write_zeroes and bdrv_discard can be taught to split
requests according to these parameters instead of introducing a new
function bdrv_zeroize.  You don't need bdrv_zeroize I think; you can
simply use bdrv_write_zeroes.  This is why I don't like this information
in BlockDriverInfo.

On the contrary, discard_write_zeroes is useful to "generic" clients,
and your qemu-img patch shows why.

Discard_zeroes is somewhere in the middle.  You only use it in
bdrv_get_block_status, but it is not something that should be hidden to
users of the high-level block.c API.  So it is fine to leave it in
BlockDriverInfo.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo
  2013-09-13 11:45       ` Paolo Bonzini
@ 2013-09-13 12:22         ` Peter Lieven
  2013-09-13 12:58           ` Paolo Bonzini
  2013-09-16 11:30         ` Peter Lieven
  1 sibling, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 12:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

Am 13.09.2013 13:45, schrieb Paolo Bonzini:
> Il 13/09/2013 12:44, Peter Lieven ha scritto:
>> On 13.09.2013 12:34, Paolo Bonzini wrote:
>>> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>>>> +    /* maximum number of sectors that can be discarded at once */
>>>> +    int max_discard;
>>>> +    /* maximum number of sectors that can zeroized at once */
>>>> +    int max_write_zeroes;
>>> These should not be needed outside the driver.
>>>
>>> If you want to make them private between block.c and block/iscsi.c, you
>>> can add them to BlockDriverState.
>> The question is, if the discard_zeroes or discard_write_zeroes is needed
>> outside the driver as well?
>>
>> I can put the max_* information in the block driver state. I also thought
>> to add alignment and granularity information even if they are currently
>> not yet used.
> Yeah, in fact bdrv_write_zeroes and bdrv_discard can be taught to split
> requests according to these parameters instead of introducing a new
> function bdrv_zeroize.  You don't need bdrv_zeroize I think; you can
> simply use bdrv_write_zeroes.  This is why I don't like this information
> in BlockDriverInfo.
bdrv_zeroize has one big advantage over a bdrv_write_zeroes over
the whole device: it checks the block status before it sends requests.
this can be a great performance benefit if a lot of blocks are already
unmapped. so i would like to keep it in, but simplifiy it (see below).

>
> On the contrary, discard_write_zeroes is useful to "generic" clients,
> and your qemu-img patch shows why.
>
> Discard_zeroes is somewhere in the middle.  You only use it in
> bdrv_get_block_status, but it is not something that should be hidden to
> users of the high-level block.c API.  So it is fine to leave it in
> BlockDriverInfo.
okay, i will leave them in and put

max_discard
discard_alignment
max_write_zeroes
write_zeroes_alignment

into the BlockDriverState. I would then like to call out to all
driver developers to set these values for their drivers
to good values.

For now I can factor out the request split logic out of
iscsi_co_discard, iscsi_co_write_zeroes and bdrv_sanitize
and put them in bdrv_co_discard and bdrv_co_write_zeroes.

I would like to leave the misalignment logic to a later patch.

What would you think are reasonable default values for
max_discard and max_write_zeroes?

Peter

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo
  2013-09-13 12:22         ` Peter Lieven
@ 2013-09-13 12:58           ` Paolo Bonzini
  0 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2013-09-13 12:58 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

Il 13/09/2013 14:22, Peter Lieven ha scritto:
> Am 13.09.2013 13:45, schrieb Paolo Bonzini:
>> Il 13/09/2013 12:44, Peter Lieven ha scritto:
>>> On 13.09.2013 12:34, Paolo Bonzini wrote:
>>>> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>>>>> +    /* maximum number of sectors that can be discarded at once */
>>>>> +    int max_discard;
>>>>> +    /* maximum number of sectors that can zeroized at once */
>>>>> +    int max_write_zeroes;
>>>> These should not be needed outside the driver.
>>>>
>>>> If you want to make them private between block.c and block/iscsi.c, you
>>>> can add them to BlockDriverState.
>>> The question is, if the discard_zeroes or discard_write_zeroes is needed
>>> outside the driver as well?
>>>
>>> I can put the max_* information in the block driver state. I also thought
>>> to add alignment and granularity information even if they are currently
>>> not yet used.
>> Yeah, in fact bdrv_write_zeroes and bdrv_discard can be taught to split
>> requests according to these parameters instead of introducing a new
>> function bdrv_zeroize.  You don't need bdrv_zeroize I think; you can
>> simply use bdrv_write_zeroes.  This is why I don't like this information
>> in BlockDriverInfo.
> 
> bdrv_zeroize has one big advantage over a bdrv_write_zeroes over
> the whole device: it checks the block status before it sends requests.
> this can be a great performance benefit if a lot of blocks are already
> unmapped. so i would like to keep it in, but simplifiy it (see below).

Good idea.

> For now I can factor out the request split logic out of
> iscsi_co_discard, iscsi_co_write_zeroes and bdrv_sanitize
> and put them in bdrv_co_discard and bdrv_co_write_zeroes.
> 
> I would like to leave the misalignment logic to a later patch.

Sure.

> What would you think are reasonable default values for
> max_discard and max_write_zeroes?

32768 (16 MB) is the highest power-of-two amount that fits in 16 bits
assuming 512-byte sectors.  But you could also take a look at what the
Linux kernel does in drivers/scsi/sd.c.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 12/12] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 12/12] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks Peter Lieven
@ 2013-09-13 13:53   ` Eric Blake
  2013-09-16  5:47   ` Peter Lieven
  1 sibling, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 13:53 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1388 bytes --]

On 09/13/2013 04:25 AM, Peter Lieven wrote:
> this patch does 2 things:
> a) only do additional call outs if BDRV_BLOCK_ZERO is not already set.
> b) use bdi.discard_zeroes to return the zero state of an unallocated block.
>    the callout to bdrv_has_zero_init() is only valid right after bdrv_create.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 282943b..6698c10 100644
> --- a/block.c
> +++ b/block.c
> @@ -3153,8 +3153,10 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
>          return ret;
>      }
>  
> -    if (!(ret & BDRV_BLOCK_DATA)) {
> -        if (bdrv_has_zero_init(bs)) {
> +    if (!(ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO)) {

Why not avoid the &&:

if ((ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) == 0) {

Then again, any decent compiler will probably already do that
optimization on your behalf.

> +        BlockDriverInfo bdi;
> +        if (bdrv_get_info(bs, &bdi) == 0 &&
> +            bdi.discard_zeroes) {
>              ret |= BDRV_BLOCK_ZERO;
>          } else if (bs->backing_hd) {
>              BlockDriverState *bs2 = bs->backing_hd;
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 01/12] block: make BdrvRequestFlags public
  2013-09-13 10:24 ` [Qemu-devel] [PATCH 01/12] block: make BdrvRequestFlags public Peter Lieven
@ 2013-09-13 16:29   ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 16:29 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 385 bytes --]

On 09/13/2013 04:24 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c               |    5 -----
>  include/block/block.h |    5 +++++
>  2 files changed, 5 insertions(+), 5 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 02/12] block: add flags to bdrv_*_write_zeroes
  2013-09-13 10:24 ` [Qemu-devel] [PATCH 02/12] block: add flags to bdrv_*_write_zeroes Peter Lieven
@ 2013-09-13 16:58   ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 16:58 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1436 bytes --]

On 09/13/2013 04:24 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block-migration.c         |    2 +-
>  block.c                   |   20 +++++++++++---------
>  block/backup.c            |    3 ++-
>  block/qcow2.c             |    2 +-
>  block/qed.c               |    3 ++-
>  block/raw_bsd.c           |    5 +++--
>  block/vmdk.c              |    3 ++-
>  include/block/block.h     |    4 ++--
>  include/block/block_int.h |    2 +-
>  qemu-io-cmds.c            |    2 +-
>  10 files changed, 26 insertions(+), 20 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

>  
> -int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
> +int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
> +                      int nb_sectors, BdrvRequestFlags flags)

Technically, you are allowing any bitwise combination of
BdrvRequestFlags, including combinations that are NOT enumerated values
within BdrvRequestFlags.  There are some type-strict languages where
you'd fail to compile when passing in a non-enum value.  But C allows
this (ab)use of enum types in place of an integer type, and I find it
better documented this way, so I'm fine with leaving it as-is.  That,
and you're just copying existing style already in qemu :)

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 03/12] block: introduce BDRV_REQ_MAY_UNMAP in bdrv_co_write_zeroes
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 03/12] block: introduce BDRV_REQ_MAY_UNMAP in bdrv_co_write_zeroes Peter Lieven
@ 2013-09-13 18:07   ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 18:07 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 947 bytes --]

On 09/13/2013 04:25 AM, Peter Lieven wrote:
> the BDRV_REQ_MAY_UNMAP flag is used to indicate that block driver
> is allowed to optimze a write zeroes request by unmapping (discarding)

s/optimze/optimize/

> blocks if it is guaranteed that the result will read back as
> zeroes.
> 
> the flag is only passed to the driver if the block device is
> opened with BDRV_O_UNMAP.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---

> +++ b/include/block/block.h
> @@ -65,6 +65,7 @@ typedef struct BlockDevOps {
>  typedef enum {
>      BDRV_REQ_COPY_ON_READ = 0x1,
>      BDRV_REQ_ZERO_WRITE   = 0x2,
> +    BDRV_REQ_MAY_UNMAP    = 0x4,

Now that this enum is public, it would be useful to have inline
documentation, rather than making people dig up the git history of when
it was introduced to learn its semantics.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 04/12] iscsi: add .bdrv_co_write_zeroes
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 04/12] iscsi: add .bdrv_co_write_zeroes Peter Lieven
@ 2013-09-13 18:10   ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 18:10 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 776 bytes --]

On 09/13/2013 04:25 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block/iscsi.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 78 insertions(+)
> 

> +    if (iscsilun->zeroblock == NULL) {
> +        iscsilun->zeroblock = g_malloc(iscsilun->block_size);
> +        if (iscsilun->zeroblock == NULL) {
> +            return -ENOMEM;

g_malloc cannot return NULL (it aborts the program on OOM instead), so
this is dead code.

> +        }
> +        memset(iscsilun->zeroblock, 0x00, iscsilun->block_size);

Furthermore, if you had used g_malloc0(), you wouldn't need this memset.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 06/12] iscsi: add .bdrv_get_info
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 06/12] iscsi: add .bdrv_get_info Peter Lieven
@ 2013-09-13 18:19   ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 18:19 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]

On 09/13/2013 04:25 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block/iscsi.c |   24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 

> +
> +    bdi->discard_zeroes =  iscsilun->lbprz;
> +    bdi->discard_write_zeroes =  (bs->open_flags & BDRV_O_UNMAP) &&

Why two spaces after = (two instances)?

But that's cosmetic, so you can fix before adding:

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 07/12] block: introduce bdrv_zeroize
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 07/12] block: introduce bdrv_zeroize Peter Lieven
  2013-09-13 10:47   ` Peter Lieven
@ 2013-09-13 18:23   ` Eric Blake
  1 sibling, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 18:23 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 763 bytes --]

On 09/13/2013 04:25 AM, Peter Lieven wrote:
> this patch adds a call to completely zero out a block device.
> the operation is speed up by checking the block status and

s/speed/sped/

> only writing zeroes to the device if they currently do not
> return zeroes. optionally the zero writing can be speed up

and again

> by setting the flag BDRV_REQ_MAY_UNMAP to emulate the zero
> write by unmapping if the driver supports it.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c               |   45 +++++++++++++++++++++++++++++++++++++++++++++
>  include/block/block.h |    1 +
>  2 files changed, 46 insertions(+)
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert
  2013-09-13 10:36   ` Paolo Bonzini
  2013-09-13 10:46     ` Peter Lieven
@ 2013-09-13 18:25     ` Eric Blake
  2013-09-13 19:48       ` Peter Lieven
  1 sibling, 1 reply; 39+ messages in thread
From: Eric Blake @ 2013-09-13 18:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kwolf, anthony, Peter Lieven, qemu-devel, ronniesahlberg, stefanha

[-- Attachment #1: Type: text/plain, Size: 1081 bytes --]

On 09/13/2013 04:36 AM, Paolo Bonzini wrote:
> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>> if the target has_zero_init = 0, but supports efficiently
>> writing zeroes by unmapping we call bdrv_zeroize to
>> avoid fully allocating the target. this currently
>> is designed especially for iscsi.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>>  qemu-img.c |   22 ++++++++++++++++------
>>  1 file changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/qemu-img.c b/qemu-img.c
>> index 3e5e388..6eaddc6 100644
>> --- a/qemu-img.c
>> +++ b/qemu-img.c
>> @@ -1354,7 +1354,8 @@ static int img_convert(int argc, char **argv)
>>          }
>>      }
>>  
>> -    flags = BDRV_O_RDWR;
>> +    flags = BDRV_O_RDWR | BDRV_O_UNMAP;
> 
> I think this should be a new command-line flag.

I agree - while 'sparse by default' may be reasonable, it is also
feasible to want a mode that guarantees expansion rather than unmapped
or sparse.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] block/get_block_status: set *pnum = 0 on error
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 09/12] block/get_block_status: set *pnum = 0 on error Peter Lieven
@ 2013-09-13 18:26   ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 18:26 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 938 bytes --]

On 09/13/2013 04:25 AM, Peter Lieven wrote:
> if the call is invoked through bdrv_is_allocated the caller might
> expect *pnum = 0 on error. however, a new implementation of
> bdrv_get_block_status might only return a negative exit value on
> error while keeping *pnum untouched.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |    1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/block.c b/block.c
> index 76a6621..4922fb9 100644
> --- a/block.c
> +++ b/block.c
> @@ -3149,6 +3149,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
>  
>      ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
>      if (ret < 0) {
> +        *pnum = 0;
>          return ret;
>      }
>  
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 10/12] block/get_block_status: avoid segfault if there is no backing_hd
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 10/12] block/get_block_status: avoid segfault if there is no backing_hd Peter Lieven
@ 2013-09-13 18:26   ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 18:26 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 915 bytes --]

On 09/13/2013 04:25 AM, Peter Lieven wrote:
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/block.c b/block.c
> index 4922fb9..cf9db62 100644
> --- a/block.c
> +++ b/block.c
> @@ -3156,7 +3156,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
>      if (!(ret & BDRV_BLOCK_DATA)) {
>          if (bdrv_has_zero_init(bs)) {
>              ret |= BDRV_BLOCK_ZERO;
> -        } else {
> +        } else if (bs->backing_hd) {
>              BlockDriverState *bs2 = bs->backing_hd;
>              int64_t length2 = bdrv_getlength(bs2);
>              if (length2 >= 0 && sector_num >= (length2 >> BDRV_SECTOR_BITS)) {
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 11/12] block/get_block_status: avoid redundant callouts on raw devices
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 11/12] block/get_block_status: avoid redundant callouts on raw devices Peter Lieven
@ 2013-09-13 18:28   ` Eric Blake
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 18:28 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1188 bytes --]

On 09/13/2013 04:25 AM, Peter Lieven wrote:
> if a raw device like an iscsi target or host device is used
> the current implementation makes a second call out to get
> the block status of bs->file. however, the raw driver already
> has called bdrv_get_block_status on bs->file.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/block.c b/block.c
> index cf9db62..282943b 100644
> --- a/block.c
> +++ b/block.c
> @@ -3167,7 +3167,8 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
>  
>      if (bs->file &&
>          (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
> -        (ret & BDRV_BLOCK_OFFSET_VALID)) {
> +        (ret & BDRV_BLOCK_OFFSET_VALID) &&
> +        strcmp(bs->drv->format_name, "raw")) {
>          ret2 = bdrv_co_get_block_status(bs->file, ret >> BDRV_SECTOR_BITS,
>                                          *pnum, pnum);
>          if (ret2 >= 0) {
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert
  2013-09-13 18:25     ` Eric Blake
@ 2013-09-13 19:48       ` Peter Lieven
  2013-09-13 19:52         ` Eric Blake
  2013-09-16 11:01         ` Paolo Bonzini
  0 siblings, 2 replies; 39+ messages in thread
From: Peter Lieven @ 2013-09-13 19:48 UTC (permalink / raw)
  To: Eric Blake
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, Paolo Bonzini

Am 13.09.2013 20:25, schrieb Eric Blake:
> On 09/13/2013 04:36 AM, Paolo Bonzini wrote:
>> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>>> if the target has_zero_init = 0, but supports efficiently
>>> writing zeroes by unmapping we call bdrv_zeroize to
>>> avoid fully allocating the target. this currently
>>> is designed especially for iscsi.
>>>
>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>> ---
>>>  qemu-img.c |   22 ++++++++++++++++------
>>>  1 file changed, 16 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/qemu-img.c b/qemu-img.c
>>> index 3e5e388..6eaddc6 100644
>>> --- a/qemu-img.c
>>> +++ b/qemu-img.c
>>> @@ -1354,7 +1354,8 @@ static int img_convert(int argc, char **argv)
>>>          }
>>>      }
>>>  
>>> -    flags = BDRV_O_RDWR;
>>> +    flags = BDRV_O_RDWR | BDRV_O_UNMAP;
>> I think this should be a new command-line flag.
> I agree - while 'sparse by default' may be reasonable, it is also
> feasible to want a mode that guarantees expansion rather than unmapped
> or sparse.
>
Ok, so do you find the proposed -S 0 bei Paolo a good choice?
If this is supplied I would go as far as completly setting
has_zero_init = 0 also for targets which default to 1. This
would guaranteed exspansion and full allocation for all drivers.

Peter

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert
  2013-09-13 19:48       ` Peter Lieven
@ 2013-09-13 19:52         ` Eric Blake
  2013-09-16 11:01         ` Paolo Bonzini
  1 sibling, 0 replies; 39+ messages in thread
From: Eric Blake @ 2013-09-13 19:52 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

On 09/13/2013 01:48 PM, Peter Lieven wrote:
>>>> +    flags = BDRV_O_RDWR | BDRV_O_UNMAP;
>>> I think this should be a new command-line flag.
>> I agree - while 'sparse by default' may be reasonable, it is also
>> feasible to want a mode that guarantees expansion rather than unmapped
>> or sparse.
>>
> Ok, so do you find the proposed -S 0 bei Paolo a good choice?
> If this is supplied I would go as far as completly setting
> has_zero_init = 0 also for targets which default to 1. This
> would guaranteed exspansion and full allocation for all drivers.

Yeah, works for me - an optional mode that guarantees full allocation.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 12/12] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks
  2013-09-13 10:25 ` [Qemu-devel] [PATCH 12/12] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks Peter Lieven
  2013-09-13 13:53   ` Eric Blake
@ 2013-09-16  5:47   ` Peter Lieven
  1 sibling, 0 replies; 39+ messages in thread
From: Peter Lieven @ 2013-09-16  5:47 UTC (permalink / raw)
  To: Peter Lieven
  Cc: kwolf, stefanha, qemu-devel, ronniesahlberg, anthony, pbonzini

On 13.09.2013 12:25, Peter Lieven wrote:
> this patch does 2 things:
> a) only do additional call outs if BDRV_BLOCK_ZERO is not already set.
> b) use bdi.discard_zeroes to return the zero state of an unallocated block.
>     the callout to bdrv_has_zero_init() is only valid right after bdrv_create.
>
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>   block.c |    6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/block.c b/block.c
> index 282943b..6698c10 100644
> --- a/block.c
> +++ b/block.c
> @@ -3153,8 +3153,10 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
>           return ret;
>       }
>   
> -    if (!(ret & BDRV_BLOCK_DATA)) {
> -        if (bdrv_has_zero_init(bs)) {
> +    if (!(ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO)) {
> +        BlockDriverInfo bdi;
> +        if (bdrv_get_info(bs, &bdi) == 0 &&
> +            bdi.discard_zeroes) {
>               ret |= BDRV_BLOCK_ZERO;
>           } else if (bs->backing_hd) {
>               BlockDriverState *bs2 = bs->backing_hd;
Kevin, according to your comment that we need a separate function like
bdrv_has_discard_zeroes. It should also be ok to change the order and
first check for bs->backing_hd here or not?

Peter

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert
  2013-09-13 19:48       ` Peter Lieven
  2013-09-13 19:52         ` Eric Blake
@ 2013-09-16 11:01         ` Paolo Bonzini
  1 sibling, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2013-09-16 11:01 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, anthony, stefanha, qemu-devel, ronniesahlberg

Il 13/09/2013 21:48, Peter Lieven ha scritto:
> Am 13.09.2013 20:25, schrieb Eric Blake:
>> On 09/13/2013 04:36 AM, Paolo Bonzini wrote:
>>> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>>>> if the target has_zero_init = 0, but supports efficiently
>>>> writing zeroes by unmapping we call bdrv_zeroize to
>>>> avoid fully allocating the target. this currently
>>>> is designed especially for iscsi.
>>>>
>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>> ---
>>>>  qemu-img.c |   22 ++++++++++++++++------
>>>>  1 file changed, 16 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/qemu-img.c b/qemu-img.c
>>>> index 3e5e388..6eaddc6 100644
>>>> --- a/qemu-img.c
>>>> +++ b/qemu-img.c
>>>> @@ -1354,7 +1354,8 @@ static int img_convert(int argc, char **argv)
>>>>          }
>>>>      }
>>>>  
>>>> -    flags = BDRV_O_RDWR;
>>>> +    flags = BDRV_O_RDWR | BDRV_O_UNMAP;
>>> I think this should be a new command-line flag.
>> I agree - while 'sparse by default' may be reasonable, it is also
>> feasible to want a mode that guarantees expansion rather than unmapped
>> or sparse.
>>
> Ok, so do you find the proposed -S 0 bei Paolo a good choice?
> If this is supplied I would go as far as completly setting
> has_zero_init = 0 also for targets which default to 1. This
> would guaranteed exspansion and full allocation for all drivers.

Sounds good.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo
  2013-09-13 11:45       ` Paolo Bonzini
  2013-09-13 12:22         ` Peter Lieven
@ 2013-09-16 11:30         ` Peter Lieven
  2013-09-16 11:37           ` Paolo Bonzini
  1 sibling, 1 reply; 39+ messages in thread
From: Peter Lieven @ 2013-09-16 11:30 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

On 13.09.2013 13:45, Paolo Bonzini wrote:
> Il 13/09/2013 12:44, Peter Lieven ha scritto:
>> On 13.09.2013 12:34, Paolo Bonzini wrote:
>>> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>>>> +    /* maximum number of sectors that can be discarded at once */
>>>> +    int max_discard;
>>>> +    /* maximum number of sectors that can zeroized at once */
>>>> +    int max_write_zeroes;
>>> These should not be needed outside the driver.
>>>
>>> If you want to make them private between block.c and block/iscsi.c, you
>>> can add them to BlockDriverState.
>> The question is, if the discard_zeroes or discard_write_zeroes is needed
>> outside the driver as well?
>>
>> I can put the max_* information in the block driver state. I also thought
>> to add alignment and granularity information even if they are currently
>> not yet used.
> Yeah, in fact bdrv_write_zeroes and bdrv_discard can be taught to split
> requests according to these parameters instead of introducing a new
> function bdrv_zeroize.  You don't need bdrv_zeroize I think; you can
> simply use bdrv_write_zeroes.  This is why I don't like this information
> in BlockDriverInfo.
>
> On the contrary, discard_write_zeroes is useful to "generic" clients,
> and your qemu-img patch shows why.
>
> Discard_zeroes is somewhere in the middle.  You only use it in
> bdrv_get_block_status, but it is not something that should be hidden to
> users of the high-level block.c API.  So it is fine to leave it in
> BlockDriverInfo.

Would you also be ok to introduce bdrv_has_discard_zeroes()
and bdrv_has_discard_write_zeroes() as Kevin suggested to avoid the need to
add the logic to return 0 if there is a bs->backing_hd everywhere.

This would also make the use of it easier as it avoids the steps
necessary to invoke bdrv_get_info().

Peter

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo
  2013-09-16 11:30         ` Peter Lieven
@ 2013-09-16 11:37           ` Paolo Bonzini
  0 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2013-09-16 11:37 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, ronniesahlberg, stefanha, qemu-devel, anthony

Il 16/09/2013 13:30, Peter Lieven ha scritto:
> On 13.09.2013 13:45, Paolo Bonzini wrote:
>> Il 13/09/2013 12:44, Peter Lieven ha scritto:
>>> On 13.09.2013 12:34, Paolo Bonzini wrote:
>>>> Il 13/09/2013 12:25, Peter Lieven ha scritto:
>>>>> +    /* maximum number of sectors that can be discarded at once */
>>>>> +    int max_discard;
>>>>> +    /* maximum number of sectors that can zeroized at once */
>>>>> +    int max_write_zeroes;
>>>> These should not be needed outside the driver.
>>>>
>>>> If you want to make them private between block.c and block/iscsi.c, you
>>>> can add them to BlockDriverState.
>>> The question is, if the discard_zeroes or discard_write_zeroes is needed
>>> outside the driver as well?
>>>
>>> I can put the max_* information in the block driver state. I also
>>> thought
>>> to add alignment and granularity information even if they are currently
>>> not yet used.
>> Yeah, in fact bdrv_write_zeroes and bdrv_discard can be taught to split
>> requests according to these parameters instead of introducing a new
>> function bdrv_zeroize.  You don't need bdrv_zeroize I think; you can
>> simply use bdrv_write_zeroes.  This is why I don't like this information
>> in BlockDriverInfo.
>>
>> On the contrary, discard_write_zeroes is useful to "generic" clients,
>> and your qemu-img patch shows why.
>>
>> Discard_zeroes is somewhere in the middle.  You only use it in
>> bdrv_get_block_status, but it is not something that should be hidden to
>> users of the high-level block.c API.  So it is fine to leave it in
>> BlockDriverInfo.
> 
> Would you also be ok to introduce bdrv_has_discard_zeroes()
> and bdrv_has_discard_write_zeroes() as Kevin suggested to avoid the need to
> add the logic to return 0 if there is a bs->backing_hd everywhere.

If Kevin says it, I agree. :)

> This would also make the use of it easier as it avoids the steps
> necessary to invoke bdrv_get_info().

Another possibility could be to put the "info" in a member of
BlockDriverState, since it is static.  Then bdrv_get_info() could be simply

    return &bs->info;

which is faster and removes the need for a BlockDriverInfo temporary.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2013-09-16 11:38 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-13 10:24 [Qemu-devel] [PATCH 00/12] block: logical block provisioning enhancements Peter Lieven
2013-09-13 10:24 ` [Qemu-devel] [PATCH 01/12] block: make BdrvRequestFlags public Peter Lieven
2013-09-13 16:29   ` Eric Blake
2013-09-13 10:24 ` [Qemu-devel] [PATCH 02/12] block: add flags to bdrv_*_write_zeroes Peter Lieven
2013-09-13 16:58   ` Eric Blake
2013-09-13 10:25 ` [Qemu-devel] [PATCH 03/12] block: introduce BDRV_REQ_MAY_UNMAP in bdrv_co_write_zeroes Peter Lieven
2013-09-13 18:07   ` Eric Blake
2013-09-13 10:25 ` [Qemu-devel] [PATCH 04/12] iscsi: add .bdrv_co_write_zeroes Peter Lieven
2013-09-13 18:10   ` Eric Blake
2013-09-13 10:25 ` [Qemu-devel] [PATCH 05/12] block: add logical block provisioning information to BlockDriverInfo Peter Lieven
2013-09-13 10:34   ` Paolo Bonzini
2013-09-13 10:44     ` Peter Lieven
2013-09-13 11:45       ` Paolo Bonzini
2013-09-13 12:22         ` Peter Lieven
2013-09-13 12:58           ` Paolo Bonzini
2013-09-16 11:30         ` Peter Lieven
2013-09-16 11:37           ` Paolo Bonzini
2013-09-13 10:25 ` [Qemu-devel] [PATCH 06/12] iscsi: add .bdrv_get_info Peter Lieven
2013-09-13 18:19   ` Eric Blake
2013-09-13 10:25 ` [Qemu-devel] [PATCH 07/12] block: introduce bdrv_zeroize Peter Lieven
2013-09-13 10:47   ` Peter Lieven
2013-09-13 18:23   ` Eric Blake
2013-09-13 10:25 ` [Qemu-devel] [PATCH 08/12] qemu-img: conditionally zero out target on convert Peter Lieven
2013-09-13 10:36   ` Paolo Bonzini
2013-09-13 10:46     ` Peter Lieven
2013-09-13 11:35       ` Paolo Bonzini
2013-09-13 18:25     ` Eric Blake
2013-09-13 19:48       ` Peter Lieven
2013-09-13 19:52         ` Eric Blake
2013-09-16 11:01         ` Paolo Bonzini
2013-09-13 10:25 ` [Qemu-devel] [PATCH 09/12] block/get_block_status: set *pnum = 0 on error Peter Lieven
2013-09-13 18:26   ` Eric Blake
2013-09-13 10:25 ` [Qemu-devel] [PATCH 10/12] block/get_block_status: avoid segfault if there is no backing_hd Peter Lieven
2013-09-13 18:26   ` Eric Blake
2013-09-13 10:25 ` [Qemu-devel] [PATCH 11/12] block/get_block_status: avoid redundant callouts on raw devices Peter Lieven
2013-09-13 18:28   ` Eric Blake
2013-09-13 10:25 ` [Qemu-devel] [PATCH 12/12] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks Peter Lieven
2013-09-13 13:53   ` Eric Blake
2013-09-16  5:47   ` Peter Lieven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.