All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 00/46] Block patches
@ 2012-04-05 15:51 Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 01/46] trace-events: Rename 'next' argument Kevin Wolf
                   ` (39 more replies)
  0 siblings, 40 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

The following changes since commit 8f8d364f2447e58768132fc10f48a67af371ee38:

  Merge branch 's390-for-upstream' of git://repo.or.cz/qemu/agraf (2012-04-04 20:45:03 +0000)

are available in the git repository at:

  git://repo.or.cz/qemu/kevin.git for-anthony

Benoît Canet (7):
      block: Add new BDRV_O_INCOMING flag to notice incoming live migration
      block: add a function to clear incoming live migration flags
      blockdev: open images with BDRV_O_INCOMING on incoming live migration
      qed: add bdrv_invalidate_cache to be called after incoming live migration
      migration: clear BDRV_O_INCOMING flags on end of incoming live migration
      qed: honor BDRV_O_INCOMING for incoming live migration
      qed: remove incoming live migration blocker

David Gibson (1):
      Use DMADirection type for dma_bdrv_io

Dong Xu Wang (4):
      qemu-img: add image fragmentation statistics
      qed: image fragmentation statistics
      qemu-img: add dirty flag status
      qed: track dirty flag status

Floris Bos (3):
      ide: Add "model=s" qdev option
      ide: Change serial number strncpy() to pstrcpy()
      ide: Adds wwn=hex qdev option

Jeff Cody (1):
      block: bdrv_append() fixes

Kevin Wolf (6):
      trace-events: Rename 'next' argument
      tracetool: Forbid argument name 'next'
      qcow2: Remove unused parameter in get_cluster_table()
      ide: IDENTIFY word 86 bit 14 is reserved
      qemu-iotests: qcow2.py
      qemu-iotests: Test unknown qcow2 header extensions

Liu Yuan (2):
      sheepdog: implement SD_OP_FLUSH_VDI operation
      sheepdog: fix send req helpers

Marcelo Tosatti (1):
      block stream: close unused files and update ->backing_hd

Paolo Bonzini (13):
      block: push recursive flushing up from drivers
      aio: move BlockDriverAIOCB to qemu-aio.h
      vdi: basic conversion to coroutines
      vdi: move end-of-I/O handling at the end
      vdi: merge aio_read_cb and aio_write_cb into callers
      vdi: move aiocb fields to locals
      vdi: leave bounce buffering to block layer
      vdi: do not create useless iovecs
      vdi: change goto to loop
      block: cancel jobs when a device is ready to go away
      block: fix streaming/closing race
      block: set job->speed in block_set_speed
      block: document job API

Stefan Hajnoczi (4):
      qemu-io: add option to enable tracing
      qerror: fix QERR_PROPERTY_VALUE_OUT_OF_RANGE description
      qdev: add blocksize property type
      block: enforce constraints on block size properties

Stefan Weil (2):
      qemu-iotests: Fix call syntax for qemu-img
      qemu-iotests: Fix call syntax for qemu-io

Zhang Shengju (1):
      block/vpc: write checksum back to footer after check

Zhi Yong Wu (1):
      block: disable I/O throttling on sync api

 block.c                      |   88 ++++++++-
 block.h                      |   21 ++-
 block/blkdebug.c             |    7 -
 block/cow.c                  |    6 -
 block/qcow.c                 |    6 -
 block/qcow2-cluster.c        |   18 +-
 block/qcow2.c                |    6 -
 block/qed-check.c            |    9 +
 block/qed.c                  |   33 ++--
 block/qed.h                  |    2 -
 block/raw.c                  |    6 -
 block/sheepdog.c             |  144 +++++++++++++--
 block/stream.c               |   41 ++++-
 block/vdi.c                  |  429 +++++++++++-------------------------------
 block/vmdk.c                 |    4 +-
 block/vpc.c                  |    9 +-
 block_int.h                  |  135 +++++++++++--
 blockdev.c                   |   12 +-
 dma-helpers.c                |   21 ++-
 dma.h                        |   12 +-
 hw/ide/core.c                |   60 +++++--
 hw/ide/internal.h            |    7 +-
 hw/ide/macio.c               |    3 +-
 hw/ide/qdev.c                |    7 +-
 hw/lsi53c895a.c              |    1 -
 hw/qdev-properties.c         |   46 +++++
 hw/qdev.h                    |    3 +
 linux-aio.c                  |    1 -
 migration.c                  |    1 +
 qemu-aio.h                   |   21 ++
 qemu-img.c                   |   12 +-
 qemu-io.c                    |   10 +-
 qerror.c                     |    7 +-
 qerror.h                     |    4 +
 scripts/tracetool            |    4 +
 tests/qemu-iotests/009       |    4 +-
 tests/qemu-iotests/010       |    6 +-
 tests/qemu-iotests/011       |    2 +-
 tests/qemu-iotests/031       |   72 +++++++
 tests/qemu-iotests/031.out   |   76 ++++++++
 tests/qemu-iotests/common.rc |    9 +-
 tests/qemu-iotests/group     |    1 +
 tests/qemu-iotests/qcow2.py  |  207 ++++++++++++++++++++
 trace-events                 |    2 +-
 44 files changed, 1081 insertions(+), 494 deletions(-)
 create mode 100755 tests/qemu-iotests/031
 create mode 100644 tests/qemu-iotests/031.out
 create mode 100755 tests/qemu-iotests/qcow2.py

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 01/46] trace-events: Rename 'next' argument
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 02/46] tracetool: Forbid argument name 'next' Kevin Wolf
                   ` (38 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

'next' is a systemtap keyword, so it's a bad idea to use it as an
argument name.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 trace-events |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/trace-events b/trace-events
index db2cd39..a5f276d 100644
--- a/trace-events
+++ b/trace-events
@@ -562,7 +562,7 @@ qemu_coroutine_terminate(void *co) "self %p"
 
 # qemu-coroutine-lock.c
 qemu_co_queue_next_bh(void) ""
-qemu_co_queue_next(void *next) "next %p"
+qemu_co_queue_next(void *nxt) "next %p"
 qemu_co_mutex_lock_entry(void *mutex, void *self) "mutex %p self %p"
 qemu_co_mutex_lock_return(void *mutex, void *self) "mutex %p self %p"
 qemu_co_mutex_unlock_entry(void *mutex, void *self) "mutex %p self %p"
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 02/46] tracetool: Forbid argument name 'next'
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 01/46] trace-events: Rename 'next' argument Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 03/46] qcow2: Remove unused parameter in get_cluster_table() Kevin Wolf
                   ` (37 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

It has happened more than once that patches that look perfectly sane
and work with simpletrace broke systemtap because they use 'next' as an
argument name for a tracing function. However, 'next' is a keyword for
systemtap, so we shouldn't use it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 scripts/tracetool |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/scripts/tracetool b/scripts/tracetool
index 47389b6..7b1c142 100755
--- a/scripts/tracetool
+++ b/scripts/tracetool
@@ -81,6 +81,10 @@ get_args()
     args=${1#*\(}
     args=${args%%\)*}
     echo "$args"
+
+    if (echo "$args" | grep "[ *]next\($\|[, ]\)" > /dev/null 2>&1); then
+        echo -e "\n#error 'next' is a bad argument name (clash with systemtap keyword)\n "
+    fi
 }
 
 # Get the argument name list of a trace event
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 03/46] qcow2: Remove unused parameter in get_cluster_table()
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 01/46] trace-events: Rename 'next' argument Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 02/46] tracetool: Forbid argument name 'next' Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 04/46] qemu-io: add option to enable tracing Kevin Wolf
                   ` (36 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

Since everything goes through the cache, callers don't use the L2 table
offset any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 block/qcow2-cluster.c |   18 ++++++++----------
 1 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index e0fb907..cbd224d 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -466,7 +466,6 @@ out:
  */
 static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
                              uint64_t **new_l2_table,
-                             uint64_t *new_l2_offset,
                              int *new_l2_index)
 {
     BDRVQcowState *s = bs->opaque;
@@ -514,7 +513,6 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
     l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
 
     *new_l2_table = l2_table;
-    *new_l2_offset = l2_offset;
     *new_l2_index = l2_index;
 
     return 0;
@@ -539,11 +537,11 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
 {
     BDRVQcowState *s = bs->opaque;
     int l2_index, ret;
-    uint64_t l2_offset, *l2_table;
+    uint64_t *l2_table;
     int64_t cluster_offset;
     int nb_csectors;
 
-    ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index);
+    ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
     if (ret < 0) {
         return 0;
     }
@@ -588,7 +586,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
 {
     BDRVQcowState *s = bs->opaque;
     int i, j = 0, l2_index, ret;
-    uint64_t *old_cluster, start_sect, l2_offset, *l2_table;
+    uint64_t *old_cluster, start_sect, *l2_table;
     uint64_t cluster_offset = m->alloc_offset;
     bool cow = false;
 
@@ -633,7 +631,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
     }
 
     qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache);
-    ret = get_cluster_table(bs, m->offset, &l2_table, &l2_offset, &l2_index);
+    ret = get_cluster_table(bs, m->offset, &l2_table, &l2_index);
     if (ret < 0) {
         goto err;
     }
@@ -817,7 +815,7 @@ int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
 {
     BDRVQcowState *s = bs->opaque;
     int l2_index, ret, sectors;
-    uint64_t l2_offset, *l2_table;
+    uint64_t *l2_table;
     unsigned int nb_clusters, keep_clusters;
     uint64_t cluster_offset;
 
@@ -825,7 +823,7 @@ int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
                                       n_start, n_end);
 
     /* Find L2 entry for the first involved cluster */
-    ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index);
+    ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
     if (ret < 0) {
         return ret;
     }
@@ -1000,12 +998,12 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
     unsigned int nb_clusters)
 {
     BDRVQcowState *s = bs->opaque;
-    uint64_t l2_offset, *l2_table;
+    uint64_t *l2_table;
     int l2_index;
     int ret;
     int i;
 
-    ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index);
+    ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
     if (ret < 0) {
         return ret;
     }
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 04/46] qemu-io: add option to enable tracing
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (2 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 03/46] qcow2: Remove unused parameter in get_cluster_table() Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 05/46] block: push recursive flushing up from drivers Kevin Wolf
                   ` (35 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

It can be useful to enable QEMU tracing when trying out block layer
interfaces via qemu-io.  Tracing can be enabled using the new -T FILE
option where the given file contains a list of trace events to enable
(just like the qemu --trace events=FILE option).

  $ echo qemu_vfree >my-events
  $ ./qemu-io -T my-events ...

Remember to use ./configure --enable-trace-backend=BACKEND when building
qemu-io.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 qemu-io.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index 3189530..e6fcd77 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -17,6 +17,7 @@
 #include "qemu-common.h"
 #include "block_int.h"
 #include "cmd.h"
+#include "trace/control.h"
 
 #define VERSION	"0.0.1"
 
@@ -1783,6 +1784,7 @@ static void usage(const char *name)
 "  -g, --growable       allow file to grow (only applies to protocols)\n"
 "  -m, --misalign       misalign allocations for O_DIRECT\n"
 "  -k, --native-aio     use kernel AIO implementation (on Linux only)\n"
+"  -T, --trace FILE     enable trace events listed in the given file\n"
 "  -h, --help           display this help and exit\n"
 "  -V, --version        output version information and exit\n"
 "\n",
@@ -1794,7 +1796,7 @@ int main(int argc, char **argv)
 {
     int readonly = 0;
     int growable = 0;
-    const char *sopt = "hVc:rsnmgk";
+    const char *sopt = "hVc:rsnmgkT:";
     const struct option lopt[] = {
         { "help", 0, NULL, 'h' },
         { "version", 0, NULL, 'V' },
@@ -1806,6 +1808,7 @@ int main(int argc, char **argv)
         { "misalign", 0, NULL, 'm' },
         { "growable", 0, NULL, 'g' },
         { "native-aio", 0, NULL, 'k' },
+        { "trace", 1, NULL, 'T' },
         { NULL, 0, NULL, 0 }
     };
     int c;
@@ -1837,6 +1840,11 @@ int main(int argc, char **argv)
         case 'k':
             flags |= BDRV_O_NATIVE_AIO;
             break;
+        case 'T':
+            if (!trace_backend_init(optarg, NULL)) {
+                exit(1); /* error message will have been printed */
+            }
+            break;
         case 'V':
             printf("%s version %s\n", progname, VERSION);
             exit(0);
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 05/46] block: push recursive flushing up from drivers
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (3 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 04/46] qemu-io: add option to enable tracing Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 07/46] ide: IDENTIFY word 86 bit 14 is reserved Kevin Wolf
                   ` (34 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c          |   22 ++++++++++++++--------
 block/blkdebug.c |    7 -------
 block/cow.c      |    6 ------
 block/qcow.c     |    6 ------
 block/qcow2.c    |    6 ------
 block/qed.c      |    8 --------
 block/raw.c      |    6 ------
 block/vdi.c      |    8 --------
 block/vmdk.c     |    4 ++--
 block/vpc.c      |    6 ------
 10 files changed, 16 insertions(+), 63 deletions(-)

diff --git a/block.c b/block.c
index b88ee90..8858be0 100644
--- a/block.c
+++ b/block.c
@@ -2331,9 +2331,7 @@ void bdrv_flush_all(void)
     BlockDriverState *bs;
 
     QTAILQ_FOREACH(bs, &bdrv_states, list) {
-        if (!bdrv_is_read_only(bs) && bdrv_is_inserted(bs)) {
-            bdrv_flush(bs);
-        }
+        bdrv_flush(bs);
     }
 }
 
@@ -3520,7 +3518,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 {
     int ret;
 
-    if (!bs->drv) {
+    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
         return 0;
     }
 
@@ -3538,7 +3536,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
     }
 
     if (bs->drv->bdrv_co_flush_to_disk) {
-        return bs->drv->bdrv_co_flush_to_disk(bs);
+        ret = bs->drv->bdrv_co_flush_to_disk(bs);
     } else if (bs->drv->bdrv_aio_flush) {
         BlockDriverAIOCB *acb;
         CoroutineIOCompletion co = {
@@ -3547,10 +3545,10 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 
         acb = bs->drv->bdrv_aio_flush(bs, bdrv_co_io_em_complete, &co);
         if (acb == NULL) {
-            return -EIO;
+            ret = -EIO;
         } else {
             qemu_coroutine_yield();
-            return co.ret;
+            ret = co.ret;
         }
     } else {
         /*
@@ -3564,8 +3562,16 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
          *
          * Let's hope the user knows what he's doing.
          */
-        return 0;
+        ret = 0;
     }
+    if (ret < 0) {
+        return ret;
+    }
+
+    /* Now flush the underlying protocol.  It will also have BDRV_O_NO_FLUSH
+     * in the case of cache=unsafe, so there are no useless flushes.
+     */
+    return bdrv_co_flush(bs->file);
 }
 
 void bdrv_invalidate_cache(BlockDriverState *bs)
diff --git a/block/blkdebug.c b/block/blkdebug.c
index a251802..e56e37d 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -397,12 +397,6 @@ static void blkdebug_close(BlockDriverState *bs)
     }
 }
 
-static BlockDriverAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
-    BlockDriverCompletionFunc *cb, void *opaque)
-{
-    return bdrv_aio_flush(bs->file, cb, opaque);
-}
-
 static void process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
     BlkdebugVars *old_vars)
 {
@@ -452,7 +446,6 @@ static BlockDriver bdrv_blkdebug = {
 
     .bdrv_aio_readv     = blkdebug_aio_readv,
     .bdrv_aio_writev    = blkdebug_aio_writev,
-    .bdrv_aio_flush     = blkdebug_aio_flush,
 
     .bdrv_debug_event   = blkdebug_debug_event,
 };
diff --git a/block/cow.c b/block/cow.c
index bb5927c..8d3c9f8 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -318,11 +318,6 @@ exit:
     return ret;
 }
 
-static coroutine_fn int cow_co_flush(BlockDriverState *bs)
-{
-    return bdrv_co_flush(bs->file);
-}
-
 static QEMUOptionParameter cow_create_options[] = {
     {
         .name = BLOCK_OPT_SIZE,
@@ -348,7 +343,6 @@ static BlockDriver bdrv_cow = {
 
     .bdrv_read              = cow_co_read,
     .bdrv_write             = cow_co_write,
-    .bdrv_co_flush_to_disk  = cow_co_flush,
     .bdrv_co_is_allocated   = cow_co_is_allocated,
 
     .create_options = cow_create_options,
diff --git a/block/qcow.c b/block/qcow.c
index b1cfe1f..35dff49 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -835,11 +835,6 @@ fail:
     return ret;
 }
 
-static coroutine_fn int qcow_co_flush(BlockDriverState *bs)
-{
-    return bdrv_co_flush(bs->file);
-}
-
 static int qcow_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
     BDRVQcowState *s = bs->opaque;
@@ -877,7 +872,6 @@ static BlockDriver bdrv_qcow = {
 
     .bdrv_co_readv          = qcow_co_readv,
     .bdrv_co_writev         = qcow_co_writev,
-    .bdrv_co_flush_to_disk  = qcow_co_flush,
     .bdrv_co_is_allocated   = qcow_co_is_allocated,
 
     .bdrv_set_key           = qcow_set_key,
diff --git a/block/qcow2.c b/block/qcow2.c
index 7aece65..70d3141 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1253,11 +1253,6 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
     return 0;
 }
 
-static coroutine_fn int qcow2_co_flush_to_disk(BlockDriverState *bs)
-{
-    return bdrv_co_flush(bs->file);
-}
-
 static int64_t qcow2_vm_state_offset(BDRVQcowState *s)
 {
 	return (int64_t)s->l1_vm_state_index << (s->cluster_bits + s->l2_bits);
@@ -1377,7 +1372,6 @@ static BlockDriver bdrv_qcow2 = {
     .bdrv_co_readv          = qcow2_co_readv,
     .bdrv_co_writev         = qcow2_co_writev,
     .bdrv_co_flush_to_os    = qcow2_co_flush_to_os,
-    .bdrv_co_flush_to_disk  = qcow2_co_flush_to_disk,
 
     .bdrv_co_discard        = qcow2_co_discard,
     .bdrv_truncate          = qcow2_truncate,
diff --git a/block/qed.c b/block/qed.c
index a041d31..d6c1580 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1350,13 +1350,6 @@ static BlockDriverAIOCB *bdrv_qed_aio_writev(BlockDriverState *bs,
                          opaque, QED_AIOCB_WRITE);
 }
 
-static BlockDriverAIOCB *bdrv_qed_aio_flush(BlockDriverState *bs,
-                                            BlockDriverCompletionFunc *cb,
-                                            void *opaque)
-{
-    return bdrv_aio_flush(bs->file, cb, opaque);
-}
-
 typedef struct {
     Coroutine *co;
     int ret;
@@ -1562,7 +1555,6 @@ static BlockDriver bdrv_qed = {
     .bdrv_make_empty          = bdrv_qed_make_empty,
     .bdrv_aio_readv           = bdrv_qed_aio_readv,
     .bdrv_aio_writev          = bdrv_qed_aio_writev,
-    .bdrv_aio_flush           = bdrv_qed_aio_flush,
     .bdrv_co_write_zeroes     = bdrv_qed_co_write_zeroes,
     .bdrv_truncate            = bdrv_qed_truncate,
     .bdrv_getlength           = bdrv_qed_getlength,
diff --git a/block/raw.c b/block/raw.c
index 1cdac0c..7086e31 100644
--- a/block/raw.c
+++ b/block/raw.c
@@ -25,11 +25,6 @@ static void raw_close(BlockDriverState *bs)
 {
 }
 
-static int coroutine_fn raw_co_flush(BlockDriverState *bs)
-{
-    return bdrv_co_flush(bs->file);
-}
-
 static int64_t raw_getlength(BlockDriverState *bs)
 {
     return bdrv_getlength(bs->file);
@@ -113,7 +108,6 @@ static BlockDriver bdrv_raw = {
 
     .bdrv_co_readv          = raw_co_readv,
     .bdrv_co_writev         = raw_co_writev,
-    .bdrv_co_flush_to_disk  = raw_co_flush,
     .bdrv_co_discard        = raw_co_discard,
 
     .bdrv_probe         = raw_probe,
diff --git a/block/vdi.c b/block/vdi.c
index 6a0011f..f8fa865 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -930,13 +930,6 @@ static void vdi_close(BlockDriverState *bs)
     error_free(s->migration_blocker);
 }
 
-static coroutine_fn int vdi_co_flush(BlockDriverState *bs)
-{
-    logout("\n");
-    return bdrv_co_flush(bs->file);
-}
-
-
 static QEMUOptionParameter vdi_create_options[] = {
     {
         .name = BLOCK_OPT_SIZE,
@@ -969,7 +962,6 @@ static BlockDriver bdrv_vdi = {
     .bdrv_open = vdi_open,
     .bdrv_close = vdi_close,
     .bdrv_create = vdi_create,
-    .bdrv_co_flush_to_disk = vdi_co_flush,
     .bdrv_co_is_allocated = vdi_co_is_allocated,
     .bdrv_make_empty = vdi_make_empty,
 
diff --git a/block/vmdk.c b/block/vmdk.c
index 45c003a..18e9b4c 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1525,10 +1525,10 @@ static void vmdk_close(BlockDriverState *bs)
 
 static coroutine_fn int vmdk_co_flush(BlockDriverState *bs)
 {
-    int i, ret, err;
     BDRVVmdkState *s = bs->opaque;
+    int i, err;
+    int ret = 0;
 
-    ret = bdrv_co_flush(bs->file);
     for (i = 0; i < s->num_extents; i++) {
         err = bdrv_co_flush(s->extents[i].file);
         if (err < 0) {
diff --git a/block/vpc.c b/block/vpc.c
index 6b4816f..706faf3 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -507,11 +507,6 @@ static coroutine_fn int vpc_co_write(BlockDriverState *bs, int64_t sector_num,
     return ret;
 }
 
-static coroutine_fn int vpc_co_flush(BlockDriverState *bs)
-{
-    return bdrv_co_flush(bs->file);
-}
-
 /*
  * Calculates the number of cylinders, heads and sectors per cylinder
  * based on a given number of sectors. This is the algorithm described
@@ -789,7 +784,6 @@ static BlockDriver bdrv_vpc = {
 
     .bdrv_read              = vpc_co_read,
     .bdrv_write             = vpc_co_write,
-    .bdrv_co_flush_to_disk  = vpc_co_flush,
 
     .create_options = vpc_create_options,
 };
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 07/46] ide: IDENTIFY word 86 bit 14 is reserved
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (4 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 05/46] block: push recursive flushing up from drivers Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 08/46] ide: Add "model=s" qdev option Kevin Wolf
                   ` (33 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

Reserved bits should be cleared to zero.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/ide/core.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 6f06d28..771811c 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -150,7 +150,7 @@ static void ide_identify(IDEState *s)
     else
          put_le16(p + 85, (1 << 14) | 1);
     /* 13=flush_cache_ext,12=flush_cache,10=lba48 */
-    put_le16(p + 86, (1 << 14) | (1 << 13) | (1 <<12) | (1 << 10));
+    put_le16(p + 86, (1 << 13) | (1 <<12) | (1 << 10));
     /* 14=set to 1, 1=smart self test, 0=smart error logging */
     put_le16(p + 87, (1 << 14) | 0);
     put_le16(p + 88, 0x3f | (1 << 13)); /* udma5 set and supported */
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 08/46] ide: Add "model=s" qdev option
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (5 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 07/46] ide: IDENTIFY word 86 bit 14 is reserved Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 09/46] ide: Change serial number strncpy() to pstrcpy() Kevin Wolf
                   ` (32 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Floris Bos <bos@je-eigen-domein.nl>

Allow the user to override the default disk model name "QEMU HARDDISK".

Some Linux distributions use the /dev/disk/by-id/scsi-SATA_name-of-disk-
model_serial addressing scheme when refering to partitions in /etc/fstab
and elsewhere. This causes problems when starting a disk image taken from
an existing physical server under qemu, because when running under qemu
name-of-disk-model is always "QEMU HARDDISK".

This patch introduces a model=s option which in combination with the
existing serial=s option can be used to fake the disk the operating
system was previously on, allowing the OS to boot properly.

Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 hw/ide/core.c     |   27 ++++++++++++++++++++++-----
 hw/ide/internal.h |    4 +++-
 hw/ide/qdev.c     |    6 ++++--
 3 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 771811c..e38cace 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -100,7 +100,7 @@ static void ide_identify(IDEState *s)
     put_le16(p + 21, 512); /* cache size in sectors */
     put_le16(p + 22, 4); /* ecc bytes */
     padstr((char *)(p + 23), s->version, 8); /* firmware version */
-    padstr((char *)(p + 27), "QEMU HARDDISK", 40); /* model */
+    padstr((char *)(p + 27), s->drive_model_str, 40); /* model */
 #if MAX_MULT_SECTORS > 1
     put_le16(p + 47, 0x8000 | MAX_MULT_SECTORS);
 #endif
@@ -188,7 +188,7 @@ static void ide_atapi_identify(IDEState *s)
     put_le16(p + 21, 512); /* cache size in sectors */
     put_le16(p + 22, 4); /* ecc bytes */
     padstr((char *)(p + 23), s->version, 8); /* firmware version */
-    padstr((char *)(p + 27), "QEMU DVD-ROM", 40); /* model */
+    padstr((char *)(p + 27), s->drive_model_str, 40); /* model */
     put_le16(p + 48, 1); /* dword I/O (XXX: should not be set on CDROM) */
 #ifdef USE_DMA_CDROM
     put_le16(p + 49, 1 << 9 | 1 << 8); /* DMA and LBA supported */
@@ -245,7 +245,7 @@ static void ide_cfata_identify(IDEState *s)
     padstr((char *)(p + 10), s->drive_serial_str, 20); /* serial number */
     put_le16(p + 22, 0x0004);			/* ECC bytes */
     padstr((char *) (p + 23), s->version, 8);	/* Firmware Revision */
-    padstr((char *) (p + 27), "QEMU MICRODRIVE", 40);/* Model number */
+    padstr((char *) (p + 27), s->drive_model_str, 40);/* Model number */
 #if MAX_MULT_SECTORS > 1
     put_le16(p + 47, 0x8000 | MAX_MULT_SECTORS);
 #else
@@ -1833,7 +1833,7 @@ static const BlockDevOps ide_cd_block_ops = {
 };
 
 int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
-                   const char *version, const char *serial)
+                   const char *version, const char *serial, const char *model)
 {
     int cylinders, heads, secs;
     uint64_t nb_sectors;
@@ -1884,6 +1884,22 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
         snprintf(s->drive_serial_str, sizeof(s->drive_serial_str),
                  "QM%05d", s->drive_serial);
     }
+    if (model) {
+        pstrcpy(s->drive_model_str, sizeof(s->drive_model_str), model);
+    } else {
+        switch (kind) {
+        case IDE_CD:
+            strcpy(s->drive_model_str, "QEMU DVD-ROM");
+            break;
+        case IDE_CFATA:
+            strcpy(s->drive_model_str, "QEMU MICRODRIVE");
+            break;
+        default:
+            strcpy(s->drive_model_str, "QEMU HARDDISK");
+            break;
+        }
+    }
+
     if (version) {
         pstrcpy(s->version, sizeof(s->version), version);
     } else {
@@ -1976,7 +1992,8 @@ void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo *hd0,
         if (dinfo) {
             if (ide_init_drive(&bus->ifs[i], dinfo->bdrv,
                                dinfo->media_cd ? IDE_CD : IDE_HD, NULL,
-                               *dinfo->serial ? dinfo->serial : NULL) < 0) {
+                               *dinfo->serial ? dinfo->serial : NULL,
+                               NULL) < 0) {
                 error_report("Can't set up IDE drive %s", dinfo->id);
                 exit(1);
             }
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index c808a0d..b1319dc 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -348,6 +348,7 @@ struct IDEState {
     uint8_t identify_data[512];
     int drive_serial;
     char drive_serial_str[21];
+    char drive_model_str[41];
     /* ide regs */
     uint8_t feature;
     uint8_t error;
@@ -468,6 +469,7 @@ struct IDEDevice {
     BlockConf conf;
     char *version;
     char *serial;
+    char *model;
 };
 
 #define BM_STATUS_DMAING 0x01
@@ -534,7 +536,7 @@ void ide_data_writel(void *opaque, uint32_t addr, uint32_t val);
 uint32_t ide_data_readl(void *opaque, uint32_t addr);
 
 int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
-                   const char *version, const char *serial);
+                   const char *version, const char *serial, const char *model);
 void ide_init2(IDEBus *bus, qemu_irq irq);
 void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo *hd0,
                                     DriveInfo *hd1, qemu_irq irq);
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index f6a4896..07227c7 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -136,7 +136,8 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind)
         }
     }
 
-    if (ide_init_drive(s, dev->conf.bs, kind, dev->version, serial) < 0) {
+    if (ide_init_drive(s, dev->conf.bs, kind,
+                       dev->version, serial, dev->model) < 0) {
         return -1;
     }
 
@@ -173,7 +174,8 @@ static int ide_drive_initfn(IDEDevice *dev)
 #define DEFINE_IDE_DEV_PROPERTIES()                     \
     DEFINE_BLOCK_PROPERTIES(IDEDrive, dev.conf),        \
     DEFINE_PROP_STRING("ver",  IDEDrive, dev.version),  \
-    DEFINE_PROP_STRING("serial",  IDEDrive, dev.serial)
+    DEFINE_PROP_STRING("serial",  IDEDrive, dev.serial),\
+    DEFINE_PROP_STRING("model", IDEDrive, dev.model)
 
 static Property ide_hd_properties[] = {
     DEFINE_IDE_DEV_PROPERTIES(),
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 09/46] ide: Change serial number strncpy() to pstrcpy()
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (6 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 08/46] ide: Add "model=s" qdev option Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 10/46] ide: Adds wwn=hex qdev option Kevin Wolf
                   ` (31 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Floris Bos <bos@je-eigen-domein.nl>

strncpy may not null-terminate the destination string.

Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 blockdev.c    |    5 +++--
 hw/ide/core.c |    2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 1a500b8..f5e7dba 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -532,8 +532,9 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
     dinfo->unit = unit_id;
     dinfo->opts = opts;
     dinfo->refcount = 1;
-    if (serial)
-        strncpy(dinfo->serial, serial, sizeof(dinfo->serial) - 1);
+    if (serial) {
+        pstrcpy(dinfo->serial, sizeof(dinfo->serial), serial);
+    }
     QTAILQ_INSERT_TAIL(&drives, dinfo, next);
 
     bdrv_set_on_error(dinfo->bdrv, on_read_error, on_write_error);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index e38cace..5647694 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1879,7 +1879,7 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
         }
     }
     if (serial) {
-        strncpy(s->drive_serial_str, serial, sizeof(s->drive_serial_str));
+        pstrcpy(s->drive_serial_str, sizeof(s->drive_serial_str), serial);
     } else {
         snprintf(s->drive_serial_str, sizeof(s->drive_serial_str),
                  "QM%05d", s->drive_serial);
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 10/46] ide: Adds wwn=hex qdev option
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (7 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 09/46] ide: Change serial number strncpy() to pstrcpy() Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 11/46] block/vpc: write checksum back to footer after check Kevin Wolf
                   ` (30 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Floris Bos <bos@je-eigen-domein.nl>

Allow the user to specify a disk's World Wide Name.

Linux guests can address disks by their unique World Wide Name number
(e.g. /dev/disk/by-id/wwn-0x5001517959123522). This patch adds support
for assigning a World Wide Name number to a virtual IDE disk.

Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 hw/ide/core.c     |   29 +++++++++++++++++++++++------
 hw/ide/internal.h |    5 ++++-
 hw/ide/qdev.c     |    3 ++-
 3 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 5647694..6e25338 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -142,8 +142,12 @@ static void ide_identify(IDEState *s)
     put_le16(p + 82, (1 << 14) | (1 << 5) | 1);
     /* 13=flush_cache_ext,12=flush_cache,10=lba48 */
     put_le16(p + 83, (1 << 14) | (1 << 13) | (1 <<12) | (1 << 10));
-    /* 14=set to 1, 1=SMART self test, 0=SMART error logging */
-    put_le16(p + 84, (1 << 14) | 0);
+    /* 14=set to 1, 8=has WWN, 1=SMART self test, 0=SMART error logging */
+    if (s->wwn) {
+        put_le16(p + 84, (1 << 14) | (1 << 8) | 0);
+    } else {
+        put_le16(p + 84, (1 << 14) | 0);
+    }
     /* 14 = NOP supported, 5=WCACHE enabled, 0=SMART feature set enabled */
     if (bdrv_enable_write_cache(s->bs))
          put_le16(p + 85, (1 << 14) | (1 << 5) | 1);
@@ -151,8 +155,12 @@ static void ide_identify(IDEState *s)
          put_le16(p + 85, (1 << 14) | 1);
     /* 13=flush_cache_ext,12=flush_cache,10=lba48 */
     put_le16(p + 86, (1 << 13) | (1 <<12) | (1 << 10));
-    /* 14=set to 1, 1=smart self test, 0=smart error logging */
-    put_le16(p + 87, (1 << 14) | 0);
+    /* 14=set to 1, 8=has WWN, 1=SMART self test, 0=SMART error logging */
+    if (s->wwn) {
+        put_le16(p + 87, (1 << 14) | (1 << 8) | 0);
+    } else {
+        put_le16(p + 87, (1 << 14) | 0);
+    }
     put_le16(p + 88, 0x3f | (1 << 13)); /* udma5 set and supported */
     put_le16(p + 93, 1 | (1 << 14) | 0x2000);
     put_le16(p + 100, s->nb_sectors);
@@ -162,6 +170,13 @@ static void ide_identify(IDEState *s)
 
     if (dev && dev->conf.physical_block_size)
         put_le16(p + 106, 0x6000 | get_physical_block_exp(&dev->conf));
+    if (s->wwn) {
+        /* LE 16-bit words 111-108 contain 64-bit World Wide Name */
+        put_le16(p + 108, s->wwn >> 48);
+        put_le16(p + 109, s->wwn >> 32);
+        put_le16(p + 110, s->wwn >> 16);
+        put_le16(p + 111, s->wwn);
+    }
     if (dev && dev->conf.discard_granularity) {
         put_le16(p + 169, 1); /* TRIM support */
     }
@@ -1833,7 +1848,8 @@ static const BlockDevOps ide_cd_block_ops = {
 };
 
 int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
-                   const char *version, const char *serial, const char *model)
+                   const char *version, const char *serial, const char *model,
+                   uint64_t wwn)
 {
     int cylinders, heads, secs;
     uint64_t nb_sectors;
@@ -1859,6 +1875,7 @@ int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
     s->heads = heads;
     s->sectors = secs;
     s->nb_sectors = nb_sectors;
+    s->wwn = wwn;
     /* The SMART values should be preserved across power cycles
        but they aren't.  */
     s->smart_enabled = 1;
@@ -1993,7 +2010,7 @@ void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo *hd0,
             if (ide_init_drive(&bus->ifs[i], dinfo->bdrv,
                                dinfo->media_cd ? IDE_CD : IDE_HD, NULL,
                                *dinfo->serial ? dinfo->serial : NULL,
-                               NULL) < 0) {
+                               NULL, 0) < 0) {
                 error_report("Can't set up IDE drive %s", dinfo->id);
                 exit(1);
             }
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index b1319dc..100efd3 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -349,6 +349,7 @@ struct IDEState {
     int drive_serial;
     char drive_serial_str[21];
     char drive_model_str[41];
+    uint64_t wwn;
     /* ide regs */
     uint8_t feature;
     uint8_t error;
@@ -470,6 +471,7 @@ struct IDEDevice {
     char *version;
     char *serial;
     char *model;
+    uint64_t wwn;
 };
 
 #define BM_STATUS_DMAING 0x01
@@ -536,7 +538,8 @@ void ide_data_writel(void *opaque, uint32_t addr, uint32_t val);
 uint32_t ide_data_readl(void *opaque, uint32_t addr);
 
 int ide_init_drive(IDEState *s, BlockDriverState *bs, IDEDriveKind kind,
-                   const char *version, const char *serial, const char *model);
+                   const char *version, const char *serial, const char *model,
+                   uint64_t wwn);
 void ide_init2(IDEBus *bus, qemu_irq irq);
 void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo *hd0,
                                     DriveInfo *hd1, qemu_irq irq);
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 07227c7..a46578d 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -137,7 +137,7 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind)
     }
 
     if (ide_init_drive(s, dev->conf.bs, kind,
-                       dev->version, serial, dev->model) < 0) {
+                       dev->version, serial, dev->model, dev->wwn) < 0) {
         return -1;
     }
 
@@ -174,6 +174,7 @@ static int ide_drive_initfn(IDEDevice *dev)
 #define DEFINE_IDE_DEV_PROPERTIES()                     \
     DEFINE_BLOCK_PROPERTIES(IDEDrive, dev.conf),        \
     DEFINE_PROP_STRING("ver",  IDEDrive, dev.version),  \
+    DEFINE_PROP_HEX64("wwn",  IDEDrive, dev.wwn, 0),    \
     DEFINE_PROP_STRING("serial",  IDEDrive, dev.serial),\
     DEFINE_PROP_STRING("model", IDEDrive, dev.model)
 
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 11/46] block/vpc: write checksum back to footer after check
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (8 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 10/46] ide: Adds wwn=hex qdev option Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 12/46] qerror: fix QERR_PROPERTY_VALUE_OUT_OF_RANGE description Kevin Wolf
                   ` (29 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Zhang Shengju <zhangsju@gmail.com>

After validation check, the 'checksum' is not written back
to footer, which leave it with zero.

This results in errors while loadding it under Microsoft's
Hyper-V environment, and also errors from utilities like
Citrix's vhd-util.

Signed-off-by: Zhang Shengju <sean_zhang@trendmicro.com.cn>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vpc.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index 706faf3..5cd13d1 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -189,6 +189,9 @@ static int vpc_open(BlockDriverState *bs, int flags)
         fprintf(stderr, "block-vpc: The header checksum of '%s' is "
             "incorrect.\n", bs->filename);
 
+    /* Write 'checksum' back to footer, or else will leave it with zero. */
+    footer->checksum = be32_to_cpu(checksum);
+
     // The visible size of a image in Virtual PC depends on the geometry
     // rather than on the size stored in the footer (the size in the footer
     // is too large usually)
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 12/46] qerror: fix QERR_PROPERTY_VALUE_OUT_OF_RANGE description
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (9 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 11/46] block/vpc: write checksum back to footer after check Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 13/46] qdev: add blocksize property type Kevin Wolf
                   ` (28 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Fix a typo in the description for QERR_PROPERTY_VALUE_OUT_OF_RANGE where
"'" was used instead of ")".

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 qerror.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/qerror.c b/qerror.c
index 41c729a..1f565fc 100644
--- a/qerror.c
+++ b/qerror.c
@@ -243,7 +243,7 @@ static const QErrorStringTable qerror_table[] = {
     {
         .error_fmt = QERR_PROPERTY_VALUE_OUT_OF_RANGE,
         .desc      = "Property '%(device).%(property)' doesn't take "
-                     "value %(value) (minimum: %(min), maximum: %(max)'",
+                     "value %(value) (minimum: %(min), maximum: %(max))",
     },
     {
         .error_fmt = QERR_QGA_COMMAND_FAILED,
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 13/46] qdev: add blocksize property type
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (10 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 12/46] qerror: fix QERR_PROPERTY_VALUE_OUT_OF_RANGE description Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 14/46] block: enforce constraints on block size properties Kevin Wolf
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Storage interfaces like virtio-blk can be configured with block size
information so that the guest can take advantage of efficient I/O
request sizes.

According to the SCSI Block Commands (SBC) standard a device's block
size is "almost always greater than one byte and may be a multiple of
512 bytes".  QEMU currently has a 512 byte minimum block size because
the block layer functions work at that granularity.  Furthermore, the
block size should be a power of 2 because QEMU calculates bitmasks from
the value.

Introduce a "blocksize" property type so devices can enforce these
constraints on block size values.  If the constraints are relaxed in the
future then this property can be updated.

Introduce the new PropertyValueNotPowerOf2 QError so QMP clients know
exactly why a block size value was rejected.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 hw/qdev-properties.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 hw/qdev.h            |    3 +++
 qerror.c             |    5 +++++
 qerror.h             |    4 ++++
 4 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index bff9152..98dd06a 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -877,6 +877,52 @@ PropertyInfo qdev_prop_pci_devfn = {
     .max   = 0xFFFFFFFFULL,
 };
 
+/* --- blocksize --- */
+
+static void set_blocksize(Object *obj, Visitor *v, void *opaque,
+                          const char *name, Error **errp)
+{
+    DeviceState *dev = DEVICE(obj);
+    Property *prop = opaque;
+    int16_t *ptr = qdev_get_prop_ptr(dev, prop);
+    Error *local_err = NULL;
+    int64_t value;
+
+    if (dev->state != DEV_STATE_CREATED) {
+        error_set(errp, QERR_PERMISSION_DENIED);
+        return;
+    }
+
+    visit_type_int(v, &value, name, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    if (value < prop->info->min || value > prop->info->max) {
+        error_set(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
+                  dev->id?:"", name, value, prop->info->min,
+                  prop->info->max);
+        return;
+    }
+
+    /* We rely on power-of-2 blocksizes for bitmasks */
+    if ((value & (value - 1)) != 0) {
+        error_set(errp, QERR_PROPERTY_VALUE_NOT_POWER_OF_2,
+                  dev->id?:"", name, value);
+        return;
+    }
+
+    *ptr = value;
+}
+
+PropertyInfo qdev_prop_blocksize = {
+    .name  = "blocksize",
+    .get   = get_int16,
+    .set   = set_blocksize,
+    .min   = 512,
+    .max   = 65024,
+};
+
 /* --- public helpers --- */
 
 static Property *qdev_prop_walk(Property *props, const char *name)
diff --git a/hw/qdev.h b/hw/qdev.h
index a8df42f..1c10960 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -223,6 +223,7 @@ extern PropertyInfo qdev_prop_drive;
 extern PropertyInfo qdev_prop_netdev;
 extern PropertyInfo qdev_prop_vlan;
 extern PropertyInfo qdev_prop_pci_devfn;
+extern PropertyInfo qdev_prop_blocksize;
 
 #define DEFINE_PROP(_name, _state, _field, _prop, _type) { \
         .name      = (_name),                                    \
@@ -284,6 +285,8 @@ extern PropertyInfo qdev_prop_pci_devfn;
 #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \
     DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_losttickpolicy, \
                         LostTickPolicy)
+#define DEFINE_PROP_BLOCKSIZE(_n, _s, _f, _d) \
+    DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_blocksize, uint16_t)
 
 #define DEFINE_PROP_END_OF_LIST()               \
     {}
diff --git a/qerror.c b/qerror.c
index 1f565fc..96fbe71 100644
--- a/qerror.c
+++ b/qerror.c
@@ -241,6 +241,11 @@ static const QErrorStringTable qerror_table[] = {
         .desc      = "Property '%(device).%(property)' can't find value '%(value)'",
     },
     {
+        .error_fmt = QERR_PROPERTY_VALUE_NOT_POWER_OF_2,
+        .desc      = "Property '%(device).%(property)' doesn't take "
+                     "value '%(value)', it's not a power of 2",
+    },
+    {
         .error_fmt = QERR_PROPERTY_VALUE_OUT_OF_RANGE,
         .desc      = "Property '%(device).%(property)' doesn't take "
                      "value %(value) (minimum: %(min), maximum: %(max))",
diff --git a/qerror.h b/qerror.h
index e16f9c2..5c23c1f 100644
--- a/qerror.h
+++ b/qerror.h
@@ -202,6 +202,10 @@ QError *qobject_to_qerror(const QObject *obj);
 #define QERR_PROPERTY_VALUE_NOT_FOUND \
     "{ 'class': 'PropertyValueNotFound', 'data': { 'device': %s, 'property': %s, 'value': %s } }"
 
+#define QERR_PROPERTY_VALUE_NOT_POWER_OF_2 \
+    "{ 'class': 'PropertyValueNotPowerOf2', 'data': { " \
+    "'device': %s, 'property': %s, 'value': %"PRId64" } }"
+
 #define QERR_PROPERTY_VALUE_OUT_OF_RANGE \
     "{ 'class': 'PropertyValueOutOfRange', 'data': { 'device': %s, 'property': %s, 'value': %"PRId64", 'min': %"PRId64", 'max': %"PRId64" } }"
 
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 14/46] block: enforce constraints on block size properties
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (11 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 13/46] qdev: add blocksize property type Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 15/46] vdi: basic conversion to coroutines Kevin Wolf
                   ` (26 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Nicolae Mogoreanu <mogo@google.com> noticed that I/O requests can lead
to QEMU crashes when the logical_block_size property is smaller than 512
bytes.

Using the new "blocksize" property we can properly enforce constraints
on the block size such that QEMU's block layer is able to operate
correctly.

Reported-by: Nicolae Mogoreanu <mogo@google.com>
Reported-by: Michael Halcrow <mhalcrow@google.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block.h b/block.h
index e670606..c51ab16 100644
--- a/block.h
+++ b/block.h
@@ -435,10 +435,10 @@ static inline unsigned int get_physical_block_exp(BlockConf *conf)
 
 #define DEFINE_BLOCK_PROPERTIES(_state, _conf)                          \
     DEFINE_PROP_DRIVE("drive", _state, _conf.bs),                       \
-    DEFINE_PROP_UINT16("logical_block_size", _state,                    \
-                       _conf.logical_block_size, 512),                  \
-    DEFINE_PROP_UINT16("physical_block_size", _state,                   \
-                       _conf.physical_block_size, 512),                 \
+    DEFINE_PROP_BLOCKSIZE("logical_block_size", _state,                 \
+                          _conf.logical_block_size, 512),               \
+    DEFINE_PROP_BLOCKSIZE("physical_block_size", _state,                \
+                          _conf.physical_block_size, 512),              \
     DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0),  \
     DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0),    \
     DEFINE_PROP_INT32("bootindex", _state, _conf.bootindex, -1),        \
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 15/46] vdi: basic conversion to coroutines
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (12 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 14/46] block: enforce constraints on block size properties Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 16/46] vdi: move end-of-I/O handling at the end Kevin Wolf
                   ` (25 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

Even a basic conversion changing the bdrv_aio_readv/bdrv_aio_writev calls
to bdrv_co_readv/bdrv_co_writev, and callbacks to goto statements can
eliminate a lot of code.  This is because error handling is simplified
and indirections through bottom halves can go away.

After this patch, I/O to the underlying file already happens via
coroutines, but the code still looks a lot like if asynchronous I/O was
being used.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c |  158 ++++++++++++++---------------------------------------------
 1 files changed, 37 insertions(+), 121 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index f8fa865..5fd8700 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -160,7 +160,6 @@ typedef struct {
     void *orig_buf;
     bool is_write;
     int header_modified;
-    BlockDriverAIOCB *hd_aiocb;
     struct iovec hd_iov;
     QEMUIOVector hd_qiov;
     QEMUBH *bh;
@@ -489,33 +488,19 @@ static int coroutine_fn vdi_co_is_allocated(BlockDriverState *bs,
     return VDI_IS_ALLOCATED(bmap_entry);
 }
 
-static void vdi_aio_cancel(BlockDriverAIOCB *blockacb)
-{
-    /* TODO: This code is untested. How can I get it executed? */
-    VdiAIOCB *acb = container_of(blockacb, VdiAIOCB, common);
-    logout("\n");
-    if (acb->hd_aiocb) {
-        bdrv_aio_cancel(acb->hd_aiocb);
-    }
-    qemu_aio_release(acb);
-}
-
 static AIOPool vdi_aio_pool = {
     .aiocb_size = sizeof(VdiAIOCB),
-    .cancel = vdi_aio_cancel,
 };
 
 static VdiAIOCB *vdi_aio_setup(BlockDriverState *bs, int64_t sector_num,
-        QEMUIOVector *qiov, int nb_sectors,
-        BlockDriverCompletionFunc *cb, void *opaque, int is_write)
+        QEMUIOVector *qiov, int nb_sectors, int is_write)
 {
     VdiAIOCB *acb;
 
     logout("%p, %" PRId64 ", %p, %d, %p, %p, %d\n",
-           bs, sector_num, qiov, nb_sectors, cb, opaque, is_write);
+           bs, sector_num, qiov, nb_sectors, is_write);
 
-    acb = qemu_aio_get(&vdi_aio_pool, bs, cb, opaque);
-    acb->hd_aiocb = NULL;
+    acb = qemu_aio_get(&vdi_aio_pool, bs, NULL, NULL);
     acb->sector_num = sector_num;
     acb->qiov = qiov;
     acb->is_write = is_write;
@@ -538,42 +523,7 @@ static VdiAIOCB *vdi_aio_setup(BlockDriverState *bs, int64_t sector_num,
     return acb;
 }
 
-static int vdi_schedule_bh(QEMUBHFunc *cb, VdiAIOCB *acb)
-{
-    logout("\n");
-
-    if (acb->bh) {
-        return -EIO;
-    }
-
-    acb->bh = qemu_bh_new(cb, acb);
-    if (!acb->bh) {
-        return -EIO;
-    }
-
-    qemu_bh_schedule(acb->bh);
-
-    return 0;
-}
-
-static void vdi_aio_read_cb(void *opaque, int ret);
-static void vdi_aio_write_cb(void *opaque, int ret);
-
-static void vdi_aio_rw_bh(void *opaque)
-{
-    VdiAIOCB *acb = opaque;
-    logout("\n");
-    qemu_bh_delete(acb->bh);
-    acb->bh = NULL;
-
-    if (acb->is_write) {
-        vdi_aio_write_cb(opaque, 0);
-    } else {
-        vdi_aio_read_cb(opaque, 0);
-    }
-}
-
-static void vdi_aio_read_cb(void *opaque, int ret)
+static int vdi_aio_read_cb(void *opaque, int ret)
 {
     VdiAIOCB *acb = opaque;
     BlockDriverState *bs = acb->common.bs;
@@ -585,12 +535,7 @@ static void vdi_aio_read_cb(void *opaque, int ret)
 
     logout("%u sectors read\n", acb->n_sectors);
 
-    acb->hd_aiocb = NULL;
-
-    if (ret < 0) {
-        goto done;
-    }
-
+restart:
     acb->nb_sectors -= acb->n_sectors;
 
     if (acb->nb_sectors == 0) {
@@ -618,10 +563,7 @@ static void vdi_aio_read_cb(void *opaque, int ret)
     if (!VDI_IS_ALLOCATED(bmap_entry)) {
         /* Block not allocated, return zeros, no need to wait. */
         memset(acb->buf, 0, n_sectors * SECTOR_SIZE);
-        ret = vdi_schedule_bh(vdi_aio_rw_bh, acb);
-        if (ret < 0) {
-            goto done;
-        }
+        ret = 0;
     } else {
         uint64_t offset = s->header.offset_data / SECTOR_SIZE +
                           (uint64_t)bmap_entry * s->block_sectors +
@@ -629,41 +571,34 @@ static void vdi_aio_read_cb(void *opaque, int ret)
         acb->hd_iov.iov_base = (void *)acb->buf;
         acb->hd_iov.iov_len = n_sectors * SECTOR_SIZE;
         qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-        acb->hd_aiocb = bdrv_aio_readv(bs->file, offset, &acb->hd_qiov,
-                                       n_sectors, vdi_aio_read_cb, acb);
+        ret = bdrv_co_readv(bs->file, offset, n_sectors, &acb->hd_qiov);
+    }
+    if (ret >= 0) {
+        goto restart;
     }
-    return;
+
 done:
     if (acb->qiov->niov > 1) {
         qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
         qemu_vfree(acb->orig_buf);
     }
-    acb->common.cb(acb->common.opaque, ret);
     qemu_aio_release(acb);
+    return ret;
 }
 
-static BlockDriverAIOCB *vdi_aio_readv(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockDriverCompletionFunc *cb, void *opaque)
+static int vdi_co_readv(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
     VdiAIOCB *acb;
     int ret;
 
     logout("\n");
-    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, cb, opaque, 0);
-    ret = vdi_schedule_bh(vdi_aio_rw_bh, acb);
-    if (ret < 0) {
-        if (acb->qiov->niov > 1) {
-            qemu_vfree(acb->orig_buf);
-        }
-        qemu_aio_release(acb);
-        return NULL;
-    }
-
-    return &acb->common;
+    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, 0);
+    ret = vdi_aio_read_cb(acb, 0);
+    return ret;
 }
 
-static void vdi_aio_write_cb(void *opaque, int ret)
+static int vdi_aio_write_cb(void *opaque, int ret)
 {
     VdiAIOCB *acb = opaque;
     BlockDriverState *bs = acb->common.bs;
@@ -673,12 +608,7 @@ static void vdi_aio_write_cb(void *opaque, int ret)
     uint32_t sector_in_block;
     uint32_t n_sectors;
 
-    acb->hd_aiocb = NULL;
-
-    if (ret < 0) {
-        goto done;
-    }
-
+restart:
     acb->nb_sectors -= acb->n_sectors;
     acb->sector_num += acb->n_sectors;
     acb->buf += acb->n_sectors * SECTOR_SIZE;
@@ -686,6 +616,7 @@ static void vdi_aio_write_cb(void *opaque, int ret)
     if (acb->nb_sectors == 0) {
         logout("finished data write\n");
         acb->n_sectors = 0;
+        ret = 0;
         if (acb->header_modified) {
             VdiHeader *header = acb->block_buffer;
             logout("now writing modified header\n");
@@ -696,10 +627,9 @@ static void vdi_aio_write_cb(void *opaque, int ret)
             acb->hd_iov.iov_base = acb->block_buffer;
             acb->hd_iov.iov_len = SECTOR_SIZE;
             qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-            acb->hd_aiocb = bdrv_aio_writev(bs->file, 0, &acb->hd_qiov, 1,
-                                            vdi_aio_write_cb, acb);
-            return;
-        } else if (VDI_IS_ALLOCATED(acb->bmap_first)) {
+            ret = bdrv_co_writev(bs->file, 0, 1, &acb->hd_qiov);
+        }
+        if (ret >= 0 && VDI_IS_ALLOCATED(acb->bmap_first)) {
             /* One or more new blocks were allocated. */
             uint64_t offset;
             uint32_t bmap_first;
@@ -722,11 +652,8 @@ static void vdi_aio_write_cb(void *opaque, int ret)
             qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
             logout("will write %u block map sectors starting from entry %u\n",
                    n_sectors, bmap_first);
-            acb->hd_aiocb = bdrv_aio_writev(bs->file, offset, &acb->hd_qiov,
-                                            n_sectors, vdi_aio_write_cb, acb);
-            return;
+            ret = bdrv_co_writev(bs->file, offset, n_sectors, &acb->hd_qiov);
         }
-        ret = 0;
         goto done;
     }
 
@@ -772,9 +699,7 @@ static void vdi_aio_write_cb(void *opaque, int ret)
         acb->hd_iov.iov_base = (void *)block;
         acb->hd_iov.iov_len = s->block_size;
         qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-        acb->hd_aiocb = bdrv_aio_writev(bs->file, offset,
-                                        &acb->hd_qiov, s->block_sectors,
-                                        vdi_aio_write_cb, acb);
+        ret = bdrv_co_writev(bs->file, offset, s->block_sectors, &acb->hd_qiov);
     } else {
         uint64_t offset = s->header.offset_data / SECTOR_SIZE +
                           (uint64_t)bmap_entry * s->block_sectors +
@@ -782,39 +707,30 @@ static void vdi_aio_write_cb(void *opaque, int ret)
         acb->hd_iov.iov_base = (void *)acb->buf;
         acb->hd_iov.iov_len = n_sectors * SECTOR_SIZE;
         qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-        acb->hd_aiocb = bdrv_aio_writev(bs->file, offset, &acb->hd_qiov,
-                                        n_sectors, vdi_aio_write_cb, acb);
+        ret = bdrv_co_writev(bs->file, offset, n_sectors, &acb->hd_qiov);
+    }
+    if (ret >= 0) {
+        goto restart;
     }
-
-    return;
 
 done:
     if (acb->qiov->niov > 1) {
         qemu_vfree(acb->orig_buf);
     }
-    acb->common.cb(acb->common.opaque, ret);
     qemu_aio_release(acb);
+    return ret;
 }
 
-static BlockDriverAIOCB *vdi_aio_writev(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockDriverCompletionFunc *cb, void *opaque)
+static int vdi_co_writev(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
     VdiAIOCB *acb;
     int ret;
 
     logout("\n");
-    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, cb, opaque, 1);
-    ret = vdi_schedule_bh(vdi_aio_rw_bh, acb);
-    if (ret < 0) {
-        if (acb->qiov->niov > 1) {
-            qemu_vfree(acb->orig_buf);
-        }
-        qemu_aio_release(acb);
-        return NULL;
-    }
-
-    return &acb->common;
+    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, 1);
+    ret = vdi_aio_write_cb(acb, 0);
+    return ret;
 }
 
 static int vdi_create(const char *filename, QEMUOptionParameter *options)
@@ -965,9 +881,9 @@ static BlockDriver bdrv_vdi = {
     .bdrv_co_is_allocated = vdi_co_is_allocated,
     .bdrv_make_empty = vdi_make_empty,
 
-    .bdrv_aio_readv = vdi_aio_readv,
+    .bdrv_co_readv = vdi_co_readv,
 #if defined(CONFIG_VDI_WRITE)
-    .bdrv_aio_writev = vdi_aio_writev,
+    .bdrv_co_writev = vdi_co_writev,
 #endif
 
     .bdrv_get_info = vdi_get_info,
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 16/46] vdi: move end-of-I/O handling at the end
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (13 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 15/46] vdi: basic conversion to coroutines Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 17/46] vdi: merge aio_read_cb and aio_write_cb into callers Kevin Wolf
                   ` (24 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

The next step is to take code that only triggers after the first operation,
and move it at the end of vdi_aio_read_cb and vdi_aio_write_cb.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c |  123 +++++++++++++++++++++++++++--------------------------------
 1 files changed, 56 insertions(+), 67 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 5fd8700..df0f431 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -533,20 +533,7 @@ static int vdi_aio_read_cb(void *opaque, int ret)
     uint32_t sector_in_block;
     uint32_t n_sectors;
 
-    logout("%u sectors read\n", acb->n_sectors);
-
 restart:
-    acb->nb_sectors -= acb->n_sectors;
-
-    if (acb->nb_sectors == 0) {
-        /* request completed */
-        ret = 0;
-        goto done;
-    }
-
-    acb->sector_num += acb->n_sectors;
-    acb->buf += acb->n_sectors * SECTOR_SIZE;
-
     block_index = acb->sector_num / s->block_sectors;
     sector_in_block = acb->sector_num % s->block_sectors;
     n_sectors = s->block_sectors - sector_in_block;
@@ -573,11 +560,16 @@ restart:
         qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
         ret = bdrv_co_readv(bs->file, offset, n_sectors, &acb->hd_qiov);
     }
-    if (ret >= 0) {
+    logout("%u sectors read\n", acb->n_sectors);
+
+    acb->nb_sectors -= acb->n_sectors;
+    acb->sector_num += acb->n_sectors;
+    acb->buf += acb->n_sectors * SECTOR_SIZE;
+
+    if (ret >= 0 && acb->nb_sectors > 0) {
         goto restart;
     }
 
-done:
     if (acb->qiov->niov > 1) {
         qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
         qemu_vfree(acb->orig_buf);
@@ -609,56 +601,6 @@ static int vdi_aio_write_cb(void *opaque, int ret)
     uint32_t n_sectors;
 
 restart:
-    acb->nb_sectors -= acb->n_sectors;
-    acb->sector_num += acb->n_sectors;
-    acb->buf += acb->n_sectors * SECTOR_SIZE;
-
-    if (acb->nb_sectors == 0) {
-        logout("finished data write\n");
-        acb->n_sectors = 0;
-        ret = 0;
-        if (acb->header_modified) {
-            VdiHeader *header = acb->block_buffer;
-            logout("now writing modified header\n");
-            assert(VDI_IS_ALLOCATED(acb->bmap_first));
-            *header = s->header;
-            vdi_header_to_le(header);
-            acb->header_modified = 0;
-            acb->hd_iov.iov_base = acb->block_buffer;
-            acb->hd_iov.iov_len = SECTOR_SIZE;
-            qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-            ret = bdrv_co_writev(bs->file, 0, 1, &acb->hd_qiov);
-        }
-        if (ret >= 0 && VDI_IS_ALLOCATED(acb->bmap_first)) {
-            /* One or more new blocks were allocated. */
-            uint64_t offset;
-            uint32_t bmap_first;
-            uint32_t bmap_last;
-            g_free(acb->block_buffer);
-            acb->block_buffer = NULL;
-            bmap_first = acb->bmap_first;
-            bmap_last = acb->bmap_last;
-            logout("now writing modified block map entry %u...%u\n",
-                   bmap_first, bmap_last);
-            /* Write modified sectors from block map. */
-            bmap_first /= (SECTOR_SIZE / sizeof(uint32_t));
-            bmap_last /= (SECTOR_SIZE / sizeof(uint32_t));
-            n_sectors = bmap_last - bmap_first + 1;
-            offset = s->bmap_sector + bmap_first;
-            acb->bmap_first = VDI_UNALLOCATED;
-            acb->hd_iov.iov_base = (void *)((uint8_t *)&s->bmap[0] +
-                                            bmap_first * SECTOR_SIZE);
-            acb->hd_iov.iov_len = n_sectors * SECTOR_SIZE;
-            qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-            logout("will write %u block map sectors starting from entry %u\n",
-                   n_sectors, bmap_first);
-            ret = bdrv_co_writev(bs->file, offset, n_sectors, &acb->hd_qiov);
-        }
-        goto done;
-    }
-
-    logout("%u sectors written\n", acb->n_sectors);
-
     block_index = acb->sector_num / s->block_sectors;
     sector_in_block = acb->sector_num % s->block_sectors;
     n_sectors = s->block_sectors - sector_in_block;
@@ -709,11 +651,58 @@ restart:
         qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
         ret = bdrv_co_writev(bs->file, offset, n_sectors, &acb->hd_qiov);
     }
-    if (ret >= 0) {
+
+    acb->nb_sectors -= acb->n_sectors;
+    acb->sector_num += acb->n_sectors;
+    acb->buf += acb->n_sectors * SECTOR_SIZE;
+
+    logout("%u sectors written\n", acb->n_sectors);
+    if (ret >= 0 && acb->nb_sectors > 0) {
         goto restart;
     }
 
-done:
+    logout("finished data write\n");
+    if (ret >= 0) {
+        ret = 0;
+        if (acb->header_modified) {
+            VdiHeader *header = acb->block_buffer;
+            logout("now writing modified header\n");
+            assert(VDI_IS_ALLOCATED(acb->bmap_first));
+            *header = s->header;
+            vdi_header_to_le(header);
+            acb->header_modified = 0;
+            acb->hd_iov.iov_base = acb->block_buffer;
+            acb->hd_iov.iov_len = SECTOR_SIZE;
+            qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
+            ret = bdrv_co_writev(bs->file, 0, 1, &acb->hd_qiov);
+        }
+        if (ret >= 0 && VDI_IS_ALLOCATED(acb->bmap_first)) {
+            /* One or more new blocks were allocated. */
+            uint64_t offset;
+            uint32_t bmap_first;
+            uint32_t bmap_last;
+            g_free(acb->block_buffer);
+            acb->block_buffer = NULL;
+            bmap_first = acb->bmap_first;
+            bmap_last = acb->bmap_last;
+            logout("now writing modified block map entry %u...%u\n",
+                   bmap_first, bmap_last);
+            /* Write modified sectors from block map. */
+            bmap_first /= (SECTOR_SIZE / sizeof(uint32_t));
+            bmap_last /= (SECTOR_SIZE / sizeof(uint32_t));
+            n_sectors = bmap_last - bmap_first + 1;
+            offset = s->bmap_sector + bmap_first;
+            acb->bmap_first = VDI_UNALLOCATED;
+            acb->hd_iov.iov_base = (void *)((uint8_t *)&s->bmap[0] +
+                                            bmap_first * SECTOR_SIZE);
+            acb->hd_iov.iov_len = n_sectors * SECTOR_SIZE;
+            qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
+            logout("will write %u block map sectors starting from entry %u\n",
+                   n_sectors, bmap_first);
+            ret = bdrv_co_writev(bs->file, offset, n_sectors, &acb->hd_qiov);
+        }
+    }
+
     if (acb->qiov->niov > 1) {
         qemu_vfree(acb->orig_buf);
     }
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 17/46] vdi: merge aio_read_cb and aio_write_cb into callers
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (14 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 16/46] vdi: move end-of-I/O handling at the end Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 18/46] vdi: move aiocb fields to locals Kevin Wolf
                   ` (23 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

Now inline the former AIO callbacks into vdi_co_readv and vdi_co_writev.
While many cleanups are possible, the code now really looks synchronous.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c |   40 ++++++++++++----------------------------
 1 files changed, 12 insertions(+), 28 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index df0f431..407fccc 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -523,15 +523,19 @@ static VdiAIOCB *vdi_aio_setup(BlockDriverState *bs, int64_t sector_num,
     return acb;
 }
 
-static int vdi_aio_read_cb(void *opaque, int ret)
+static int vdi_co_readv(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
-    VdiAIOCB *acb = opaque;
-    BlockDriverState *bs = acb->common.bs;
+    VdiAIOCB *acb;
     BDRVVdiState *s = bs->opaque;
     uint32_t bmap_entry;
     uint32_t block_index;
     uint32_t sector_in_block;
     uint32_t n_sectors;
+    int ret;
+
+    logout("\n");
+    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, 0);
 
 restart:
     block_index = acb->sector_num / s->block_sectors;
@@ -578,27 +582,19 @@ restart:
     return ret;
 }
 
-static int vdi_co_readv(BlockDriverState *bs,
+static int vdi_co_writev(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
     VdiAIOCB *acb;
-    int ret;
-
-    logout("\n");
-    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, 0);
-    ret = vdi_aio_read_cb(acb, 0);
-    return ret;
-}
-
-static int vdi_aio_write_cb(void *opaque, int ret)
-{
-    VdiAIOCB *acb = opaque;
-    BlockDriverState *bs = acb->common.bs;
     BDRVVdiState *s = bs->opaque;
     uint32_t bmap_entry;
     uint32_t block_index;
     uint32_t sector_in_block;
     uint32_t n_sectors;
+    int ret;
+
+    logout("\n");
+    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, 1);
 
 restart:
     block_index = acb->sector_num / s->block_sectors;
@@ -710,18 +706,6 @@ restart:
     return ret;
 }
 
-static int vdi_co_writev(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
-{
-    VdiAIOCB *acb;
-    int ret;
-
-    logout("\n");
-    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, 1);
-    ret = vdi_aio_write_cb(acb, 0);
-    return ret;
-}
-
 static int vdi_create(const char *filename, QEMUOptionParameter *options)
 {
     int fd;
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 18/46] vdi: move aiocb fields to locals
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (15 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 17/46] vdi: merge aio_read_cb and aio_write_cb into callers Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 19/46] vdi: leave bounce buffering to block layer Kevin Wolf
                   ` (22 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

Most of the AIOCB really holds local variables that need to persist
across callback invocation.  It can go away now.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c |  163 +++++++++++++++++++++++-----------------------------------
 1 files changed, 65 insertions(+), 98 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 407fccc..0c1cff6 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -145,24 +145,8 @@ void uuid_unparse(const uuid_t uu, char *out)
 
 typedef struct {
     BlockDriverAIOCB common;
-    int64_t sector_num;
-    QEMUIOVector *qiov;
     uint8_t *buf;
-    /* Total number of sectors. */
-    int nb_sectors;
-    /* Number of sectors for current AIO. */
-    int n_sectors;
-    /* New allocated block map entry. */
-    uint32_t bmap_first;
-    uint32_t bmap_last;
-    /* Buffer for new allocated block. */
-    void *block_buffer;
     void *orig_buf;
-    bool is_write;
-    int header_modified;
-    struct iovec hd_iov;
-    QEMUIOVector hd_qiov;
-    QEMUBH *bh;
 } VdiAIOCB;
 
 typedef struct {
@@ -492,34 +476,20 @@ static AIOPool vdi_aio_pool = {
     .aiocb_size = sizeof(VdiAIOCB),
 };
 
-static VdiAIOCB *vdi_aio_setup(BlockDriverState *bs, int64_t sector_num,
-        QEMUIOVector *qiov, int nb_sectors, int is_write)
+static VdiAIOCB *vdi_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov)
 {
     VdiAIOCB *acb;
 
-    logout("%p, %" PRId64 ", %p, %d, %p, %p, %d\n",
-           bs, sector_num, qiov, nb_sectors, is_write);
+    logout("%p, %p\n", bs, qiov);
 
     acb = qemu_aio_get(&vdi_aio_pool, bs, NULL, NULL);
-    acb->sector_num = sector_num;
-    acb->qiov = qiov;
-    acb->is_write = is_write;
 
     if (qiov->niov > 1) {
         acb->buf = qemu_blockalign(bs, qiov->size);
         acb->orig_buf = acb->buf;
-        if (is_write) {
-            qemu_iovec_to_buffer(qiov, acb->buf);
-        }
     } else {
         acb->buf = (uint8_t *)qiov->iov->iov_base;
     }
-    acb->nb_sectors = nb_sectors;
-    acb->n_sectors = 0;
-    acb->bmap_first = VDI_UNALLOCATED;
-    acb->bmap_last = VDI_UNALLOCATED;
-    acb->block_buffer = NULL;
-    acb->header_modified = 0;
     return acb;
 }
 
@@ -532,24 +502,25 @@ static int vdi_co_readv(BlockDriverState *bs,
     uint32_t block_index;
     uint32_t sector_in_block;
     uint32_t n_sectors;
+    struct iovec hd_iov;
+    QEMUIOVector hd_qiov;
     int ret;
 
     logout("\n");
-    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, 0);
+    acb = vdi_aio_setup(bs, qiov);
 
 restart:
-    block_index = acb->sector_num / s->block_sectors;
-    sector_in_block = acb->sector_num % s->block_sectors;
+    block_index = sector_num / s->block_sectors;
+    sector_in_block = sector_num % s->block_sectors;
     n_sectors = s->block_sectors - sector_in_block;
-    if (n_sectors > acb->nb_sectors) {
-        n_sectors = acb->nb_sectors;
+    if (n_sectors > nb_sectors) {
+        n_sectors = nb_sectors;
     }
 
     logout("will read %u sectors starting at sector %" PRIu64 "\n",
-           n_sectors, acb->sector_num);
+           n_sectors, sector_num);
 
     /* prepare next AIO request */
-    acb->n_sectors = n_sectors;
     bmap_entry = le32_to_cpu(s->bmap[block_index]);
     if (!VDI_IS_ALLOCATED(bmap_entry)) {
         /* Block not allocated, return zeros, no need to wait. */
@@ -559,23 +530,23 @@ restart:
         uint64_t offset = s->header.offset_data / SECTOR_SIZE +
                           (uint64_t)bmap_entry * s->block_sectors +
                           sector_in_block;
-        acb->hd_iov.iov_base = (void *)acb->buf;
-        acb->hd_iov.iov_len = n_sectors * SECTOR_SIZE;
-        qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-        ret = bdrv_co_readv(bs->file, offset, n_sectors, &acb->hd_qiov);
+        hd_iov.iov_base = (void *)acb->buf;
+        hd_iov.iov_len = n_sectors * SECTOR_SIZE;
+        qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
+        ret = bdrv_co_readv(bs->file, offset, n_sectors, &hd_qiov);
     }
-    logout("%u sectors read\n", acb->n_sectors);
+    logout("%u sectors read\n", n_sectors);
 
-    acb->nb_sectors -= acb->n_sectors;
-    acb->sector_num += acb->n_sectors;
-    acb->buf += acb->n_sectors * SECTOR_SIZE;
+    nb_sectors -= n_sectors;
+    sector_num += n_sectors;
+    acb->buf += n_sectors * SECTOR_SIZE;
 
-    if (ret >= 0 && acb->nb_sectors > 0) {
+    if (ret >= 0 && nb_sectors > 0) {
         goto restart;
     }
 
-    if (acb->qiov->niov > 1) {
-        qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
+    if (acb->orig_buf) {
+        qemu_iovec_from_buffer(qiov, acb->orig_buf, qiov->size);
         qemu_vfree(acb->orig_buf);
     }
     qemu_aio_release(acb);
@@ -591,96 +562,93 @@ static int vdi_co_writev(BlockDriverState *bs,
     uint32_t block_index;
     uint32_t sector_in_block;
     uint32_t n_sectors;
+    uint32_t bmap_first = VDI_UNALLOCATED;
+    uint32_t bmap_last = VDI_UNALLOCATED;
+    struct iovec hd_iov;
+    QEMUIOVector hd_qiov;
+    uint8_t *block = NULL;
     int ret;
 
     logout("\n");
-    acb = vdi_aio_setup(bs, sector_num, qiov, nb_sectors, 1);
+    acb = vdi_aio_setup(bs, qiov);
+    if (acb->orig_buf) {
+        qemu_iovec_to_buffer(qiov, acb->buf);
+    }
 
 restart:
-    block_index = acb->sector_num / s->block_sectors;
-    sector_in_block = acb->sector_num % s->block_sectors;
+    block_index = sector_num / s->block_sectors;
+    sector_in_block = sector_num % s->block_sectors;
     n_sectors = s->block_sectors - sector_in_block;
-    if (n_sectors > acb->nb_sectors) {
-        n_sectors = acb->nb_sectors;
+    if (n_sectors > nb_sectors) {
+        n_sectors = nb_sectors;
     }
 
     logout("will write %u sectors starting at sector %" PRIu64 "\n",
-           n_sectors, acb->sector_num);
+           n_sectors, sector_num);
 
     /* prepare next AIO request */
-    acb->n_sectors = n_sectors;
     bmap_entry = le32_to_cpu(s->bmap[block_index]);
     if (!VDI_IS_ALLOCATED(bmap_entry)) {
         /* Allocate new block and write to it. */
         uint64_t offset;
-        uint8_t *block;
         bmap_entry = s->header.blocks_allocated;
         s->bmap[block_index] = cpu_to_le32(bmap_entry);
         s->header.blocks_allocated++;
         offset = s->header.offset_data / SECTOR_SIZE +
                  (uint64_t)bmap_entry * s->block_sectors;
-        block = acb->block_buffer;
         if (block == NULL) {
             block = g_malloc(s->block_size);
-            acb->block_buffer = block;
-            acb->bmap_first = block_index;
-            assert(!acb->header_modified);
-            acb->header_modified = 1;
+            bmap_first = block_index;
         }
-        acb->bmap_last = block_index;
+        bmap_last = block_index;
         /* Copy data to be written to new block and zero unused parts. */
         memset(block, 0, sector_in_block * SECTOR_SIZE);
         memcpy(block + sector_in_block * SECTOR_SIZE,
                acb->buf, n_sectors * SECTOR_SIZE);
         memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0,
                (s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE);
-        acb->hd_iov.iov_base = (void *)block;
-        acb->hd_iov.iov_len = s->block_size;
-        qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-        ret = bdrv_co_writev(bs->file, offset, s->block_sectors, &acb->hd_qiov);
+        hd_iov.iov_base = (void *)block;
+        hd_iov.iov_len = s->block_size;
+        qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
+        ret = bdrv_co_writev(bs->file, offset, s->block_sectors, &hd_qiov);
     } else {
         uint64_t offset = s->header.offset_data / SECTOR_SIZE +
                           (uint64_t)bmap_entry * s->block_sectors +
                           sector_in_block;
-        acb->hd_iov.iov_base = (void *)acb->buf;
-        acb->hd_iov.iov_len = n_sectors * SECTOR_SIZE;
-        qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-        ret = bdrv_co_writev(bs->file, offset, n_sectors, &acb->hd_qiov);
+        hd_iov.iov_base = (void *)acb->buf;
+        hd_iov.iov_len = n_sectors * SECTOR_SIZE;
+        qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
+        ret = bdrv_co_writev(bs->file, offset, n_sectors, &hd_qiov);
     }
 
-    acb->nb_sectors -= acb->n_sectors;
-    acb->sector_num += acb->n_sectors;
-    acb->buf += acb->n_sectors * SECTOR_SIZE;
+    nb_sectors -= n_sectors;
+    sector_num += n_sectors;
+    acb->buf += n_sectors * SECTOR_SIZE;
 
-    logout("%u sectors written\n", acb->n_sectors);
-    if (ret >= 0 && acb->nb_sectors > 0) {
+    logout("%u sectors written\n", n_sectors);
+    if (ret >= 0 && nb_sectors > 0) {
         goto restart;
     }
 
     logout("finished data write\n");
     if (ret >= 0) {
         ret = 0;
-        if (acb->header_modified) {
-            VdiHeader *header = acb->block_buffer;
+        if (block) {
+            VdiHeader *header = (VdiHeader *) block;
             logout("now writing modified header\n");
-            assert(VDI_IS_ALLOCATED(acb->bmap_first));
+            assert(VDI_IS_ALLOCATED(bmap_first));
             *header = s->header;
             vdi_header_to_le(header);
-            acb->header_modified = 0;
-            acb->hd_iov.iov_base = acb->block_buffer;
-            acb->hd_iov.iov_len = SECTOR_SIZE;
-            qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
-            ret = bdrv_co_writev(bs->file, 0, 1, &acb->hd_qiov);
+            hd_iov.iov_base = block;
+            hd_iov.iov_len = SECTOR_SIZE;
+            qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
+            ret = bdrv_co_writev(bs->file, 0, 1, &hd_qiov);
         }
-        if (ret >= 0 && VDI_IS_ALLOCATED(acb->bmap_first)) {
+        g_free(block);
+        block = NULL;
+        if (ret >= 0 && VDI_IS_ALLOCATED(bmap_first)) {
             /* One or more new blocks were allocated. */
             uint64_t offset;
-            uint32_t bmap_first;
-            uint32_t bmap_last;
-            g_free(acb->block_buffer);
-            acb->block_buffer = NULL;
-            bmap_first = acb->bmap_first;
-            bmap_last = acb->bmap_last;
             logout("now writing modified block map entry %u...%u\n",
                    bmap_first, bmap_last);
             /* Write modified sectors from block map. */
@@ -688,18 +656,17 @@ restart:
             bmap_last /= (SECTOR_SIZE / sizeof(uint32_t));
             n_sectors = bmap_last - bmap_first + 1;
             offset = s->bmap_sector + bmap_first;
-            acb->bmap_first = VDI_UNALLOCATED;
-            acb->hd_iov.iov_base = (void *)((uint8_t *)&s->bmap[0] +
+            hd_iov.iov_base = (void *)((uint8_t *)&s->bmap[0] +
                                             bmap_first * SECTOR_SIZE);
-            acb->hd_iov.iov_len = n_sectors * SECTOR_SIZE;
-            qemu_iovec_init_external(&acb->hd_qiov, &acb->hd_iov, 1);
+            hd_iov.iov_len = n_sectors * SECTOR_SIZE;
+            qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
             logout("will write %u block map sectors starting from entry %u\n",
                    n_sectors, bmap_first);
-            ret = bdrv_co_writev(bs->file, offset, n_sectors, &acb->hd_qiov);
+            ret = bdrv_co_writev(bs->file, offset, n_sectors, &hd_qiov);
         }
     }
 
-    if (acb->qiov->niov > 1) {
+    if (acb->orig_buf) {
         qemu_vfree(acb->orig_buf);
     }
     qemu_aio_release(acb);
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 19/46] vdi: leave bounce buffering to block layer
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (16 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 18/46] vdi: move aiocb fields to locals Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 20/46] vdi: do not create useless iovecs Kevin Wolf
                   ` (21 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

vdi.c really works as if it implemented bdrv_read and bdrv_write.  However,
because only vector I/O is supported by the asynchronous callbacks, it
went through extra pain to bounce-buffer the I/O.  This can be handled
by the block layer now that the format is coroutine-based.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c |   67 ++++++++++------------------------------------------------
 1 files changed, 12 insertions(+), 55 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 0c1cff6..790b61e 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -144,12 +144,6 @@ void uuid_unparse(const uuid_t uu, char *out)
 #endif
 
 typedef struct {
-    BlockDriverAIOCB common;
-    uint8_t *buf;
-    void *orig_buf;
-} VdiAIOCB;
-
-typedef struct {
     char text[0x40];
     uint32_t signature;
     uint32_t version;
@@ -472,31 +466,9 @@ static int coroutine_fn vdi_co_is_allocated(BlockDriverState *bs,
     return VDI_IS_ALLOCATED(bmap_entry);
 }
 
-static AIOPool vdi_aio_pool = {
-    .aiocb_size = sizeof(VdiAIOCB),
-};
-
-static VdiAIOCB *vdi_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov)
+static int vdi_co_read(BlockDriverState *bs,
+        int64_t sector_num, uint8_t *buf, int nb_sectors)
 {
-    VdiAIOCB *acb;
-
-    logout("%p, %p\n", bs, qiov);
-
-    acb = qemu_aio_get(&vdi_aio_pool, bs, NULL, NULL);
-
-    if (qiov->niov > 1) {
-        acb->buf = qemu_blockalign(bs, qiov->size);
-        acb->orig_buf = acb->buf;
-    } else {
-        acb->buf = (uint8_t *)qiov->iov->iov_base;
-    }
-    return acb;
-}
-
-static int vdi_co_readv(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
-{
-    VdiAIOCB *acb;
     BDRVVdiState *s = bs->opaque;
     uint32_t bmap_entry;
     uint32_t block_index;
@@ -507,7 +479,6 @@ static int vdi_co_readv(BlockDriverState *bs,
     int ret;
 
     logout("\n");
-    acb = vdi_aio_setup(bs, qiov);
 
 restart:
     block_index = sector_num / s->block_sectors;
@@ -524,13 +495,13 @@ restart:
     bmap_entry = le32_to_cpu(s->bmap[block_index]);
     if (!VDI_IS_ALLOCATED(bmap_entry)) {
         /* Block not allocated, return zeros, no need to wait. */
-        memset(acb->buf, 0, n_sectors * SECTOR_SIZE);
+        memset(buf, 0, n_sectors * SECTOR_SIZE);
         ret = 0;
     } else {
         uint64_t offset = s->header.offset_data / SECTOR_SIZE +
                           (uint64_t)bmap_entry * s->block_sectors +
                           sector_in_block;
-        hd_iov.iov_base = (void *)acb->buf;
+        hd_iov.iov_base = (void *)buf;
         hd_iov.iov_len = n_sectors * SECTOR_SIZE;
         qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
         ret = bdrv_co_readv(bs->file, offset, n_sectors, &hd_qiov);
@@ -539,24 +510,18 @@ restart:
 
     nb_sectors -= n_sectors;
     sector_num += n_sectors;
-    acb->buf += n_sectors * SECTOR_SIZE;
+    buf += n_sectors * SECTOR_SIZE;
 
     if (ret >= 0 && nb_sectors > 0) {
         goto restart;
     }
 
-    if (acb->orig_buf) {
-        qemu_iovec_from_buffer(qiov, acb->orig_buf, qiov->size);
-        qemu_vfree(acb->orig_buf);
-    }
-    qemu_aio_release(acb);
     return ret;
 }
 
-static int vdi_co_writev(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
+static int vdi_co_write(BlockDriverState *bs,
+        int64_t sector_num, const uint8_t *buf, int nb_sectors)
 {
-    VdiAIOCB *acb;
     BDRVVdiState *s = bs->opaque;
     uint32_t bmap_entry;
     uint32_t block_index;
@@ -570,10 +535,6 @@ static int vdi_co_writev(BlockDriverState *bs,
     int ret;
 
     logout("\n");
-    acb = vdi_aio_setup(bs, qiov);
-    if (acb->orig_buf) {
-        qemu_iovec_to_buffer(qiov, acb->buf);
-    }
 
 restart:
     block_index = sector_num / s->block_sectors;
@@ -604,7 +565,7 @@ restart:
         /* Copy data to be written to new block and zero unused parts. */
         memset(block, 0, sector_in_block * SECTOR_SIZE);
         memcpy(block + sector_in_block * SECTOR_SIZE,
-               acb->buf, n_sectors * SECTOR_SIZE);
+               buf, n_sectors * SECTOR_SIZE);
         memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0,
                (s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE);
         hd_iov.iov_base = (void *)block;
@@ -615,7 +576,7 @@ restart:
         uint64_t offset = s->header.offset_data / SECTOR_SIZE +
                           (uint64_t)bmap_entry * s->block_sectors +
                           sector_in_block;
-        hd_iov.iov_base = (void *)acb->buf;
+        hd_iov.iov_base = (void *)buf;
         hd_iov.iov_len = n_sectors * SECTOR_SIZE;
         qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
         ret = bdrv_co_writev(bs->file, offset, n_sectors, &hd_qiov);
@@ -623,7 +584,7 @@ restart:
 
     nb_sectors -= n_sectors;
     sector_num += n_sectors;
-    acb->buf += n_sectors * SECTOR_SIZE;
+    buf += n_sectors * SECTOR_SIZE;
 
     logout("%u sectors written\n", n_sectors);
     if (ret >= 0 && nb_sectors > 0) {
@@ -666,10 +627,6 @@ restart:
         }
     }
 
-    if (acb->orig_buf) {
-        qemu_vfree(acb->orig_buf);
-    }
-    qemu_aio_release(acb);
     return ret;
 }
 
@@ -821,9 +778,9 @@ static BlockDriver bdrv_vdi = {
     .bdrv_co_is_allocated = vdi_co_is_allocated,
     .bdrv_make_empty = vdi_make_empty,
 
-    .bdrv_co_readv = vdi_co_readv,
+    .bdrv_read = vdi_co_read,
 #if defined(CONFIG_VDI_WRITE)
-    .bdrv_co_writev = vdi_co_writev,
+    .bdrv_write = vdi_co_write,
 #endif
 
     .bdrv_get_info = vdi_get_info,
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 20/46] vdi: do not create useless iovecs
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (17 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 19/46] vdi: leave bounce buffering to block layer Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 21/46] vdi: change goto to loop Kevin Wolf
                   ` (20 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

Reads and writes to the underlying file can also occur with the simple
non-vectored I/O interfaces.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c |   79 ++++++++++++++++++++++++----------------------------------
 1 files changed, 33 insertions(+), 46 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 790b61e..bfeafea 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -474,8 +474,6 @@ static int vdi_co_read(BlockDriverState *bs,
     uint32_t block_index;
     uint32_t sector_in_block;
     uint32_t n_sectors;
-    struct iovec hd_iov;
-    QEMUIOVector hd_qiov;
     int ret;
 
     logout("\n");
@@ -501,10 +499,7 @@ restart:
         uint64_t offset = s->header.offset_data / SECTOR_SIZE +
                           (uint64_t)bmap_entry * s->block_sectors +
                           sector_in_block;
-        hd_iov.iov_base = (void *)buf;
-        hd_iov.iov_len = n_sectors * SECTOR_SIZE;
-        qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
-        ret = bdrv_co_readv(bs->file, offset, n_sectors, &hd_qiov);
+        ret = bdrv_read(bs->file, offset, buf, n_sectors);
     }
     logout("%u sectors read\n", n_sectors);
 
@@ -529,8 +524,6 @@ static int vdi_co_write(BlockDriverState *bs,
     uint32_t n_sectors;
     uint32_t bmap_first = VDI_UNALLOCATED;
     uint32_t bmap_last = VDI_UNALLOCATED;
-    struct iovec hd_iov;
-    QEMUIOVector hd_qiov;
     uint8_t *block = NULL;
     int ret;
 
@@ -568,18 +561,12 @@ restart:
                buf, n_sectors * SECTOR_SIZE);
         memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0,
                (s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE);
-        hd_iov.iov_base = (void *)block;
-        hd_iov.iov_len = s->block_size;
-        qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
-        ret = bdrv_co_writev(bs->file, offset, s->block_sectors, &hd_qiov);
+        ret = bdrv_write(bs->file, offset, block, s->block_sectors);
     } else {
         uint64_t offset = s->header.offset_data / SECTOR_SIZE +
                           (uint64_t)bmap_entry * s->block_sectors +
                           sector_in_block;
-        hd_iov.iov_base = (void *)buf;
-        hd_iov.iov_len = n_sectors * SECTOR_SIZE;
-        qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
-        ret = bdrv_co_writev(bs->file, offset, n_sectors, &hd_qiov);
+        ret = bdrv_write(bs->file, offset, buf, n_sectors);
     }
 
     nb_sectors -= n_sectors;
@@ -592,39 +579,39 @@ restart:
     }
 
     logout("finished data write\n");
-    if (ret >= 0) {
-        ret = 0;
-        if (block) {
-            VdiHeader *header = (VdiHeader *) block;
-            logout("now writing modified header\n");
-            assert(VDI_IS_ALLOCATED(bmap_first));
-            *header = s->header;
-            vdi_header_to_le(header);
-            hd_iov.iov_base = block;
-            hd_iov.iov_len = SECTOR_SIZE;
-            qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
-            ret = bdrv_co_writev(bs->file, 0, 1, &hd_qiov);
-        }
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (block) {
+        /* One or more new blocks were allocated. */
+        VdiHeader *header = (VdiHeader *) block;
+        uint8_t *base;
+        uint64_t offset;
+
+        logout("now writing modified header\n");
+        assert(VDI_IS_ALLOCATED(bmap_first));
+        *header = s->header;
+        vdi_header_to_le(header);
+        ret = bdrv_write(bs->file, 0, block, 1);
         g_free(block);
         block = NULL;
-        if (ret >= 0 && VDI_IS_ALLOCATED(bmap_first)) {
-            /* One or more new blocks were allocated. */
-            uint64_t offset;
-            logout("now writing modified block map entry %u...%u\n",
-                   bmap_first, bmap_last);
-            /* Write modified sectors from block map. */
-            bmap_first /= (SECTOR_SIZE / sizeof(uint32_t));
-            bmap_last /= (SECTOR_SIZE / sizeof(uint32_t));
-            n_sectors = bmap_last - bmap_first + 1;
-            offset = s->bmap_sector + bmap_first;
-            hd_iov.iov_base = (void *)((uint8_t *)&s->bmap[0] +
-                                            bmap_first * SECTOR_SIZE);
-            hd_iov.iov_len = n_sectors * SECTOR_SIZE;
-            qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
-            logout("will write %u block map sectors starting from entry %u\n",
-                   n_sectors, bmap_first);
-            ret = bdrv_co_writev(bs->file, offset, n_sectors, &hd_qiov);
+
+        if (ret < 0) {
+            return ret;
         }
+
+        logout("now writing modified block map entry %u...%u\n",
+               bmap_first, bmap_last);
+        /* Write modified sectors from block map. */
+        bmap_first /= (SECTOR_SIZE / sizeof(uint32_t));
+        bmap_last /= (SECTOR_SIZE / sizeof(uint32_t));
+        n_sectors = bmap_last - bmap_first + 1;
+        offset = s->bmap_sector + bmap_first;
+        base = ((uint8_t *)&s->bmap[0]) + bmap_first * SECTOR_SIZE;
+        logout("will write %u block map sectors starting from entry %u\n",
+               n_sectors, bmap_first);
+        ret = bdrv_write(bs->file, offset, base, n_sectors);
     }
 
     return ret;
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 21/46] vdi: change goto to loop
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (18 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 20/46] vdi: do not create useless iovecs Kevin Wolf
@ 2012-04-05 15:51 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 22/46] Use DMADirection type for dma_bdrv_io Kevin Wolf
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:51 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

Finally reindent all code and change goto statements to a loop.

Acked-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c |  143 ++++++++++++++++++++++++++++------------------------------
 1 files changed, 69 insertions(+), 74 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index bfeafea..119d3c7 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -474,41 +474,38 @@ static int vdi_co_read(BlockDriverState *bs,
     uint32_t block_index;
     uint32_t sector_in_block;
     uint32_t n_sectors;
-    int ret;
+    int ret = 0;
 
     logout("\n");
 
-restart:
-    block_index = sector_num / s->block_sectors;
-    sector_in_block = sector_num % s->block_sectors;
-    n_sectors = s->block_sectors - sector_in_block;
-    if (n_sectors > nb_sectors) {
-        n_sectors = nb_sectors;
-    }
-
-    logout("will read %u sectors starting at sector %" PRIu64 "\n",
-           n_sectors, sector_num);
+    while (ret >= 0 && nb_sectors > 0) {
+        block_index = sector_num / s->block_sectors;
+        sector_in_block = sector_num % s->block_sectors;
+        n_sectors = s->block_sectors - sector_in_block;
+        if (n_sectors > nb_sectors) {
+            n_sectors = nb_sectors;
+        }
 
-    /* prepare next AIO request */
-    bmap_entry = le32_to_cpu(s->bmap[block_index]);
-    if (!VDI_IS_ALLOCATED(bmap_entry)) {
-        /* Block not allocated, return zeros, no need to wait. */
-        memset(buf, 0, n_sectors * SECTOR_SIZE);
-        ret = 0;
-    } else {
-        uint64_t offset = s->header.offset_data / SECTOR_SIZE +
-                          (uint64_t)bmap_entry * s->block_sectors +
-                          sector_in_block;
-        ret = bdrv_read(bs->file, offset, buf, n_sectors);
-    }
-    logout("%u sectors read\n", n_sectors);
+        logout("will read %u sectors starting at sector %" PRIu64 "\n",
+               n_sectors, sector_num);
 
-    nb_sectors -= n_sectors;
-    sector_num += n_sectors;
-    buf += n_sectors * SECTOR_SIZE;
+        /* prepare next AIO request */
+        bmap_entry = le32_to_cpu(s->bmap[block_index]);
+        if (!VDI_IS_ALLOCATED(bmap_entry)) {
+            /* Block not allocated, return zeros, no need to wait. */
+            memset(buf, 0, n_sectors * SECTOR_SIZE);
+            ret = 0;
+        } else {
+            uint64_t offset = s->header.offset_data / SECTOR_SIZE +
+                              (uint64_t)bmap_entry * s->block_sectors +
+                              sector_in_block;
+            ret = bdrv_read(bs->file, offset, buf, n_sectors);
+        }
+        logout("%u sectors read\n", n_sectors);
 
-    if (ret >= 0 && nb_sectors > 0) {
-        goto restart;
+        nb_sectors -= n_sectors;
+        sector_num += n_sectors;
+        buf += n_sectors * SECTOR_SIZE;
     }
 
     return ret;
@@ -525,57 +522,55 @@ static int vdi_co_write(BlockDriverState *bs,
     uint32_t bmap_first = VDI_UNALLOCATED;
     uint32_t bmap_last = VDI_UNALLOCATED;
     uint8_t *block = NULL;
-    int ret;
+    int ret = 0;
 
     logout("\n");
 
-restart:
-    block_index = sector_num / s->block_sectors;
-    sector_in_block = sector_num % s->block_sectors;
-    n_sectors = s->block_sectors - sector_in_block;
-    if (n_sectors > nb_sectors) {
-        n_sectors = nb_sectors;
-    }
-
-    logout("will write %u sectors starting at sector %" PRIu64 "\n",
-           n_sectors, sector_num);
+    while (ret >= 0 && nb_sectors > 0) {
+        block_index = sector_num / s->block_sectors;
+        sector_in_block = sector_num % s->block_sectors;
+        n_sectors = s->block_sectors - sector_in_block;
+        if (n_sectors > nb_sectors) {
+            n_sectors = nb_sectors;
+        }
 
-    /* prepare next AIO request */
-    bmap_entry = le32_to_cpu(s->bmap[block_index]);
-    if (!VDI_IS_ALLOCATED(bmap_entry)) {
-        /* Allocate new block and write to it. */
-        uint64_t offset;
-        bmap_entry = s->header.blocks_allocated;
-        s->bmap[block_index] = cpu_to_le32(bmap_entry);
-        s->header.blocks_allocated++;
-        offset = s->header.offset_data / SECTOR_SIZE +
-                 (uint64_t)bmap_entry * s->block_sectors;
-        if (block == NULL) {
-            block = g_malloc(s->block_size);
-            bmap_first = block_index;
+        logout("will write %u sectors starting at sector %" PRIu64 "\n",
+               n_sectors, sector_num);
+
+        /* prepare next AIO request */
+        bmap_entry = le32_to_cpu(s->bmap[block_index]);
+        if (!VDI_IS_ALLOCATED(bmap_entry)) {
+            /* Allocate new block and write to it. */
+            uint64_t offset;
+            bmap_entry = s->header.blocks_allocated;
+            s->bmap[block_index] = cpu_to_le32(bmap_entry);
+            s->header.blocks_allocated++;
+            offset = s->header.offset_data / SECTOR_SIZE +
+                     (uint64_t)bmap_entry * s->block_sectors;
+            if (block == NULL) {
+                block = g_malloc(s->block_size);
+                bmap_first = block_index;
+            }
+            bmap_last = block_index;
+            /* Copy data to be written to new block and zero unused parts. */
+            memset(block, 0, sector_in_block * SECTOR_SIZE);
+            memcpy(block + sector_in_block * SECTOR_SIZE,
+                   buf, n_sectors * SECTOR_SIZE);
+            memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0,
+                   (s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE);
+            ret = bdrv_write(bs->file, offset, block, s->block_sectors);
+        } else {
+            uint64_t offset = s->header.offset_data / SECTOR_SIZE +
+                              (uint64_t)bmap_entry * s->block_sectors +
+                              sector_in_block;
+            ret = bdrv_write(bs->file, offset, buf, n_sectors);
         }
-        bmap_last = block_index;
-        /* Copy data to be written to new block and zero unused parts. */
-        memset(block, 0, sector_in_block * SECTOR_SIZE);
-        memcpy(block + sector_in_block * SECTOR_SIZE,
-               buf, n_sectors * SECTOR_SIZE);
-        memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0,
-               (s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE);
-        ret = bdrv_write(bs->file, offset, block, s->block_sectors);
-    } else {
-        uint64_t offset = s->header.offset_data / SECTOR_SIZE +
-                          (uint64_t)bmap_entry * s->block_sectors +
-                          sector_in_block;
-        ret = bdrv_write(bs->file, offset, buf, n_sectors);
-    }
 
-    nb_sectors -= n_sectors;
-    sector_num += n_sectors;
-    buf += n_sectors * SECTOR_SIZE;
+        nb_sectors -= n_sectors;
+        sector_num += n_sectors;
+        buf += n_sectors * SECTOR_SIZE;
 
-    logout("%u sectors written\n", n_sectors);
-    if (ret >= 0 && nb_sectors > 0) {
-        goto restart;
+        logout("%u sectors written\n", n_sectors);
     }
 
     logout("finished data write\n");
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 22/46] Use DMADirection type for dma_bdrv_io
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (19 preceding siblings ...)
  2012-04-05 15:51 ` [Qemu-devel] [PATCH 21/46] vdi: change goto to loop Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 23/46] block: disable I/O throttling on sync api Kevin Wolf
                   ` (18 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: David Gibson <david@gibson.dropbear.id.au>

Currently dma_bdrv_io() takes a 'to_dev' boolean parameter to
determine the direction of DMA it is emulating.  We already have a
DMADirection enum designed specifically to encode DMA directions.
This patch uses it for dma_bdrv_io() as well.  This involves removing
the DMADirection definition from the #ifdef it was inside, but since that
only existed to protect the definition of dma_addr_t from places where
config.h is not included, there wasn't any reason for it to be there in
the first place.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 dma-helpers.c  |   20 ++++++++++++--------
 dma.h          |   12 ++++++------
 hw/ide/core.c  |    3 ++-
 hw/ide/macio.c |    3 ++-
 4 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index 7fcc86d..7971a89 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -41,7 +41,7 @@ typedef struct {
     BlockDriverAIOCB *acb;
     QEMUSGList *sg;
     uint64_t sector_num;
-    bool to_dev;
+    DMADirection dir;
     bool in_cancel;
     int sg_cur_index;
     dma_addr_t sg_cur_byte;
@@ -75,7 +75,8 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
 
     for (i = 0; i < dbs->iov.niov; ++i) {
         cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
-                                  dbs->iov.iov[i].iov_len, !dbs->to_dev,
+                                  dbs->iov.iov[i].iov_len,
+                                  dbs->dir != DMA_DIRECTION_TO_DEVICE,
                                   dbs->iov.iov[i].iov_len);
     }
     qemu_iovec_reset(&dbs->iov);
@@ -122,7 +123,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
     while (dbs->sg_cur_index < dbs->sg->nsg) {
         cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
         cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
-        mem = cpu_physical_memory_map(cur_addr, &cur_len, !dbs->to_dev);
+        mem = cpu_physical_memory_map(cur_addr, &cur_len,
+                                      dbs->dir != DMA_DIRECTION_TO_DEVICE);
         if (!mem)
             break;
         qemu_iovec_add(&dbs->iov, mem, cur_len);
@@ -169,11 +171,11 @@ static AIOPool dma_aio_pool = {
 BlockDriverAIOCB *dma_bdrv_io(
     BlockDriverState *bs, QEMUSGList *sg, uint64_t sector_num,
     DMAIOFunc *io_func, BlockDriverCompletionFunc *cb,
-    void *opaque, bool to_dev)
+    void *opaque, DMADirection dir)
 {
     DMAAIOCB *dbs = qemu_aio_get(&dma_aio_pool, bs, cb, opaque);
 
-    trace_dma_bdrv_io(dbs, bs, sector_num, to_dev);
+    trace_dma_bdrv_io(dbs, bs, sector_num, (dir == DMA_DIRECTION_TO_DEVICE));
 
     dbs->acb = NULL;
     dbs->bs = bs;
@@ -181,7 +183,7 @@ BlockDriverAIOCB *dma_bdrv_io(
     dbs->sector_num = sector_num;
     dbs->sg_cur_index = 0;
     dbs->sg_cur_byte = 0;
-    dbs->to_dev = to_dev;
+    dbs->dir = dir;
     dbs->io_func = io_func;
     dbs->bh = NULL;
     qemu_iovec_init(&dbs->iov, sg->nsg);
@@ -194,14 +196,16 @@ BlockDriverAIOCB *dma_bdrv_read(BlockDriverState *bs,
                                 QEMUSGList *sg, uint64_t sector,
                                 void (*cb)(void *opaque, int ret), void *opaque)
 {
-    return dma_bdrv_io(bs, sg, sector, bdrv_aio_readv, cb, opaque, false);
+    return dma_bdrv_io(bs, sg, sector, bdrv_aio_readv, cb, opaque,
+                       DMA_DIRECTION_FROM_DEVICE);
 }
 
 BlockDriverAIOCB *dma_bdrv_write(BlockDriverState *bs,
                                  QEMUSGList *sg, uint64_t sector,
                                  void (*cb)(void *opaque, int ret), void *opaque)
 {
-    return dma_bdrv_io(bs, sg, sector, bdrv_aio_writev, cb, opaque, true);
+    return dma_bdrv_io(bs, sg, sector, bdrv_aio_writev, cb, opaque,
+                       DMA_DIRECTION_TO_DEVICE);
 }
 
 
diff --git a/dma.h b/dma.h
index 20e86d2..05ac325 100644
--- a/dma.h
+++ b/dma.h
@@ -17,6 +17,11 @@
 
 typedef struct ScatterGatherEntry ScatterGatherEntry;
 
+typedef enum {
+    DMA_DIRECTION_TO_DEVICE = 0,
+    DMA_DIRECTION_FROM_DEVICE = 1,
+} DMADirection;
+
 struct QEMUSGList {
     ScatterGatherEntry *sg;
     int nsg;
@@ -29,11 +34,6 @@ typedef target_phys_addr_t dma_addr_t;
 
 #define DMA_ADDR_FMT TARGET_FMT_plx
 
-typedef enum {
-    DMA_DIRECTION_TO_DEVICE = 0,
-    DMA_DIRECTION_FROM_DEVICE = 1,
-} DMADirection;
-
 struct ScatterGatherEntry {
     dma_addr_t base;
     dma_addr_t len;
@@ -51,7 +51,7 @@ typedef BlockDriverAIOCB *DMAIOFunc(BlockDriverState *bs, int64_t sector_num,
 BlockDriverAIOCB *dma_bdrv_io(BlockDriverState *bs,
                               QEMUSGList *sg, uint64_t sector_num,
                               DMAIOFunc *io_func, BlockDriverCompletionFunc *cb,
-                              void *opaque, bool to_dev);
+                              void *opaque, DMADirection dir);
 BlockDriverAIOCB *dma_bdrv_read(BlockDriverState *bs,
                                 QEMUSGList *sg, uint64_t sector,
                                 BlockDriverCompletionFunc *cb, void *opaque);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 6e25338..35723fd 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -618,7 +618,8 @@ void ide_dma_cb(void *opaque, int ret)
         break;
     case IDE_DMA_TRIM:
         s->bus->dma->aiocb = dma_bdrv_io(s->bs, &s->sg, sector_num,
-                                         ide_issue_trim, ide_dma_cb, s, true);
+                                         ide_issue_trim, ide_dma_cb, s,
+                                         DMA_DIRECTION_TO_DEVICE);
         break;
     }
     return;
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index a4df244..7b38d9e 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -149,7 +149,8 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
         break;
     case IDE_DMA_TRIM:
         m->aiocb = dma_bdrv_io(s->bs, &s->sg, sector_num,
-                               ide_issue_trim, pmac_ide_transfer_cb, s, true);
+                               ide_issue_trim, pmac_ide_transfer_cb, s,
+                               DMA_DIRECTION_TO_DEVICE);
         break;
     }
     return;
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 23/46] block: disable I/O throttling on sync api
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (20 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 22/46] Use DMADirection type for dma_bdrv_io Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 25/46] block: fix streaming/closing race Kevin Wolf
                   ` (17 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 8858be0..0344673 100644
--- a/block.c
+++ b/block.c
@@ -1463,6 +1463,17 @@ static int bdrv_rw_co(BlockDriverState *bs, int64_t sector_num, uint8_t *buf,
 
     qemu_iovec_init_external(&qiov, &iov, 1);
 
+    /**
+     * In sync call context, when the vcpu is blocked, this throttling timer
+     * will not fire; so the I/O throttling function has to be disabled here
+     * if it has been enabled.
+     */
+    if (bs->io_limits_enabled) {
+        fprintf(stderr, "Disabling I/O throttling on '%s' due "
+                        "to synchronous I/O.\n", bdrv_get_device_name(bs));
+        bdrv_io_limits_disable(bs);
+    }
+
     if (qemu_in_coroutine()) {
         /* Fast-path if already in coroutine context */
         bdrv_rw_co_entry(&rwco);
@@ -1969,10 +1980,19 @@ static int guess_disk_lchs(BlockDriverState *bs,
     struct partition *p;
     uint32_t nr_sects;
     uint64_t nb_sectors;
+    bool enabled;
 
     bdrv_get_geometry(bs, &nb_sectors);
 
+    /**
+     * The function will be invoked during startup not only in sync I/O mode,
+     * but also in async I/O mode. So the I/O throttling function has to
+     * be disabled temporarily here, not permanently.
+     */
+    enabled = bs->io_limits_enabled;
+    bs->io_limits_enabled = false;
     ret = bdrv_read(bs, 0, buf, 1);
+    bs->io_limits_enabled = enabled;
     if (ret < 0)
         return -1;
     /* test msdos magic */
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 25/46] block: fix streaming/closing race
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (21 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 23/46] block: disable I/O throttling on sync api Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 27/46] block: document job API Kevin Wolf
                   ` (16 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

Streaming can issue I/O while qcow2_close is running.  This causes the
L2 caches to become very confused or, alternatively, could cause a
segfault when the streaming coroutine is reentered after closing its
block device.  The fix is to cancel streaming jobs when closing their
underlying device.

The cancellation must be synchronous, on the other hand qemu_aio_wait
will not restart a coroutine that is sleeping in co_sleep.  So add
a flag saying whether streaming has in-flight I/O.  If the busy flag
is false, the coroutine is quiescent and, when cancelled, will not
issue any new I/O.

This protects streaming against closing, but not against deleting.
We have a reference count protecting us against concurrent deletion,
but I still added an assertion to ensure nothing bad happens.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c        |   16 ++++++++++++++++
 block/stream.c |    6 ++++--
 block_int.h    |    2 ++
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 0344673..16e14fa 100644
--- a/block.c
+++ b/block.c
@@ -813,6 +813,9 @@ unlink_and_fail:
 void bdrv_close(BlockDriverState *bs)
 {
     if (bs->drv) {
+        if (bs->job) {
+            block_job_cancel_sync(bs->job);
+        }
         if (bs == bs_snapshots) {
             bs_snapshots = NULL;
         }
@@ -966,6 +969,8 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top)
 void bdrv_delete(BlockDriverState *bs)
 {
     assert(!bs->dev);
+    assert(!bs->job);
+    assert(!bs->in_use);
 
     /* remove from list, if necessary */
     bdrv_make_anon(bs);
@@ -4095,3 +4100,14 @@ bool block_job_is_cancelled(BlockJob *job)
 {
     return job->cancelled;
 }
+
+void block_job_cancel_sync(BlockJob *job)
+{
+    BlockDriverState *bs = job->bs;
+
+    assert(bs->job == job);
+    block_job_cancel(job);
+    while (bs->job != NULL && bs->job->busy) {
+        qemu_aio_wait();
+    }
+}
diff --git a/block/stream.c b/block/stream.c
index d1b3986..f186bfd 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -175,7 +175,7 @@ retry:
             break;
         }
 
-
+        s->common.busy = true;
         if (base) {
             ret = is_allocated_base(bs, base, sector_num,
                                     STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE, &n);
@@ -189,6 +189,7 @@ retry:
             if (s->common.speed) {
                 uint64_t delay_ns = ratelimit_calculate_delay(&s->limit, n);
                 if (delay_ns > 0) {
+                    s->common.busy = false;
                     co_sleep_ns(rt_clock, delay_ns);
 
                     /* Recheck cancellation and that sectors are unallocated */
@@ -208,6 +209,7 @@ retry:
         /* Note that even when no rate limit is applied we need to yield
          * with no pending I/O here so that qemu_aio_flush() returns.
          */
+        s->common.busy = false;
         co_sleep_ns(rt_clock, 0);
     }
 
@@ -215,7 +217,7 @@ retry:
         bdrv_disable_copy_on_read(bs);
     }
 
-    if (sector_num == end && ret == 0) {
+    if (!block_job_is_cancelled(&s->common) && sector_num == end && ret == 0) {
         const char *base_id = NULL;
         if (base) {
             base_id = s->backing_file_id;
diff --git a/block_int.h b/block_int.h
index 22c86a5..a96aabd 100644
--- a/block_int.h
+++ b/block_int.h
@@ -83,6 +83,7 @@ struct BlockJob {
     const BlockJobType *job_type;
     BlockDriverState *bs;
     bool cancelled;
+    bool busy;
 
     /* These fields are published by the query-block-jobs QMP API */
     int64_t offset;
@@ -311,6 +312,7 @@ void block_job_complete(BlockJob *job, int ret);
 int block_job_set_speed(BlockJob *job, int64_t value);
 void block_job_cancel(BlockJob *job);
 bool block_job_is_cancelled(BlockJob *job);
+void block_job_cancel_sync(BlockJob *job);
 
 int stream_start(BlockDriverState *bs, BlockDriverState *base,
                  const char *base_id, BlockDriverCompletionFunc *cb,
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 27/46] block: document job API
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (22 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 25/46] block: fix streaming/closing race Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 28/46] qemu-img: add image fragmentation statistics Kevin Wolf
                   ` (15 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

I am not sure that these are really proper GtkDoc, but they follow
the existing documentation in block_int.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block_int.h |  115 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 112 insertions(+), 3 deletions(-)

diff --git a/block_int.h b/block_int.h
index a96aabd..0e5a032 100644
--- a/block_int.h
+++ b/block_int.h
@@ -63,8 +63,13 @@ typedef struct BlockIOBaseValue {
     uint64_t ios[2];
 } BlockIOBaseValue;
 
-typedef void BlockJobCancelFunc(void *opaque);
 typedef struct BlockJob BlockJob;
+
+/**
+ * BlockJobType:
+ *
+ * A class type for block job objects.
+ */
 typedef struct BlockJobType {
     /** Derived BlockJob struct size */
     size_t instance_size;
@@ -77,20 +82,48 @@ typedef struct BlockJobType {
 } BlockJobType;
 
 /**
- * Long-running operation on a BlockDriverState
+ * BlockJob:
+ *
+ * Long-running operation on a BlockDriverState.
  */
 struct BlockJob {
+    /** The job type, including the job vtable.  */
     const BlockJobType *job_type;
+
+    /** The block device on which the job is operating.  */
     BlockDriverState *bs;
+
+    /**
+     * Set to true if the job should cancel itself.  The flag must
+     * always be tested just before toggling the busy flag from false
+     * to true.  After a job has detected that the cancelled flag is
+     * true, it should not anymore issue any I/O operation to the
+     * block device.
+     */
     bool cancelled;
+
+    /**
+     * Set to false by the job while it is in a quiescent state, where
+     * no I/O is pending and cancellation can be processed without
+     * issuing new I/O.  The busy flag must be set to false when the
+     * job goes to sleep on any condition that is not detected by
+     * #qemu_aio_wait, such as a timer.
+     */
     bool busy;
 
-    /* These fields are published by the query-block-jobs QMP API */
+    /** Offset that is published by the query-block-jobs QMP API */
     int64_t offset;
+
+    /** Length that is published by the query-block-jobs QMP API */
     int64_t len;
+
+    /** Speed that was set with @block_job_set_speed.  */
     int64_t speed;
 
+    /** The completion function that will be called when the job completes.  */
     BlockDriverCompletionFunc *cb;
+
+    /** The opaque value that is passed to the completion function.  */
     void *opaque;
 };
 
@@ -306,14 +339,90 @@ void bdrv_set_io_limits(BlockDriverState *bs,
 int is_windows_drive(const char *filename);
 #endif
 
+/**
+ * block_job_create:
+ * @job_type: The class object for the newly-created job.
+ * @bs: The block
+ * @cb: Completion function for the job.
+ * @opaque: Opaque pointer value passed to @cb.
+ *
+ * Create a new long-running block device job and return it.  The job
+ * will call @cb asynchronously when the job completes.  Note that
+ * @bs may have been closed at the time the @cb it is called.  If
+ * this is the case, the job may be reported as either cancelled or
+ * completed.
+ *
+ * This function is not part of the public job interface; it should be
+ * called from a wrapper that is specific to the job type.
+ */
 void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
                        BlockDriverCompletionFunc *cb, void *opaque);
+
+/**
+ * block_job_complete:
+ * @job: The job being completed.
+ * @ret: The status code.
+ *
+ * Call the completion function that was registered at creation time, and
+ * free @job.
+ */
 void block_job_complete(BlockJob *job, int ret);
+
+/**
+ * block_job_set_speed:
+ * @job: The job to set the speed for.
+ * @speed: The new value
+ *
+ * Set a rate-limiting parameter for the job; the actual meaning may
+ * vary depending on the job type.
+ */
 int block_job_set_speed(BlockJob *job, int64_t value);
+
+/**
+ * block_job_cancel:
+ * @job: The job to be canceled.
+ *
+ * Asynchronously cancel the specified job.
+ */
 void block_job_cancel(BlockJob *job);
+
+/**
+ * block_job_is_cancelled:
+ * @job: The job being queried.
+ *
+ * Returns whether the job is scheduled for cancellation.
+ */
 bool block_job_is_cancelled(BlockJob *job);
+
+/**
+ * block_job_cancel:
+ * @job: The job to be canceled.
+ *
+ * Asynchronously cancel the job and wait for it to reach a quiescent
+ * state.  Note that the completion callback will still be called
+ * asynchronously, hence it is *not* valid to call #bdrv_delete
+ * immediately after #block_job_cancel_sync.  Users of block jobs
+ * will usually protect the BlockDriverState objects with a reference
+ * count, should this be a concern.
+ */
 void block_job_cancel_sync(BlockJob *job);
 
+/**
+ * stream_start:
+ * @bs: Block device to operate on.
+ * @base: Block device that will become the new base, or %NULL to
+ * flatten the whole backing file chain onto @bs.
+ * @base_id: The file name that will be written to @bs as the new
+ * backing file if the job completes.  Ignored if @base is %NULL.
+ * @cb: Completion function for the job.
+ * @opaque: Opaque pointer value passed to @cb.
+ *
+ * Start a streaming operation on @bs.  Clusters that are unallocated
+ * in @bs, but allocated in any image between @base and @bs (both
+ * exclusive) will be written to @bs.  At the end of a successful
+ * streaming job, the backing file of @bs will be changed to
+ * @base_id in the written image and to @base in the live BlockDriverState.
+ */
 int stream_start(BlockDriverState *bs, BlockDriverState *base,
                  const char *base_id, BlockDriverCompletionFunc *cb,
                  void *opaque);
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 28/46] qemu-img: add image fragmentation statistics
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (23 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 27/46] block: document job API Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 29/46] qed: " Kevin Wolf
                   ` (14 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Discussion can be found at:
http://patchwork.ozlabs.org/patch/128730/

This patch add image fragmentation statistics while using qemu-img check.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.h    |    7 +++++++
 qemu-img.c |    9 ++++++++-
 2 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/block.h b/block.h
index c51ab16..ea12f5d 100644
--- a/block.h
+++ b/block.h
@@ -17,6 +17,12 @@ typedef struct BlockDriverInfo {
     int64_t vm_state_offset;
 } BlockDriverInfo;
 
+typedef struct BlockFragInfo {
+    uint64_t allocated_clusters;
+    uint64_t total_clusters;
+    uint64_t fragmented_clusters;
+} BlockFragInfo;
+
 typedef struct QEMUSnapshotInfo {
     char id_str[128]; /* unique snapshot id */
     /* the following fields are informative. They are not needed for
@@ -175,6 +181,7 @@ typedef struct BdrvCheckResult {
     int corruptions;
     int leaks;
     int check_errors;
+    BlockFragInfo bfi;
 } BdrvCheckResult;
 
 int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res);
diff --git a/qemu-img.c b/qemu-img.c
index 0e48b35..4de48ba 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -428,6 +428,13 @@ static int img_check(int argc, char **argv)
         }
     }
 
+    if (result.bfi.total_clusters != 0 && result.bfi.allocated_clusters != 0) {
+        printf("%" PRId64 "/%" PRId64 "= %0.2f%% allocated, %0.2f%% fragmented\n",
+        result.bfi.allocated_clusters, result.bfi.total_clusters,
+        result.bfi.allocated_clusters * 100.0 / result.bfi.total_clusters,
+        result.bfi.fragmented_clusters * 100.0 / result.bfi.allocated_clusters);
+    }
+
     bdrv_delete(bs);
 
     if (ret < 0 || result.check_errors) {
@@ -716,7 +723,7 @@ static int img_convert(int argc, char **argv)
         ret = -1;
         goto out;
     }
-        
+
     qemu_progress_init(progress, 2.0);
     qemu_progress_print(0, 100);
 
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 29/46] qed: image fragmentation statistics
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (24 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 28/46] qemu-img: add image fragmentation statistics Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 30/46] qemu-img: add dirty flag status Kevin Wolf
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qed-check.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/block/qed-check.c b/block/qed-check.c
index e4a49ce..94327ff 100644
--- a/block/qed-check.c
+++ b/block/qed-check.c
@@ -68,6 +68,7 @@ static unsigned int qed_check_l2_table(QEDCheck *check, QEDTable *table)
 {
     BDRVQEDState *s = check->s;
     unsigned int i, num_invalid = 0;
+    uint64_t last_offset = 0;
 
     for (i = 0; i < s->table_nelems; i++) {
         uint64_t offset = table->offsets[i];
@@ -76,6 +77,11 @@ static unsigned int qed_check_l2_table(QEDCheck *check, QEDTable *table)
             qed_offset_is_zero_cluster(offset)) {
             continue;
         }
+        check->result->bfi.allocated_clusters++;
+        if (last_offset && (last_offset + s->header.cluster_size != offset)) {
+            check->result->bfi.fragmented_clusters++;
+        }
+        last_offset = offset;
 
         /* Detect invalid cluster offset */
         if (!qed_check_cluster_offset(s, offset)) {
@@ -200,6 +206,9 @@ int qed_check(BDRVQEDState *s, BdrvCheckResult *result, bool fix)
     check.used_clusters = g_malloc0(((check.nclusters + 31) / 32) *
                                        sizeof(check.used_clusters[0]));
 
+    check.result->bfi.total_clusters =
+        (s->header.image_size + s->header.cluster_size - 1) /
+            s->header.cluster_size;
     ret = qed_check_l1_table(&check, s->l1_table);
     if (ret == 0) {
         /* Only check for leaks if entire image was scanned successfully */
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 30/46] qemu-img: add dirty flag status
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (25 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 29/46] qed: " Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 31/46] qed: track " Kevin Wolf
                   ` (12 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Some block drivers can verify their image files are clean or not. So we can show
it while using "qemu-img info".

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.h    |    1 +
 qemu-img.c |    3 +++
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/block.h b/block.h
index ea12f5d..4e99e18 100644
--- a/block.h
+++ b/block.h
@@ -15,6 +15,7 @@ typedef struct BlockDriverInfo {
     int cluster_size;
     /* offset at which the VM state can be saved (0 if not possible) */
     int64_t vm_state_offset;
+    bool is_dirty;
 } BlockDriverInfo;
 
 typedef struct BlockFragInfo {
diff --git a/qemu-img.c b/qemu-img.c
index 4de48ba..6a61ca8 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1132,6 +1132,9 @@ static int img_info(int argc, char **argv)
         if (bdi.cluster_size != 0) {
             printf("cluster_size: %d\n", bdi.cluster_size);
         }
+        if (bdi.is_dirty) {
+            printf("cleanly shut down: no\n");
+        }
     }
     bdrv_get_backing_filename(bs, backing_filename, sizeof(backing_filename));
     if (backing_filename[0] != '\0') {
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 31/46] qed: track dirty flag status
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (26 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 30/46] qemu-img: add dirty flag status Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 32/46] block: bdrv_append() fixes Kevin Wolf
                   ` (11 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qed.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index d6c1580..19d87f3 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1434,6 +1434,7 @@ static int bdrv_qed_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
 
     memset(bdi, 0, sizeof(*bdi));
     bdi->cluster_size = s->header.cluster_size;
+    bdi->is_dirty = s->header.features & QED_F_NEED_CHECK;
     return 0;
 }
 
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 32/46] block: bdrv_append() fixes
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (27 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 31/46] qed: track " Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 33/46] sheepdog: implement SD_OP_FLUSH_VDI operation Kevin Wolf
                   ` (10 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Jeff Cody <jcody@redhat.com>

A few fixups for bdrv_append():

The new bs (bs_new) passed into bdrv_append() should be anonymous.  Rather
than call bdrv_make_anon() to enforce this, use an assert to catch when a caller
is passing in a bs_new that is not anonymous.

Also, the new top layer should have its backing_format reflect the original
top's format.

And last, after the swap of bs contents, the device_name will have been copied
down. This needs to be cleared to reflect the anonymity of the bs that was
pushed down.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 33630eb..b3117ef 100644
--- a/block.c
+++ b/block.c
@@ -892,14 +892,16 @@ void bdrv_make_anon(BlockDriverState *bs)
  * This will modify the BlockDriverState fields, and swap contents
  * between bs_new and bs_top. Both bs_new and bs_top are modified.
  *
+ * bs_new is required to be anonymous.
+ *
  * This function does not create any image files.
  */
 void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top)
 {
     BlockDriverState tmp;
 
-    /* the new bs must not be in bdrv_states */
-    bdrv_make_anon(bs_new);
+    /* bs_new must be anonymous */
+    assert(bs_new->device_name[0] == '\0');
 
     tmp = *bs_new;
 
@@ -944,11 +946,18 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top)
      * swapping bs_new and bs_top contents. */
     tmp.backing_hd = bs_new;
     pstrcpy(tmp.backing_file, sizeof(tmp.backing_file), bs_top->filename);
+    bdrv_get_format(bs_top, tmp.backing_format, sizeof(tmp.backing_format));
 
     /* swap contents of the fixed new bs and the current top */
     *bs_new = *bs_top;
     *bs_top = tmp;
 
+    /* device_name[] was carried over from the old bs_top.  bs_new
+     * shouldn't be in bdrv_states, so we need to make device_name[]
+     * reflect the anonymity of bs_new
+     */
+    bs_new->device_name[0] = '\0';
+
     /* clear the copied fields in the new backing file */
     bdrv_detach_dev(bs_new, bs_new->dev);
 
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 33/46] sheepdog: implement SD_OP_FLUSH_VDI operation
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (28 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 32/46] block: bdrv_append() fixes Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 34/46] sheepdog: fix send req helpers Kevin Wolf
                   ` (9 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Liu Yuan <tailai.ly@taobao.com>

Flush operation is supposed to flush the write-back cache of
sheepdog cluster.

By issuing flush operation, we can assure the Guest of data
reaching the sheepdog cluster storage.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/sheepdog.c |  142 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 128 insertions(+), 14 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 00276f6f..1248534 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -32,9 +32,11 @@
 #define SD_OP_RELEASE_VDI    0x13
 #define SD_OP_GET_VDI_INFO   0x14
 #define SD_OP_READ_VDIS      0x15
+#define SD_OP_FLUSH_VDI      0x16
 
 #define SD_FLAG_CMD_WRITE    0x01
 #define SD_FLAG_CMD_COW      0x02
+#define SD_FLAG_CMD_CACHE    0x04
 
 #define SD_RES_SUCCESS       0x00 /* Success */
 #define SD_RES_UNKNOWN       0x01 /* Unknown error */
@@ -293,10 +295,12 @@ typedef struct BDRVSheepdogState {
 
     char name[SD_MAX_VDI_LEN];
     int is_snapshot;
+    uint8_t cache_enabled;
 
     char *addr;
     char *port;
     int fd;
+    int flush_fd;
 
     CoMutex lock;
     Coroutine *co_send;
@@ -516,6 +520,23 @@ static int send_req(int sockfd, SheepdogReq *hdr, void *data,
     return ret;
 }
 
+static int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
+                       unsigned int *wlen)
+{
+    int ret;
+
+    ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
+    if (ret < sizeof(*hdr)) {
+        error_report("failed to send a req, %s", strerror(errno));
+    }
+
+    ret = qemu_co_send(sockfd, data, *wlen);
+    if (ret < *wlen) {
+        error_report("failed to send a req, %s", strerror(errno));
+    }
+
+    return ret;
+}
 static int do_req(int sockfd, SheepdogReq *hdr, void *data,
                   unsigned int *wlen, unsigned int *rlen)
 {
@@ -550,6 +571,40 @@ out:
     return ret;
 }
 
+static int do_co_req(int sockfd, SheepdogReq *hdr, void *data,
+                     unsigned int *wlen, unsigned int *rlen)
+{
+    int ret;
+
+    socket_set_block(sockfd);
+    ret = send_co_req(sockfd, hdr, data, wlen);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
+    if (ret < sizeof(*hdr)) {
+        error_report("failed to get a rsp, %s", strerror(errno));
+        goto out;
+    }
+
+    if (*rlen > hdr->data_length) {
+        *rlen = hdr->data_length;
+    }
+
+    if (*rlen) {
+        ret = qemu_co_recv(sockfd, data, *rlen);
+        if (ret < *rlen) {
+            error_report("failed to get the data, %s", strerror(errno));
+            goto out;
+        }
+    }
+    ret = 0;
+out:
+    socket_set_nonblock(sockfd);
+    return ret;
+}
+
 static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
                            struct iovec *iov, int niov, int create,
                            enum AIOCBState aiocb_type);
@@ -900,6 +955,10 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
         hdr.flags = SD_FLAG_CMD_WRITE | flags;
     }
 
+    if (s->cache_enabled) {
+        hdr.flags |= SD_FLAG_CMD_CACHE;
+    }
+
     hdr.oid = oid;
     hdr.cow_oid = old_oid;
     hdr.copies = s->inode.nr_copies;
@@ -942,7 +1001,7 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
 
 static int read_write_object(int fd, char *buf, uint64_t oid, int copies,
                              unsigned int datalen, uint64_t offset,
-                             int write, int create)
+                             int write, int create, uint8_t cache)
 {
     SheepdogObjReq hdr;
     SheepdogObjRsp *rsp = (SheepdogObjRsp *)&hdr;
@@ -965,6 +1024,11 @@ static int read_write_object(int fd, char *buf, uint64_t oid, int copies,
         rlen = datalen;
         hdr.opcode = SD_OP_READ_OBJ;
     }
+
+    if (cache) {
+        hdr.flags |= SD_FLAG_CMD_CACHE;
+    }
+
     hdr.oid = oid;
     hdr.data_length = datalen;
     hdr.offset = offset;
@@ -986,15 +1050,18 @@ static int read_write_object(int fd, char *buf, uint64_t oid, int copies,
 }
 
 static int read_object(int fd, char *buf, uint64_t oid, int copies,
-                       unsigned int datalen, uint64_t offset)
+                       unsigned int datalen, uint64_t offset, uint8_t cache)
 {
-    return read_write_object(fd, buf, oid, copies, datalen, offset, 0, 0);
+    return read_write_object(fd, buf, oid, copies, datalen, offset, 0, 0,
+                             cache);
 }
 
 static int write_object(int fd, char *buf, uint64_t oid, int copies,
-                        unsigned int datalen, uint64_t offset, int create)
+                        unsigned int datalen, uint64_t offset, int create,
+                        uint8_t cache)
 {
-    return read_write_object(fd, buf, oid, copies, datalen, offset, 1, create);
+    return read_write_object(fd, buf, oid, copies, datalen, offset, 1, create,
+                             cache);
 }
 
 static int sd_open(BlockDriverState *bs, const char *filename, int flags)
@@ -1026,6 +1093,15 @@ static int sd_open(BlockDriverState *bs, const char *filename, int flags)
         goto out;
     }
 
+    if (flags & BDRV_O_CACHE_WB) {
+        s->cache_enabled = 1;
+        s->flush_fd = connect_to_sdog(s->addr, s->port);
+        if (s->flush_fd < 0) {
+            error_report("failed to connect");
+            goto out;
+        }
+    }
+
     if (snapid) {
         dprintf("%" PRIx32 " snapshot inode was open.\n", vid);
         s->is_snapshot = 1;
@@ -1038,7 +1114,8 @@ static int sd_open(BlockDriverState *bs, const char *filename, int flags)
     }
 
     buf = g_malloc(SD_INODE_SIZE);
-    ret = read_object(fd, buf, vid_to_vdi_oid(vid), 0, SD_INODE_SIZE, 0);
+    ret = read_object(fd, buf, vid_to_vdi_oid(vid), 0, SD_INODE_SIZE, 0,
+                      s->cache_enabled);
 
     closesocket(fd);
 
@@ -1272,6 +1349,9 @@ static void sd_close(BlockDriverState *bs)
 
     qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL, NULL, NULL);
     closesocket(s->fd);
+    if (s->cache_enabled) {
+        closesocket(s->flush_fd);
+    }
     g_free(s->addr);
 }
 
@@ -1305,7 +1385,7 @@ static int sd_truncate(BlockDriverState *bs, int64_t offset)
     datalen = SD_INODE_SIZE - sizeof(s->inode.data_vdi_id);
     s->inode.vdi_size = offset;
     ret = write_object(fd, (char *)&s->inode, vid_to_vdi_oid(s->inode.vdi_id),
-                       s->inode.nr_copies, datalen, 0, 0);
+                       s->inode.nr_copies, datalen, 0, 0, s->cache_enabled);
     close(fd);
 
     if (ret < 0) {
@@ -1387,7 +1467,7 @@ static int sd_create_branch(BDRVSheepdogState *s)
     }
 
     ret = read_object(fd, buf, vid_to_vdi_oid(vid), s->inode.nr_copies,
-                      SD_INODE_SIZE, 0);
+                      SD_INODE_SIZE, 0, s->cache_enabled);
 
     closesocket(fd);
 
@@ -1575,6 +1655,36 @@ static coroutine_fn int sd_co_readv(BlockDriverState *bs, int64_t sector_num,
     return acb->ret;
 }
 
+static int coroutine_fn sd_co_flush_to_disk(BlockDriverState *bs)
+{
+    BDRVSheepdogState *s = bs->opaque;
+    SheepdogObjReq hdr = { 0 };
+    SheepdogObjRsp *rsp = (SheepdogObjRsp *)&hdr;
+    SheepdogInode *inode = &s->inode;
+    int ret;
+    unsigned int wlen = 0, rlen = 0;
+
+    if (!s->cache_enabled) {
+        return 0;
+    }
+
+    hdr.opcode = SD_OP_FLUSH_VDI;
+    hdr.oid = vid_to_vdi_oid(inode->vdi_id);
+
+    ret = do_co_req(s->flush_fd, (SheepdogReq *)&hdr, NULL, &wlen, &rlen);
+    if (ret) {
+        error_report("failed to send a request to the sheep");
+        return ret;
+    }
+
+    if (rsp->result != SD_RES_SUCCESS) {
+        error_report("%s", sd_strerror(rsp->result));
+        return -EIO;
+    }
+
+    return 0;
+}
+
 static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
 {
     BDRVSheepdogState *s = bs->opaque;
@@ -1610,7 +1720,7 @@ static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     }
 
     ret = write_object(fd, (char *)&s->inode, vid_to_vdi_oid(s->inode.vdi_id),
-                       s->inode.nr_copies, datalen, 0, 0);
+                       s->inode.nr_copies, datalen, 0, 0, s->cache_enabled);
     if (ret < 0) {
         error_report("failed to write snapshot's inode.");
         ret = -EIO;
@@ -1629,7 +1739,7 @@ static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     inode = (SheepdogInode *)g_malloc(datalen);
 
     ret = read_object(fd, (char *)inode, vid_to_vdi_oid(new_vid),
-                      s->inode.nr_copies, datalen, 0);
+                      s->inode.nr_copies, datalen, 0, s->cache_enabled);
 
     if (ret < 0) {
         error_report("failed to read new inode info. %s", strerror(errno));
@@ -1684,7 +1794,7 @@ static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
 
     buf = g_malloc(SD_INODE_SIZE);
     ret = read_object(fd, buf, vid_to_vdi_oid(vid), s->inode.nr_copies,
-                      SD_INODE_SIZE, 0);
+                      SD_INODE_SIZE, 0, s->cache_enabled);
 
     closesocket(fd);
 
@@ -1779,7 +1889,8 @@ static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
 
         /* we don't need to read entire object */
         ret = read_object(fd, (char *)&inode, vid_to_vdi_oid(vid),
-                          0, SD_INODE_SIZE - sizeof(inode.data_vdi_id), 0);
+                          0, SD_INODE_SIZE - sizeof(inode.data_vdi_id), 0,
+                          s->cache_enabled);
 
         if (ret) {
             continue;
@@ -1835,10 +1946,12 @@ static int do_load_save_vmstate(BDRVSheepdogState *s, uint8_t *data,
         create = (offset == 0);
         if (load) {
             ret = read_object(fd, (char *)data, vmstate_oid,
-                              s->inode.nr_copies, data_len, offset);
+                              s->inode.nr_copies, data_len, offset,
+                              s->cache_enabled);
         } else {
             ret = write_object(fd, (char *)data, vmstate_oid,
-                               s->inode.nr_copies, data_len, offset, create);
+                               s->inode.nr_copies, data_len, offset, create,
+                               s->cache_enabled);
         }
 
         if (ret < 0) {
@@ -1904,6 +2017,7 @@ BlockDriver bdrv_sheepdog = {
 
     .bdrv_co_readv  = sd_co_readv,
     .bdrv_co_writev = sd_co_writev,
+    .bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
 
     .bdrv_snapshot_create   = sd_snapshot_create,
     .bdrv_snapshot_goto     = sd_snapshot_goto,
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 34/46] sheepdog: fix send req helpers
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (29 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 33/46] sheepdog: implement SD_OP_FLUSH_VDI operation Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 36/46] qemu-iotests: Test unknown qcow2 header extensions Kevin Wolf
                   ` (8 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Liu Yuan <tailai.ly@taobao.com>

We should return if reading of the header fails.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/sheepdog.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 1248534..3eaf625 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -510,6 +510,7 @@ static int send_req(int sockfd, SheepdogReq *hdr, void *data,
     ret = qemu_send_full(sockfd, hdr, sizeof(*hdr), 0);
     if (ret < sizeof(*hdr)) {
         error_report("failed to send a req, %s", strerror(errno));
+        return ret;
     }
 
     ret = qemu_send_full(sockfd, data, *wlen, 0);
@@ -528,6 +529,7 @@ static int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
     ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
     if (ret < sizeof(*hdr)) {
         error_report("failed to send a req, %s", strerror(errno));
+        return ret;
     }
 
     ret = qemu_co_send(sockfd, data, *wlen);
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 36/46] qemu-iotests: Test unknown qcow2 header extensions
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (30 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 34/46] sheepdog: fix send req helpers Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 37/46] qemu-iotests: Fix call syntax for qemu-img Kevin Wolf
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

The immportant thing here is that header extensions don't get silently
dropped when the header is rewritten, e.g. during a rebase.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 tests/qemu-iotests/031     |   72 +++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/031.out |   76 ++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |    1 +
 3 files changed, 149 insertions(+), 0 deletions(-)
 create mode 100755 tests/qemu-iotests/031
 create mode 100644 tests/qemu-iotests/031.out

diff --git a/tests/qemu-iotests/031 b/tests/qemu-iotests/031
new file mode 100755
index 0000000..6365f28
--- /dev/null
+++ b/tests/qemu-iotests/031
@@ -0,0 +1,72 @@
+#!/bin/bash
+#
+# Test that all qcow2 header extensions survive a header rewrite
+#
+# Copyright (C) 2011 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=kwolf@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+	_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.pattern
+
+# This tests qcow2-specific low-level functionality
+_supported_fmt qcow2
+_supported_proto generic
+_supported_os Linux
+
+CLUSTER_SIZE=65536
+echo
+echo === Create image with unknown header extension ===
+echo
+_make_test_img 64M
+./qcow2.py $TEST_IMG add-header-ext 0x12345678 "This is a test header extension"
+./qcow2.py $TEST_IMG dump-header
+_check_test_img
+
+echo
+echo === Rewrite header with no backing file ===
+echo
+$QEMU_IMG rebase -u -b "" $TEST_IMG
+./qcow2.py $TEST_IMG dump-header
+_check_test_img
+
+echo
+echo === Add a backing file and format ===
+echo
+$QEMU_IMG rebase -u -b "/some/backing/file/path" -F host_device $TEST_IMG
+./qcow2.py $TEST_IMG dump-header
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/031.out b/tests/qemu-iotests/031.out
new file mode 100644
index 0000000..0f1bf68
--- /dev/null
+++ b/tests/qemu-iotests/031.out
@@ -0,0 +1,76 @@
+QA output created by 031
+
+=== Create image with unknown header extension ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 cluster_size=65536 
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+
+Header extension:
+magic                     0x12345678
+length                    31
+data                      'This is a test header extension'
+
+No errors were found on the image.
+
+=== Rewrite header with no backing file ===
+
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+
+Header extension:
+magic                     0x12345678
+length                    31
+data                      'This is a test header extension'
+
+No errors were found on the image.
+
+=== Add a backing file and format ===
+
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x90
+backing_file_size         0x17
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+
+Header extension:
+magic                     0xe2792aca
+length                    11
+data                      'host_device'
+
+Header extension:
+magic                     0x12345678
+length                    31
+data                      'This is a test header extension'
+
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index b549f10..1742ede 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -37,3 +37,4 @@
 028 rw backing auto
 029 rw auto quick
 030 rw auto
+031 rw auto quick
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 37/46] qemu-iotests: Fix call syntax for qemu-img
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (31 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 36/46] qemu-iotests: Test unknown qcow2 header extensions Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 38/46] qemu-iotests: Fix call syntax for qemu-io Kevin Wolf
                   ` (6 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Stefan Weil <sw@weilnetz.de>

qemu-img requires first options, then file name, then size.

GNU getopt also allows options at the end, but POSIX getopt
doesn't. Try "export POSIXLY_CORRECT=y" to get the POSIX
behaviour with GNU getopt, too.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 tests/qemu-iotests/common.rc |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 26811ca..4cb8dae 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -57,16 +57,21 @@ _make_test_img()
 {
     # extra qemu-img options can be added by tests
     # at least one argument (the image size) needs to be added
-    local extra_img_options=$*
+    local extra_img_options=""
     local cluster_size_filter="s# cluster_size=[0-9]\\+##g"
+    local image_size=$*
 
+    if [ "$1" = "-b" ]; then
+        extra_img_options="$1 $2"
+        image_size=$3
+    fi
     if [ \( "$IMGFMT" = "qcow2" -o "$IMGFMT" = "qed" \) -a -n "$CLUSTER_SIZE" ]; then
         extra_img_options="-o cluster_size=$CLUSTER_SIZE $extra_img_options"
         cluster_size_filter=""
     fi
 
     # XXX(hch): have global image options?
-    $QEMU_IMG create -f $IMGFMT $TEST_IMG $extra_img_options | \
+    $QEMU_IMG create -f $IMGFMT $extra_img_options $TEST_IMG $image_size | \
     	sed -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" | \
     	sed -e "s#$TEST_DIR#TEST_DIR#g" | \
     	sed -e "s#$IMGFMT#IMGFMT#g" | \
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 38/46] qemu-iotests: Fix call syntax for qemu-io
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (32 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 37/46] qemu-iotests: Fix call syntax for qemu-img Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 40/46] block: Add new BDRV_O_INCOMING flag to notice incoming live migration Kevin Wolf
                   ` (5 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Stefan Weil <sw@weilnetz.de>

qemu-io requires options first, then fixed parameters.

GNU getopt also allows options at the end, but POSIX getopt
doesn't. Try "export POSIXLY_CORRECT=y" to get the POSIX
behaviour with GNU getopt, too.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 tests/qemu-iotests/009 |    4 ++--
 tests/qemu-iotests/010 |    6 +++---
 tests/qemu-iotests/011 |    2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/009 b/tests/qemu-iotests/009
index f7262b5..25368c8 100755
--- a/tests/qemu-iotests/009
+++ b/tests/qemu-iotests/009
@@ -53,10 +53,10 @@ _make_test_img $size
 echo
 echo "creating pattern"
 $QEMU_IO \
-	-c "write 2048k 4k -P 65" \
+	-c "write -P 65 2048k 4k" \
 	-c "write 4k 4k" \
 	-c "write 9M 4k" \
-	-c "read 2044k 8k -P 65 -s 4k -l 4k" \
+	-c "read -P 65 -s 4k -l 4k 2044k 8k" \
 $TEST_IMG | _filter_qemu_io
 
 echo
diff --git a/tests/qemu-iotests/010 b/tests/qemu-iotests/010
index e3205aa..7b57929 100755
--- a/tests/qemu-iotests/010
+++ b/tests/qemu-iotests/010
@@ -53,11 +53,11 @@ _make_test_img $size
 echo
 echo "creating pattern"
 $QEMU_IO \
-	-c "write 2048k 4k -P 165" \
+	-c "write -P 165 2048k 4k" \
 	-c "write 64k 4k" \
 	-c "write 9M 4k" \
-	-c "write 2044k 4k -P 165" \
-	-c "write 8M 4k -P 99" \
+	-c "write -P 165 2044k 4k" \
+	-c "write -P 99 8M 4k" \
 	-c "read -P 165 2044k 8k" \
 $TEST_IMG | _filter_qemu_io
 
diff --git a/tests/qemu-iotests/011 b/tests/qemu-iotests/011
index 59df1ae..b03df68 100755
--- a/tests/qemu-iotests/011
+++ b/tests/qemu-iotests/011
@@ -60,7 +60,7 @@ for i in `seq 1 10`; do
     # Note that we filter away the actual offset.  That's because qemu
     # may re-order the two aio requests.  We only want to make sure the
     # filesystem isn't corrupted afterwards anyway.
-    $QEMU_IO $TEST_IMG -c "aio_write $off1 1M" -c "aio_write $off2 1M" | \
+    $QEMU_IO -c "aio_write $off1 1M" -c "aio_write $off2 1M" $TEST_IMG | \
     	_filter_qemu_io | \
 	sed -e 's/bytes at offset [0-9]*/bytes at offset XXX/g'
 done
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 40/46] block: Add new BDRV_O_INCOMING flag to notice incoming live migration
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (33 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 38/46] qemu-iotests: Fix call syntax for qemu-io Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 42/46] blockdev: open images with BDRV_O_INCOMING on " Kevin Wolf
                   ` (4 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Benoît Canet <benoit.canet@gmail.com>

>From original patch with Patchwork-id: 31110 by
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

"Add a flag to indicate that incoming migration is pending and care needs
to be taken for data consistency.  Block drivers should not modify the
image file before incoming migration is complete since the migration
source host is still using the image file."

The rationale for not using bdrv->read_only is the following.

"Unfortunately this is not possible because too many other places in QEMU
test bdrv_is_read_only() and use it for their own evil purposes.  For
example, ide_init_drive() will error out because read-only harddisks are
not supported.  We're mixing guest and host side read-only concepts so
this simpler alternative does not work."

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/block.h b/block.h
index 4e99e18..5151dea 100644
--- a/block.h
+++ b/block.h
@@ -78,6 +78,7 @@ typedef struct BlockDevOps {
 #define BDRV_O_NO_BACKING  0x0100 /* don't open the backing file */
 #define BDRV_O_NO_FLUSH    0x0200 /* disable flushing on this disk */
 #define BDRV_O_COPY_ON_READ 0x0400 /* copy read backing sectors into image */
+#define BDRV_O_INCOMING    0x0800  /* consistency hint for incoming migration */
 
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH)
 
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 42/46] blockdev: open images with BDRV_O_INCOMING on incoming live migration
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (34 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 40/46] block: Add new BDRV_O_INCOMING flag to notice incoming live migration Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 43/46] qed: add bdrv_invalidate_cache to be called after " Kevin Wolf
                   ` (3 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Benoît Canet <benoit.canet@gmail.com>

Open images with BDRV_O_INCOMING in order to inform block drivers
that an incoming live migration is coming.

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 blockdev.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 4d17486..0c2440e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -595,6 +595,10 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
         bdrv_flags |= BDRV_O_COPY_ON_READ;
     }
 
+    if (runstate_check(RUN_STATE_INMIGRATE)) {
+        bdrv_flags |= BDRV_O_INCOMING;
+    }
+
     if (media == MEDIA_CDROM) {
         /* CDROM is fine for any interface, don't check.  */
         ro = 1;
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 43/46] qed: add bdrv_invalidate_cache to be called after incoming live migration
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (35 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 42/46] blockdev: open images with BDRV_O_INCOMING on " Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 44/46] migration: clear BDRV_O_INCOMING flags on end of " Kevin Wolf
                   ` (2 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Benoît Canet <benoit.canet@gmail.com>

The QED image is reopened to flush metadata and check consistency.

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qed.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 19d87f3..a5e9d57 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1510,6 +1510,15 @@ static int bdrv_qed_change_backing_file(BlockDriverState *bs,
     return ret;
 }
 
+static void bdrv_qed_invalidate_cache(BlockDriverState *bs)
+{
+    BDRVQEDState *s = bs->opaque;
+
+    bdrv_qed_close(bs);
+    memset(s, 0, sizeof(BDRVQEDState));
+    bdrv_qed_open(bs, bs->open_flags);
+}
+
 static int bdrv_qed_check(BlockDriverState *bs, BdrvCheckResult *result)
 {
     BDRVQEDState *s = bs->opaque;
@@ -1561,6 +1570,7 @@ static BlockDriver bdrv_qed = {
     .bdrv_getlength           = bdrv_qed_getlength,
     .bdrv_get_info            = bdrv_qed_get_info,
     .bdrv_change_backing_file = bdrv_qed_change_backing_file,
+    .bdrv_invalidate_cache    = bdrv_qed_invalidate_cache,
     .bdrv_check               = bdrv_qed_check,
 };
 
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 44/46] migration: clear BDRV_O_INCOMING flags on end of incoming live migration
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (36 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 43/46] qed: add bdrv_invalidate_cache to be called after " Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 45/46] qed: honor BDRV_O_INCOMING for " Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 46/46] qed: remove incoming live migration blocker Kevin Wolf
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Benoît Canet <benoit.canet@gmail.com>

Signed-off-by: Benoît Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 migration.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/migration.c b/migration.c
index 8c119ba..94f7839 100644
--- a/migration.c
+++ b/migration.c
@@ -91,6 +91,7 @@ void process_incoming_migration(QEMUFile *f)
     qemu_announce_self();
     DPRINTF("successfully loaded vm state\n");
 
+    bdrv_clear_incoming_migration_all();
     /* Make sure all file formats flush their mutable metadata */
     bdrv_invalidate_cache_all();
 
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 45/46] qed: honor BDRV_O_INCOMING for incoming live migration
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (37 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 44/46] migration: clear BDRV_O_INCOMING flags on end of " Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 46/46] qed: remove incoming live migration blocker Kevin Wolf
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Benoît Canet <benoit.canet@gmail.com>

>From original commit with Patchwork-id: 31108 by
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

"The QED image format includes a file header bit to mark images dirty.
QED normally checks dirty images on open and fixes inconsistent
metadata.  This is undesirable during live migration since the dirty bit
may be set if the source host is modifying the image file.  The check
should be postponed until migration completes.

Skip operations that modify the image file if the BDRV_O_INCOMING flag
is set."

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qed.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index a5e9d57..aea2772 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -450,7 +450,7 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags)
      * feature is no longer valid.
      */
     if ((s->header.autoclear_features & ~QED_AUTOCLEAR_FEATURE_MASK) != 0 &&
-        !bdrv_is_read_only(bs->file)) {
+        !bdrv_is_read_only(bs->file) && !(flags & BDRV_O_INCOMING)) {
         s->header.autoclear_features &= QED_AUTOCLEAR_FEATURE_MASK;
 
         ret = qed_write_header_sync(s);
@@ -477,7 +477,8 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags)
          * potentially inconsistent images to be opened read-only.  This can
          * aid data recovery from an otherwise inconsistent image.
          */
-        if (!bdrv_is_read_only(bs->file)) {
+        if (!bdrv_is_read_only(bs->file) &&
+            !(flags & BDRV_O_INCOMING)) {
             BdrvCheckResult result = {0};
 
             ret = qed_check(s, &result, true);
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [Qemu-devel] [PATCH 46/46] qed: remove incoming live migration blocker
  2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
                   ` (38 preceding siblings ...)
  2012-04-05 15:52 ` [Qemu-devel] [PATCH 45/46] qed: honor BDRV_O_INCOMING for " Kevin Wolf
@ 2012-04-05 15:52 ` Kevin Wolf
  39 siblings, 0 replies; 41+ messages in thread
From: Kevin Wolf @ 2012-04-05 15:52 UTC (permalink / raw)
  To: anthony; +Cc: kwolf, qemu-devel

From: Benoît Canet <benoit.canet@gmail.com>

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qed.c |    9 ---------
 block/qed.h |    2 --
 2 files changed, 0 insertions(+), 11 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index aea2772..366cde7 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -498,12 +498,6 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags)
     s->need_check_timer = qemu_new_timer_ns(vm_clock,
                                             qed_need_check_timer_cb, s);
 
-    error_set(&s->migration_blocker,
-              QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
-              "qed", bs->device_name, "live migration");
-    migrate_add_blocker(s->migration_blocker);
-
-
 out:
     if (ret) {
         qed_free_l2_cache(&s->l2_cache);
@@ -516,9 +510,6 @@ static void bdrv_qed_close(BlockDriverState *bs)
 {
     BDRVQEDState *s = bs->opaque;
 
-    migrate_del_blocker(s->migration_blocker);
-    error_free(s->migration_blocker);
-
     qed_cancel_need_check_timer(s);
     qemu_free_timer(s->need_check_timer);
 
diff --git a/block/qed.h b/block/qed.h
index 62624a1..c716772 100644
--- a/block/qed.h
+++ b/block/qed.h
@@ -169,8 +169,6 @@ typedef struct {
 
     /* Periodic flush and clear need check flag */
     QEMUTimer *need_check_timer;
-
-    Error *migration_blocker;
 } BDRVQEDState;
 
 enum {
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2012-04-05 16:38 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-05 15:51 [Qemu-devel] [PULL 00/46] Block patches Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 01/46] trace-events: Rename 'next' argument Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 02/46] tracetool: Forbid argument name 'next' Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 03/46] qcow2: Remove unused parameter in get_cluster_table() Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 04/46] qemu-io: add option to enable tracing Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 05/46] block: push recursive flushing up from drivers Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 07/46] ide: IDENTIFY word 86 bit 14 is reserved Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 08/46] ide: Add "model=s" qdev option Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 09/46] ide: Change serial number strncpy() to pstrcpy() Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 10/46] ide: Adds wwn=hex qdev option Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 11/46] block/vpc: write checksum back to footer after check Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 12/46] qerror: fix QERR_PROPERTY_VALUE_OUT_OF_RANGE description Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 13/46] qdev: add blocksize property type Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 14/46] block: enforce constraints on block size properties Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 15/46] vdi: basic conversion to coroutines Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 16/46] vdi: move end-of-I/O handling at the end Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 17/46] vdi: merge aio_read_cb and aio_write_cb into callers Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 18/46] vdi: move aiocb fields to locals Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 19/46] vdi: leave bounce buffering to block layer Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 20/46] vdi: do not create useless iovecs Kevin Wolf
2012-04-05 15:51 ` [Qemu-devel] [PATCH 21/46] vdi: change goto to loop Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 22/46] Use DMADirection type for dma_bdrv_io Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 23/46] block: disable I/O throttling on sync api Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 25/46] block: fix streaming/closing race Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 27/46] block: document job API Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 28/46] qemu-img: add image fragmentation statistics Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 29/46] qed: " Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 30/46] qemu-img: add dirty flag status Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 31/46] qed: track " Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 32/46] block: bdrv_append() fixes Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 33/46] sheepdog: implement SD_OP_FLUSH_VDI operation Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 34/46] sheepdog: fix send req helpers Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 36/46] qemu-iotests: Test unknown qcow2 header extensions Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 37/46] qemu-iotests: Fix call syntax for qemu-img Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 38/46] qemu-iotests: Fix call syntax for qemu-io Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 40/46] block: Add new BDRV_O_INCOMING flag to notice incoming live migration Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 42/46] blockdev: open images with BDRV_O_INCOMING on " Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 43/46] qed: add bdrv_invalidate_cache to be called after " Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 44/46] migration: clear BDRV_O_INCOMING flags on end of " Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 45/46] qed: honor BDRV_O_INCOMING for " Kevin Wolf
2012-04-05 15:52 ` [Qemu-devel] [PATCH 46/46] qed: remove incoming live migration blocker Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.