All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
@ 2016-05-10  2:50 Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 01/27] block: Add BDRV_O_NO_LOCK Fam Zheng
                   ` (28 more replies)
  0 siblings, 29 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

v4: Don't lock RO image. [Rich]

v3: Highlight is handling of image locks during close(3) and bdrv_reopen(). A
    number of new patches are added consequently.



Fam Zheng (27):
  block: Add BDRV_O_NO_LOCK
  qapi: Add lock-image in blockdev-add options
  blockdev: Add and parse "lock-image" option for block devices
  block: Introduce image file locking
  block: Add bdrv_image_locked
  block: Make bdrv_reopen_{commit, abort} private functions
  block: Handle image locking during reopen
  osdep: Add qemu_lock_fd and qemu_unlock_fd
  osdep: Introduce qemu_dup
  raw-posix: Use qemu_dup
  raw-posix: Implement .bdrv_lockf
  gluster: Implement .bdrv_lockf
  qemu-io: Add "-L" option for BDRV_O_NO_LOCK
  qemu-img: Add "-L" option to sub commands
  qemu-img: Update documentation of "-L" option
  qemu-nbd: Add "--no-lock/-L" option
  block: Don't lock drive-backup target image in none mode
  mirror: Disable image locking on target backing chain
  qemu-iotests: 140: Disable image lock for qemu-io access
  qemu-iotests: 046: Move version detection out from verify_io
  qemu-iotests: Wait for QEMU processes before checking image in 091
  qemu-iotests: 030: Disable image lock when checking test image
  iotests: 087: Disable image lock in cases where file is shared
  iotests: Disable image locking in 085
  tests: Use null-co:// instead of /dev/null
  block: Turn on image locking by default
  qemu-iotests: Add test case 153 for image locking

 block.c                    |  76 +++++++++++++-
 block/gluster.c            |  31 ++++++
 block/raw-posix.c          |  32 ++++--
 blockdev.c                 |  23 +++++
 include/block/block.h      |   6 +-
 include/block/block_int.h  |  12 +++
 include/qemu/osdep.h       |   3 +
 qapi/block-core.json       |   6 +-
 qemu-img-cmds.hx           |  44 ++++-----
 qemu-img.c                 |  90 +++++++++++++----
 qemu-img.texi              |   3 +
 qemu-io.c                  |  24 ++++-
 qemu-nbd.c                 |   7 +-
 qemu-nbd.texi              |   2 +
 tests/drive_del-test.c     |   2 +-
 tests/nvme-test.c          |   2 +-
 tests/qemu-iotests/030     |   2 +-
 tests/qemu-iotests/046     |  22 +++--
 tests/qemu-iotests/085     |   3 +-
 tests/qemu-iotests/087     |   6 ++
 tests/qemu-iotests/091     |   3 +
 tests/qemu-iotests/091.out |   1 +
 tests/qemu-iotests/140     |   2 +-
 tests/qemu-iotests/153     | 191 +++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/153.out | 241 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 tests/usb-hcd-uhci-test.c  |   2 +-
 tests/usb-hcd-xhci-test.c  |   2 +-
 tests/virtio-blk-test.c    |   2 +-
 tests/virtio-scsi-test.c   |   4 +-
 util/osdep.c               | 215 +++++++++++++++++++++++++++++++++++++---
 31 files changed, 968 insertions(+), 92 deletions(-)
 create mode 100755 tests/qemu-iotests/153
 create mode 100644 tests/qemu-iotests/153.out

-- 
2.8.2

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 01/27] block: Add BDRV_O_NO_LOCK
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 02/27] qapi: Add lock-image in blockdev-add options Fam Zheng
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Later the block layer will automatically lock the images to avoid unexpected
concurrent accesses to the same image, which will easily corrupt the metadata
or user data, unless in some very special cases, like migration.

The exceptional cases like shared storage migration and testing should set
BDRV_O_NO_LOCK the flag indicating that the block layer should skip the
automatic locking of the image, like the old behavior.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
---
 include/block/block.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/block/block.h b/include/block/block.h
index 3a73137..b803597 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -94,6 +94,7 @@ typedef struct HDGeometry {
                                       select an appropriate protocol driver,
                                       ignoring the format layer */
 #define BDRV_O_NO_IO       0x10000 /* don't initialize for I/O */
+#define BDRV_O_NO_LOCK     0x20000 /* don't lock image file */
 
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH)
 
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 02/27] qapi: Add lock-image in blockdev-add options
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 01/27] block: Add BDRV_O_NO_LOCK Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 03/27] blockdev: Add and parse "lock-image" option for block devices Fam Zheng
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

To allow overriding the default locking behavior when opening the image.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qapi/block-core.json | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 1d09079..2913f3e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2065,6 +2065,9 @@
 # @detect-zeroes: #optional detect and optimize zero writes (Since 2.1)
 #                 (default: off)
 #
+# @lock-image: #optional whether to lock the image (default: true)
+#              (Since 2.7)
+#
 # Remaining options are determined by the block driver.
 #
 # Since: 1.7
@@ -2082,7 +2085,8 @@
             '*stats-account-invalid': 'bool',
             '*stats-account-failed': 'bool',
             '*stats-intervals': ['int'],
-            '*detect-zeroes': 'BlockdevDetectZeroesOptions' },
+            '*detect-zeroes': 'BlockdevDetectZeroesOptions',
+            '*lock-image': 'bool' },
   'discriminator': 'driver',
   'data': {
       'archipelago':'BlockdevOptionsArchipelago',
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 03/27] blockdev: Add and parse "lock-image" option for block devices
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 01/27] block: Add BDRV_O_NO_LOCK Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 02/27] qapi: Add lock-image in blockdev-add options Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 04/27] block: Introduce image file locking Fam Zheng
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Honor the locking switch specified in CLI or QMP, and set the open flags for
the image accordingly.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index f1f520a..200fa56 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -382,6 +382,10 @@ static void extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags,
                return;
             }
         }
+
+        if (!qemu_opt_get_bool(opts, "lock-image", false)) {
+            *bdrv_flags |= BDRV_O_NO_LOCK;
+        }
     }
 
     /* disk I/O throttling */
@@ -4249,6 +4253,10 @@ QemuOptsList qemu_common_drive_opts = {
             .type = QEMU_OPT_BOOL,
             .help = "whether to account for failed I/O operations "
                     "in the statistics",
+        },{
+            .name = "lock-image",
+            .type = QEMU_OPT_BOOL,
+            .help = "whether to lock the image (default: on)",
         },
         { /* end of list */ }
     },
@@ -4278,6 +4286,10 @@ static QemuOptsList qemu_root_bds_opts = {
             .name = "detect-zeroes",
             .type = QEMU_OPT_STRING,
             .help = "try to optimize zero writes (off, on, unmap)",
+        },{
+            .name = "lock-image",
+            .type = QEMU_OPT_BOOL,
+            .help = "whether to lock the image (default: on)",
         },
         { /* end of list */ }
     },
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 04/27] block: Introduce image file locking
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (2 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 03/27] blockdev: Add and parse "lock-image" option for block devices Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 05/27] block: Add bdrv_image_locked Fam Zheng
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Block drivers can implement this new operation .bdrv_lockf to actually lock the
image in the protocol specific way.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c                   | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 include/block/block_int.h | 12 ++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/block.c b/block.c
index d4939b4..16c7d58 100644
--- a/block.c
+++ b/block.c
@@ -846,6 +846,40 @@ out:
     g_free(gen_node_name);
 }
 
+static int bdrv_lock_unlock_image_do(BlockDriverState *bs, bool lock_image)
+{
+    int cmd = BDRV_LOCKF_UNLOCK;
+    int ret;
+
+    if (bs->image_locked == lock_image) {
+        return 0;
+    } else if (!bs->drv) {
+        return -ENOMEDIUM;
+    } else if (!bs->drv->bdrv_lockf) {
+        return 0;
+    }
+    if (lock_image) {
+        cmd = bs->open_flags & BDRV_O_RDWR ? BDRV_LOCKF_RWLOCK :
+                                             BDRV_LOCKF_ROLOCK;
+    }
+    ret = bs->drv->bdrv_lockf(bs, cmd);
+    if (ret == -ENOTSUP) {
+        /* Handle it the same way as !bs->drv->bdrv_lockf */
+        ret = 0;
+    }
+    return ret;
+}
+
+static int bdrv_lock_image(BlockDriverState *bs)
+{
+    return bdrv_lock_unlock_image_do(bs, true);
+}
+
+static int bdrv_unlock_image(BlockDriverState *bs)
+{
+    return bdrv_lock_unlock_image_do(bs, false);
+}
+
 static QemuOptsList bdrv_runtime_opts = {
     .name = "bdrv_common",
     .head = QTAILQ_HEAD_INITIALIZER(bdrv_runtime_opts.head),
@@ -995,6 +1029,14 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file,
         goto free_and_fail;
     }
 
+    if (!(open_flags & (BDRV_O_NO_LOCK | BDRV_O_INACTIVE))) {
+        ret = bdrv_lock_image(bs);
+        if (ret) {
+            error_setg(errp, "Failed to lock image");
+            goto free_and_fail;
+        }
+    }
+
     ret = refresh_total_sectors(bs, bs->total_sectors);
     if (ret < 0) {
         error_setg_errno(errp, -ret, "Could not refresh total sector count");
@@ -2144,6 +2186,7 @@ static void bdrv_close(BlockDriverState *bs)
     if (bs->drv) {
         BdrvChild *child, *next;
 
+        bdrv_unlock_image(bs);
         bs->drv->bdrv_close(bs);
         bs->drv = NULL;
 
@@ -3230,6 +3273,9 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
         error_setg_errno(errp, -ret, "Could not refresh total sector count");
         return;
     }
+    if (!(bs->open_flags & BDRV_O_NO_LOCK)) {
+        bdrv_lock_image(bs);
+    }
 }
 
 void bdrv_invalidate_cache_all(Error **errp)
@@ -3261,6 +3307,7 @@ static int bdrv_inactivate(BlockDriverState *bs)
         }
     }
 
+    ret = bdrv_unlock_image(bs);
     bs->open_flags |= BDRV_O_INACTIVE;
     return 0;
 }
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 10d8759..ffa30b0 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -85,6 +85,12 @@ typedef struct BdrvTrackedRequest {
     struct BdrvTrackedRequest *waiting_for;
 } BdrvTrackedRequest;
 
+typedef enum {
+    BDRV_LOCKF_RWLOCK,
+    BDRV_LOCKF_ROLOCK,
+    BDRV_LOCKF_UNLOCK,
+} BdrvLockfCmd;
+
 struct BlockDriver {
     const char *format_name;
     int instance_size;
@@ -317,6 +323,11 @@ struct BlockDriver {
      */
     void (*bdrv_drain)(BlockDriverState *bs);
 
+    /**
+     * Lock/unlock the image.
+     */
+    int (*bdrv_lockf)(BlockDriverState *bs, BdrvLockfCmd cmd);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
@@ -485,6 +496,7 @@ struct BlockDriverState {
     NotifierWithReturn write_threshold_notifier;
 
     int quiesce_counter;
+    bool image_locked;
 };
 
 struct BlockBackendRootState {
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 05/27] block: Add bdrv_image_locked
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (3 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 04/27] block: Introduce image file locking Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 06/27] block: Make bdrv_reopen_{commit, abort} private functions Fam Zheng
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c               | 5 +++++
 include/block/block.h | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 16c7d58..bb7ebed 100644
--- a/block.c
+++ b/block.c
@@ -880,6 +880,11 @@ static int bdrv_unlock_image(BlockDriverState *bs)
     return bdrv_lock_unlock_image_do(bs, false);
 }
 
+bool bdrv_image_locked(BlockDriverState *bs)
+{
+    return bs->image_locked;
+}
+
 static QemuOptsList bdrv_runtime_opts = {
     .name = "bdrv_common",
     .head = QTAILQ_HEAD_INITIALIZER(bdrv_runtime_opts.head),
diff --git a/include/block/block.h b/include/block/block.h
index b803597..bd623d9 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -282,7 +282,7 @@ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
 BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
                                     BlockDriverState *bs);
 BlockDriverState *bdrv_find_base(BlockDriverState *bs);
-
+bool bdrv_image_locked(BlockDriverState *bs);
 
 typedef struct BdrvCheckResult {
     int corruptions;
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 06/27] block: Make bdrv_reopen_{commit, abort} private functions
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (4 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 05/27] block: Add bdrv_image_locked Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 07/27] block: Handle image locking during reopen Fam Zheng
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

As parts of the transactional reopen, they are not necessary outside
block.c. Make them static.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c               | 6 ++++--
 include/block/block.h | 2 --
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index bb7ebed..3f5369a 100644
--- a/block.c
+++ b/block.c
@@ -1944,6 +1944,8 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
  * to all devices.
  *
  */
+static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
+static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
 {
     int ret = -1;
@@ -2123,7 +2125,7 @@ error:
  * makes them final by swapping the staging BlockDriverState contents into
  * the active BlockDriverState contents.
  */
-void bdrv_reopen_commit(BDRVReopenState *reopen_state)
+static void bdrv_reopen_commit(BDRVReopenState *reopen_state)
 {
     BlockDriver *drv;
 
@@ -2150,7 +2152,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
  * Abort the reopen, and delete and free the staged changes in
  * reopen_state
  */
-void bdrv_reopen_abort(BDRVReopenState *reopen_state)
+static void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 {
     BlockDriver *drv;
 
diff --git a/include/block/block.h b/include/block/block.h
index bd623d9..d240a03 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -227,8 +227,6 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
 int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp);
 int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
                         BlockReopenQueue *queue, Error **errp);
-void bdrv_reopen_commit(BDRVReopenState *reopen_state);
-void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 int bdrv_read(BlockDriverState *bs, int64_t sector_num,
               uint8_t *buf, int nb_sectors);
 int bdrv_write(BlockDriverState *bs, int64_t sector_num,
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 07/27] block: Handle image locking during reopen
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (5 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 06/27] block: Make bdrv_reopen_{commit, abort} private functions Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Stash the locking state into BDRVReopenState. If it was locked, unlock
in prepare, and lock it again when commit or abort.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c               | 11 +++++++++++
 include/block/block.h |  1 +
 2 files changed, 12 insertions(+)

diff --git a/block.c b/block.c
index 3f5369a..1b3aac4 100644
--- a/block.c
+++ b/block.c
@@ -2113,6 +2113,11 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
         } while ((entry = qdict_next(reopen_state->options, entry)));
     }
 
+    reopen_state->was_locked = reopen_state->bs->image_locked;
+    if (reopen_state->was_locked) {
+        bdrv_unlock_image(reopen_state->bs);
+    }
+
     ret = 0;
 
 error:
@@ -2137,6 +2142,9 @@ static void bdrv_reopen_commit(BDRVReopenState *reopen_state)
     if (drv->bdrv_reopen_commit) {
         drv->bdrv_reopen_commit(reopen_state);
     }
+    if (reopen_state->was_locked) {
+        bdrv_lock_image(reopen_state->bs);
+    }
 
     /* set BDS specific flags now */
     QDECREF(reopen_state->bs->explicit_options);
@@ -2163,6 +2171,9 @@ static void bdrv_reopen_abort(BDRVReopenState *reopen_state)
     if (drv->bdrv_reopen_abort) {
         drv->bdrv_reopen_abort(reopen_state);
     }
+    if (reopen_state->was_locked) {
+        bdrv_lock_image(reopen_state->bs);
+    }
 
     QDECREF(reopen_state->explicit_options);
 }
diff --git a/include/block/block.h b/include/block/block.h
index d240a03..7839f69 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -157,6 +157,7 @@ typedef struct BDRVReopenState {
     QDict *options;
     QDict *explicit_options;
     void *opaque;
+    bool was_locked;
 } BDRVReopenState;
 
 /*
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (6 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 07/27] block: Handle image locking during reopen Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  7:54   ` Richard W.M. Jones
  2016-05-10  8:57   ` Daniel P. Berrange
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 09/27] osdep: Introduce qemu_dup Fam Zheng
                   ` (20 subsequent siblings)
  28 siblings, 2 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

They are wrappers of POSIX fcntl file locking, with the additional
interception of open/close (through qemu_open and qemu_close) to offer a
better semantics that preserves the locks across multiple life cycles of
different fds on the same file.  The reason to make this semantics
change over the fcntl locks is to make the API cleaner for QEMU
subsystems.

More precisely, we remove this "feature" of fcntl lock:

    If a process closes any file descriptor referring to a file, then
    all of the process's locks on that file are released, regardless of
    the file descriptor(s) on which the locks were obtained.

as long as the fd is always open/closed through qemu_open and
qemu_close.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/qemu/osdep.h |   2 +
 util/osdep.c         | 200 +++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 188 insertions(+), 14 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 408783f..089c13f 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -251,6 +251,8 @@ int qemu_madvise(void *addr, size_t len, int advice);
 
 int qemu_open(const char *name, int flags, ...);
 int qemu_close(int fd);
+int qemu_lock_fd(int fd, int64_t start, int64_t len, bool readonly);
+int qemu_unlock_fd(int fd, int64_t start, int64_t len);
 
 #if defined(__HAIKU__) && defined(__i386__)
 #define FMT_pid "%ld"
diff --git a/util/osdep.c b/util/osdep.c
index d56d071..1510cbf 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -41,6 +41,7 @@ extern int madvise(caddr_t, size_t, int);
 #include "qemu/sockets.h"
 #include "qemu/error-report.h"
 #include "monitor/monitor.h"
+#include <glib.h>
 
 static bool fips_enabled = false;
 
@@ -146,13 +147,9 @@ static int qemu_parse_fdset(const char *param)
 }
 #endif
 
-/*
- * Opens a file with FD_CLOEXEC set
- */
-int qemu_open(const char *name, int flags, ...)
+static int qemu_fd_open(const char *name, int flags, int mode)
 {
     int ret;
-    int mode = 0;
 
 #ifndef _WIN32
     const char *fdset_id_str;
@@ -189,13 +186,6 @@ int qemu_open(const char *name, int flags, ...)
     }
 #endif
 
-    if (flags & O_CREAT) {
-        va_list ap;
-
-        va_start(ap, flags);
-        mode = va_arg(ap, int);
-        va_end(ap);
-    }
 
 #ifdef O_CLOEXEC
     ret = open(name, flags | O_CLOEXEC, mode);
@@ -216,6 +206,188 @@ int qemu_open(const char *name, int flags, ...)
     return ret;
 }
 
+typedef struct LockEntry LockEntry;
+struct LockEntry {
+    int fd;
+    bool readonly;
+    int64_t start;
+    int64_t len;
+    QLIST_ENTRY(LockEntry) next;
+};
+
+typedef struct {
+    QLIST_HEAD(, LockEntry) records;
+} LockRecord;
+
+static GHashTable *fd_to_path;
+static GHashTable *lock_map;
+
+static void fd_map_init(void)
+{
+    if (!fd_to_path) {
+        fd_to_path = g_hash_table_new(g_direct_hash, g_direct_equal);
+        lock_map = g_hash_table_new(g_str_hash, g_str_equal);
+    }
+}
+
+static int qemu_lock_do(int fd, int64_t start, int64_t len, int fl_type)
+{
+    struct flock fl = (struct flock) {
+        .l_whence  = SEEK_SET,
+        /* Locking byte 1 avoids interfereing with virtlockd. */
+        .l_start = start,
+        .l_len = len,
+        .l_type = fl_type,
+    };
+
+    return fcntl(fd, F_SETLK, &fl);
+}
+
+int qemu_lock_fd(int fd, int64_t start, int64_t len, bool readonly)
+{
+    LockEntry *ent;
+    LockRecord *rec;
+    char *path;
+    int ret;
+
+    ret = qemu_lock_do(fd, start, len, readonly ? F_RDLCK : F_WRLCK);
+    if (ret == -1) {
+        return -errno;
+    }
+    path = g_hash_table_lookup(fd_to_path, GINT_TO_POINTER(fd));
+    assert(path);
+    rec = g_hash_table_lookup(lock_map, path);
+    if (!rec) {
+        gboolean inserted;
+        rec = g_new(LockRecord, 1);
+        QLIST_INIT(&rec->records);
+        inserted = g_hash_table_insert(lock_map, path, rec);
+        assert(inserted);
+    } else {
+        /* Check the range is not locked by any fd in the record yet. */
+        int64_t end = start + len;
+        QLIST_FOREACH(ent, &rec->records, next) {
+            int64_t ent_end = ent->start + ent->len;
+            if ((ent->start >= start && ent_end <= end) ||
+                (ent_end >= start && ent_end <= end)) {
+                /* There is an overlap between ent and us, only shared locking
+                 * is permitted. */
+                if (!(ent->readonly && readonly)) {
+                    return -EAGAIN;
+                }
+            }
+        }
+    }
+    ent = g_new(LockEntry, 1);
+    *ent = (LockEntry) {
+        .fd = fd,
+        .start = start,
+        .len = len,
+        .readonly = readonly,
+    };
+    QLIST_INSERT_HEAD(&rec->records, ent, next);
+    return 0;
+}
+
+int qemu_unlock_fd(int fd, int64_t start, int64_t len)
+{
+    LockEntry *ent;
+    LockRecord *rec;
+    int ret;
+    char *path;
+
+    ret = qemu_lock_do(fd, start, len, F_UNLCK);
+    if (ret == -1) {
+        return ret;
+    }
+    path = g_hash_table_lookup(fd_to_path, GINT_TO_POINTER(fd));
+    assert(path);
+    rec = g_hash_table_lookup(lock_map, path);
+    assert(ret);
+    QLIST_FOREACH(ent, &rec->records, next) {
+        if (ent->fd == fd && ent->start == start && ent->len == len) {
+            QLIST_REMOVE(ent, next);
+            break;
+        }
+    }
+    assert(ent);
+    return ret;
+}
+
+static void qemu_fd_add_record(int fd, const char *path)
+{
+    gboolean ret;
+    char *rpath;
+
+    fd_map_init();
+    rpath = realpath(path, NULL);
+    if (!rpath) {
+        fprintf(stderr, "Failed to get real path for file %s: %s\n", path,
+                strerror(errno));
+        rpath = strdup(path);
+    }
+    ret = g_hash_table_insert(fd_to_path, GINT_TO_POINTER(fd), g_strdup(rpath));
+    /* It's a caller bug if the record already exists */
+    assert(ret);
+    free(rpath);
+}
+
+static int qemu_fd_close(int fd)
+{
+    LockEntry *ent, *tmp;
+    LockRecord *rec;
+    char *path;
+    int ret;
+
+    assert(fd_to_path);
+    path = g_hash_table_lookup(fd_to_path, GINT_TO_POINTER(fd));
+    assert(path);
+    g_hash_table_remove(fd_to_path, GINT_TO_POINTER(fd));
+    rec = g_hash_table_lookup(lock_map, path);
+    ret = close(fd);
+
+    if (rec) {
+        QLIST_FOREACH_SAFE(ent, &rec->records, next, tmp) {
+            int r;
+            if (ent->fd == fd) {
+                QLIST_REMOVE(ent, next);
+                g_free(ent);
+                continue;
+            }
+            r = qemu_lock_do(ent->fd, ent->start, ent->len,
+                             ent->readonly ? F_RDLCK : F_WRLCK);
+            if (r == -1) {
+                fprintf(stderr, "Failed to acquire lock on fd %d: %s\n",
+                        ent->fd, strerror(errno));
+            }
+        }
+    }
+
+    return ret;
+}
+
+/*
+ * Opens a file with FD_CLOEXEC set
+ */
+int qemu_open(const char *name, int flags, ...)
+{
+    int mode = 0;
+    int ret;
+
+    if (flags & O_CREAT) {
+        va_list ap;
+
+        va_start(ap, flags);
+        mode = va_arg(ap, int);
+        va_end(ap);
+    }
+    ret = qemu_fd_open(name, flags, mode);
+    if (ret >= 0) {
+        qemu_fd_add_record(ret, name);
+    }
+    return ret;
+}
+
 int qemu_close(int fd)
 {
     int64_t fdset_id;
@@ -225,7 +397,7 @@ int qemu_close(int fd)
     if (fdset_id != -1) {
         int ret;
 
-        ret = close(fd);
+        ret = qemu_fd_close(fd);
         if (ret == 0) {
             monitor_fdset_dup_fd_remove(fd);
         }
@@ -233,7 +405,7 @@ int qemu_close(int fd)
         return ret;
     }
 
-    return close(fd);
+    return qemu_fd_close(fd);
 }
 
 /*
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 09/27] osdep: Introduce qemu_dup
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (7 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 10/27] raw-posix: Use qemu_dup Fam Zheng
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

This takes care both the CLOEXEC flag and fd-path mapping for image
locking.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/qemu/osdep.h |  1 +
 util/osdep.c         | 15 +++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 089c13f..8174902 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -251,6 +251,7 @@ int qemu_madvise(void *addr, size_t len, int advice);
 
 int qemu_open(const char *name, int flags, ...);
 int qemu_close(int fd);
+int qemu_dup(int fd);
 int qemu_lock_fd(int fd, int64_t start, int64_t len, bool readonly);
 int qemu_unlock_fd(int fd, int64_t start, int64_t len);
 
diff --git a/util/osdep.c b/util/osdep.c
index 1510cbf..66c993a 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -408,6 +408,21 @@ int qemu_close(int fd)
     return qemu_fd_close(fd);
 }
 
+int qemu_dup(int fd)
+{
+    char *path;
+    int ret = qemu_dup_flags(fd, 0);
+    if (ret == -1) {
+        return ret;
+    }
+
+    path = g_hash_table_lookup(fd_to_path, GINT_TO_POINTER(fd));
+    assert(path);
+    qemu_fd_add_record(ret, path);
+    return ret;
+}
+
+
 /*
  * A variant of write(2) which handles partial write.
  *
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 10/27] raw-posix: Use qemu_dup
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (8 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 09/27] osdep: Introduce qemu_dup Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 11/27] raw-posix: Implement .bdrv_lockf Fam Zheng
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/raw-posix.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 906d5c9..4e4d0d2 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -644,15 +644,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 
     if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
         /* dup the original fd */
-        /* TODO: use qemu fcntl wrapper */
-#ifdef F_DUPFD_CLOEXEC
-        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
-#else
-        raw_s->fd = dup(s->fd);
-        if (raw_s->fd != -1) {
-            qemu_set_cloexec(raw_s->fd);
-        }
-#endif
+        raw_s->fd = qemu_dup(s->fd);
         if (raw_s->fd >= 0) {
             ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
             if (ret) {
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 11/27] raw-posix: Implement .bdrv_lockf
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (9 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 10/27] raw-posix: Use qemu_dup Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 12/27] gluster: " Fam Zheng
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

virtlockd in libvirt locks the first byte, we lock byte 1 to avoid
the intervene.

Both file and host device protocols are covered.

Suggested-by: "Daniel P. Berrange" <berrange@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/raw-posix.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 4e4d0d2..926aee1 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -35,6 +35,7 @@
 #include "raw-aio.h"
 #include "qapi/util.h"
 #include "qapi/qmp/qstring.h"
+#include "glib.h"
 
 #if defined(__APPLE__) && (__MACH__)
 #include <paths.h>
@@ -397,6 +398,23 @@ static void raw_attach_aio_context(BlockDriverState *bs,
 #endif
 }
 
+static int raw_lockf(BlockDriverState *bs, BdrvLockfCmd cmd)
+{
+
+    BDRVRawState *s = bs->opaque;
+
+    switch (cmd) {
+    case BDRV_LOCKF_RWLOCK:
+        return qemu_lock_fd(s->fd, 1, 1, false);
+    case BDRV_LOCKF_ROLOCK:
+        return qemu_lock_fd(s->fd, 1, 1, true);
+    case BDRV_LOCKF_UNLOCK:
+        return qemu_unlock_fd(s->fd, 1, 1);
+    default:
+        abort();
+    }
+}
+
 #ifdef CONFIG_LINUX_AIO
 static int raw_set_aio(void **aio_ctx, int *use_aio, int bdrv_flags)
 {
@@ -1952,6 +1970,8 @@ BlockDriver bdrv_file = {
     .bdrv_detach_aio_context = raw_detach_aio_context,
     .bdrv_attach_aio_context = raw_attach_aio_context,
 
+    .bdrv_lockf = raw_lockf,
+
     .create_opts = &raw_create_opts,
 };
 
@@ -2407,6 +2427,8 @@ static BlockDriver bdrv_host_device = {
 #ifdef __linux__
     .bdrv_aio_ioctl     = hdev_aio_ioctl,
 #endif
+
+    .bdrv_lockf = raw_lockf,
 };
 
 #if defined(__linux__) || defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 12/27] gluster: Implement .bdrv_lockf
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (10 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 11/27] raw-posix: Implement .bdrv_lockf Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 13/27] qemu-io: Add "-L" option for BDRV_O_NO_LOCK Fam Zheng
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/gluster.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index a8aaacf..f9c5050 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -723,6 +723,33 @@ static int64_t qemu_gluster_allocated_file_size(BlockDriverState *bs)
     }
 }
 
+
+static int qemu_gluster_lockf(BlockDriverState *bs, BdrvLockfCmd cmd)
+{
+    BDRVGlusterState *s = bs->opaque;
+    int ret;
+    struct flock fl = (struct flock) {
+        .l_start = 0,
+        .l_whence  = SEEK_SET,
+        .l_len = 0,
+    };
+    switch (cmd) {
+    case BDRV_LOCKF_RWLOCK:
+        fl.l_type = F_WRLCK;
+        break;
+    case BDRV_LOCKF_ROLOCK:
+        fl.l_type = F_RDLCK;
+        break;
+    case BDRV_LOCKF_UNLOCK:
+        fl.l_type = F_UNLCK;
+        break;
+    default:
+        abort();
+    }
+    ret = glfs_posix_lock(s->fd, F_SETLK, &fl);
+    return ret == -1 ? -errno : 0;
+}
+
 static int qemu_gluster_has_zero_init(BlockDriverState *bs)
 {
     /* GlusterFS volume could be backed by a block device */
@@ -764,6 +791,7 @@ static BlockDriver bdrv_gluster = {
     .bdrv_co_readv                = qemu_gluster_co_readv,
     .bdrv_co_writev               = qemu_gluster_co_writev,
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
+    .bdrv_lockf                   = qemu_gluster_lockf,
     .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
@@ -791,6 +819,7 @@ static BlockDriver bdrv_gluster_tcp = {
     .bdrv_co_readv                = qemu_gluster_co_readv,
     .bdrv_co_writev               = qemu_gluster_co_writev,
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
+    .bdrv_lockf                   = qemu_gluster_lockf,
     .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
@@ -818,6 +847,7 @@ static BlockDriver bdrv_gluster_unix = {
     .bdrv_co_readv                = qemu_gluster_co_readv,
     .bdrv_co_writev               = qemu_gluster_co_writev,
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
+    .bdrv_lockf                   = qemu_gluster_lockf,
     .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
@@ -845,6 +875,7 @@ static BlockDriver bdrv_gluster_rdma = {
     .bdrv_co_readv                = qemu_gluster_co_readv,
     .bdrv_co_writev               = qemu_gluster_co_writev,
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
+    .bdrv_lockf                   = qemu_gluster_lockf,
     .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
 #ifdef CONFIG_GLUSTERFS_DISCARD
     .bdrv_co_discard              = qemu_gluster_co_discard,
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 13/27] qemu-io: Add "-L" option for BDRV_O_NO_LOCK
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (11 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 12/27] gluster: " Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 14/27] qemu-img: Add "-L" option to sub commands Fam Zheng
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-io.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index 0598251..a601fed 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -107,6 +107,7 @@ static void open_help(void)
 " -r, -- open file read-only\n"
 " -s, -- use snapshot file\n"
 " -n, -- disable host cache\n"
+" -L, -- disable image lock\n"
 " -o, -- options to be given to the block driver"
 "\n");
 }
@@ -120,7 +121,7 @@ static const cmdinfo_t open_cmd = {
     .argmin     = 1,
     .argmax     = -1,
     .flags      = CMD_NOFILE_OK,
-    .args       = "[-Crsn] [-o options] [path]",
+    .args       = "[-CrsnL] [-o options] [path]",
     .oneline    = "open the file specified by path",
     .help       = open_help,
 };
@@ -139,12 +140,13 @@ static int open_f(BlockBackend *blk, int argc, char **argv)
 {
     int flags = 0;
     int readonly = 0;
+    int nolock = 0;
     bool writethrough = true;
     int c;
     QemuOpts *qopts;
     QDict *opts;
 
-    while ((c = getopt(argc, argv, "snrgo:")) != -1) {
+    while ((c = getopt(argc, argv, "snrgLo:")) != -1) {
         switch (c) {
         case 's':
             flags |= BDRV_O_SNAPSHOT;
@@ -156,6 +158,9 @@ static int open_f(BlockBackend *blk, int argc, char **argv)
         case 'r':
             readonly = 1;
             break;
+        case 'L':
+            nolock = 1;
+            break;
         case 'o':
             if (imageOpts) {
                 printf("--image-opts and 'open -o' are mutually exclusive\n");
@@ -176,6 +181,10 @@ static int open_f(BlockBackend *blk, int argc, char **argv)
         flags |= BDRV_O_RDWR;
     }
 
+    if (nolock) {
+        flags |= BDRV_O_NO_LOCK;
+    }
+
     if (imageOpts && (optind == argc - 1)) {
         if (!qemu_opts_parse_noisily(&empty_opts, argv[optind], false)) {
             qemu_opts_reset(&empty_opts);
@@ -410,7 +419,8 @@ static QemuOptsList file_opts = {
 int main(int argc, char **argv)
 {
     int readonly = 0;
-    const char *sopt = "hVc:d:f:rsnmgkt:T:";
+    int nolock = 0;
+    const char *sopt = "hVc:d:f:rLsnmgkt:T:";
     const struct option lopt[] = {
         { "help", no_argument, NULL, 'h' },
         { "version", no_argument, NULL, 'V' },
@@ -418,6 +428,7 @@ int main(int argc, char **argv)
         { "cmd", required_argument, NULL, 'c' },
         { "format", required_argument, NULL, 'f' },
         { "read-only", no_argument, NULL, 'r' },
+        { "no-lock", no_argument, NULL, 'L' },
         { "snapshot", no_argument, NULL, 's' },
         { "nocache", no_argument, NULL, 'n' },
         { "misalign", no_argument, NULL, 'm' },
@@ -477,6 +488,9 @@ int main(int argc, char **argv)
         case 'r':
             readonly = 1;
             break;
+        case 'L':
+            nolock = 1;
+            break;
         case 'm':
             qemuio_misalign = true;
             break;
@@ -557,6 +571,10 @@ int main(int argc, char **argv)
         flags |= BDRV_O_RDWR;
     }
 
+    if (nolock) {
+        flags |= BDRV_O_NO_LOCK;
+    }
+
     if ((argc - optind) == 1) {
         if (imageOpts) {
             QemuOpts *qopts = NULL;
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 14/27] qemu-img: Add "-L" option to sub commands
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (12 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 13/27] qemu-io: Add "-L" option for BDRV_O_NO_LOCK Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 15/27] qemu-img: Update documentation of "-L" option Fam Zheng
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

If specified, BDRV_O_NO_LOCK flag will be set when opening the image.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-img.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 72 insertions(+), 17 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 46f2a6d..28e350c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -598,6 +598,7 @@ static int img_check(int argc, char **argv)
     ImageCheck *check;
     bool quiet = false;
     bool image_opts = false;
+    bool nolock = false;
 
     fmt = NULL;
     output = NULL;
@@ -614,7 +615,7 @@ static int img_check(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "hf:r:T:q",
+        c = getopt_long(argc, argv, "hf:r:T:qL",
                         long_options, &option_index);
         if (c == -1) {
             break;
@@ -648,6 +649,9 @@ static int img_check(int argc, char **argv)
         case 'q':
             quiet = true;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case OPTION_OBJECT: {
             QemuOpts *opts;
             opts = qemu_opts_parse_noisily(&qemu_object_opts,
@@ -681,6 +685,7 @@ static int img_check(int argc, char **argv)
         return 1;
     }
 
+    flags |= nolock ? BDRV_O_NO_LOCK : 0;
     ret = bdrv_parse_cache_mode(cache, &flags, &writethrough);
     if (ret < 0) {
         error_report("Invalid source cache option: %s", cache);
@@ -801,6 +806,7 @@ static int img_commit(int argc, char **argv)
     Error *local_err = NULL;
     CommonBlockJobCBInfo cbi;
     bool image_opts = false;
+    bool nolock = false;
 
     fmt = NULL;
     cache = BDRV_DEFAULT_CACHE;
@@ -812,7 +818,7 @@ static int img_commit(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "f:ht:b:dpq",
+        c = getopt_long(argc, argv, "f:ht:b:dpqL",
                         long_options, NULL);
         if (c == -1) {
             break;
@@ -842,6 +848,9 @@ static int img_commit(int argc, char **argv)
         case 'q':
             quiet = true;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case OPTION_OBJECT: {
             QemuOpts *opts;
             opts = qemu_opts_parse_noisily(&qemu_object_opts,
@@ -873,6 +882,7 @@ static int img_commit(int argc, char **argv)
     }
 
     flags = BDRV_O_RDWR | BDRV_O_UNMAP;
+    flags |= nolock ? BDRV_O_NO_LOCK : 0;
     ret = bdrv_parse_cache_mode(cache, &flags, &writethrough);
     if (ret < 0) {
         error_report("Invalid cache option: %s", cache);
@@ -1130,6 +1140,7 @@ static int img_compare(int argc, char **argv)
     int c, pnum;
     uint64_t progress_base;
     bool image_opts = false;
+    bool nolock = false;
 
     cache = BDRV_DEFAULT_CACHE;
     for (;;) {
@@ -1139,7 +1150,7 @@ static int img_compare(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "hf:F:T:pqs",
+        c = getopt_long(argc, argv, "hf:F:T:pqsL",
                         long_options, NULL);
         if (c == -1) {
             break;
@@ -1167,6 +1178,9 @@ static int img_compare(int argc, char **argv)
         case 's':
             strict = true;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case OPTION_OBJECT: {
             QemuOpts *opts;
             opts = qemu_opts_parse_noisily(&qemu_object_opts,
@@ -1205,6 +1219,7 @@ static int img_compare(int argc, char **argv)
     qemu_progress_init(progress, 2.0);
 
     flags = 0;
+    flags |= nolock ? BDRV_O_NO_LOCK : 0;
     ret = bdrv_parse_cache_mode(cache, &flags, &writethrough);
     if (ret < 0) {
         error_report("Invalid source cache option: %s", cache);
@@ -1736,6 +1751,7 @@ static int img_convert(int argc, char **argv)
     QemuOpts *sn_opts = NULL;
     ImgConvertState state;
     bool image_opts = false;
+    bool nolock = false;
 
     fmt = NULL;
     out_fmt = "raw";
@@ -1751,7 +1767,7 @@ static int img_convert(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "hf:O:B:ce6o:s:l:S:pt:T:qn",
+        c = getopt_long(argc, argv, "hf:O:B:ce6o:s:l:S:pt:T:qnL",
                         long_options, NULL);
         if (c == -1) {
             break;
@@ -1840,6 +1856,9 @@ static int img_convert(int argc, char **argv)
         case 'q':
             quiet = true;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case 'n':
             skip_create = 1;
             break;
@@ -1889,6 +1908,7 @@ static int img_convert(int argc, char **argv)
     }
 
     src_flags = 0;
+    src_flags |= nolock ? BDRV_O_NO_LOCK : 0;
     ret = bdrv_parse_cache_mode(src_cache, &src_flags, &src_writethrough);
     if (ret < 0) {
         error_report("Invalid source cache option: %s", src_cache);
@@ -2038,6 +2058,7 @@ static int img_convert(int argc, char **argv)
     }
 
     flags = min_sparse ? (BDRV_O_RDWR | BDRV_O_UNMAP) : BDRV_O_RDWR;
+    flags |= nolock ? BDRV_O_NO_LOCK : 0;
     ret = bdrv_parse_cache_mode(cache, &flags, &writethrough);
     if (ret < 0) {
         error_report("Invalid cache option: %s", cache);
@@ -2218,12 +2239,14 @@ static gboolean str_equal_func(gconstpointer a, gconstpointer b)
 static ImageInfoList *collect_image_info_list(bool image_opts,
                                               const char *filename,
                                               const char *fmt,
-                                              bool chain)
+                                              bool chain,
+                                              bool nolock)
 {
     ImageInfoList *head = NULL;
     ImageInfoList **last = &head;
     GHashTable *filenames;
     Error *err = NULL;
+    int flags;
 
     filenames = g_hash_table_new_full(g_str_hash, str_equal_func, NULL, NULL);
 
@@ -2240,8 +2263,9 @@ static ImageInfoList *collect_image_info_list(bool image_opts,
         }
         g_hash_table_insert(filenames, (gpointer)filename, NULL);
 
-        blk = img_open(image_opts, filename, fmt,
-                       BDRV_O_NO_BACKING | BDRV_O_NO_IO, false, false);
+        flags = BDRV_O_NO_BACKING | BDRV_O_NO_IO;
+        flags |= nolock ? BDRV_O_NO_LOCK : 0;
+        blk = img_open(image_opts, filename, fmt, flags, false, false);
         if (!blk) {
             goto err;
         }
@@ -2293,6 +2317,7 @@ static int img_info(int argc, char **argv)
     const char *filename, *fmt, *output;
     ImageInfoList *list;
     bool image_opts = false;
+    bool nolock = false;
 
     fmt = NULL;
     output = NULL;
@@ -2307,7 +2332,7 @@ static int img_info(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "f:h",
+        c = getopt_long(argc, argv, "f:hL",
                         long_options, &option_index);
         if (c == -1) {
             break;
@@ -2320,6 +2345,9 @@ static int img_info(int argc, char **argv)
         case 'f':
             fmt = optarg;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case OPTION_OUTPUT:
             output = optarg;
             break;
@@ -2359,7 +2387,7 @@ static int img_info(int argc, char **argv)
         return 1;
     }
 
-    list = collect_image_info_list(image_opts, filename, fmt, chain);
+    list = collect_image_info_list(image_opts, filename, fmt, chain, nolock);
     if (!list) {
         return 1;
     }
@@ -2505,6 +2533,8 @@ static int img_map(int argc, char **argv)
     MapEntry curr = { .length = 0 }, next;
     int ret = 0;
     bool image_opts = false;
+    bool nolock = false;
+    int flags;
 
     fmt = NULL;
     output = NULL;
@@ -2518,7 +2548,7 @@ static int img_map(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "f:h",
+        c = getopt_long(argc, argv, "f:hL",
                         long_options, &option_index);
         if (c == -1) {
             break;
@@ -2531,6 +2561,9 @@ static int img_map(int argc, char **argv)
         case 'f':
             fmt = optarg;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case OPTION_OUTPUT:
             output = optarg;
             break;
@@ -2567,7 +2600,8 @@ static int img_map(int argc, char **argv)
         return 1;
     }
 
-    blk = img_open(image_opts, filename, fmt, 0, false, false);
+    flags = nolock ? BDRV_O_NO_LOCK : 0;
+    blk = img_open(image_opts, filename, fmt, flags, false, false);
     if (!blk) {
         return 1;
     }
@@ -2630,6 +2664,7 @@ static int img_snapshot(int argc, char **argv)
     bool quiet = false;
     Error *err = NULL;
     bool image_opts = false;
+    bool nolock = false;
 
     bdrv_oflags = BDRV_O_RDWR;
     /* Parse commandline parameters */
@@ -2640,7 +2675,7 @@ static int img_snapshot(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "la:c:d:hq",
+        c = getopt_long(argc, argv, "la:c:d:hqL",
                         long_options, NULL);
         if (c == -1) {
             break;
@@ -2685,6 +2720,9 @@ static int img_snapshot(int argc, char **argv)
         case 'q':
             quiet = true;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case OPTION_OBJECT: {
             QemuOpts *opts;
             opts = qemu_opts_parse_noisily(&qemu_object_opts,
@@ -2710,6 +2748,7 @@ static int img_snapshot(int argc, char **argv)
         return 1;
     }
 
+    bdrv_oflags |= nolock ? BDRV_O_NO_LOCK : 0;
     /* Open the image */
     blk = img_open(image_opts, filename, NULL, bdrv_oflags, false, quiet);
     if (!blk) {
@@ -2779,6 +2818,7 @@ static int img_rebase(int argc, char **argv)
     bool quiet = false;
     Error *local_err = NULL;
     bool image_opts = false;
+    bool nolock = false;
 
     /* Parse commandline parameters */
     fmt = NULL;
@@ -2793,7 +2833,7 @@ static int img_rebase(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "hf:F:b:upt:T:q",
+        c = getopt_long(argc, argv, "hf:F:b:upt:T:qL",
                         long_options, NULL);
         if (c == -1) {
             break;
@@ -2824,6 +2864,9 @@ static int img_rebase(int argc, char **argv)
         case 'T':
             src_cache = optarg;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case 'q':
             quiet = true;
             break;
@@ -2863,6 +2906,7 @@ static int img_rebase(int argc, char **argv)
     qemu_progress_print(0, 100);
 
     flags = BDRV_O_RDWR | (unsafe ? BDRV_O_NO_BACKING : 0);
+    flags |= nolock ? BDRV_O_NO_LOCK : 0;
     ret = bdrv_parse_cache_mode(cache, &flags, &writethrough);
     if (ret < 0) {
         error_report("Invalid cache option: %s", cache);
@@ -3120,6 +3164,8 @@ static int img_resize(int argc, char **argv)
     bool quiet = false;
     BlockBackend *blk = NULL;
     QemuOpts *param;
+    int flags;
+    bool nolock = false;
 
     static QemuOptsList resize_options = {
         .name = "resize_options",
@@ -3154,7 +3200,7 @@ static int img_resize(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "f:hq",
+        c = getopt_long(argc, argv, "f:hqL",
                         long_options, NULL);
         if (c == -1) {
             break;
@@ -3170,6 +3216,9 @@ static int img_resize(int argc, char **argv)
         case 'q':
             quiet = true;
             break;
+        case 'L':
+            nolock = true;
+            break;
         case OPTION_OBJECT: {
             QemuOpts *opts;
             opts = qemu_opts_parse_noisily(&qemu_object_opts,
@@ -3221,8 +3270,9 @@ static int img_resize(int argc, char **argv)
     n = qemu_opt_get_size(param, BLOCK_OPT_SIZE, 0);
     qemu_opts_del(param);
 
-    blk = img_open(image_opts, filename, fmt,
-                   BDRV_O_RDWR, false, quiet);
+    flags = BDRV_O_RDWR;
+    flags |= nolock ? BDRV_O_NO_LOCK : 0;
+    blk = img_open(image_opts, filename, fmt, flags, false, quiet);
     if (!blk) {
         ret = -1;
         goto out;
@@ -3283,6 +3333,7 @@ static int img_amend(int argc, char **argv)
     BlockBackend *blk = NULL;
     BlockDriverState *bs = NULL;
     bool image_opts = false;
+    bool nolock = false;
 
     cache = BDRV_DEFAULT_CACHE;
     for (;;) {
@@ -3292,7 +3343,7 @@ static int img_amend(int argc, char **argv)
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
-        c = getopt_long(argc, argv, "ho:f:t:pq",
+        c = getopt_long(argc, argv, "ho:f:t:pqL",
                         long_options, NULL);
         if (c == -1) {
             break;
@@ -3329,6 +3380,9 @@ static int img_amend(int argc, char **argv)
             case 'q':
                 quiet = true;
                 break;
+            case 'L':
+                nolock = true;
+                break;
             case OPTION_OBJECT:
                 opts = qemu_opts_parse_noisily(&qemu_object_opts,
                                                optarg, true);
@@ -3374,6 +3428,7 @@ static int img_amend(int argc, char **argv)
     }
 
     flags = BDRV_O_RDWR;
+    flags |= nolock ? BDRV_O_NO_LOCK : 0;
     ret = bdrv_parse_cache_mode(cache, &flags, &writethrough);
     if (ret < 0) {
         error_report("Invalid cache option: %s", cache);
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 15/27] qemu-img: Update documentation of "-L" option
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (13 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 14/27] qemu-img: Add "-L" option to sub commands Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 16/27] qemu-nbd: Add "--no-lock/-L" option Fam Zheng
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-img-cmds.hx | 44 ++++++++++++++++++++++----------------------
 qemu-img.c       |  1 +
 qemu-img.texi    |  3 +++
 3 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index e7cded6..fa87942 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -10,68 +10,68 @@ STEXI
 ETEXI
 
 DEF("check", img_check,
-    "check [-q] [--object objectdef] [--image-opts] [-f fmt] [--output=ofmt] [-r [leaks | all]] [-T src_cache] filename")
+    "check [-q] [-L] [--object objectdef] [--image-opts] [-f fmt] [--output=ofmt] [-r [leaks | all]] [-T src_cache] filename")
 STEXI
-@item check [--object @var{objectdef}] [--image-opts] [-q] [-f @var{fmt}] [--output=@var{ofmt}] [-r [leaks | all]] [-T @var{src_cache}] @var{filename}
+@item check [--object @var{objectdef}] [--image-opts] [-q] [-L] [-f @var{fmt}] [--output=@var{ofmt}] [-r [leaks | all]] [-T @var{src_cache}] @var{filename}
 ETEXI
 
 DEF("create", img_create,
-    "create [-q] [--object objectdef] [--image-opts] [-f fmt] [-o options] filename [size]")
+    "create [-q] [-L] [--object objectdef] [--image-opts] [-f fmt] [-o options] filename [size]")
 STEXI
-@item create [--object @var{objectdef}] [--image-opts] [-q] [-f @var{fmt}] [-o @var{options}] @var{filename} [@var{size}]
+@item create [--object @var{objectdef}] [--image-opts] [-q] [-L] [-f @var{fmt}] [-o @var{options}] @var{filename} [@var{size}]
 ETEXI
 
 DEF("commit", img_commit,
-    "commit [-q] [--object objectdef] [--image-opts] [-f fmt] [-t cache] [-b base] [-d] [-p] filename")
+    "commit [-q] [-L] [--object objectdef] [--image-opts] [-f fmt] [-t cache] [-b base] [-d] [-p] filename")
 STEXI
-@item commit [--object @var{objectdef}] [--image-opts] [-q] [-f @var{fmt}] [-t @var{cache}] [-b @var{base}] [-d] [-p] @var{filename}
+@item commit [--object @var{objectdef}] [--image-opts] [-q] [-L] [-f @var{fmt}] [-t @var{cache}] [-b @var{base}] [-d] [-p] @var{filename}
 ETEXI
 
 DEF("compare", img_compare,
-    "compare [--object objectdef] [--image-opts] [-f fmt] [-F fmt] [-T src_cache] [-p] [-q] [-s] filename1 filename2")
+    "compare [--object objectdef] [--image-opts] [-f fmt] [-F fmt] [-T src_cache] [-p] [-q] [-L] [-s] filename1 filename2")
 STEXI
-@item compare [--object @var{objectdef}] [--image-opts] [-f @var{fmt}] [-F @var{fmt}] [-T @var{src_cache}] [-p] [-q] [-s] @var{filename1} @var{filename2}
+@item compare [--object @var{objectdef}] [--image-opts] [-f @var{fmt}] [-F @var{fmt}] [-T @var{src_cache}] [-p] [-q] [-L] [-s] @var{filename1} @var{filename2}
 ETEXI
 
 DEF("convert", img_convert,
-    "convert [--object objectdef] [--image-opts] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-o options] [-s snapshot_id_or_name] [-l snapshot_param] [-S sparse_size] filename [filename2 [...]] output_filename")
+    "convert [--object objectdef] [--image-opts] [-c] [-p] [-q] [-L] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-o options] [-s snapshot_id_or_name] [-l snapshot_param] [-S sparse_size] filename [filename2 [...]] output_filename")
 STEXI
-@item convert [--object @var{objectdef}] [--image-opts] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_id_or_name}] [-l @var{snapshot_param}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename}
+@item convert [--object @var{objectdef}] [--image-opts] [-c] [-p] [-q] [-L] [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_id_or_name}] [-l @var{snapshot_param}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename}
 ETEXI
 
 DEF("info", img_info,
-    "info [--object objectdef] [--image-opts] [-f fmt] [--output=ofmt] [--backing-chain] filename")
+    "info [--object objectdef] [--image-opts] [-f fmt] [-L] [--output=ofmt] [--backing-chain] filename")
 STEXI
-@item info [--object @var{objectdef}] [--image-opts] [-f @var{fmt}] [--output=@var{ofmt}] [--backing-chain] @var{filename}
+@item info [--object @var{objectdef}] [--image-opts] [-f @var{fmt}] [-L] [--output=@var{ofmt}] [--backing-chain] @var{filename}
 ETEXI
 
 DEF("map", img_map,
-    "map [--object objectdef] [--image-opts] [-f fmt] [--output=ofmt] filename")
+    "map [--object objectdef] [--image-opts] [-f fmt] [-L] [--output=ofmt] filename")
 STEXI
-@item map [--object @var{objectdef}] [--image-opts] [-f @var{fmt}] [--output=@var{ofmt}] @var{filename}
+@item map [--object @var{objectdef}] [--image-opts] [-f @var{fmt}] [-L] [--output=@var{ofmt}] @var{filename}
 ETEXI
 
 DEF("snapshot", img_snapshot,
-    "snapshot [--object objectdef] [--image-opts] [-q] [-l | -a snapshot | -c snapshot | -d snapshot] filename")
+    "snapshot [--object objectdef] [--image-opts] [-q] [-L] [-l | -a snapshot | -c snapshot | -d snapshot] filename")
 STEXI
-@item snapshot [--object @var{objectdef}] [--image-opts] [-q] [-l | -a @var{snapshot} | -c @var{snapshot} | -d @var{snapshot}] @var{filename}
+@item snapshot [--object @var{objectdef}] [--image-opts] [-q] [-L] [-l | -a @var{snapshot} | -c @var{snapshot} | -d @var{snapshot}] @var{filename}
 ETEXI
 
 DEF("rebase", img_rebase,
-    "rebase [--object objectdef] [--image-opts] [-q] [-f fmt] [-t cache] [-T src_cache] [-p] [-u] -b backing_file [-F backing_fmt] filename")
+    "rebase [--object objectdef] [--image-opts] [-q] [-L] [-f fmt] [-t cache] [-T src_cache] [-p] [-u] -b backing_file [-F backing_fmt] filename")
 STEXI
-@item rebase [--object @var{objectdef}] [--image-opts] [-q] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-p] [-u] -b @var{backing_file} [-F @var{backing_fmt}] @var{filename}
+@item rebase [--object @var{objectdef}] [--image-opts] [-q] [-L] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-p] [-u] -b @var{backing_file} [-F @var{backing_fmt}] @var{filename}
 ETEXI
 
 DEF("resize", img_resize,
-    "resize [--object objectdef] [--image-opts] [-q] filename [+ | -]size")
+    "resize [--object objectdef] [--image-opts] [-q] [-L] filename [+ | -]size")
 STEXI
-@item resize [--object @var{objectdef}] [--image-opts] [-q] @var{filename} [+ | -]@var{size}
+@item resize [--object @var{objectdef}] [--image-opts] [-q] [-L] @var{filename} [+ | -]@var{size}
 ETEXI
 
 DEF("amend", img_amend,
-    "amend [--object objectdef] [--image-opts] [-p] [-q] [-f fmt] [-t cache] -o options filename")
+    "amend [--object objectdef] [--image-opts] [-p] [-q] [-L] [-f fmt] [-t cache] -o options filename")
 STEXI
-@item amend [--object @var{objectdef}] [--image-opts] [-p] [-q] [-f @var{fmt}] [-t @var{cache}] -o @var{options} @var{filename}
+@item amend [--object @var{objectdef}] [--image-opts] [-p] [-q] [-L] [-f @var{fmt}] [-t @var{cache}] -o @var{options} @var{filename}
 @end table
 ETEXI
diff --git a/qemu-img.c b/qemu-img.c
index 28e350c..6fd9e83 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -130,6 +130,7 @@ static void QEMU_NORETURN help(void)
            "  '-h' with or without a command shows this help and lists the supported formats\n"
            "  '-p' show progress of command (only certain commands)\n"
            "  '-q' use Quiet mode - do not print any output (except errors)\n"
+           "  '-L' disables image locking\n"
            "  '-S' indicates the consecutive number of bytes (defaults to 4k) that must\n"
            "       contain only zeros for qemu-img to create a sparse image during\n"
            "       conversion. If the number of bytes is 0, the source will not be scanned for\n"
diff --git a/qemu-img.texi b/qemu-img.texi
index afaebdd..b91f2e1 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -77,6 +77,9 @@ progress is reported when the process receives a @code{SIGUSR1} signal.
 @item -q
 Quiet mode - do not print any output (except errors). There's no progress bar
 in case both @var{-q} and @var{-p} options are used.
+@item -L
+disables image locking. The image will not be locked, other processes can
+access and lock this image while we are using it.
 @item -S @var{size}
 indicates the consecutive number of bytes that must contain only zeros
 for qemu-img to create a sparse image during conversion. This value is rounded
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 16/27] qemu-nbd: Add "--no-lock/-L" option
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (14 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 15/27] qemu-img: Update documentation of "-L" option Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 17/27] block: Don't lock drive-backup target image in none mode Fam Zheng
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-nbd.c    | 7 ++++++-
 qemu-nbd.texi | 2 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index c55b40f..a675a19 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -95,6 +95,7 @@ static void usage(const char *name)
 "Block device options:\n"
 "  -f, --format=FORMAT       set image format (raw, qcow2, ...)\n"
 "  -r, --read-only           export read-only\n"
+"  -L, --no-lock             disable image locking\n"
 "  -s, --snapshot            use FILE as an external snapshot, create a temporary\n"
 "                            file with backing_file=FILE, redirect the write to\n"
 "                            the temporary one\n"
@@ -464,7 +465,7 @@ int main(int argc, char **argv)
     off_t fd_size;
     QemuOpts *sn_opts = NULL;
     const char *sn_id_or_name = NULL;
-    const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:";
+    const char *sopt = "hVb:o:p:rsnLP:c:dvk:e:f:tl:x:";
     struct option lopt[] = {
         { "help", no_argument, NULL, 'h' },
         { "version", no_argument, NULL, 'V' },
@@ -473,6 +474,7 @@ int main(int argc, char **argv)
         { "socket", required_argument, NULL, 'k' },
         { "offset", required_argument, NULL, 'o' },
         { "read-only", no_argument, NULL, 'r' },
+        { "no-lock", no_argument, NULL, 'L' },
         { "partition", required_argument, NULL, 'P' },
         { "connect", required_argument, NULL, 'c' },
         { "disconnect", no_argument, NULL, 'd' },
@@ -628,6 +630,9 @@ int main(int argc, char **argv)
             nbdflags |= NBD_FLAG_READ_ONLY;
             flags &= ~BDRV_O_RDWR;
             break;
+        case 'L':
+            flags |= BDRV_O_NO_LOCK;
+            break;
         case 'P':
             partition = strtol(optarg, &end, 0);
             if (*end) {
diff --git a/qemu-nbd.texi b/qemu-nbd.texi
index 9f23343..6b7b1d1 100644
--- a/qemu-nbd.texi
+++ b/qemu-nbd.texi
@@ -43,6 +43,8 @@ Force the use of the block driver for format @var{fmt} instead of
 auto-detecting
 @item -r, --read-only
 Export the disk as read-only
+@item -L, --no-lock
+Disable image locking
 @item -P, --partition=@var{num}
 Only expose partition @var{num}
 @item -s, --snapshot
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 17/27] block: Don't lock drive-backup target image in none mode
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (15 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 16/27] qemu-nbd: Add "--no-lock/-L" option Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 18/27] mirror: Disable image locking on target backing chain Fam Zheng
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

As a very special case, in sync=none mode, the source, as the backing
image of the target, will be RO opened again, which is not accepted by
image locking because the first open could be exclusive.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 200fa56..e3882d6 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3243,6 +3243,11 @@ static void do_drive_backup(const char *device, const char *target,
         }
     }
     if (sync == MIRROR_SYNC_MODE_NONE) {
+        /* XXX: bs will be open second time as the backing file of target,
+         * disable image locking. Once block layer allows sharing backing BDS,
+         * Change below to BDRV_O_NO_BACKING and assign it after bdrv_open().
+         * */
+        flags |= BDRV_O_NO_LOCK;
         source = bs;
     }
 
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 18/27] mirror: Disable image locking on target backing chain
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (16 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 17/27] block: Don't lock drive-backup target image in none mode Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 19/27] qemu-iotests: 140: Disable image lock for qemu-io access Fam Zheng
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

In sync=none the backing image of s->target is s->common.bs, which could
be exclusively locked, the image locking wouldn't work here.

Later we can update completion code to lock it after the replaced node
has dropped its lock.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index e3882d6..9968568 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3617,6 +3617,12 @@ void qmp_drive_mirror(const char *device, const char *target,
      * file.
      */
     target_bs = NULL;
+    if (sync == MIRROR_SYNC_MODE_NONE) {
+        flags |= BDRV_O_NO_LOCK;
+    }
+    /* TODO: in mirror complete, after target_bs is switched to and the
+     * original BDS's lock is dropped, we should enable the lock on target_bs.
+     * */
     ret = bdrv_open(&target_bs, target, NULL, options,
                     flags | BDRV_O_NO_BACKING, &local_err);
     if (ret < 0) {
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 19/27] qemu-iotests: 140: Disable image lock for qemu-io access
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (17 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 18/27] mirror: Disable image locking on target backing chain Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 20/27] qemu-iotests: 046: Move version detection out from verify_io Fam Zheng
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

The VM is still on, the image locking check would complain.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/140 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/140 b/tests/qemu-iotests/140
index 49f9df4..3be656a 100755
--- a/tests/qemu-iotests/140
+++ b/tests/qemu-iotests/140
@@ -70,7 +70,7 @@ _send_qemu_cmd $QEMU_HANDLE \
        'arguments': { 'device': 'drv' }}" \
     'return'
 
-$QEMU_IO_PROG -f raw -c 'read -P 42 0 64k' \
+$QEMU_IO_PROG -f raw -L -c 'read -P 42 0 64k' \
     "nbd+unix:///drv?socket=$TEST_DIR/nbd" 2>&1 \
     | _filter_qemu_io | _filter_nbd
 
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 20/27] qemu-iotests: 046: Move version detection out from verify_io
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (18 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 19/27] qemu-iotests: 140: Disable image lock for qemu-io access Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 21/27] qemu-iotests: Wait for QEMU processes before checking image in 091 Fam Zheng
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

So the image lock won't complain.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/046 | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/046 b/tests/qemu-iotests/046
index e528b67..365658e 100755
--- a/tests/qemu-iotests/046
+++ b/tests/qemu-iotests/046
@@ -192,15 +192,7 @@ echo "== Verify image content =="
 
 function verify_io()
 {
-    if ($QEMU_IMG info -f "$IMGFMT" "$TEST_IMG" | grep "compat: 0.10" > /dev/null); then
-        # For v2 images, discarded clusters are read from the backing file
-        # Keep the variable empty so that the backing file value can be used as
-        # the default below
-        discarded=
-    else
-        # Discarded clusters are zeroed for v3 or later
-        discarded=0
-    fi
+    discarded=$1
 
     echo read -P 0 0 0x10000
 
@@ -261,7 +253,17 @@ function verify_io()
     echo read -P 17  0x11c000 0x4000
 }
 
-verify_io | $QEMU_IO "$TEST_IMG" | _filter_qemu_io
+if ($QEMU_IMG info -f "$IMGFMT" "$TEST_IMG" | grep "compat: 0.10" > /dev/null); then
+    # For v2 images, discarded clusters are read from the backing file
+    # Keep the variable empty so that the backing file value can be used as
+    # the default below
+    discarded=
+else
+    # Discarded clusters are zeroed for v3 or later
+    discarded=0
+fi
+
+verify_io $discarded | $QEMU_IO -L "$TEST_IMG" | _filter_qemu_io
 
 _check_test_img
 
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 21/27] qemu-iotests: Wait for QEMU processes before checking image in 091
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (19 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 20/27] qemu-iotests: 046: Move version detection out from verify_io Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 22/27] qemu-iotests: 030: Disable image lock when checking test image Fam Zheng
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

We should wait for the QEMU process to terminate and close the image
before we check the data.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/091     | 3 +++
 tests/qemu-iotests/091.out | 1 +
 2 files changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/091 b/tests/qemu-iotests/091
index 32bbd56..32aa5c3 100755
--- a/tests/qemu-iotests/091
+++ b/tests/qemu-iotests/091
@@ -95,6 +95,9 @@ echo "vm2: qemu process running successfully"
 echo "vm2: flush io, and quit"
 _send_qemu_cmd $h2 'qemu-io disk flush' "(qemu)"
 _send_qemu_cmd $h2 'quit' ""
+echo "vm1: quit"
+_send_qemu_cmd $h1 'quit' ""
+wait
 
 echo "Check image pattern"
 ${QEMU_IO} -c "read -P 0x22 0 4M" "${TEST_IMG}" | _filter_testdir | _filter_qemu_io
diff --git a/tests/qemu-iotests/091.out b/tests/qemu-iotests/091.out
index 5017f8c..6658ca8 100644
--- a/tests/qemu-iotests/091.out
+++ b/tests/qemu-iotests/091.out
@@ -18,6 +18,7 @@ vm1: live migration completed
 vm2: qemu-io disk write complete
 vm2: qemu process running successfully
 vm2: flush io, and quit
+vm1: quit
 Check image pattern
 read 4194304/4194304 bytes at offset 0
 4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 22/27] qemu-iotests: 030: Disable image lock when checking test image
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (20 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 21/27] qemu-iotests: Wait for QEMU processes before checking image in 091 Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 23/27] iotests: 087: Disable image lock in cases where file is shared Fam Zheng
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

The VM is running, qemu-io would fail the lock acquisition.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/030 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index 3ac2443..fa996ef 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -95,7 +95,7 @@ class TestSingleDrive(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
 
         # The image map is empty before the operation
-        empty_map = qemu_io('-f', iotests.imgfmt, '-c', 'map', test_img)
+        empty_map = qemu_io('-L', '-f', iotests.imgfmt, '-c', 'map', test_img)
 
         # This is a no-op: no data should ever be copied from the base image
         result = self.vm.qmp('block-stream', device='drive0', base=mid_img)
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 23/27] iotests: 087: Disable image lock in cases where file is shared
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (21 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 22/27] qemu-iotests: 030: Disable image lock when checking test image Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 24/27] iotests: Disable image locking in 085 Fam Zheng
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Otherwise the error handling we are expceting will be masked by the
preceding image locking check, and is going to be indistinguishable
because the error messages are all the same.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/087 | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087
index e7bca37..b3c69bd 100755
--- a/tests/qemu-iotests/087
+++ b/tests/qemu-iotests/087
@@ -84,6 +84,7 @@ run_qemu <<EOF
       "options": {
         "driver": "$IMGFMT",
         "id": "disk",
+        "lock-image": false,
         "node-name": "test-node",
         "file": {
             "driver": "file",
@@ -97,6 +98,7 @@ run_qemu <<EOF
       "options": {
         "driver": "$IMGFMT",
         "id": "disk",
+        "lock-image": false,
         "file": {
             "driver": "file",
             "filename": "$TEST_IMG"
@@ -109,6 +111,7 @@ run_qemu <<EOF
       "options": {
         "driver": "$IMGFMT",
         "id": "test-node",
+        "lock-image": false,
         "file": {
             "driver": "file",
             "filename": "$TEST_IMG"
@@ -122,6 +125,7 @@ run_qemu <<EOF
         "driver": "$IMGFMT",
         "id": "disk2",
         "node-name": "disk",
+        "lock-image": false,
         "file": {
             "driver": "file",
             "filename": "$TEST_IMG"
@@ -135,6 +139,7 @@ run_qemu <<EOF
         "driver": "$IMGFMT",
         "id": "disk2",
         "node-name": "test-node",
+        "lock-image": false,
         "file": {
             "driver": "file",
             "filename": "$TEST_IMG"
@@ -148,6 +153,7 @@ run_qemu <<EOF
         "driver": "$IMGFMT",
         "id": "disk3",
         "node-name": "disk3",
+        "lock-image": false,
         "file": {
             "driver": "file",
             "filename": "$TEST_IMG"
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 24/27] iotests: Disable image locking in 085
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (22 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 23/27] iotests: 087: Disable image lock in cases where file is shared Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 25/27] tests: Use null-co:// instead of /dev/null Fam Zheng
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

The cases is about live snapshot features. Disable image locking because
otherwise a few tests are going to fail because we reuse the same images
at blockdev-add.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/085 | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/085 b/tests/qemu-iotests/085
index aa77eca..710dd7a 100755
--- a/tests/qemu-iotests/085
+++ b/tests/qemu-iotests/085
@@ -102,6 +102,7 @@ function add_snapshot_image()
     cmd="{ 'execute': 'blockdev-add', 'arguments':
            { 'options':
              { 'driver': 'qcow2', 'node-name': 'snap_${1}', ${extra_params}
+               'lock-image': false,
                'file':
                { 'driver': 'file', 'filename': '${snapshot_file}',
                  'node-name': 'file_${1}' } } } }"
@@ -130,7 +131,7 @@ echo === Running QEMU ===
 echo
 
 qemu_comm_method="qmp"
-_launch_qemu -drive file="${TEST_IMG}.1",if=virtio -drive file="${TEST_IMG}.2",if=virtio
+_launch_qemu -drive file="${TEST_IMG}.1",if=virtio,lock-image=off -drive file="${TEST_IMG}.2",if=virtio,lock-image=off
 h=$QEMU_HANDLE
 
 echo
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 25/27] tests: Use null-co:// instead of /dev/null
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (23 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 24/27] iotests: Disable image locking in 085 Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 26/27] block: Turn on image locking by default Fam Zheng
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

With image locking, opening /dev/null can fail when multiple tests run
in parallel (make -j2, for example). Use null-co:// as the null protocol
doesn't do image locking.

While it's arguable we could special-case /dev/null, /dev/zero,
/dev/urandom etc in raw-posix driver, it is not really necessary because
user can always specify lock-image=off when it is appropriate. So let's
write sensible testing code too.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/drive_del-test.c    | 2 +-
 tests/nvme-test.c         | 2 +-
 tests/usb-hcd-uhci-test.c | 2 +-
 tests/usb-hcd-xhci-test.c | 2 +-
 tests/virtio-blk-test.c   | 2 +-
 tests/virtio-scsi-test.c  | 4 ++--
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tests/drive_del-test.c b/tests/drive_del-test.c
index fe03236..b9a6036 100644
--- a/tests/drive_del-test.c
+++ b/tests/drive_del-test.c
@@ -93,7 +93,7 @@ static void test_after_failed_device_add(void)
 static void test_drive_del_device_del(void)
 {
     /* Start with a drive used by a device that unplugs instantaneously */
-    qtest_start("-drive if=none,id=drive0,file=/dev/null,format=raw"
+    qtest_start("-drive if=none,id=drive0,file=null-co://,format=raw"
                 " -device virtio-scsi-pci"
                 " -device scsi-hd,drive=drive0,id=dev0");
 
diff --git a/tests/nvme-test.c b/tests/nvme-test.c
index ec06893..60734e8 100644
--- a/tests/nvme-test.c
+++ b/tests/nvme-test.c
@@ -23,7 +23,7 @@ int main(int argc, char **argv)
     g_test_init(&argc, &argv, NULL);
     qtest_add_func("/nvme/nop", nop);
 
-    qtest_start("-drive id=drv0,if=none,file=/dev/null,format=raw "
+    qtest_start("-drive id=drv0,if=none,file=null-co://,format=raw "
                 "-device nvme,drive=drv0,serial=foo");
     ret = g_test_run();
 
diff --git a/tests/usb-hcd-uhci-test.c b/tests/usb-hcd-uhci-test.c
index 71ff2ea..4a2f3ad 100644
--- a/tests/usb-hcd-uhci-test.c
+++ b/tests/usb-hcd-uhci-test.c
@@ -86,7 +86,7 @@ int main(int argc, char **argv)
     qtest_add_func("/uhci/pci/hotplug/usb-storage", test_usb_storage_hotplug);
 
     qtest_start("-device piix3-usb-uhci,id=uhci,addr=1d.0"
-                " -drive id=drive0,if=none,file=/dev/null,format=raw"
+                " -drive id=drive0,if=none,file=null-co://,format=raw"
                 " -device usb-tablet,bus=uhci.0,port=1");
     ret = g_test_run();
     qtest_end();
diff --git a/tests/usb-hcd-xhci-test.c b/tests/usb-hcd-xhci-test.c
index 7e2e212..5e96bec 100644
--- a/tests/usb-hcd-xhci-test.c
+++ b/tests/usb-hcd-xhci-test.c
@@ -90,7 +90,7 @@ int main(int argc, char **argv)
     qtest_add_func("/xhci/pci/hotplug/usb-uas", test_usb_uas_hotplug);
 
     qtest_start("-device nec-usb-xhci,id=xhci"
-                " -drive id=drive0,if=none,file=/dev/null,format=raw");
+                " -drive id=drive0,if=none,file=null-co://,format=raw");
     ret = g_test_run();
     qtest_end();
 
diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index 3a66630..7c21938 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -81,7 +81,7 @@ static QPCIBus *pci_test_start(void)
     tmp_path = drive_create();
 
     cmdline = g_strdup_printf("-drive if=none,id=drive0,file=%s,format=raw "
-                        "-drive if=none,id=drive1,file=/dev/null,format=raw "
+                        "-drive if=none,id=drive1,file=null-co://,format=raw "
                         "-device virtio-blk-pci,id=drv0,drive=drive0,"
                         "addr=%x.%x",
                         tmp_path, PCI_SLOT, PCI_FN);
diff --git a/tests/virtio-scsi-test.c b/tests/virtio-scsi-test.c
index d78747a..74f85a8 100644
--- a/tests/virtio-scsi-test.c
+++ b/tests/virtio-scsi-test.c
@@ -57,7 +57,7 @@ static void qvirtio_scsi_start(const char *extra_opts)
     char *cmdline;
 
     cmdline = g_strdup_printf(
-                "-drive id=drv0,if=none,file=/dev/null,format=raw "
+                "-drive id=drv0,if=none,file=null-co://,format=raw "
                 "-device virtio-scsi-pci,id=vs0 "
                 "-device scsi-hd,bus=vs0.0,drive=drv0 %s",
                 extra_opts ? : "");
@@ -208,7 +208,7 @@ static void hotplug(void)
 {
     QDict *response;
 
-    qvirtio_scsi_start("-drive id=drv1,if=none,file=/dev/null,format=raw");
+    qvirtio_scsi_start("-drive id=drv1,if=none,file=null-co://,format=raw");
     response = qmp("{\"execute\": \"device_add\","
                    " \"arguments\": {"
                    "   \"driver\": \"scsi-hd\","
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 26/27] block: Turn on image locking by default
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (24 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 25/27] tests: Use null-co:// instead of /dev/null Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 27/27] qemu-iotests: Add test case 153 for image locking Fam Zheng
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Now that test cases are covered, we can turn it on. RO (shared) lock is
disabled to allow existing libguestfs use cases (invoking QEMU for
reading image that is exclusively locked by another QEMU).

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c    | 7 +++++++
 blockdev.c | 2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 1b3aac4..a490bfe 100644
--- a/block.c
+++ b/block.c
@@ -972,6 +972,13 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file,
     bs->zero_beyond_eof = true;
     bs->read_only = !(bs->open_flags & BDRV_O_RDWR);
 
+    if (bs->read_only) {
+        /* libguestfs uses us to _read_ images that are rw opened by another
+         * QEMU, skip locking ro images because the other QEMU may have an
+         * exclusive lock. */
+        bs->open_flags |= BDRV_O_NO_LOCK;
+    }
+
     if (use_bdrv_whitelist && !bdrv_is_whitelisted(drv, bs->read_only)) {
         error_setg(errp,
                    !bs->read_only && bdrv_is_whitelisted(drv, true)
diff --git a/blockdev.c b/blockdev.c
index 9968568..007690b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -383,7 +383,7 @@ static void extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags,
             }
         }
 
-        if (!qemu_opt_get_bool(opts, "lock-image", false)) {
+        if (!qemu_opt_get_bool(opts, "lock-image", true)) {
             *bdrv_flags |= BDRV_O_NO_LOCK;
         }
     }
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Qemu-devel] [PATCH v4 27/27] qemu-iotests: Add test case 153 for image locking
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (25 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 26/27] block: Turn on image locking by default Fam Zheng
@ 2016-05-10  2:50 ` Fam Zheng
  2016-05-10  8:14 ` [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Richard W.M. Jones
  2016-05-11 11:48 ` Richard W.M. Jones
  28 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-10  2:50 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster, Eric Blake,
	qemu-block, rjones, stefanha, pbonzini, John Snow, berrange, den

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/153     | 191 +++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/153.out | 241 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 433 insertions(+)
 create mode 100755 tests/qemu-iotests/153
 create mode 100644 tests/qemu-iotests/153.out

diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
new file mode 100755
index 0000000..456e4f6
--- /dev/null
+++ b/tests/qemu-iotests/153
@@ -0,0 +1,191 @@
+#!/bin/bash
+#
+# Test image locking
+#
+# Copyright (C) 2016 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=famz@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+    rm -f "${TEST_IMG}.base"
+    rm -f "${TEST_IMG}.convert"
+    rm -f "${TEST_IMG}.a"
+    rm -f "${TEST_IMG}.b"
+    rm -f "${TEST_IMG}.lnk"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+size=32M
+
+_run_cmd()
+{
+    echo
+    (echo "$@"; "$@" 2>&1 1>/dev/null) | _filter_testdir
+}
+
+function _do_run_qemu()
+{
+    (
+        if ! test -t 0; then
+            while read cmd; do
+                echo $cmd
+            done
+        fi
+        echo quit
+    ) | $QEMU -nographic -monitor stdio -serial none "$@" 1>/dev/null
+}
+
+function _run_qemu_with_images()
+{
+    _do_run_qemu \
+        $(for i in $@; do echo "-drive if=none,file=$i"; done) 2>&1 \
+        | _filter_testdir | _filter_qemu
+}
+
+for opts1 in "" "readonly=on" "lock-image=off"; do
+    echo
+    echo "== Creating base image =="
+    TEST_IMG="${TEST_IMG}.base" _make_test_img $size
+
+    echo
+    echo "== Creating test image =="
+    $QEMU_IMG create -f $IMGFMT "${TEST_IMG}" -b ${TEST_IMG}.base | _filter_img_create
+
+    echo
+    echo "== Launching QEMU $opts1 =="
+    _launch_qemu -drive file="${TEST_IMG}",if=none,$opts1
+    h=$QEMU_HANDLE
+
+    for opts2 in "" "lock-image=on" "lock-image=off"; do
+        echo
+        echo "== Launching another QEMU $opts2 =="
+        echo "quit" | \
+            $QEMU -nographic -monitor stdio \
+            -drive file="${TEST_IMG}",if=none,$opts2 2>&1 1>/dev/null | \
+            _filter_testdir | _filter_qemu
+    done
+
+    for L in "" "-L"; do
+
+        echo
+        echo "== Running utility commands $(test -n "$L" && echo "(lock disabled)") =="
+        _run_cmd $QEMU_IO $L -c "read 0 512" "${TEST_IMG}"
+        _run_cmd $QEMU_IO $L -r -c "read 0 512" "${TEST_IMG}"
+        _run_cmd $QEMU_IMG info        $L "${TEST_IMG}"
+        _run_cmd $QEMU_IMG check       $L "${TEST_IMG}"
+        _run_cmd $QEMU_IMG compare     $L "${TEST_IMG}" "${TEST_IMG}"
+        _run_cmd $QEMU_IMG map         $L "${TEST_IMG}"
+        _run_cmd $QEMU_IMG amend -o "" $L "${TEST_IMG}"
+        _run_cmd $QEMU_IMG commit      $L "${TEST_IMG}"
+        _run_cmd $QEMU_IMG resize      $L "${TEST_IMG}" $size
+        _run_cmd $QEMU_IMG rebase      $L "${TEST_IMG}" -b "${TEST_IMG}.base"
+        _run_cmd $QEMU_IMG snapshot -l $L "${TEST_IMG}"
+        _run_cmd $QEMU_IMG convert     $L "${TEST_IMG}" "${TEST_IMG}.convert"
+    done
+    _send_qemu_cmd $h "{ 'execute': 'quit', }" ""
+    echo
+    echo "Round done"
+    _cleanup_qemu
+done
+
+echo
+echo "== Two devices with the same image (rw - rw) =="
+_run_qemu_with_images "${TEST_IMG}" "${TEST_IMG}"
+
+echo
+echo "== Two devices with the same image (rw - ro) =="
+_run_qemu_with_images "${TEST_IMG}" "${TEST_IMG},readonly=on"
+
+echo
+echo "== Two devices with the same image (ro - ro) =="
+_run_qemu_with_images "${TEST_IMG},readonly=on" "${TEST_IMG},readonly=on"
+
+(
+    $QEMU_IMG create -f qcow2 "${TEST_IMG}.a" -b "${TEST_IMG}"
+    $QEMU_IMG create -f qcow2 "${TEST_IMG}.b" -b "${TEST_IMG}"
+    $QEMU_IMG create -f qcow2 "${TEST_IMG}.c" -b "${TEST_IMG}.b"
+) | _filter_img_create
+
+echo
+echo "== Two devices sharing the same file in backing chain =="
+_run_qemu_with_images "${TEST_IMG}.a" "${TEST_IMG}.b"
+_run_qemu_with_images "${TEST_IMG}.a" "${TEST_IMG}.c"
+
+echo
+echo "== Backing image also as an active device =="
+_run_qemu_with_images "${TEST_IMG}.a" "${TEST_IMG}"
+
+echo
+echo "== Backing image also as an active device (ro) =="
+_run_qemu_with_images "${TEST_IMG}.a" "${TEST_IMG},readonly=on"
+
+echo
+echo "== Symbolic link =="
+rm -f "${TEST_IMG}.lnk" &>/dev/null
+ln -s ${TEST_IMG} "${TEST_IMG}.lnk" || echo "Failed to create link"
+_run_qemu_with_images "${TEST_IMG}.lnk" "${TEST_IMG}"
+
+echo
+echo "== Test that close one BDS doesn't unlock others =="
+_launch_qemu -drive file="${TEST_IMG}",if=none,id=d0
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'qmp_capabilities' }" \
+    'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'human-monitor-command',
+       'arguments': { 'command-line': 'drive_add 0 if=none,id=d1,file=${TEST_IMG}' } }" \
+    "Failed to lock image"
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'human-monitor-command',
+       'arguments': { 'command-line': 'drive_add 0 if=none,id=d1,file=${TEST_IMG},lock-image=off' } }" \
+    "OK"
+
+_send_qemu_cmd $QEMU_HANDLE \
+    "{ 'execute': 'human-monitor-command',
+       'arguments': { 'command-line': 'drive_del d1' } }" \
+    ""
+
+_run_cmd $QEMU_IO -c quit "${TEST_IMG}"
+
+_cleanup_qemu
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
new file mode 100644
index 0000000..15d0e3f
--- /dev/null
+++ b/tests/qemu-iotests/153.out
@@ -0,0 +1,241 @@
+QA output created by 153
+
+== Creating base image ==
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=33554432
+
+== Creating test image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT.base
+
+== Launching QEMU  ==
+
+== Launching another QEMU  ==
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=none,: Failed to lock image
+
+== Launching another QEMU lock-image=on ==
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=none,lock-image=on: Failed to lock image
+
+== Launching another QEMU lock-image=off ==
+
+== Running utility commands  ==
+
+_qemu_io_wrapper -c read 0 512 TEST_DIR/t.qcow2
+can't open device TEST_DIR/t.qcow2: Failed to lock image
+no file open, try 'help open'
+
+_qemu_io_wrapper -r -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper info TEST_DIR/t.qcow2
+
+_qemu_img_wrapper check TEST_DIR/t.qcow2
+
+_qemu_img_wrapper compare TEST_DIR/t.qcow2 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper map TEST_DIR/t.qcow2
+
+_qemu_img_wrapper amend -o  TEST_DIR/t.qcow2
+qemu-img: Could not open 'TEST_DIR/t.qcow2': Failed to lock image
+
+_qemu_img_wrapper commit TEST_DIR/t.qcow2
+qemu-img: Could not open 'TEST_DIR/t.qcow2': Failed to lock image
+
+_qemu_img_wrapper resize TEST_DIR/t.qcow2 32M
+qemu-img: Could not open 'TEST_DIR/t.qcow2': Failed to lock image
+
+_qemu_img_wrapper rebase TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+qemu-img: Could not open 'TEST_DIR/t.qcow2': Failed to lock image
+
+_qemu_img_wrapper snapshot -l TEST_DIR/t.qcow2
+
+_qemu_img_wrapper convert TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.convert
+
+== Running utility commands (lock disabled) ==
+
+_qemu_io_wrapper -L -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_io_wrapper -L -r -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper info -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper check -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper compare -L TEST_DIR/t.qcow2 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper map -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper amend -o  -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper commit -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper resize -L TEST_DIR/t.qcow2 32M
+
+_qemu_img_wrapper rebase -L TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+
+_qemu_img_wrapper snapshot -l -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper convert -L TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.convert
+
+Round done
+
+== Creating base image ==
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=33554432
+
+== Creating test image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT.base
+
+== Launching QEMU readonly=on ==
+
+== Launching another QEMU  ==
+
+== Launching another QEMU lock-image=on ==
+
+== Launching another QEMU lock-image=off ==
+
+== Running utility commands  ==
+
+_qemu_io_wrapper -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_io_wrapper -r -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper info TEST_DIR/t.qcow2
+
+_qemu_img_wrapper check TEST_DIR/t.qcow2
+
+_qemu_img_wrapper compare TEST_DIR/t.qcow2 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper map TEST_DIR/t.qcow2
+
+_qemu_img_wrapper amend -o  TEST_DIR/t.qcow2
+
+_qemu_img_wrapper commit TEST_DIR/t.qcow2
+
+_qemu_img_wrapper resize TEST_DIR/t.qcow2 32M
+
+_qemu_img_wrapper rebase TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+
+_qemu_img_wrapper snapshot -l TEST_DIR/t.qcow2
+
+_qemu_img_wrapper convert TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.convert
+
+== Running utility commands (lock disabled) ==
+
+_qemu_io_wrapper -L -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_io_wrapper -L -r -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper info -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper check -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper compare -L TEST_DIR/t.qcow2 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper map -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper amend -o  -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper commit -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper resize -L TEST_DIR/t.qcow2 32M
+
+_qemu_img_wrapper rebase -L TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+
+_qemu_img_wrapper snapshot -l -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper convert -L TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.convert
+
+Round done
+
+== Creating base image ==
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=33554432
+
+== Creating test image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT.base
+
+== Launching QEMU lock-image=off ==
+
+== Launching another QEMU  ==
+
+== Launching another QEMU lock-image=on ==
+
+== Launching another QEMU lock-image=off ==
+
+== Running utility commands  ==
+
+_qemu_io_wrapper -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_io_wrapper -r -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper info TEST_DIR/t.qcow2
+
+_qemu_img_wrapper check TEST_DIR/t.qcow2
+
+_qemu_img_wrapper compare TEST_DIR/t.qcow2 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper map TEST_DIR/t.qcow2
+
+_qemu_img_wrapper amend -o  TEST_DIR/t.qcow2
+
+_qemu_img_wrapper commit TEST_DIR/t.qcow2
+
+_qemu_img_wrapper resize TEST_DIR/t.qcow2 32M
+
+_qemu_img_wrapper rebase TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+
+_qemu_img_wrapper snapshot -l TEST_DIR/t.qcow2
+
+_qemu_img_wrapper convert TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.convert
+
+== Running utility commands (lock disabled) ==
+
+_qemu_io_wrapper -L -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_io_wrapper -L -r -c read 0 512 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper info -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper check -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper compare -L TEST_DIR/t.qcow2 TEST_DIR/t.qcow2
+
+_qemu_img_wrapper map -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper amend -o  -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper commit -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper resize -L TEST_DIR/t.qcow2 32M
+
+_qemu_img_wrapper rebase -L TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+
+_qemu_img_wrapper snapshot -l -L TEST_DIR/t.qcow2
+
+_qemu_img_wrapper convert -L TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.convert
+
+Round done
+
+== Two devices with the same image (rw - rw) ==
+QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2: Failed to lock image
+
+== Two devices with the same image (rw - ro) ==
+
+== Two devices with the same image (ro - ro) ==
+Formatting 'TEST_DIR/t.IMGFMT.a', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT
+Formatting 'TEST_DIR/t.IMGFMT.b', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT
+Formatting 'TEST_DIR/t.IMGFMT.c', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT.b
+
+== Two devices sharing the same file in backing chain ==
+
+== Backing image also as an active device ==
+
+== Backing image also as an active device (ro) ==
+
+== Symbolic link ==
+QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2: Failed to lock image
+
+== Test that close one BDS doesn't unlock others ==
+{"return": {}}
+{"return": "Failed to lock imagern"}
+{"return": "OKrn"}
+
+_qemu_io_wrapper -c quit TEST_DIR/t.qcow2
+can't open device TEST_DIR/t.qcow2: Failed to lock image
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 822953b..3f6d4a9 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -153,3 +153,4 @@
 149 rw auto sudo
 150 rw auto quick
 152 rw auto quick
+153 rw auto quick
-- 
2.8.2

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
@ 2016-05-10  7:54   ` Richard W.M. Jones
  2016-05-10  8:57   ` Daniel P. Berrange
  1 sibling, 0 replies; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-10  7:54 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster,
	Eric Blake, qemu-block, stefanha, pbonzini, John Snow, berrange,
	den

On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> +int qemu_lock_fd(int fd, int64_t start, int64_t len, bool readonly)

I find this new API to be very unintuitive.  When I was reading the
other code in block/raw-posix.c I had to refer back to this file to
find out what all the integers meant.

It is also misleading.  There's aren't really such a thing as
"readonly locks", or "read locks" or "write locks".  I know that
(some) POSIX APIs use these terms, but the terms are wrong.

There are shared locks, and there are exclusive locks.  The locks have
nothing to do with reading or writing.  In fact the locks only apply
when writing, and are to do with whether you want to allow multiple
writers at the same time (shared lock), or only a single writer
(exclusive lock).

Anyway, I think the boolean "readonly" should be replaced by
some flag like:

#define QEMU_LOCK_SHARED 1
#define QEMU_LOCK_EXCLUSIVE 2

or similar.

Not sure about the start/len.  Do we need them in the API at all given
that we've decided to lock a particular byte of the file?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (26 preceding siblings ...)
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 27/27] qemu-iotests: Add test case 153 for image locking Fam Zheng
@ 2016-05-10  8:14 ` Richard W.M. Jones
  2016-05-10  8:43   ` Richard W.M. Jones
  2016-05-11 11:48 ` Richard W.M. Jones
  28 siblings, 1 reply; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-10  8:14 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster,
	Eric Blake, qemu-block, stefanha, pbonzini, John Snow, berrange,
	den

On Tue, May 10, 2016 at 10:50:32AM +0800, Fam Zheng wrote:
> v4: Don't lock RO image. [Rich]

I applied this on top of HEAD and added some debugging messages to
block/raw-posix.c so I can see when files are being opened and locked,
and it does appear to do the right thing for read-only and
write-exclusive files, ie. it applies an exclusive lock when opening a
file for write-exclusive, and no lock when opening a file for read.

However I didn't test the write-shareable case (the libvirt
<shareable/> flag which should map to a shared lock -- is that right Dan?).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  8:14 ` [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Richard W.M. Jones
@ 2016-05-10  8:43   ` Richard W.M. Jones
  2016-05-10  8:50     ` Daniel P. Berrange
  0 siblings, 1 reply; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-10  8:43 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, John Snow, Jeff Cody, Markus Armbruster,
	qemu-devel, stefanha, den, pbonzini, Max Reitz

On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> However I didn't test the write-shareable case (the libvirt
> <shareable/> flag which should map to a shared lock -- is that right Dan?).

To Dan (mainly): I think setting the <shareable/> flag in libvirt only
sets cache=unsafe on the qemu drive (it may have other effects for
virtlockd).  Should there be an extra qemu drive flag to communicate
that the drive is write-shareable?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  8:43   ` Richard W.M. Jones
@ 2016-05-10  8:50     ` Daniel P. Berrange
  2016-05-10  9:14       ` Kevin Wolf
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-10  8:50 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Jeff Cody, Markus Armbruster,
	qemu-devel, stefanha, pbonzini, den, Max Reitz, John Snow

On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > However I didn't test the write-shareable case (the libvirt
> > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> 
> To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> sets cache=unsafe on the qemu drive (it may have other effects for
> virtlockd).  Should there be an extra qemu drive flag to communicate
> that the drive is write-shareable?

Sure, if QEMU had a way to indicate that the disk was used in a
write-sharable mode, libvirt would use it.

I agree with your general point that we should do no locking for
read-only access, F_RDLCK for shared-write and F_WRLCK for
exclusive-write access. I think I made that point similarly against
an earlier version of this patchset

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
  2016-05-10  7:54   ` Richard W.M. Jones
@ 2016-05-10  8:57   ` Daniel P. Berrange
  2016-05-10  9:06     ` Richard W.M. Jones
  2016-05-11  0:48     ` Fam Zheng
  1 sibling, 2 replies; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-10  8:57 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster,
	Eric Blake, qemu-block, rjones, stefanha, pbonzini, John Snow,
	den

On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> They are wrappers of POSIX fcntl file locking, with the additional
> interception of open/close (through qemu_open and qemu_close) to offer a
> better semantics that preserves the locks across multiple life cycles of
> different fds on the same file.  The reason to make this semantics
> change over the fcntl locks is to make the API cleaner for QEMU
> subsystems.
> 
> More precisely, we remove this "feature" of fcntl lock:
> 
>     If a process closes any file descriptor referring to a file, then
>     all of the process's locks on that file are released, regardless of
>     the file descriptor(s) on which the locks were obtained.
> 
> as long as the fd is always open/closed through qemu_open and
> qemu_close.

You're not actually really removing that problem - this is just hacking
around it in a manner which is racy & susceptible to silent failure.


> +static int qemu_fd_close(int fd)
> +{
> +    LockEntry *ent, *tmp;
> +    LockRecord *rec;
> +    char *path;
> +    int ret;
> +
> +    assert(fd_to_path);
> +    path = g_hash_table_lookup(fd_to_path, GINT_TO_POINTER(fd));
> +    assert(path);
> +    g_hash_table_remove(fd_to_path, GINT_TO_POINTER(fd));
> +    rec = g_hash_table_lookup(lock_map, path);
> +    ret = close(fd);
> +
> +    if (rec) {
> +        QLIST_FOREACH_SAFE(ent, &rec->records, next, tmp) {
> +            int r;
> +            if (ent->fd == fd) {
> +                QLIST_REMOVE(ent, next);
> +                g_free(ent);
> +                continue;
> +            }
> +            r = qemu_lock_do(ent->fd, ent->start, ent->len,
> +                             ent->readonly ? F_RDLCK : F_WRLCK);
> +            if (r == -1) {
> +                fprintf(stderr, "Failed to acquire lock on fd %d: %s\n",
> +                        ent->fd, strerror(errno));
> +            }

If another app has acquired the lock between the close and this attempt
to re-acquire the lock, then QEMU is silently carrying on without any
lock being held. For something that's intending to provide protection
against concurrent use I think this is not an acceptable failure
scenario.


> +int qemu_open(const char *name, int flags, ...)
> +{
> +    int mode = 0;
> +    int ret;
> +
> +    if (flags & O_CREAT) {
> +        va_list ap;
> +
> +        va_start(ap, flags);
> +        mode = va_arg(ap, int);
> +        va_end(ap);
> +    }
> +    ret = qemu_fd_open(name, flags, mode);
> +    if (ret >= 0) {
> +        qemu_fd_add_record(ret, name);
> +    }
> +    return ret;
> +}

I think the approach you have here is fundamentally not usable with
fcntl locks, because it is still using the pattern

   fd = open(path)
   lock(fd)
   if failed lock
      close(fd)
   ...do stuff.


As mentioned in previous posting I believe the block layer must be
checking whether it already has a lock held against "path" *before*
even attempting to open the file. Once QEMU has the file descriptor
open it is too later, because closing the FD will release pre-existing
locks QEMU has.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2016-05-10  8:57   ` Daniel P. Berrange
@ 2016-05-10  9:06     ` Richard W.M. Jones
  2016-05-10  9:20       ` Daniel P. Berrange
  2016-05-11  0:48     ` Fam Zheng
  1 sibling, 1 reply; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-10  9:06 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Fam Zheng, qemu-devel, Kevin Wolf, Max Reitz, Jeff Cody,
	Markus Armbruster, Eric Blake, qemu-block, stefanha, pbonzini,
	John Snow, den

On Tue, May 10, 2016 at 09:57:48AM +0100, Daniel P. Berrange wrote:
> On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> > They are wrappers of POSIX fcntl file locking, with the additional
> > interception of open/close (through qemu_open and qemu_close) to offer a
> > better semantics that preserves the locks across multiple life cycles of
> > different fds on the same file.  The reason to make this semantics
> > change over the fcntl locks is to make the API cleaner for QEMU
> > subsystems.
> > 
> > More precisely, we remove this "feature" of fcntl lock:
> > 
> >     If a process closes any file descriptor referring to a file, then
> >     all of the process's locks on that file are released, regardless of
> >     the file descriptor(s) on which the locks were obtained.
> > 
> > as long as the fd is always open/closed through qemu_open and
> > qemu_close.
> 
> You're not actually really removing that problem - this is just hacking
> around it in a manner which is racy & susceptible to silent failure.

Whatever happened to file-private locks (https://lwn.net/Articles/586904/)?
My very recent glibc doesn't appear to include them, unless the
final standard used something different from `F_SETLKPW'.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  8:50     ` Daniel P. Berrange
@ 2016-05-10  9:14       ` Kevin Wolf
  2016-05-10  9:23         ` Daniel P. Berrange
  2016-05-10 10:02         ` Richard W.M. Jones
  0 siblings, 2 replies; 61+ messages in thread
From: Kevin Wolf @ 2016-05-10  9:14 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Richard W.M. Jones, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > However I didn't test the write-shareable case (the libvirt
> > > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> > 
> > To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> > sets cache=unsafe on the qemu drive (it may have other effects for
> > virtlockd).  Should there be an extra qemu drive flag to communicate
> > that the drive is write-shareable?
> 
> Sure, if QEMU had a way to indicate that the disk was used in a
> write-sharable mode, libvirt would use it.
> 
> I agree with your general point that we should do no locking for
> read-only access, F_RDLCK for shared-write and F_WRLCK for
> exclusive-write access. I think I made that point similarly against
> an earlier version of this patchset

Why should we do no locking for read-only access by default? If an image
is written to, read-only accesses are potentially inconsistent and we
should protect users against it.

The only argument I can see in the old versions of this series is
"libguestfs does it and usually it gets away with it". For me, that's
not an argument for generally allowing this, but at most for providing a
switch to bypass the locking.

Because let's be clear about this: If someone lost data because they
took an inconsistent backup this way and comes to us qemu developers,
all we can tell them is "sorry, what you did is not supported". And
that's a pretty strong sign that locking should prevent it by default.

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2016-05-10  9:06     ` Richard W.M. Jones
@ 2016-05-10  9:20       ` Daniel P. Berrange
  0 siblings, 0 replies; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-10  9:20 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Fam Zheng, qemu-devel, Kevin Wolf, Max Reitz, Jeff Cody,
	Markus Armbruster, Eric Blake, qemu-block, stefanha, pbonzini,
	John Snow, den

On Tue, May 10, 2016 at 10:06:35AM +0100, Richard W.M. Jones wrote:
> On Tue, May 10, 2016 at 09:57:48AM +0100, Daniel P. Berrange wrote:
> > On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> > > They are wrappers of POSIX fcntl file locking, with the additional
> > > interception of open/close (through qemu_open and qemu_close) to offer a
> > > better semantics that preserves the locks across multiple life cycles of
> > > different fds on the same file.  The reason to make this semantics
> > > change over the fcntl locks is to make the API cleaner for QEMU
> > > subsystems.
> > > 
> > > More precisely, we remove this "feature" of fcntl lock:
> > > 
> > >     If a process closes any file descriptor referring to a file, then
> > >     all of the process's locks on that file are released, regardless of
> > >     the file descriptor(s) on which the locks were obtained.
> > > 
> > > as long as the fd is always open/closed through qemu_open and
> > > qemu_close.
> > 
> > You're not actually really removing that problem - this is just hacking
> > around it in a manner which is racy & susceptible to silent failure.
> 
> Whatever happened to file-private locks (https://lwn.net/Articles/586904/)?
> My very recent glibc doesn't appear to include them, unless the
> final standard used something different from `F_SETLKPW'.

They merged in 3.15, but the constant names got renamed just after merge :-)
Look for F_OFD_SETLKW instead


commit 0d3f7a2dd2f5cf9642982515e020c1aee2cf7af6
Author: Jeff Layton <jlayton@redhat.com>
Date:   Tue Apr 22 08:23:58 2014 -0400

    locks: rename file-private locks to "open file description locks"
    
    File-private locks have been merged into Linux for v3.15, and *now*
    people are commenting that the name and macro definitions for the new
    file-private locks suck.
    
    ...and I can't even disagree. The names and command macros do suck.
    
    We're going to have to live with these for a long time, so it's
    important that we be happy with the names before we're stuck with them.
    The consensus on the lists so far is that they should be rechristened as
    "open file description locks".
    
    The name isn't a big deal for the kernel, but the command macros are not
    visually distinct enough from the traditional POSIX lock macros. The
    glibc and documentation folks are recommending that we change them to
    look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a
    programmer will typo one of the commands wrong, and also makes it easier
    to spot this difference when reading code.
    
    This patch makes the following changes that I think are necessary before
    v3.15 ships:
    
    1) rename the command macros to their new names. These end up in the uapi
       headers and so are part of the external-facing API. It turns out that
       glibc doesn't actually use the fcntl.h uapi header, but it's hard to
       be sure that something else won't. Changing it now is safest.
    
    2) make the the /proc/locks output display these as type "OFDLCK"
    

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  9:14       ` Kevin Wolf
@ 2016-05-10  9:23         ` Daniel P. Berrange
  2016-05-10  9:35           ` Kevin Wolf
  2016-05-10 10:02         ` Richard W.M. Jones
  1 sibling, 1 reply; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-10  9:23 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Richard W.M. Jones, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > However I didn't test the write-shareable case (the libvirt
> > > > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> > > 
> > > To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> > > sets cache=unsafe on the qemu drive (it may have other effects for
> > > virtlockd).  Should there be an extra qemu drive flag to communicate
> > > that the drive is write-shareable?
> > 
> > Sure, if QEMU had a way to indicate that the disk was used in a
> > write-sharable mode, libvirt would use it.
> > 
> > I agree with your general point that we should do no locking for
> > read-only access, F_RDLCK for shared-write and F_WRLCK for
> > exclusive-write access. I think I made that point similarly against
> > an earlier version of this patchset
> 
> Why should we do no locking for read-only access by default? If an image
> is written to, read-only accesses are potentially inconsistent and we
> should protect users against it.
> 
> The only argument I can see in the old versions of this series is
> "libguestfs does it and usually it gets away with it". For me, that's
> not an argument for generally allowing this, but at most for providing a
> switch to bypass the locking.
> 
> Because let's be clear about this: If someone lost data because they
> took an inconsistent backup this way and comes to us qemu developers,
> all we can tell them is "sorry, what you did is not supported". And
> that's a pretty strong sign that locking should prevent it by default.

We have 3 usage scenarios - readonly-share, writable-shared and
writable-exclusive, and only 2 lock types to play with. This series
does no locking at all in the writable-shared case, so we still have
the problem you describe in that someone opening the image for readonly
access will succeeed even when it is used writable-shared by another
process.

So we have to pick a trade-off here. IMHO it is more important to
ensure we have locks in both the write-shared & write-exclusive case,
as both of those can definitely result in corrupted images if not
careful, where as read-only access merely results in your reading
bad data, no risk of corrupting the original image.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  9:23         ` Daniel P. Berrange
@ 2016-05-10  9:35           ` Kevin Wolf
  2016-05-10  9:43             ` Daniel P. Berrange
  0 siblings, 1 reply; 61+ messages in thread
From: Kevin Wolf @ 2016-05-10  9:35 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Richard W.M. Jones, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> > Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > > However I didn't test the write-shareable case (the libvirt
> > > > > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> > > > 
> > > > To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> > > > sets cache=unsafe on the qemu drive (it may have other effects for
> > > > virtlockd).  Should there be an extra qemu drive flag to communicate
> > > > that the drive is write-shareable?
> > > 
> > > Sure, if QEMU had a way to indicate that the disk was used in a
> > > write-sharable mode, libvirt would use it.
> > > 
> > > I agree with your general point that we should do no locking for
> > > read-only access, F_RDLCK for shared-write and F_WRLCK for
> > > exclusive-write access. I think I made that point similarly against
> > > an earlier version of this patchset
> > 
> > Why should we do no locking for read-only access by default? If an image
> > is written to, read-only accesses are potentially inconsistent and we
> > should protect users against it.
> > 
> > The only argument I can see in the old versions of this series is
> > "libguestfs does it and usually it gets away with it". For me, that's
> > not an argument for generally allowing this, but at most for providing a
> > switch to bypass the locking.
> > 
> > Because let's be clear about this: If someone lost data because they
> > took an inconsistent backup this way and comes to us qemu developers,
> > all we can tell them is "sorry, what you did is not supported". And
> > that's a pretty strong sign that locking should prevent it by default.
> 
> We have 3 usage scenarios - readonly-share, writable-shared and
> writable-exclusive, and only 2 lock types to play with. This series
> does no locking at all in the writable-shared case, so we still have
> the problem you describe in that someone opening the image for readonly
> access will succeeed even when it is used writable-shared by another
> process.
> 
> So we have to pick a trade-off here. IMHO it is more important to
> ensure we have locks in both the write-shared & write-exclusive case,
> as both of those can definitely result in corrupted images if not
> careful, where as read-only access merely results in your reading
> bad data, no risk of corrupting the original image.

I agree that we want locking for the writable-shared case. That doesn't
mean no locking for read-only, though. I think read-only and writeable-
shared are the same thing as far as locking is concerned.

This is the matrix of the allowed combinations (without a manual lock
override that enables libguestfs to use unsupported cases), as I see it:

                    wr-excl     wr-shared   read-only
write-exclusive     no          no          no
write-shared        no          yes         yes
read-only           no          yes         yes

Do you disagree with any of the entries?

Otherwise, this suggests that write-exclusive is F_WRLCK and both
write-shared and read-only are F_RDLCK.

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  9:35           ` Kevin Wolf
@ 2016-05-10  9:43             ` Daniel P. Berrange
  2016-05-10 10:07               ` Kevin Wolf
  0 siblings, 1 reply; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-10  9:43 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Richard W.M. Jones, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

On Tue, May 10, 2016 at 11:35:14AM +0200, Kevin Wolf wrote:
> Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> > On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> > > Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > > > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > > > However I didn't test the write-shareable case (the libvirt
> > > > > > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> > > > > 
> > > > > To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> > > > > sets cache=unsafe on the qemu drive (it may have other effects for
> > > > > virtlockd).  Should there be an extra qemu drive flag to communicate
> > > > > that the drive is write-shareable?
> > > > 
> > > > Sure, if QEMU had a way to indicate that the disk was used in a
> > > > write-sharable mode, libvirt would use it.
> > > > 
> > > > I agree with your general point that we should do no locking for
> > > > read-only access, F_RDLCK for shared-write and F_WRLCK for
> > > > exclusive-write access. I think I made that point similarly against
> > > > an earlier version of this patchset
> > > 
> > > Why should we do no locking for read-only access by default? If an image
> > > is written to, read-only accesses are potentially inconsistent and we
> > > should protect users against it.
> > > 
> > > The only argument I can see in the old versions of this series is
> > > "libguestfs does it and usually it gets away with it". For me, that's
> > > not an argument for generally allowing this, but at most for providing a
> > > switch to bypass the locking.
> > > 
> > > Because let's be clear about this: If someone lost data because they
> > > took an inconsistent backup this way and comes to us qemu developers,
> > > all we can tell them is "sorry, what you did is not supported". And
> > > that's a pretty strong sign that locking should prevent it by default.
> > 
> > We have 3 usage scenarios - readonly-share, writable-shared and
> > writable-exclusive, and only 2 lock types to play with. This series
> > does no locking at all in the writable-shared case, so we still have
> > the problem you describe in that someone opening the image for readonly
> > access will succeeed even when it is used writable-shared by another
> > process.
> > 
> > So we have to pick a trade-off here. IMHO it is more important to
> > ensure we have locks in both the write-shared & write-exclusive case,
> > as both of those can definitely result in corrupted images if not
> > careful, where as read-only access merely results in your reading
> > bad data, no risk of corrupting the original image.
> 
> I agree that we want locking for the writable-shared case. That doesn't
> mean no locking for read-only, though. I think read-only and writeable-
> shared are the same thing as far as locking is concerned.

It doesn't make much sense to say that it is unsafe to use read-only
in combination with write-exclusive, but then allow read-only with
write-shared. In both cases you have the same scenario that the
read-only app may get inconsistent data when reading.

> This is the matrix of the allowed combinations (without a manual lock
> override that enables libguestfs to use unsupported cases), as I see it:
> 
>                     wr-excl     wr-shared   read-only
> write-exclusive     no          no          no
> write-shared        no          yes         yes
> read-only           no          yes         yes
> 
> Do you disagree with any of the entries?

I would have 'yes' in all the read-only cells.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  9:14       ` Kevin Wolf
  2016-05-10  9:23         ` Daniel P. Berrange
@ 2016-05-10 10:02         ` Richard W.M. Jones
  1 sibling, 0 replies; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-10 10:02 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow


On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > However I didn't test the write-shareable case (the libvirt
> > > > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> > > 
> > > To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> > > sets cache=unsafe on the qemu drive (it may have other effects for
> > > virtlockd).  Should there be an extra qemu drive flag to communicate
> > > that the drive is write-shareable?
> > 
> > Sure, if QEMU had a way to indicate that the disk was used in a
> > write-sharable mode, libvirt would use it.
> > 
> > I agree with your general point that we should do no locking for
> > read-only access, F_RDLCK for shared-write and F_WRLCK for
> > exclusive-write access. I think I made that point similarly against
> > an earlier version of this patchset
> 
> Why should we do no locking for read-only access by default? If an image
> is written to, read-only accesses are potentially inconsistent and we
> should protect users against it.
> 
> The only argument I can see in the old versions of this series is
> "libguestfs does it and usually it gets away with it". For me, that's
> not an argument for generally allowing this, but at most for providing a
> switch to bypass the locking.

Most of the interesting stats derived from disks come from the
superblock (eg. virt-df information) or parts of the filesystem that
rarely change (for example contents of /usr).  I don't claim that this
is always safe, but I do claim that it works almost all the time, and
when it doesn't, it doesn't especially matter because you're
collecting time series data where one missed stat is irrelevant.

> Because let's be clear about this: If someone lost data because they
> took an inconsistent backup this way and comes to us qemu developers,
> all we can tell them is "sorry, what you did is not supported". And
> that's a pretty strong sign that locking should prevent it by default.

Mostly that would happen because someone copies the disk image (eg.
rsync /var/lib/libvirt/images/*.qcow2 /backups).  Nothing in this
patch or anywhere else will protect against that.

Libguestfs has prominent warnings in every manual page about both
writing to live VMs and reading live VMs, eg:
http://libguestfs.org/guestfish.1.html#warning

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  9:43             ` Daniel P. Berrange
@ 2016-05-10 10:07               ` Kevin Wolf
  2016-05-10 10:16                 ` Richard W.M. Jones
  2016-05-10 10:29                 ` Daniel P. Berrange
  0 siblings, 2 replies; 61+ messages in thread
From: Kevin Wolf @ 2016-05-10 10:07 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Richard W.M. Jones, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

Am 10.05.2016 um 11:43 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 11:35:14AM +0200, Kevin Wolf wrote:
> > Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> > > On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> > > > Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > > > > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > > > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > > > > However I didn't test the write-shareable case (the libvirt
> > > > > > > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> > > > > > 
> > > > > > To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> > > > > > sets cache=unsafe on the qemu drive (it may have other effects for
> > > > > > virtlockd).  Should there be an extra qemu drive flag to communicate
> > > > > > that the drive is write-shareable?
> > > > > 
> > > > > Sure, if QEMU had a way to indicate that the disk was used in a
> > > > > write-sharable mode, libvirt would use it.
> > > > > 
> > > > > I agree with your general point that we should do no locking for
> > > > > read-only access, F_RDLCK for shared-write and F_WRLCK for
> > > > > exclusive-write access. I think I made that point similarly against
> > > > > an earlier version of this patchset
> > > > 
> > > > Why should we do no locking for read-only access by default? If an image
> > > > is written to, read-only accesses are potentially inconsistent and we
> > > > should protect users against it.
> > > > 
> > > > The only argument I can see in the old versions of this series is
> > > > "libguestfs does it and usually it gets away with it". For me, that's
> > > > not an argument for generally allowing this, but at most for providing a
> > > > switch to bypass the locking.
> > > > 
> > > > Because let's be clear about this: If someone lost data because they
> > > > took an inconsistent backup this way and comes to us qemu developers,
> > > > all we can tell them is "sorry, what you did is not supported". And
> > > > that's a pretty strong sign that locking should prevent it by default.
> > > 
> > > We have 3 usage scenarios - readonly-share, writable-shared and
> > > writable-exclusive, and only 2 lock types to play with. This series
> > > does no locking at all in the writable-shared case, so we still have
> > > the problem you describe in that someone opening the image for readonly
> > > access will succeeed even when it is used writable-shared by another
> > > process.
> > > 
> > > So we have to pick a trade-off here. IMHO it is more important to
> > > ensure we have locks in both the write-shared & write-exclusive case,
> > > as both of those can definitely result in corrupted images if not
> > > careful, where as read-only access merely results in your reading
> > > bad data, no risk of corrupting the original image.
> > 
> > I agree that we want locking for the writable-shared case. That doesn't
> > mean no locking for read-only, though. I think read-only and writeable-
> > shared are the same thing as far as locking is concerned.
> 
> It doesn't make much sense to say that it is unsafe to use read-only
> in combination with write-exclusive, but then allow read-only with
> write-shared. In both cases you have the same scenario that the
> read-only app may get inconsistent data when reading.

Doesn't write-shared usually indicate that the contents can actually be
consistently read/written to, for example because a cluster filesystem
is used? If you couldn't, you would use write-exclusive instead.

In contrast, write-exclusive indicates that correct concurrent access
isn't possible, for example because you're using a "normal" filesystem
that might additionally keep things in a writeback cache in the guest.
You wouldn't use write-shared there.

> > This is the matrix of the allowed combinations (without a manual lock
> > override that enables libguestfs to use unsupported cases), as I see it:
> > 
> >                     wr-excl     wr-shared   read-only
> > write-exclusive     no          no          no
> > write-shared        no          yes         yes
> > read-only           no          yes         yes
> > 
> > Do you disagree with any of the entries?
> 
> I would have 'yes' in all the read-only cells.

I'm surprised how low the standards seem to be when we're talking about
data integrity. If occasionally losing data is okay, the qemu block
layer could be quite a bit simpler.

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 10:07               ` Kevin Wolf
@ 2016-05-10 10:16                 ` Richard W.M. Jones
  2016-05-10 11:08                   ` Kevin Wolf
  2016-05-10 10:29                 ` Daniel P. Berrange
  1 sibling, 1 reply; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-10 10:16 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

On Tue, May 10, 2016 at 12:07:06PM +0200, Kevin Wolf wrote:
> I'm surprised how low the standards seem to be when we're talking about
> data integrity. If occasionally losing data is okay, the qemu block
> layer could be quite a bit simpler.

I welcome this patch because it fixes a real data integrity issue
which we've seen in the field: people using guestfish (in write mode)
on live VMs.

We try our very best to prevent this happening -- for example if you
use guestfish via libvirt, it will check if the VM is live and refuse
access.  Though this is not and cannot be bulletproof (since someone
can start a VM up after we have opened it).  We also have prominent
warnings in the manual and in the FAQ about this.

However _reading_ disks doesn't corrupt live VMs.  The worst that
happens is guestfish will error out or you'll see some inconsistent
stats from virt-df.

None of this has anything to do with data integrity in the qemu block
layer, and no one is arguing that it should be weakened.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 10:07               ` Kevin Wolf
  2016-05-10 10:16                 ` Richard W.M. Jones
@ 2016-05-10 10:29                 ` Daniel P. Berrange
  2016-05-10 11:14                   ` Kevin Wolf
  1 sibling, 1 reply; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-10 10:29 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Richard W.M. Jones, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

On Tue, May 10, 2016 at 12:07:06PM +0200, Kevin Wolf wrote:
> Am 10.05.2016 um 11:43 hat Daniel P. Berrange geschrieben:
> > On Tue, May 10, 2016 at 11:35:14AM +0200, Kevin Wolf wrote:
> > > Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> > > > On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> > > > > Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > > > > > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > > > > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > > > > > However I didn't test the write-shareable case (the libvirt
> > > > > > > > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> > > > > > > 
> > > > > > > To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> > > > > > > sets cache=unsafe on the qemu drive (it may have other effects for
> > > > > > > virtlockd).  Should there be an extra qemu drive flag to communicate
> > > > > > > that the drive is write-shareable?
> > > > > > 
> > > > > > Sure, if QEMU had a way to indicate that the disk was used in a
> > > > > > write-sharable mode, libvirt would use it.
> > > > > > 
> > > > > > I agree with your general point that we should do no locking for
> > > > > > read-only access, F_RDLCK for shared-write and F_WRLCK for
> > > > > > exclusive-write access. I think I made that point similarly against
> > > > > > an earlier version of this patchset
> > > > > 
> > > > > Why should we do no locking for read-only access by default? If an image
> > > > > is written to, read-only accesses are potentially inconsistent and we
> > > > > should protect users against it.
> > > > > 
> > > > > The only argument I can see in the old versions of this series is
> > > > > "libguestfs does it and usually it gets away with it". For me, that's
> > > > > not an argument for generally allowing this, but at most for providing a
> > > > > switch to bypass the locking.
> > > > > 
> > > > > Because let's be clear about this: If someone lost data because they
> > > > > took an inconsistent backup this way and comes to us qemu developers,
> > > > > all we can tell them is "sorry, what you did is not supported". And
> > > > > that's a pretty strong sign that locking should prevent it by default.
> > > > 
> > > > We have 3 usage scenarios - readonly-share, writable-shared and
> > > > writable-exclusive, and only 2 lock types to play with. This series
> > > > does no locking at all in the writable-shared case, so we still have
> > > > the problem you describe in that someone opening the image for readonly
> > > > access will succeeed even when it is used writable-shared by another
> > > > process.
> > > > 
> > > > So we have to pick a trade-off here. IMHO it is more important to
> > > > ensure we have locks in both the write-shared & write-exclusive case,
> > > > as both of those can definitely result in corrupted images if not
> > > > careful, where as read-only access merely results in your reading
> > > > bad data, no risk of corrupting the original image.
> > > 
> > > I agree that we want locking for the writable-shared case. That doesn't
> > > mean no locking for read-only, though. I think read-only and writeable-
> > > shared are the same thing as far as locking is concerned.
> > 
> > It doesn't make much sense to say that it is unsafe to use read-only
> > in combination with write-exclusive, but then allow read-only with
> > write-shared. In both cases you have the same scenario that the
> > read-only app may get inconsistent data when reading.
> 
> Doesn't write-shared usually indicate that the contents can actually be
> consistently read/written to, for example because a cluster filesystem
> is used? If you couldn't, you would use write-exclusive instead.
> 
> In contrast, write-exclusive indicates that correct concurrent access
> isn't possible, for example because you're using a "normal" filesystem
> that might additionally keep things in a writeback cache in the guest.
> You wouldn't use write-shared there.
> 
> > > This is the matrix of the allowed combinations (without a manual lock
> > > override that enables libguestfs to use unsupported cases), as I see it:
> > > 
> > >                     wr-excl     wr-shared   read-only
> > > write-exclusive     no          no          no
> > > write-shared        no          yes         yes
> > > read-only           no          yes         yes
> > > 
> > > Do you disagree with any of the entries?
> > 
> > I would have 'yes' in all the read-only cells.
> 
> I'm surprised how low the standards seem to be when we're talking about
> data integrity. If occasionally losing data is okay, the qemu block
> layer could be quite a bit simpler.

To my mind there's two separate issues at play here. Protection against
two processes writing to the same image in a way that causes unrecoverable
corruption. Protection against reading bad data from an otherwise fine
disk image.

Both write-exclusive & write-shared can obviously cause unrecoverable
corruption, so we need locking for both of those scenarios.

Whether you lock or not, read-only access can never cause corruption of
a disk image, so IMHO locking is irrelevant there. Whether it is safe
to have a reader concurrently access disk contents with a writer (whether
shared or exclusive) is dependant on what type of format the data in
the image is stored in. That data format is upto the guest OS and
invisible to QEMU - it may or may not be valid to have readers accessing
a disk that is in shared-write mode, we have no way of knowing. As such
IMHO it doesn't make sense to claim we provide useful protection between
a reader and writers - there will always be cases where QEMU is either
too strict, or too lax, thus giving a false sense of safety or forcing
apps to just unconditionally turn off locking in read-only mode defeating
the point.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 10:16                 ` Richard W.M. Jones
@ 2016-05-10 11:08                   ` Kevin Wolf
  2016-05-10 11:46                     ` Richard W.M. Jones
  0 siblings, 1 reply; 61+ messages in thread
From: Kevin Wolf @ 2016-05-10 11:08 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Daniel P. Berrange, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

Am 10.05.2016 um 12:16 hat Richard W.M. Jones geschrieben:
> On Tue, May 10, 2016 at 12:07:06PM +0200, Kevin Wolf wrote:
> > I'm surprised how low the standards seem to be when we're talking about
> > data integrity. If occasionally losing data is okay, the qemu block
> > layer could be quite a bit simpler.
> 
> I welcome this patch because it fixes a real data integrity issue
> which we've seen in the field: people using guestfish (in write mode)
> on live VMs.

Yes, people writing to live disks is a very real issue. They don't only
use guestfish for it, but also qemu-img (e.g. taking snapshots), and
these are the cases that become visible on the qemu mailing list.

But if you imply that the read-only case isn't real, I have to disagree.
Sometimes people also try to copy data out from a live VM with qemu-img
convert, and while this seems to succeed, they may actually have
produced a corrupt copy. This is why I want to protect the read-only
case as well.

> We try our very best to prevent this happening -- for example if you
> use guestfish via libvirt, it will check if the VM is live and refuse
> access.  Though this is not and cannot be bulletproof (since someone
> can start a VM up after we have opened it).  We also have prominent
> warnings in the manual and in the FAQ about this.
> 
> However _reading_ disks doesn't corrupt live VMs.  The worst that
> happens is guestfish will error out or you'll see some inconsistent
> stats from virt-df.

Are you saying that libguestfs only allows operations like df on live
images, but not e.g. copying files out of the VM?

Because if copying data out was allowed, I'd suspect that people would
use it on live VMs and would be surprised if they didn't get what they
expected (which they often only notice when it's too late).

I guess you're right that we can tolerate some df command not being 100%
sure to return the right numbers, but it's a very special case and I
think it's not demanding too much if you need to pass a lock-override
flag when you do something like this, when this can protect people
against accidentally creating corrupted copies.

> None of this has anything to do with data integrity in the qemu block
> layer, and no one is arguing that it should be weakened.

We're talking about real data corruption in both cases.

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 10:29                 ` Daniel P. Berrange
@ 2016-05-10 11:14                   ` Kevin Wolf
  0 siblings, 0 replies; 61+ messages in thread
From: Kevin Wolf @ 2016-05-10 11:14 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Richard W.M. Jones, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

Am 10.05.2016 um 12:29 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 12:07:06PM +0200, Kevin Wolf wrote:
> > Am 10.05.2016 um 11:43 hat Daniel P. Berrange geschrieben:
> > > On Tue, May 10, 2016 at 11:35:14AM +0200, Kevin Wolf wrote:
> > > > Am 10.05.2016 um 11:23 hat Daniel P. Berrange geschrieben:
> > > > > On Tue, May 10, 2016 at 11:14:22AM +0200, Kevin Wolf wrote:
> > > > > > Am 10.05.2016 um 10:50 hat Daniel P. Berrange geschrieben:
> > > > > > > On Tue, May 10, 2016 at 09:43:04AM +0100, Richard W.M. Jones wrote:
> > > > > > > > On Tue, May 10, 2016 at 09:14:26AM +0100, Richard W.M. Jones wrote:
> > > > > > > > > However I didn't test the write-shareable case (the libvirt
> > > > > > > > > <shareable/> flag which should map to a shared lock -- is that right Dan?).
> > > > > > > > 
> > > > > > > > To Dan (mainly): I think setting the <shareable/> flag in libvirt only
> > > > > > > > sets cache=unsafe on the qemu drive (it may have other effects for
> > > > > > > > virtlockd).  Should there be an extra qemu drive flag to communicate
> > > > > > > > that the drive is write-shareable?
> > > > > > > 
> > > > > > > Sure, if QEMU had a way to indicate that the disk was used in a
> > > > > > > write-sharable mode, libvirt would use it.
> > > > > > > 
> > > > > > > I agree with your general point that we should do no locking for
> > > > > > > read-only access, F_RDLCK for shared-write and F_WRLCK for
> > > > > > > exclusive-write access. I think I made that point similarly against
> > > > > > > an earlier version of this patchset
> > > > > > 
> > > > > > Why should we do no locking for read-only access by default? If an image
> > > > > > is written to, read-only accesses are potentially inconsistent and we
> > > > > > should protect users against it.
> > > > > > 
> > > > > > The only argument I can see in the old versions of this series is
> > > > > > "libguestfs does it and usually it gets away with it". For me, that's
> > > > > > not an argument for generally allowing this, but at most for providing a
> > > > > > switch to bypass the locking.
> > > > > > 
> > > > > > Because let's be clear about this: If someone lost data because they
> > > > > > took an inconsistent backup this way and comes to us qemu developers,
> > > > > > all we can tell them is "sorry, what you did is not supported". And
> > > > > > that's a pretty strong sign that locking should prevent it by default.
> > > > > 
> > > > > We have 3 usage scenarios - readonly-share, writable-shared and
> > > > > writable-exclusive, and only 2 lock types to play with. This series
> > > > > does no locking at all in the writable-shared case, so we still have
> > > > > the problem you describe in that someone opening the image for readonly
> > > > > access will succeeed even when it is used writable-shared by another
> > > > > process.
> > > > > 
> > > > > So we have to pick a trade-off here. IMHO it is more important to
> > > > > ensure we have locks in both the write-shared & write-exclusive case,
> > > > > as both of those can definitely result in corrupted images if not
> > > > > careful, where as read-only access merely results in your reading
> > > > > bad data, no risk of corrupting the original image.
> > > > 
> > > > I agree that we want locking for the writable-shared case. That doesn't
> > > > mean no locking for read-only, though. I think read-only and writeable-
> > > > shared are the same thing as far as locking is concerned.
> > > 
> > > It doesn't make much sense to say that it is unsafe to use read-only
> > > in combination with write-exclusive, but then allow read-only with
> > > write-shared. In both cases you have the same scenario that the
> > > read-only app may get inconsistent data when reading.
> > 
> > Doesn't write-shared usually indicate that the contents can actually be
> > consistently read/written to, for example because a cluster filesystem
> > is used? If you couldn't, you would use write-exclusive instead.
> > 
> > In contrast, write-exclusive indicates that correct concurrent access
> > isn't possible, for example because you're using a "normal" filesystem
> > that might additionally keep things in a writeback cache in the guest.
> > You wouldn't use write-shared there.
> > 
> > > > This is the matrix of the allowed combinations (without a manual lock
> > > > override that enables libguestfs to use unsupported cases), as I see it:
> > > > 
> > > >                     wr-excl     wr-shared   read-only
> > > > write-exclusive     no          no          no
> > > > write-shared        no          yes         yes
> > > > read-only           no          yes         yes
> > > > 
> > > > Do you disagree with any of the entries?
> > > 
> > > I would have 'yes' in all the read-only cells.
> > 
> > I'm surprised how low the standards seem to be when we're talking about
> > data integrity. If occasionally losing data is okay, the qemu block
> > layer could be quite a bit simpler.
> 
> To my mind there's two separate issues at play here. Protection against
> two processes writing to the same image in a way that causes unrecoverable
> corruption. Protection against reading bad data from an otherwise fine
> disk image.
> 
> Both write-exclusive & write-shared can obviously cause unrecoverable
> corruption, so we need locking for both of those scenarios.
> 
> Whether you lock or not, read-only access can never cause corruption of
> a disk image, so IMHO locking is irrelevant there. Whether it is safe
> to have a reader concurrently access disk contents with a writer (whether
> shared or exclusive) is dependant on what type of format the data in
> the image is stored in.

Basically this says "we're protecting our image metadata, please protect
your actual data yourself". This isn't much better than what we have
today. We are implementing locking exactly because users are _not_
completely aware of the problems that concurrent access causes. Telling
them to be more careful doesn't solve any problem.

> That data format is upto the guest OS and
> invisible to QEMU - it may or may not be valid to have readers accessing
> a disk that is in shared-write mode, we have no way of knowing. As such
> IMHO it doesn't make sense to claim we provide useful protection between
> a reader and writers - there will always be cases where QEMU is either
> too strict, or too lax, thus giving a false sense of safety or forcing
> apps to just unconditionally turn off locking in read-only mode defeating
> the point.

I think being strict and having options to be laxer is useful when we're
talking about corruption risks.

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 11:08                   ` Kevin Wolf
@ 2016-05-10 11:46                     ` Richard W.M. Jones
  2016-05-10 12:01                       ` Kevin Wolf
  0 siblings, 1 reply; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-10 11:46 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

On Tue, May 10, 2016 at 01:08:49PM +0200, Kevin Wolf wrote:
> Are you saying that libguestfs only allows operations like df on live
> images, but not e.g. copying files out of the VM?
[...]

virt-copy-out will let you copy out files from a live VM.

There's no difference between "safe" and "unsafe" operations, because
(a) it depends on unknowable information about the guest -- it's safe
to read (even write) a filesystem if it's not mounted by the guest,
and (b) even reading a superblock field from an in-use mounted
filesystem is subject to an unlikely but possible race.

Users of libguestfs on live VMs just have to be aware of this, and we
make them aware over and over again of the potential problems.
Importantly, readonly access won't result in corrupt filesystems in
the live VM.

I'm much more interested in stopping people from writing to live VMs.
That is a serious problem, results in unrecoverable filesystems and
near-100% certain data loss [especially with journalled fses], and is
something that has no (or very very few) valid use cases.  It's also
something which only qemu is in a position to properly protect
against.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 11:46                     ` Richard W.M. Jones
@ 2016-05-10 12:01                       ` Kevin Wolf
  2016-05-10 12:11                         ` Richard W.M. Jones
  0 siblings, 1 reply; 61+ messages in thread
From: Kevin Wolf @ 2016-05-10 12:01 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Daniel P. Berrange, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

Am 10.05.2016 um 13:46 hat Richard W.M. Jones geschrieben:
> On Tue, May 10, 2016 at 01:08:49PM +0200, Kevin Wolf wrote:
> > Are you saying that libguestfs only allows operations like df on live
> > images, but not e.g. copying files out of the VM?
> [...]
> 
> virt-copy-out will let you copy out files from a live VM.
> 
> There's no difference between "safe" and "unsafe" operations, because
> (a) it depends on unknowable information about the guest -- it's safe
> to read (even write) a filesystem if it's not mounted by the guest,
> and (b) even reading a superblock field from an in-use mounted
> filesystem is subject to an unlikely but possible race.

The consequences of a failure are different. Specifically, in the quote
that you stripped from this email, you said as a justification for
ignoring the possible problems:

> However _reading_ disks doesn't corrupt live VMs.  The worst that
> happens is guestfish will error out or you'll see some inconsistent
> stats from virt-df.

Creating corrupted copies of data is much worse than wrong stats in a
virt-df result, in my opinion.

> Users of libguestfs on live VMs just have to be aware of this, and we
> make them aware over and over again of the potential problems.

The correct way to make them aware is requiring something like a --force
flag. Just writing it in the documentation is not much more than an
excuse to tell the users that it was their own fault.

> Importantly, readonly access won't result in corrupt filesystems in
> the live VM.

Yes, it only results in corrupt data in the copy. Is that much better?

> I'm much more interested in stopping people from writing to live VMs.
> That is a serious problem, results in unrecoverable filesystems and
> near-100% certain data loss [especially with journalled fses], and is
> something that has no (or very very few) valid use cases.  It's also
> something which only qemu is in a position to properly protect
> against.

Nobody is saying that we shouldn't protect against writing to live VMs.

We're discussing that you are objecting to protecting users against
reading from live VMs, which may not be quite as disastrous in many
cases, but can result in corruption, too. And here as well only qemu is
in the position to properly protect against it.

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 12:01                       ` Kevin Wolf
@ 2016-05-10 12:11                         ` Richard W.M. Jones
  2016-05-10 12:22                           ` Daniel P. Berrange
  0 siblings, 1 reply; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-10 12:11 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Daniel P. Berrange, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

At no point did I say that it was safe to use libguestfs on live VMs
or that you would always get consistent data out.

But the fact that it can fail is understood, the chance of failure is
really tiny (it has literally only happened twice that I've read
corrupted data, in years of daily use), and the operation is very
useful.

So I think this patch series should either not lock r/o VMs, or should
add a nolock flag to override the locking (which libguestfs will
always use).

That's all I have to say on this.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 12:11                         ` Richard W.M. Jones
@ 2016-05-10 12:22                           ` Daniel P. Berrange
  2016-05-10 12:45                             ` Kevin Wolf
                                               ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-10 12:22 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Jeff Cody, Markus Armbruster,
	qemu-devel, stefanha, pbonzini, den, Max Reitz, John Snow

On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
> At no point did I say that it was safe to use libguestfs on live VMs
> or that you would always get consistent data out.
> 
> But the fact that it can fail is understood, the chance of failure is
> really tiny (it has literally only happened twice that I've read
> corrupted data, in years of daily use), and the operation is very
> useful.
> 
> So I think this patch series should either not lock r/o VMs, or should
> add a nolock flag to override the locking (which libguestfs will
> always use).

If QEMU locks r/o disks, then libvirt would likely end up setting the
"nolock" flag unconditionally too, in order to avoid breaking libguestfs
and other application usage of libvirt.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 12:22                           ` Daniel P. Berrange
@ 2016-05-10 12:45                             ` Kevin Wolf
  2016-05-11  8:04                             ` Markus Armbruster
  2016-05-11  8:04                             ` Fam Zheng
  2 siblings, 0 replies; 61+ messages in thread
From: Kevin Wolf @ 2016-05-10 12:45 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Richard W.M. Jones, Fam Zheng, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

Am 10.05.2016 um 14:22 hat Daniel P. Berrange geschrieben:
> On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
> > At no point did I say that it was safe to use libguestfs on live VMs
> > or that you would always get consistent data out.
> > 
> > But the fact that it can fail is understood, the chance of failure is
> > really tiny (it has literally only happened twice that I've read
> > corrupted data, in years of daily use), and the operation is very
> > useful.

Is it understood by you (which I'm happy to believe you) or by all of
your users (which I doubt)?

> > So I think this patch series should either not lock r/o VMs, or should
> > add a nolock flag to override the locking (which libguestfs will
> > always use).
> 
> If QEMU locks r/o disks, then libvirt would likely end up setting the
> "nolock" flag unconditionally too, in order to avoid breaking libguestfs
> and other application usage of libvirt.

It would still have some value as we would protect command line users at
least. Of course I would prefer if libvirt could find a way to protect
their users as well without breaking valid use cases. But if we can
point at libvirt, I guess that's good for qemu at least...

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2016-05-10  8:57   ` Daniel P. Berrange
  2016-05-10  9:06     ` Richard W.M. Jones
@ 2016-05-11  0:48     ` Fam Zheng
  2016-05-11  1:05       ` Fam Zheng
  2016-05-11  9:01       ` Daniel P. Berrange
  1 sibling, 2 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-11  0:48 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Kevin Wolf, qemu-block, rjones, John Snow, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, den, pbonzini,
	Max Reitz

On Tue, 05/10 09:57, Daniel P. Berrange wrote:
> On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> > They are wrappers of POSIX fcntl file locking, with the additional
> > interception of open/close (through qemu_open and qemu_close) to offer a
> > better semantics that preserves the locks across multiple life cycles of
> > different fds on the same file.  The reason to make this semantics
> > change over the fcntl locks is to make the API cleaner for QEMU
> > subsystems.
> > 
> > More precisely, we remove this "feature" of fcntl lock:
> > 
> >     If a process closes any file descriptor referring to a file, then
> >     all of the process's locks on that file are released, regardless of
> >     the file descriptor(s) on which the locks were obtained.
> > 
> > as long as the fd is always open/closed through qemu_open and
> > qemu_close.
> 
> You're not actually really removing that problem - this is just hacking
> around it in a manner which is racy & susceptible to silent failure.
> 
> 
> > +static int qemu_fd_close(int fd)
> > +{
> > +    LockEntry *ent, *tmp;
> > +    LockRecord *rec;
> > +    char *path;
> > +    int ret;
> > +
> > +    assert(fd_to_path);
> > +    path = g_hash_table_lookup(fd_to_path, GINT_TO_POINTER(fd));
> > +    assert(path);
> > +    g_hash_table_remove(fd_to_path, GINT_TO_POINTER(fd));
> > +    rec = g_hash_table_lookup(lock_map, path);
> > +    ret = close(fd);
> > +
> > +    if (rec) {
> > +        QLIST_FOREACH_SAFE(ent, &rec->records, next, tmp) {
> > +            int r;
> > +            if (ent->fd == fd) {
> > +                QLIST_REMOVE(ent, next);
> > +                g_free(ent);
> > +                continue;
> > +            }
> > +            r = qemu_lock_do(ent->fd, ent->start, ent->len,
> > +                             ent->readonly ? F_RDLCK : F_WRLCK);
> > +            if (r == -1) {
> > +                fprintf(stderr, "Failed to acquire lock on fd %d: %s\n",
> > +                        ent->fd, strerror(errno));
> > +            }
> 
> If another app has acquired the lock between the close and this attempt
> to re-acquire the lock, then QEMU is silently carrying on without any
> lock being held. For something that's intending to provide protection
> against concurrent use I think this is not an acceptable failure
> scenario.
> 
> 
> > +int qemu_open(const char *name, int flags, ...)
> > +{
> > +    int mode = 0;
> > +    int ret;
> > +
> > +    if (flags & O_CREAT) {
> > +        va_list ap;
> > +
> > +        va_start(ap, flags);
> > +        mode = va_arg(ap, int);
> > +        va_end(ap);
> > +    }
> > +    ret = qemu_fd_open(name, flags, mode);
> > +    if (ret >= 0) {
> > +        qemu_fd_add_record(ret, name);
> > +    }
> > +    return ret;
> > +}
> 
> I think the approach you have here is fundamentally not usable with
> fcntl locks, because it is still using the pattern
> 
>    fd = open(path)
>    lock(fd)
>    if failed lock
>       close(fd)
>    ...do stuff.
> 
> 
> As mentioned in previous posting I believe the block layer must be
> checking whether it already has a lock held against "path" *before*
> even attempting to open the file. Once QEMU has the file descriptor
> open it is too later, because closing the FD will release pre-existing
> locks QEMU has.

There will still be valid close() calls, that will release necessary locks
temporarily.  You are right the "close + re-acquire" in qemu_fd_close() is a
racy problem. Any suggestion how this could be fixed?  Something like
introducing a "close2" syscall that doesn't drop locks on other fds?

Fam

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2016-05-11  0:48     ` Fam Zheng
@ 2016-05-11  1:05       ` Fam Zheng
  2016-05-11  9:01       ` Daniel P. Berrange
  1 sibling, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-11  1:05 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Kevin Wolf, qemu-block, qemu-devel, Jeff Cody, rjones,
	Markus Armbruster, stefanha, pbonzini, den, Max Reitz, John Snow

On Wed, 05/11 08:48, Fam Zheng wrote:
> racy problem. Any suggestion how this could be fixed?

Reading into the subthread I see the answer: the file-private locks look
promising. Will take a look at that! Thanks.

Fam

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 12:22                           ` Daniel P. Berrange
  2016-05-10 12:45                             ` Kevin Wolf
@ 2016-05-11  8:04                             ` Markus Armbruster
  2016-05-11  8:52                               ` Daniel P. Berrange
  2016-05-11  8:04                             ` Fam Zheng
  2 siblings, 1 reply; 61+ messages in thread
From: Markus Armbruster @ 2016-05-11  8:04 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Richard W.M. Jones, Kevin Wolf, Fam Zheng, qemu-block, Jeff Cody,
	qemu-devel, stefanha, den, pbonzini, Max Reitz, John Snow

"Daniel P. Berrange" <berrange@redhat.com> writes:

> On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
>> At no point did I say that it was safe to use libguestfs on live VMs
>> or that you would always get consistent data out.
>> 
>> But the fact that it can fail is understood, the chance of failure is
>> really tiny (it has literally only happened twice that I've read
>> corrupted data, in years of daily use), and the operation is very
>> useful.
>> 
>> So I think this patch series should either not lock r/o VMs, or should
>> add a nolock flag to override the locking (which libguestfs will
>> always use).
>
> If QEMU locks r/o disks, then libvirt would likely end up setting the
> "nolock" flag unconditionally too, in order to avoid breaking libguestfs
> and other application usage of libvirt.

Could a QEMU + libvirt together provide both safe and unsafe read-only
access?  Safe means you get consistent data.  Unsafe means you're taking
your chances.

Libguestfs could then use unsafe if the user asks for it.  Or even by
default; that's really libguesfs's business.

Backward compatibility may complicate things, but getting into a
reasonable state is sometimes worth a lengthy and somewhat messy
transition.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10 12:22                           ` Daniel P. Berrange
  2016-05-10 12:45                             ` Kevin Wolf
  2016-05-11  8:04                             ` Markus Armbruster
@ 2016-05-11  8:04                             ` Fam Zheng
  2016-05-11  9:28                               ` Richard W.M. Jones
  2 siblings, 1 reply; 61+ messages in thread
From: Fam Zheng @ 2016-05-11  8:04 UTC (permalink / raw)
  To: Daniel P. Berrange, rjones
  Cc: Kevin Wolf, qemu-block, Jeff Cody, Markus Armbruster, qemu-devel,
	stefanha, pbonzini, den, Max Reitz, John Snow

On Tue, 05/10 13:22, Daniel P. Berrange wrote:
> On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
> > At no point did I say that it was safe to use libguestfs on live VMs
> > or that you would always get consistent data out.
> > 
> > But the fact that it can fail is understood, the chance of failure is
> > really tiny (it has literally only happened twice that I've read
> > corrupted data, in years of daily use), and the operation is very
> > useful.
> > 
> > So I think this patch series should either not lock r/o VMs, or should
> > add a nolock flag to override the locking (which libguestfs will
> > always use).

It sounds you are happy with either way but actually this series does both. So,
would it be okay for libguestfs if we go for "lock r/o VMs by default and
provide nolock flag"? It would then have the best default for non-libguestfs
users.

Fam

> 
> If QEMU locks r/o disks, then libvirt would likely end up setting the
> "nolock" flag unconditionally too, in order to avoid breaking libguestfs
> and other application usage of libvirt.
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-11  8:04                             ` Markus Armbruster
@ 2016-05-11  8:52                               ` Daniel P. Berrange
  0 siblings, 0 replies; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-11  8:52 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Richard W.M. Jones, Kevin Wolf, Fam Zheng, qemu-block, Jeff Cody,
	qemu-devel, stefanha, den, pbonzini, Max Reitz, John Snow

On Wed, May 11, 2016 at 10:04:12AM +0200, Markus Armbruster wrote:
> "Daniel P. Berrange" <berrange@redhat.com> writes:
> 
> > On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
> >> At no point did I say that it was safe to use libguestfs on live VMs
> >> or that you would always get consistent data out.
> >> 
> >> But the fact that it can fail is understood, the chance of failure is
> >> really tiny (it has literally only happened twice that I've read
> >> corrupted data, in years of daily use), and the operation is very
> >> useful.
> >> 
> >> So I think this patch series should either not lock r/o VMs, or should
> >> add a nolock flag to override the locking (which libguestfs will
> >> always use).
> >
> > If QEMU locks r/o disks, then libvirt would likely end up setting the
> > "nolock" flag unconditionally too, in order to avoid breaking libguestfs
> > and other application usage of libvirt.
> 
> Could a QEMU + libvirt together provide both safe and unsafe read-only
> access?  Safe means you get consistent data.  Unsafe means you're taking
> your chances.
> 
> Libguestfs could then use unsafe if the user asks for it.  Or even by
> default; that's really libguesfs's business.
> 
> Backward compatibility may complicate things, but getting into a
> reasonable state is sometimes worth a lengthy and somewhat messy
> transition.

We would have to use 'nolock' by default, and provide apps an opt-in
config flag to request locking of r/o images if they wanted it.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2016-05-11  0:48     ` Fam Zheng
  2016-05-11  1:05       ` Fam Zheng
@ 2016-05-11  9:01       ` Daniel P. Berrange
  1 sibling, 0 replies; 61+ messages in thread
From: Daniel P. Berrange @ 2016-05-11  9:01 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Kevin Wolf, qemu-block, rjones, John Snow, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, den, pbonzini,
	Max Reitz

On Wed, May 11, 2016 at 08:48:18AM +0800, Fam Zheng wrote:
> On Tue, 05/10 09:57, Daniel P. Berrange wrote:
> > On Tue, May 10, 2016 at 10:50:40AM +0800, Fam Zheng wrote:
> > > They are wrappers of POSIX fcntl file locking, with the additional
> > > interception of open/close (through qemu_open and qemu_close) to offer a
> > > better semantics that preserves the locks across multiple life cycles of
> > > different fds on the same file.  The reason to make this semantics
> > > change over the fcntl locks is to make the API cleaner for QEMU
> > > subsystems.
> > > 
> > > More precisely, we remove this "feature" of fcntl lock:
> > > 
> > >     If a process closes any file descriptor referring to a file, then
> > >     all of the process's locks on that file are released, regardless of
> > >     the file descriptor(s) on which the locks were obtained.
> > > 
> > > as long as the fd is always open/closed through qemu_open and
> > > qemu_close.
> > 
> > You're not actually really removing that problem - this is just hacking
> > around it in a manner which is racy & susceptible to silent failure.
> > 
> > 
> > > +static int qemu_fd_close(int fd)
> > > +{
> > > +    LockEntry *ent, *tmp;
> > > +    LockRecord *rec;
> > > +    char *path;
> > > +    int ret;
> > > +
> > > +    assert(fd_to_path);
> > > +    path = g_hash_table_lookup(fd_to_path, GINT_TO_POINTER(fd));
> > > +    assert(path);
> > > +    g_hash_table_remove(fd_to_path, GINT_TO_POINTER(fd));
> > > +    rec = g_hash_table_lookup(lock_map, path);
> > > +    ret = close(fd);
> > > +
> > > +    if (rec) {
> > > +        QLIST_FOREACH_SAFE(ent, &rec->records, next, tmp) {
> > > +            int r;
> > > +            if (ent->fd == fd) {
> > > +                QLIST_REMOVE(ent, next);
> > > +                g_free(ent);
> > > +                continue;
> > > +            }
> > > +            r = qemu_lock_do(ent->fd, ent->start, ent->len,
> > > +                             ent->readonly ? F_RDLCK : F_WRLCK);
> > > +            if (r == -1) {
> > > +                fprintf(stderr, "Failed to acquire lock on fd %d: %s\n",
> > > +                        ent->fd, strerror(errno));
> > > +            }
> > 
> > If another app has acquired the lock between the close and this attempt
> > to re-acquire the lock, then QEMU is silently carrying on without any
> > lock being held. For something that's intending to provide protection
> > against concurrent use I think this is not an acceptable failure
> > scenario.
> > 
> > 
> > > +int qemu_open(const char *name, int flags, ...)
> > > +{
> > > +    int mode = 0;
> > > +    int ret;
> > > +
> > > +    if (flags & O_CREAT) {
> > > +        va_list ap;
> > > +
> > > +        va_start(ap, flags);
> > > +        mode = va_arg(ap, int);
> > > +        va_end(ap);
> > > +    }
> > > +    ret = qemu_fd_open(name, flags, mode);
> > > +    if (ret >= 0) {
> > > +        qemu_fd_add_record(ret, name);
> > > +    }
> > > +    return ret;
> > > +}
> > 
> > I think the approach you have here is fundamentally not usable with
> > fcntl locks, because it is still using the pattern
> > 
> >    fd = open(path)
> >    lock(fd)
> >    if failed lock
> >       close(fd)
> >    ...do stuff.
> > 
> > 
> > As mentioned in previous posting I believe the block layer must be
> > checking whether it already has a lock held against "path" *before*
> > even attempting to open the file. Once QEMU has the file descriptor
> > open it is too later, because closing the FD will release pre-existing
> > locks QEMU has.
> 
> There will still be valid close() calls, that will release necessary locks
> temporarily.  You are right the "close + re-acquire" in qemu_fd_close() is a
> racy problem. Any suggestion how this could be fixed?  Something like
> introducing a "close2" syscall that doesn't drop locks on other fds?

I guess the problem scenario is with the "nolock" option - if you have a
block driver which has an image open and locked, if you have another block
driver opening that same image with 'nolock' then you then hit the potential
close() problem.

One idea might be to simply have a delayed close(). ie don't close the
file descriptor until all locks have been released by the other block
driver.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-11  8:04                             ` Fam Zheng
@ 2016-05-11  9:28                               ` Richard W.M. Jones
  2016-05-11 10:03                                 ` Kevin Wolf
  0 siblings, 1 reply; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-11  9:28 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Daniel P. Berrange, Kevin Wolf, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow


On Wed, May 11, 2016 at 04:04:40PM +0800, Fam Zheng wrote:
> On Tue, 05/10 13:22, Daniel P. Berrange wrote:
> > On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
> > > At no point did I say that it was safe to use libguestfs on live VMs
> > > or that you would always get consistent data out.
> > > 
> > > But the fact that it can fail is understood, the chance of failure is
> > > really tiny (it has literally only happened twice that I've read
> > > corrupted data, in years of daily use), and the operation is very
> > > useful.
> > > 
> > > So I think this patch series should either not lock r/o VMs, or should
> > > add a nolock flag to override the locking (which libguestfs will
> > > always use).
> 
> It sounds you are happy with either way but actually this series does both. So,
> would it be okay for libguestfs if we go for "lock r/o VMs by default and
> provide nolock flag"? It would then have the best default for non-libguestfs
> users.

Yes, and libguestfs will always pass the nolock flag for readonly disks.

Another question is whether doing:

  qemu-img create -b disk.img -f qcow2 overlay.qcow2

is permitted, since that is the first command that libguestfs issues
when opening a disk read-only (where disk.img could be a live VM in
the case we're talking about).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-11  9:28                               ` Richard W.M. Jones
@ 2016-05-11 10:03                                 ` Kevin Wolf
  0 siblings, 0 replies; 61+ messages in thread
From: Kevin Wolf @ 2016-05-11 10:03 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Fam Zheng, Daniel P. Berrange, qemu-block, Jeff Cody,
	Markus Armbruster, qemu-devel, stefanha, pbonzini, den,
	Max Reitz, John Snow

Am 11.05.2016 um 11:28 hat Richard W.M. Jones geschrieben:
> 
> On Wed, May 11, 2016 at 04:04:40PM +0800, Fam Zheng wrote:
> > On Tue, 05/10 13:22, Daniel P. Berrange wrote:
> > > On Tue, May 10, 2016 at 01:11:30PM +0100, Richard W.M. Jones wrote:
> > > > At no point did I say that it was safe to use libguestfs on live VMs
> > > > or that you would always get consistent data out.
> > > > 
> > > > But the fact that it can fail is understood, the chance of failure is
> > > > really tiny (it has literally only happened twice that I've read
> > > > corrupted data, in years of daily use), and the operation is very
> > > > useful.
> > > > 
> > > > So I think this patch series should either not lock r/o VMs, or should
> > > > add a nolock flag to override the locking (which libguestfs will
> > > > always use).
> > 
> > It sounds you are happy with either way but actually this series does both. So,
> > would it be okay for libguestfs if we go for "lock r/o VMs by default and
> > provide nolock flag"? It would then have the best default for non-libguestfs
> > users.
> 
> Yes, and libguestfs will always pass the nolock flag for readonly disks.
> 
> Another question is whether doing:
> 
>   qemu-img create -b disk.img -f qcow2 overlay.qcow2
> 
> is permitted, since that is the first command that libguestfs issues
> when opening a disk read-only (where disk.img could be a live VM in
> the case we're talking about).

I think qemu-img has a few special cases where the lock can be ignored
as well. Opening the backing file to query its size could be one of
them; but if you pass a size, qemu-img doesn't even need to open it.
'qemu-img info' could be another thing that is reasonable enough on an
image that is in use.

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
                   ` (27 preceding siblings ...)
  2016-05-10  8:14 ` [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Richard W.M. Jones
@ 2016-05-11 11:48 ` Richard W.M. Jones
  2016-05-11 11:56   ` Kevin Wolf
  28 siblings, 1 reply; 61+ messages in thread
From: Richard W.M. Jones @ 2016-05-11 11:48 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Kevin Wolf, Max Reitz, Jeff Cody, Markus Armbruster,
	Eric Blake, qemu-block, stefanha, pbonzini, John Snow, berrange,
	den


While I remember there is another concern that doesn't seem to be
addressed in this patch series.  What happens when a guest is paused?
I think exclusive locks should be converted to shared locks in that
case, since (only while the guest is paused) it _is_ safe to fish
around inside the guest's state.  Of course the lock must be restored
before the guest resumes.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-11 11:48 ` Richard W.M. Jones
@ 2016-05-11 11:56   ` Kevin Wolf
  2016-05-12  1:07     ` Fam Zheng
  0 siblings, 1 reply; 61+ messages in thread
From: Kevin Wolf @ 2016-05-11 11:56 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Fam Zheng, qemu-devel, Max Reitz, Jeff Cody, Markus Armbruster,
	Eric Blake, qemu-block, stefanha, pbonzini, John Snow, berrange,
	den

Am 11.05.2016 um 13:48 hat Richard W.M. Jones geschrieben:
> While I remember there is another concern that doesn't seem to be
> addressed in this patch series.  What happens when a guest is paused?
> I think exclusive locks should be converted to shared locks in that
> case, since (only while the guest is paused) it _is_ safe to fish
> around inside the guest's state.  Of course the lock must be restored
> before the guest resumes.

I think it's still one of the cases where it's appropriate to require an
"I know what I'm doing" flag. In paused guests, you still don't
necessarily see the same contents as the guest does (because of guest
caches). Apart from that, things like block jobs and NBD servers keep
running even with a stopped VM.

The lock can only be dropped in cases where we can justify switching to
BDRV_O_INACTIVE, and I don't think a simple stop/cont should do this.

Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening
  2016-05-11 11:56   ` Kevin Wolf
@ 2016-05-12  1:07     ` Fam Zheng
  0 siblings, 0 replies; 61+ messages in thread
From: Fam Zheng @ 2016-05-12  1:07 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Richard W.M. Jones, qemu-devel, Max Reitz, Jeff Cody,
	Markus Armbruster, Eric Blake, qemu-block, stefanha, pbonzini,
	John Snow, berrange, den

On Wed, 05/11 13:56, Kevin Wolf wrote:
> Am 11.05.2016 um 13:48 hat Richard W.M. Jones geschrieben:
> > While I remember there is another concern that doesn't seem to be
> > addressed in this patch series.  What happens when a guest is paused?
> > I think exclusive locks should be converted to shared locks in that
> > case, since (only while the guest is paused) it _is_ safe to fish
> > around inside the guest's state.  Of course the lock must be restored
> > before the guest resumes.
> 
> I think it's still one of the cases where it's appropriate to require an
> "I know what I'm doing" flag. In paused guests, you still don't
> necessarily see the same contents as the guest does (because of guest
> caches). Apart from that, things like block jobs and NBD servers keep
> running even with a stopped VM.

Agreed. The external accessor can explicitly specify "lock-image=off" on its
command line to fish around.

Fam

> 
> The lock can only be dropped in cases where we can justify switching to
> BDRV_O_INACTIVE, and I don't think a simple stop/cont should do this.
> 
> Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2016-05-12  1:08 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-10  2:50 [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 01/27] block: Add BDRV_O_NO_LOCK Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 02/27] qapi: Add lock-image in blockdev-add options Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 03/27] blockdev: Add and parse "lock-image" option for block devices Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 04/27] block: Introduce image file locking Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 05/27] block: Add bdrv_image_locked Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 06/27] block: Make bdrv_reopen_{commit, abort} private functions Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 07/27] block: Handle image locking during reopen Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 08/27] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
2016-05-10  7:54   ` Richard W.M. Jones
2016-05-10  8:57   ` Daniel P. Berrange
2016-05-10  9:06     ` Richard W.M. Jones
2016-05-10  9:20       ` Daniel P. Berrange
2016-05-11  0:48     ` Fam Zheng
2016-05-11  1:05       ` Fam Zheng
2016-05-11  9:01       ` Daniel P. Berrange
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 09/27] osdep: Introduce qemu_dup Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 10/27] raw-posix: Use qemu_dup Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 11/27] raw-posix: Implement .bdrv_lockf Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 12/27] gluster: " Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 13/27] qemu-io: Add "-L" option for BDRV_O_NO_LOCK Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 14/27] qemu-img: Add "-L" option to sub commands Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 15/27] qemu-img: Update documentation of "-L" option Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 16/27] qemu-nbd: Add "--no-lock/-L" option Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 17/27] block: Don't lock drive-backup target image in none mode Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 18/27] mirror: Disable image locking on target backing chain Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 19/27] qemu-iotests: 140: Disable image lock for qemu-io access Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 20/27] qemu-iotests: 046: Move version detection out from verify_io Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 21/27] qemu-iotests: Wait for QEMU processes before checking image in 091 Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 22/27] qemu-iotests: 030: Disable image lock when checking test image Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 23/27] iotests: 087: Disable image lock in cases where file is shared Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 24/27] iotests: Disable image locking in 085 Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 25/27] tests: Use null-co:// instead of /dev/null Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 26/27] block: Turn on image locking by default Fam Zheng
2016-05-10  2:50 ` [Qemu-devel] [PATCH v4 27/27] qemu-iotests: Add test case 153 for image locking Fam Zheng
2016-05-10  8:14 ` [Qemu-devel] [PATCH v4 00/27] block: Lock images when opening Richard W.M. Jones
2016-05-10  8:43   ` Richard W.M. Jones
2016-05-10  8:50     ` Daniel P. Berrange
2016-05-10  9:14       ` Kevin Wolf
2016-05-10  9:23         ` Daniel P. Berrange
2016-05-10  9:35           ` Kevin Wolf
2016-05-10  9:43             ` Daniel P. Berrange
2016-05-10 10:07               ` Kevin Wolf
2016-05-10 10:16                 ` Richard W.M. Jones
2016-05-10 11:08                   ` Kevin Wolf
2016-05-10 11:46                     ` Richard W.M. Jones
2016-05-10 12:01                       ` Kevin Wolf
2016-05-10 12:11                         ` Richard W.M. Jones
2016-05-10 12:22                           ` Daniel P. Berrange
2016-05-10 12:45                             ` Kevin Wolf
2016-05-11  8:04                             ` Markus Armbruster
2016-05-11  8:52                               ` Daniel P. Berrange
2016-05-11  8:04                             ` Fam Zheng
2016-05-11  9:28                               ` Richard W.M. Jones
2016-05-11 10:03                                 ` Kevin Wolf
2016-05-10 10:29                 ` Daniel P. Berrange
2016-05-10 11:14                   ` Kevin Wolf
2016-05-10 10:02         ` Richard W.M. Jones
2016-05-11 11:48 ` Richard W.M. Jones
2016-05-11 11:56   ` Kevin Wolf
2016-05-12  1:07     ` Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.