All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v10 00/16] block: Image locking series
@ 2017-01-19 14:38 Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
                   ` (15 more replies)
  0 siblings, 16 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

v10: While we still don't have comprehensive propagation mechanism that will be
provided by new op blocker system for "permissive modes", the locking enabled
by default is regardlessly useful and long overdue. So I think we should merge
this for 2.9 and build user options on top later when the op blocker API
settles.

Address comments from Max and Eric:
  - Use F_OFD_GETLK instead of opening r/w for ro images. [Max]
  - Lock both bytes exclusively for non-shared write. [Max]
  - Spell fixes. [Eric]
  - Fix test matrix. [Max]
  - Comment tweaks. [Max]
  - Return code cleanups. [Max]
  - Don't abuse "disable_lock" for migration. [Max]
  - Use bs->exact_filename instead of bs->filename. [Max]
  - Force protect qcow2 concurrent write.
  - Fix indentation. [Max]
  - Always use re-open for lockfd instead of dup. [Max]
  - Fall through to "abort" where "prepare" failed. [Max]
  - Fix option handling in raw_reopen_handle_lock. [Max]
  - Use "error_abort" in commit and abort. [Max]
  - Fix cleanup of raw_reopen_handle_lock() failure. [Max]
  - Add a patch for qcow2 to mask BDRV_O_SHARE_RW if r/w.
  - Rebase and fix new more cases that will be broken by "lock by default".

Fam Zheng (16):
  osdep: Add qemu_lock_fd and qemu_unlock_fd
  block: Define BDRV_O_SHARE_RW
  qemu-io: Set "share-rw" flag together with read-only
  qemu-img: Set "share-rw" flag in read-only commands
  block: Set "share-rw" flag in drive-backup when sync=none
  iotests: 055: Don't attach the drive to vm for drive-backup
  iotests: 030: Read-only open image for getting map
  iotests: 087: Don't attach test image twice
  iotests: 085: Avoid image locking conflict
  iotests: 091: Quit QEMU before checking image
  iotests: 172: Use separate images for multiple devices
  tests: Use null-co:// instead of /dev/null as the dummy image
  tests: Disable image lock in test-replication
  file-posix: Implement image locking
  qcow2: Force "no other writer" lock on bs->file
  tests: Add test-image-lock

 block/file-posix.c         | 667 ++++++++++++++++++++++++++++++++++++++++++++-
 block/qcow2.c              |  11 +
 blockdev.c                 |   3 +
 include/block/block.h      |   2 +
 include/qemu/osdep.h       |   3 +
 qemu-img.c                 |  10 +-
 qemu-io.c                  |   2 +
 tests/Makefile.include     |   2 +
 tests/drive_del-test.c     |   2 +-
 tests/nvme-test.c          |   2 +-
 tests/qemu-iotests/030     |   4 +-
 tests/qemu-iotests/055     |  32 ++-
 tests/qemu-iotests/085     |  32 ++-
 tests/qemu-iotests/085.out |   3 +-
 tests/qemu-iotests/087     |   6 +-
 tests/qemu-iotests/091     |   2 +
 tests/qemu-iotests/172     |  53 ++--
 tests/qemu-iotests/172.out |  50 ++--
 tests/test-image-lock.c    | 200 ++++++++++++++
 tests/test-replication.c   |   9 +-
 tests/usb-hcd-uhci-test.c  |   2 +-
 tests/usb-hcd-xhci-test.c  |   2 +-
 tests/virtio-blk-test.c    |   2 +-
 tests/virtio-scsi-test.c   |   4 +-
 util/osdep.c               |  48 ++++
 25 files changed, 1056 insertions(+), 97 deletions(-)
 create mode 100644 tests/test-image-lock.c

-- 
2.9.3

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 02/16] block: Define BDRV_O_SHARE_RW Fam Zheng
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

They are wrappers of POSIX fcntl "file private locking", with a
convenient "try lock" wrapper implemented with F_OFD_GETLK.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/qemu/osdep.h |  3 +++
 util/osdep.c         | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 689f253..e864fe8 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -297,6 +297,9 @@ int qemu_close(int fd);
 #ifndef _WIN32
 int qemu_dup(int fd);
 #endif
+int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive);
+int qemu_unlock_fd(int fd, int64_t start, int64_t len);
+int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive);
 
 #if defined(__HAIKU__) && defined(__i386__)
 #define FMT_pid "%ld"
diff --git a/util/osdep.c b/util/osdep.c
index 06fb1cf..3de4a18 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -140,6 +140,54 @@ static int qemu_parse_fdset(const char *param)
 {
     return qemu_parse_fd(param);
 }
+
+static int qemu_lock_fcntl(int fd, int64_t start, int64_t len, int fl_type)
+{
+#ifdef F_OFD_SETLK
+    int ret;
+    struct flock fl = {
+        .l_whence = SEEK_SET,
+        .l_start  = start,
+        .l_len    = len,
+        .l_type   = fl_type,
+    };
+    ret = fcntl(fd, F_OFD_SETLK, &fl);
+    return ret == -1 ? -errno : 0;
+#else
+    return -ENOTSUP;
+#endif
+}
+
+int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive)
+{
+    return qemu_lock_fcntl(fd, start, len, exclusive ? F_WRLCK : F_RDLCK);
+}
+
+int qemu_unlock_fd(int fd, int64_t start, int64_t len)
+{
+    return qemu_lock_fcntl(fd, start, len, F_UNLCK);
+}
+
+int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive)
+{
+#ifdef F_OFD_SETLK
+    int ret;
+    struct flock fl = {
+        .l_whence = SEEK_SET,
+        .l_start  = start,
+        .l_len    = len,
+        .l_type   = exclusive ? F_WRLCK : F_RDLCK,
+    };
+    ret = fcntl(fd, F_OFD_GETLK, &fl);
+    if (ret == -1) {
+        return -errno;
+    } else {
+        return fl.l_type == F_UNLCK ? 0 : -EAGAIN;
+    }
+#else
+    return -ENOTSUP;
+#endif
+}
 #endif
 
 /*
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 02/16] block: Define BDRV_O_SHARE_RW
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 03/16] qemu-io: Set "share-rw" flag together with read-only Fam Zheng
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/block/block.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 8b0dcda..243839d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -97,6 +97,8 @@ typedef struct HDGeometry {
                                       select an appropriate protocol driver,
                                       ignoring the format layer */
 #define BDRV_O_NO_IO       0x10000 /* don't initialize for I/O */
+#define BDRV_O_SHARE_RW    0x20000 /* accept shared read-write from other users
+                                      of the image */
 
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH)
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 03/16] qemu-io: Set "share-rw" flag together with read-only
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 02/16] block: Define BDRV_O_SHARE_RW Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 04/16] qemu-img: Set "share-rw" flag in read-only commands Fam Zheng
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

qemu-io is a low level tool to read or modify guest visible data, which
implies the user knows very well what is being done. Allowing reading
from a locked image is harmless in most cases, so do it.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-io.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/qemu-io.c b/qemu-io.c
index 23a229f..f000504 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -585,6 +585,8 @@ int main(int argc, char **argv)
     /* open the device */
     if (!readonly) {
         flags |= BDRV_O_RDWR;
+    } else {
+        flags |= BDRV_O_SHARE_RW;
     }
 
     if ((argc - optind) == 1) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 04/16] qemu-img: Set "share-rw" flag in read-only commands
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (2 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 03/16] qemu-io: Set "share-rw" flag together with read-only Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 05/16] block: Set "share-rw" flag in drive-backup when sync=none Fam Zheng
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

Checking the status of an image when it is being used by guest is
usually useful, and there is no risk of corrupting data, so don't let
the upcoming image locking feature limit this use case.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-img.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 5df66fe..6a091e0 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -705,6 +705,10 @@ static int img_check(int argc, char **argv)
         return 1;
     }
 
+    if (!(flags & BDRV_O_RDWR)) {
+        flags |= BDRV_O_SHARE_RW;
+    }
+
     blk = img_open(image_opts, filename, fmt, flags, writethrough, quiet);
     if (!blk) {
         return 1;
@@ -1238,6 +1242,7 @@ static int img_compare(int argc, char **argv)
         goto out3;
     }
 
+    flags |= BDRV_O_SHARE_RW;
     blk1 = img_open(image_opts, filename1, fmt1, flags, writethrough, quiet);
     if (!blk1) {
         ret = 2;
@@ -2286,7 +2291,8 @@ static ImageInfoList *collect_image_info_list(bool image_opts,
         g_hash_table_insert(filenames, (gpointer)filename, NULL);
 
         blk = img_open(image_opts, filename, fmt,
-                       BDRV_O_NO_BACKING | BDRV_O_NO_IO, false, false);
+                       BDRV_O_NO_BACKING | BDRV_O_NO_IO | BDRV_O_SHARE_RW,
+                       false, false);
         if (!blk) {
             goto err;
         }
@@ -2612,7 +2618,7 @@ static int img_map(int argc, char **argv)
         return 1;
     }
 
-    blk = img_open(image_opts, filename, fmt, 0, false, false);
+    blk = img_open(image_opts, filename, fmt, BDRV_O_SHARE_RW, false, false);
     if (!blk) {
         return 1;
     }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 05/16] block: Set "share-rw" flag in drive-backup when sync=none
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (3 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 04/16] qemu-img: Set "share-rw" flag in read-only commands Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 06/16] iotests: 055: Don't attach the drive to vm for drive-backup Fam Zheng
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

In this case we may open the source's backing image chain multiple
times. Setting share flag means the new open won't try to acquire or
check any lock, once we implement image locking.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 245e1e1..c97e97f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3177,6 +3177,9 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
     }
     if (backup->sync == MIRROR_SYNC_MODE_NONE) {
         source = bs;
+        /* FIXME: block layer should really open target with BDRV_O_NO_BACKING
+         * and reuse source's backing chain, if they share one. */
+        flags |= BDRV_O_SHARE_RW;
     }
 
     size = bdrv_getlength(bs);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 06/16] iotests: 055: Don't attach the drive to vm for drive-backup
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (4 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 05/16] block: Set "share-rw" flag in drive-backup when sync=none Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 07/16] iotests: 030: Read-only open image for getting map Fam Zheng
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/055 | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 1d3fd04..20a7596 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -455,17 +455,18 @@ class TestDriveCompression(iotests.QMPTestCase):
         except OSError:
             pass
 
-    def do_prepare_drives(self, fmt, args):
+    def do_prepare_drives(self, fmt, args, attach):
         self.vm = iotests.VM().add_drive(test_img)
 
         qemu_img('create', '-f', fmt, blockdev_target_img,
                  str(TestDriveCompression.image_len), *args)
-        self.vm.add_drive(blockdev_target_img, format=fmt)
+        if attach:
+            self.vm.add_drive(blockdev_target_img, format=fmt)
 
         self.vm.launch()
 
-    def do_test_compress_complete(self, cmd, format, **args):
-        self.do_prepare_drives(format['type'], format['args'])
+    def do_test_compress_complete(self, cmd, format, attach, **args):
+        self.do_prepare_drives(format['type'], format['args'], attach)
 
         self.assert_no_active_block_jobs()
 
@@ -481,15 +482,16 @@ class TestDriveCompression(iotests.QMPTestCase):
 
     def test_complete_compress_drive_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_complete('drive-backup', format,
+            self.do_test_compress_complete('drive-backup', format, attach=False,
                                            target=blockdev_target_img, mode='existing')
 
     def test_complete_compress_blockdev_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_complete('blockdev-backup', format, target='drive1')
+            self.do_test_compress_complete('blockdev-backup', format,
+                                           attach=True, target='drive1')
 
-    def do_test_compress_cancel(self, cmd, format, **args):
-        self.do_prepare_drives(format['type'], format['args'])
+    def do_test_compress_cancel(self, cmd, format, attach, **args):
+        self.do_prepare_drives(format['type'], format['args'], attach)
 
         self.assert_no_active_block_jobs()
 
@@ -503,15 +505,16 @@ class TestDriveCompression(iotests.QMPTestCase):
 
     def test_compress_cancel_drive_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_cancel('drive-backup', format,
+            self.do_test_compress_cancel('drive-backup', format, attach=False,
                                          target=blockdev_target_img, mode='existing')
 
     def test_compress_cancel_blockdev_backup(self):
        for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_cancel('blockdev-backup', format, target='drive1')
+            self.do_test_compress_cancel('blockdev-backup', format, attach=True,
+                                         target='drive1')
 
-    def do_test_compress_pause(self, cmd, format, **args):
-        self.do_prepare_drives(format['type'], format['args'])
+    def do_test_compress_pause(self, cmd, format, attach, **args):
+        self.do_prepare_drives(format['type'], format['args'], attach)
 
         self.assert_no_active_block_jobs()
 
@@ -543,12 +546,13 @@ class TestDriveCompression(iotests.QMPTestCase):
 
     def test_compress_pause_drive_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_pause('drive-backup', format,
+            self.do_test_compress_pause('drive-backup', format, attach=False,
                                         target=blockdev_target_img, mode='existing')
 
     def test_compress_pause_blockdev_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_pause('blockdev-backup', format, target='drive1')
+            self.do_test_compress_pause('blockdev-backup', format, attach=True,
+                                        target='drive1')
 
 if __name__ == '__main__':
     iotests.main(supported_fmts=['raw', 'qcow2'])
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 07/16] iotests: 030: Read-only open image for getting map
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (5 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 06/16] iotests: 055: Don't attach the drive to vm for drive-backup Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 08/16] iotests: 087: Don't attach test image twice Fam Zheng
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/030 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index 54db54a..fe0c73f 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -114,7 +114,7 @@ class TestSingleDrive(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
 
         # The image map is empty before the operation
-        empty_map = qemu_io('-f', iotests.imgfmt, '-c', 'map', test_img)
+        empty_map = qemu_io('-f', iotests.imgfmt, '-c', 'map', '-r', test_img)
 
         # This is a no-op: no data should ever be copied from the base image
         result = self.vm.qmp('block-stream', device='drive0', base=mid_img)
@@ -125,7 +125,7 @@ class TestSingleDrive(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
         self.vm.shutdown()
 
-        self.assertEqual(qemu_io('-f', iotests.imgfmt, '-c', 'map', test_img),
+        self.assertEqual(qemu_io('-f', iotests.imgfmt, '-c', 'map', '-r', test_img),
                          empty_map, 'image file map changed after a no-op')
 
     def test_stream_partial(self):
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 08/16] iotests: 087: Don't attach test image twice
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (6 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 07/16] iotests: 030: Read-only open image for getting map Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 09/16] iotests: 085: Avoid image locking conflict Fam Zheng
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

The test scenario doesn't require the same image, instead it focuses on
the duplicated node-name, so use null-co to avoid locking conflict.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/087 | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087
index 9de57dd..6d52f7d 100755
--- a/tests/qemu-iotests/087
+++ b/tests/qemu-iotests/087
@@ -82,8 +82,7 @@ run_qemu -drive driver=$IMGFMT,id=disk,node-name=test-node,file="$TEST_IMG" <<EO
       "driver": "$IMGFMT",
       "node-name": "disk",
       "file": {
-          "driver": "file",
-          "filename": "$TEST_IMG"
+          "driver": "null-co"
       }
     }
   }
@@ -92,8 +91,7 @@ run_qemu -drive driver=$IMGFMT,id=disk,node-name=test-node,file="$TEST_IMG" <<EO
       "driver": "$IMGFMT",
       "node-name": "test-node",
       "file": {
-          "driver": "file",
-          "filename": "$TEST_IMG"
+          "driver": "null-co"
       }
     }
   }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 09/16] iotests: 085: Avoid image locking conflict
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (7 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 08/16] iotests: 087: Don't attach test image twice Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 10/16] iotests: 091: Quit QEMU before checking image Fam Zheng
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

In the case where we test the expected error when a blockdev-snapshot
target already has a backing image, backing chain is opened multiple
times. This will be a problem when we use image locking, so use a
different backing file that is not already open.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/085     | 32 +++++++++++++++++++-------------
 tests/qemu-iotests/085.out |  3 ++-
 2 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/085 b/tests/qemu-iotests/085
index c53e97f..c91d78d 100755
--- a/tests/qemu-iotests/085
+++ b/tests/qemu-iotests/085
@@ -87,24 +87,26 @@ function create_group_snapshot()
 }
 
 # ${1}: unique identifier for the snapshot filename
-# ${2}: true: open backing images; false: don't open them (default)
+# ${2}: extra_params to the blockdev-add command
+# ${3}: filename
+function do_blockdev_add()
+{
+    cmd="{ 'execute': 'blockdev-add', 'arguments':
+           { 'driver': 'qcow2', 'node-name': 'snap_${1}', ${2}
+             'file':
+             { 'driver': 'file', 'filename': '${3}',
+               'node-name': 'file_${1}' } } }"
+    _send_qemu_cmd $h "${cmd}" "return"
+}
+
+# ${1}: unique identifier for the snapshot filename
 function add_snapshot_image()
 {
-    if [ "${2}" = "true" ]; then
-        extra_params=""
-    else
-        extra_params="'backing': '', "
-    fi
     base_image="${TEST_DIR}/$((${1}-1))-${snapshot_virt0}"
     snapshot_file="${TEST_DIR}/${1}-${snapshot_virt0}"
     _make_test_img -b "${base_image}" "$size"
     mv "${TEST_IMG}" "${snapshot_file}"
-    cmd="{ 'execute': 'blockdev-add', 'arguments':
-           { 'driver': 'qcow2', 'node-name': 'snap_${1}', ${extra_params}
-             'file':
-             { 'driver': 'file', 'filename': '${snapshot_file}',
-               'node-name': 'file_${1}' } } }"
-    _send_qemu_cmd $h "${cmd}" "return"
+    do_blockdev_add "$1" "'backing': '', " "${snapshot_file}"
 }
 
 # ${1}: unique identifier for the snapshot filename
@@ -222,7 +224,11 @@ echo === Invalid command - snapshot node has a backing image ===
 echo
 
 SNAPSHOTS=$((${SNAPSHOTS}+1))
-add_snapshot_image ${SNAPSHOTS} true
+
+_make_test_img "$size"
+mv "${TEST_IMG}" "${TEST_IMG}.base"
+_make_test_img -b "${TEST_IMG}.base" "$size"
+do_blockdev_add ${SNAPSHOTS} "" "${TEST_IMG}"
 blockdev_snapshot ${SNAPSHOTS} error
 
 echo
diff --git a/tests/qemu-iotests/085.out b/tests/qemu-iotests/085.out
index 08e4bb7..7bbf84d 100644
--- a/tests/qemu-iotests/085.out
+++ b/tests/qemu-iotests/085.out
@@ -78,7 +78,8 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/
 
 === Invalid command - snapshot node has a backing image ===
 
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/12-snapshot-v0.IMGFMT
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/t.IMGFMT.base
 {"return": {}}
 {"error": {"class": "GenericError", "desc": "The snapshot already has a backing image"}}
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 10/16] iotests: 091: Quit QEMU before checking image
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (8 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 09/16] iotests: 085: Avoid image locking conflict Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 11/16] iotests: 172: Use separate images for multiple devices Fam Zheng
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/091 | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/091 b/tests/qemu-iotests/091
index 32bbd56..10ac4a8 100755
--- a/tests/qemu-iotests/091
+++ b/tests/qemu-iotests/091
@@ -95,7 +95,9 @@ echo "vm2: qemu process running successfully"
 echo "vm2: flush io, and quit"
 _send_qemu_cmd $h2 'qemu-io disk flush' "(qemu)"
 _send_qemu_cmd $h2 'quit' ""
+_send_qemu_cmd $h1 'quit' ""
 
+wait
 echo "Check image pattern"
 ${QEMU_IO} -c "read -P 0x22 0 4M" "${TEST_IMG}" | _filter_testdir | _filter_qemu_io
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 11/16] iotests: 172: Use separate images for multiple devices
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (9 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 10/16] iotests: 091: Quit QEMU before checking image Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 12/16] tests: Use null-co:// instead of /dev/null as the dummy image Fam Zheng
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

To avoid image lock failures.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/172     | 53 ++++++++++++++++++++++++----------------------
 tests/qemu-iotests/172.out | 50 ++++++++++++++++++++++---------------------
 2 files changed, 54 insertions(+), 49 deletions(-)

diff --git a/tests/qemu-iotests/172 b/tests/qemu-iotests/172
index 1b7d3a1..f9d4cff 100755
--- a/tests/qemu-iotests/172
+++ b/tests/qemu-iotests/172
@@ -86,6 +86,9 @@ size=720k
 
 _make_test_img $size
 
+TEST_IMG="$TEST_IMG.2" _make_test_img $size
+TEST_IMG="$TEST_IMG.3" _make_test_img $size
+
 # Default drive semantics:
 #
 # By default you get a single empty floppy drive. You can override it with
@@ -105,7 +108,7 @@ echo === Using -fda/-fdb options ===
 
 check_floppy_qtree -fda "$TEST_IMG"
 check_floppy_qtree -fdb "$TEST_IMG"
-check_floppy_qtree -fda "$TEST_IMG" -fdb "$TEST_IMG"
+check_floppy_qtree -fda "$TEST_IMG" -fdb "$TEST_IMG.2"
 
 
 echo
@@ -114,7 +117,7 @@ echo === Using -drive options ===
 
 check_floppy_qtree -drive if=floppy,file="$TEST_IMG"
 check_floppy_qtree -drive if=floppy,file="$TEST_IMG",index=1
-check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=floppy,file="$TEST_IMG",index=1
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=floppy,file="$TEST_IMG.2",index=1
 
 echo
 echo
@@ -122,7 +125,7 @@ echo === Using -drive if=none and -global ===
 
 check_floppy_qtree -drive if=none,file="$TEST_IMG" -global isa-fdc.driveA=none0
 check_floppy_qtree -drive if=none,file="$TEST_IMG" -global isa-fdc.driveB=none0
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveA=none0 -global isa-fdc.driveB=none1
 
 echo
@@ -131,7 +134,7 @@ echo === Using -drive if=none and -device ===
 
 check_floppy_qtree -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
 check_floppy_qtree -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG",file.disable-lock=on -drive if=none,file="$TEST_IMG.2" \
                    -device floppy,drive=none0 -device floppy,drive=none1,unit=1
 
 echo
@@ -139,58 +142,58 @@ echo
 echo === Mixing -fdX and -global ===
 
 # Working
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global isa-fdc.driveB=none0
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global isa-fdc.driveA=none0
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -global isa-fdc.driveB=none0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -global isa-fdc.driveA=none0
 
 # Conflicting (-fdX wins)
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global isa-fdc.driveA=none0
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global isa-fdc.driveB=none0
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -global isa-fdc.driveA=none0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -global isa-fdc.driveB=none0
 
 echo
 echo
 echo === Mixing -fdX and -device ===
 
 # Working
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=1
 
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=0
 
 # Conflicting
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=0
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=1
 
 echo
 echo
 echo === Mixing -drive and -device ===
 
 # Working
-check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
-check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=1
 
 # Conflicting
-check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=0
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=0
 
 echo
 echo
 echo === Mixing -global and -device ===
 
 # Working
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveA=none0 -device floppy,drive=none1
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=1
 
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveB=none0 -device floppy,drive=none1
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=0
 
 # Conflicting
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=0
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=1
 
 echo
@@ -199,8 +202,8 @@ echo === Too many floppy drives ===
 
 # Working
 check_floppy_qtree -drive if=floppy,file="$TEST_IMG" \
-                   -drive if=none,file="$TEST_IMG" \
-                   -drive if=none,file="$TEST_IMG" \
+                   -drive if=none,file="$TEST_IMG.2" \
+                   -drive if=none,file="$TEST_IMG.3" \
                    -global isa-fdc.driveB=none0 \
                    -device floppy,drive=none1
 
diff --git a/tests/qemu-iotests/172.out b/tests/qemu-iotests/172.out
index 6b7edaf..7f557ac 100644
--- a/tests/qemu-iotests/172.out
+++ b/tests/qemu-iotests/172.out
@@ -1,5 +1,7 @@
 QA output created by 172
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=737280
+Formatting 'TEST_DIR/t.IMGFMT.2', fmt=IMGFMT size=737280
+Formatting 'TEST_DIR/t.IMGFMT.3', fmt=IMGFMT size=737280
 
 
 === Default ===
@@ -95,7 +97,7 @@ Testing: -fdb TEST_DIR/t.qcow2
                 write-cache = "auto"
                 drive-type = "288"
 
-Testing: -fda TEST_DIR/t.qcow2 -fdb TEST_DIR/t.qcow2
+Testing: -fda TEST_DIR/t.qcow2 -fdb TEST_DIR/t.qcow2.2
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -196,7 +198,7 @@ Testing: -drive if=floppy,file=TEST_DIR/t.qcow2,index=1
                 write-cache = "auto"
                 drive-type = "288"
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=floppy,file=TEST_DIR/t.qcow2,index=1
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=floppy,file=TEST_DIR/t.qcow2.2,index=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -287,7 +289,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0 -global isa-fdc.driveB=none1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0 -global isa-fdc.driveB=none1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -378,7 +380,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=1
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0 -device floppy,drive=none1,unit=1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2,file.disable-lock=on -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0 -device floppy,drive=none1,unit=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -417,7 +419,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
 
 === Mixing -fdX and -global ===
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -453,7 +455,7 @@ Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -489,7 +491,7 @@ Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -515,7 +517,7 @@ Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -544,7 +546,7 @@ Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-
 
 === Mixing -fdX and -device ===
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -580,7 +582,7 @@ Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device flop
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=1
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -616,7 +618,7 @@ Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device flop
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -652,7 +654,7 @@ Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device flop
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=0
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -688,18 +690,18 @@ Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device flop
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=0
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=0
 QEMU_PROG: -device floppy,drive=none0,unit=0: Floppy unit 0 is in use
 QEMU_PROG: -device floppy,drive=none0,unit=0: Device initialization failed.
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=1
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=1
 QEMU_PROG: -device floppy,drive=none0,unit=1: Floppy unit 1 is in use
 QEMU_PROG: -device floppy,drive=none0,unit=1: Device initialization failed.
 
 
 === Mixing -drive and -device ===
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -735,7 +737,7 @@ Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.q
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=1
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -771,14 +773,14 @@ Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.q
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=0
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=0
 QEMU_PROG: -device floppy,drive=none0,unit=0: Floppy unit 0 is in use
 QEMU_PROG: -device floppy,drive=none0,unit=0: Device initialization failed.
 
 
 === Mixing -global and -device ===
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0 -device floppy,drive=none1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0 -device floppy,drive=none1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -814,7 +816,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -850,7 +852,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0 -device floppy,drive=none1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0 -device floppy,drive=none1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -886,7 +888,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=0
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -922,18 +924,18 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=0
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=0
 QEMU_PROG: -device floppy,drive=none1,unit=0: Floppy unit 0 is in use
 QEMU_PROG: -device floppy,drive=none1,unit=0: Device initialization failed.
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=1
 QEMU_PROG: -device floppy,drive=none1,unit=1: Floppy unit 1 is in use
 QEMU_PROG: -device floppy,drive=none1,unit=1: Device initialization failed.
 
 
 === Too many floppy drives ===
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0 -device floppy,drive=none1
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -drive if=none,file=TEST_DIR/t.qcow2.3 -global isa-fdc.driveB=none0 -device floppy,drive=none1
 QEMU_PROG: -device floppy,drive=none1: Can't create floppy unit 2, bus supports only 2 units
 QEMU_PROG: -device floppy,drive=none1: Device initialization failed.
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 12/16] tests: Use null-co:// instead of /dev/null as the dummy image
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (10 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 11/16] iotests: 172: Use separate images for multiple devices Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 13/16] tests: Disable image lock in test-replication Fam Zheng
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/drive_del-test.c    | 2 +-
 tests/nvme-test.c         | 2 +-
 tests/usb-hcd-uhci-test.c | 2 +-
 tests/usb-hcd-xhci-test.c | 2 +-
 tests/virtio-blk-test.c   | 2 +-
 tests/virtio-scsi-test.c  | 4 ++--
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tests/drive_del-test.c b/tests/drive_del-test.c
index 121b9c9..2175139 100644
--- a/tests/drive_del-test.c
+++ b/tests/drive_del-test.c
@@ -92,7 +92,7 @@ static void test_after_failed_device_add(void)
 static void test_drive_del_device_del(void)
 {
     /* Start with a drive used by a device that unplugs instantaneously */
-    qtest_start("-drive if=none,id=drive0,file=/dev/null,format=raw"
+    qtest_start("-drive if=none,id=drive0,file=null-co://,format=raw"
                 " -device virtio-scsi-pci"
                 " -device scsi-hd,drive=drive0,id=dev0");
 
diff --git a/tests/nvme-test.c b/tests/nvme-test.c
index c8bece4..7674a44 100644
--- a/tests/nvme-test.c
+++ b/tests/nvme-test.c
@@ -22,7 +22,7 @@ int main(int argc, char **argv)
     g_test_init(&argc, &argv, NULL);
     qtest_add_func("/nvme/nop", nop);
 
-    qtest_start("-drive id=drv0,if=none,file=/dev/null,format=raw "
+    qtest_start("-drive id=drv0,if=none,file=null-co://,format=raw "
                 "-device nvme,drive=drv0,serial=foo");
     ret = g_test_run();
 
diff --git a/tests/usb-hcd-uhci-test.c b/tests/usb-hcd-uhci-test.c
index e956b9c..24c3ea9 100644
--- a/tests/usb-hcd-uhci-test.c
+++ b/tests/usb-hcd-uhci-test.c
@@ -78,7 +78,7 @@ int main(int argc, char **argv)
 {
     const char *arch = qtest_get_arch();
     const char *cmd = "-device piix3-usb-uhci,id=uhci,addr=1d.0"
-                      " -drive id=drive0,if=none,file=/dev/null,format=raw"
+                      " -drive id=drive0,if=none,file=null-co://,format=raw"
                       " -device usb-tablet,bus=uhci.0,port=1";
     int ret;
 
diff --git a/tests/usb-hcd-xhci-test.c b/tests/usb-hcd-xhci-test.c
index 22513e9..031764d 100644
--- a/tests/usb-hcd-xhci-test.c
+++ b/tests/usb-hcd-xhci-test.c
@@ -89,7 +89,7 @@ int main(int argc, char **argv)
     qtest_add_func("/xhci/pci/hotplug/usb-uas", test_usb_uas_hotplug);
 
     qtest_start("-device nec-usb-xhci,id=xhci"
-                " -drive id=drive0,if=none,file=/dev/null,format=raw");
+                " -drive id=drive0,if=none,file=null-co://,format=raw");
     ret = g_test_run();
     qtest_end();
 
diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index 0e32e41..0fc7c99 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -63,7 +63,7 @@ static QOSState *pci_test_start(void)
     const char *arch = qtest_get_arch();
     char *tmp_path;
     const char *cmd = "-drive if=none,id=drive0,file=%s,format=raw "
-                      "-drive if=none,id=drive1,file=/dev/null,format=raw "
+                      "-drive if=none,id=drive1,file=null-co://,format=raw "
                       "-device virtio-blk-pci,id=drv0,drive=drive0,"
                       "addr=%x.%x";
 
diff --git a/tests/virtio-scsi-test.c b/tests/virtio-scsi-test.c
index 69220ef..8c27641 100644
--- a/tests/virtio-scsi-test.c
+++ b/tests/virtio-scsi-test.c
@@ -35,7 +35,7 @@ typedef struct {
 static QOSState *qvirtio_scsi_start(const char *extra_opts)
 {
     const char *arch = qtest_get_arch();
-    const char *cmd = "-drive id=drv0,if=none,file=/dev/null,format=raw "
+    const char *cmd = "-drive id=drv0,if=none,file=null-co://,format=raw "
                       "-device virtio-scsi-pci,id=vs0 "
                       "-device scsi-hd,bus=vs0.0,drive=drv0 %s";
 
@@ -195,7 +195,7 @@ static void hotplug(void)
     QDict *response;
     QOSState *qs;
 
-    qs = qvirtio_scsi_start("-drive id=drv1,if=none,file=/dev/null,format=raw");
+    qs = qvirtio_scsi_start("-drive id=drv1,if=none,file=null-co://,format=raw");
     response = qmp("{\"execute\": \"device_add\","
                    " \"arguments\": {"
                    "   \"driver\": \"scsi-hd\","
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 13/16] tests: Disable image lock in test-replication
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (11 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 12/16] tests: Use null-co:// instead of /dev/null as the dummy image Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 14/16] file-posix: Implement image locking Fam Zheng
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

The COLO block replication architecture requires one disk to be shared
between primary and secondary, in the test both processes use posix file
protocol (instead of over NBD) so it is affected by image locking.
Disable the lock.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/test-replication.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tests/test-replication.c b/tests/test-replication.c
index fac2da3..5bede49 100644
--- a/tests/test-replication.c
+++ b/tests/test-replication.c
@@ -179,7 +179,8 @@ static BlockBackend *start_primary(void)
     char *cmdline;
 
     cmdline = g_strdup_printf("driver=replication,mode=primary,node-name=xxx,"
-                              "file.driver=qcow2,file.file.filename=%s"
+                              "file.driver=qcow2,file.file.filename=%s,"
+                              "file.file.disable-lock=on"
                               , p_local_disk);
     opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
     g_free(cmdline);
@@ -310,7 +311,9 @@ static BlockBackend *start_secondary(void)
     Error *local_err = NULL;
 
     /* add s_local_disk and forge S_LOCAL_DISK_ID */
-    cmdline = g_strdup_printf("file.filename=%s,driver=qcow2", s_local_disk);
+    cmdline = g_strdup_printf("file.filename=%s,driver=qcow2,"
+                              "file.file.disable-lock=on",
+                              s_local_disk);
     opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
     g_free(cmdline);
 
@@ -331,8 +334,10 @@ static BlockBackend *start_secondary(void)
     /* add S_(ACTIVE/HIDDEN)_DISK and forge S_ID */
     cmdline = g_strdup_printf("driver=replication,mode=secondary,top-id=%s,"
                               "file.driver=qcow2,file.file.filename=%s,"
+                              "file.file.disable-lock=on",
                               "file.backing.driver=qcow2,"
                               "file.backing.file.filename=%s,"
+                              "file.backing.file.disable-lock=on",
                               "file.backing.backing=%s"
                               , S_ID, s_active_disk, s_hidden_disk
                               , S_LOCAL_DISK_ID);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 14/16] file-posix: Implement image locking
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (12 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 13/16] tests: Disable image lock in test-replication Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 15:49   ` Daniel P. Berrange
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 15/16] qcow2: Force "no other writer" lock on bs->file Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 16/16] tests: Add test-image-lock Fam Zheng
  15 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

This implements open flag sensible image locking for local file
and host device protocol.

virtlockd in libvirt locks the first byte, so we start looking at the
file bytes from 1.

Quoting what was proposed by Kevin Wolf <kwolf@redhat.com>, there are
four locking modes by combining two bits (BDRV_O_RDWR and
BDRV_O_SHARE_RW), and implemented by taking two locks:

Lock bytes:

* byte 1: I can't allow other processes to write to the image
* byte 2: I am writing to the image

Lock modes:

* shared writer (BDRV_O_RDWR | BDRV_O_SHARE_RW): Take shared lock on
  byte 2. Test whether byte 1 is locked using an exclusive lock, and
  fail if so.

* exclusive writer (BDRV_O_RDWR only): Take shared lock on byte 2. Test
  whether byte 1 is locked using an exclusive lock, and fail if so. Then
  take shared lock on byte 1. I suppose this is racy, but we can
  probably tolerate that.

* reader that can tolerate writers (BDRV_O_SHARE_RW only): Don't do anything

* reader that can't tolerate writers (neither bit is set): Take shared
  lock on byte 1. Test whether byte 2 is locked, and fail if so.

The complication is in the transactional reopen.  To make the reopen
logic managable, and allow better reuse, the code is internally
organized with a table from old mode to the new one.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/file-posix.c | 667 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 664 insertions(+), 3 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 28b47d9..fabce04 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -131,8 +131,39 @@ do { \
 
 #define MAX_BLOCKSIZE	4096
 
+/* Posix file locking bytes. Libvirt takes byte 0, so start from byte 1. */
+#define RAW_LOCK_BYTE_MIN             1
+#define RAW_LOCK_BYTE_NO_OTHER_WRITER 1
+#define RAW_LOCK_BYTE_WRITE           2
+
+/*
+ ** reader that can tolerate writers: Don't do anything
+ *
+ ** reader that can't tolerate writers: Take shared lock on byte 1. Test
+ *  byte 2 is unlocked.
+ *
+ ** shared writer: Take shared lock on byte 2. Test byte 1 is unlocked.
+ *
+ ** exclusive writer: Take exclusive locks on both bytes.
+ */
+
+typedef enum {
+    /* Read only and accept other writers. */
+    RAW_L_READ_SHARE_RW,
+    /* Read only and try to forbid other writers. */
+    RAW_L_READ,
+    /* Read write and accept other writers. */
+    RAW_L_WRITE_SHARE_RW,
+    /* Read write and try to forbid other writers. */
+    RAW_L_WRITE,
+} BDRVRawLockMode;
+
 typedef struct BDRVRawState {
     int fd;
+    /* A dup of @fd to make manipulating lock easier, especially during reopen,
+     * where this will accept BDRVRawReopenState.lock_fd. */
+    int lock_fd;
+    bool disable_lock;
     int type;
     int open_flags;
     size_t buf_align;
@@ -146,11 +177,15 @@ typedef struct BDRVRawState {
     bool use_linux_aio:1;
     bool has_fallocate;
     bool needs_alignment;
+    BDRVRawLockMode cur_lock_mode;
 } BDRVRawState;
 
 typedef struct BDRVRawReopenState {
     int fd;
+    /* A dup of @fd used for acquiring lock. */
+    int lock_fd;
     int open_flags;
+    bool disable_lock;
 } BDRVRawReopenState;
 
 static int fd_open(BlockDriverState *bs);
@@ -368,6 +403,58 @@ static void raw_parse_flags(int bdrv_flags, int *open_flags)
     }
 }
 
+static int raw_lock_fd(int fd, BDRVRawLockMode mode, Error **errp)
+{
+    int ret;
+    assert(fd >= 0);
+    switch (mode) {
+    case RAW_L_READ_SHARE_RW:
+        ret = qemu_unlock_fd(fd, RAW_LOCK_BYTE_MIN, 2);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to unlock fd");
+            goto fail;
+        }
+        break;
+    case RAW_L_READ:
+        ret = qemu_lock_fd(fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock share byte");
+            goto fail;
+        }
+        ret = qemu_lock_fd_test(fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Write byte lock is taken");
+            goto fail;
+        }
+        break;
+    case RAW_L_WRITE_SHARE_RW:
+        ret = qemu_lock_fd(fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock write byte");
+            goto fail;
+        }
+        ret = qemu_lock_fd_test(fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Share byte lock is taken");
+            goto fail;
+        }
+        break;
+    case RAW_L_WRITE:
+        ret = qemu_lock_fd(fd, RAW_LOCK_BYTE_MIN, 2, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock image");
+            goto fail;
+        }
+        break;
+    default:
+        abort();
+    }
+    return 0;
+fail:
+    qemu_unlock_fd(fd, RAW_LOCK_BYTE_MIN, 2);
+    return ret;
+}
+
 static void raw_parse_filename(const char *filename, QDict *options,
                                Error **errp)
 {
@@ -393,10 +480,88 @@ static QemuOptsList raw_runtime_opts = {
             .type = QEMU_OPT_STRING,
             .help = "host AIO implementation (threads, native)",
         },
+        {
+            .name = "disable-lock",
+            .type = QEMU_OPT_BOOL,
+            .help = "don't lock the file",
+        },
         { /* end of list */ }
     },
 };
 
+static BDRVRawLockMode raw_get_lock_mode(int flags)
+{
+    switch (flags & (BDRV_O_RDWR | BDRV_O_SHARE_RW)) {
+    case BDRV_O_RDWR:
+        return RAW_L_WRITE;
+    case BDRV_O_RDWR | BDRV_O_SHARE_RW:
+        return RAW_L_WRITE_SHARE_RW;
+    case BDRV_O_SHARE_RW:
+        return RAW_L_READ_SHARE_RW;
+    case 0:
+        return RAW_L_READ;
+    default:
+        abort();
+    }
+}
+
+static int raw_open_lockfd(const char *filename, int flags,
+                           BDRVRawLockMode *lock_mode, Error **errp)
+{
+    int ret = -1;
+    const char *normalized_filename = filename;
+
+    ret = raw_normalize_devicepath(&normalized_filename);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Could not normalize device path");
+    } else {
+        assert(!(flags & O_CREAT));
+        ret = qemu_open(normalized_filename, flags);
+        if (ret == -1) {
+            error_setg_errno(errp, errno, "Could not open file: %s",
+                             normalized_filename);
+            ret = -errno;
+        }
+    }
+    return ret;
+}
+
+static bool raw_lock_enabled(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+
+    return !(s->disable_lock || bdrv_get_flags(bs) & BDRV_O_INACTIVE);
+}
+
+static int raw_apply_image_lock(BlockDriverState *bs, int bdrv_flags,
+                                Error **errp)
+{
+    int ret;
+    BDRVRawState *s = bs->opaque;
+    BDRVRawLockMode lock_mode;
+
+    if (!raw_lock_enabled(bs)) {
+        return 0;
+    }
+    assert(s->cur_lock_mode == RAW_L_READ_SHARE_RW);
+    lock_mode = raw_get_lock_mode(bdrv_flags);
+    ret = raw_open_lockfd(bs->exact_filename, s->open_flags, &lock_mode,
+                          errp);
+    if (ret < 0) {
+        return ret;
+    }
+    s->lock_fd = ret;
+    if (lock_mode == RAW_L_READ_SHARE_RW) {
+        return 0;
+    }
+    ret = raw_lock_fd(s->lock_fd, lock_mode, errp);
+    if (ret) {
+        return ret;
+    }
+    s->cur_lock_mode = lock_mode;
+    return 0;
+}
+
 static int raw_open_common(BlockDriverState *bs, QDict *options,
                            int bdrv_flags, int open_flags, Error **errp)
 {
@@ -440,6 +605,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
     raw_parse_flags(bdrv_flags, &s->open_flags);
 
     s->fd = -1;
+    s->lock_fd = -1;
     fd = qemu_open(filename, s->open_flags, 0644);
     if (fd < 0) {
         ret = -errno;
@@ -451,6 +617,13 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
     }
     s->fd = fd;
 
+    s->disable_lock = qemu_opt_get_bool(opts, "disable-lock", false);
+
+    ret = raw_apply_image_lock(bs, bdrv_flags, errp);
+    if (ret) {
+        goto fail;
+    }
+
 #ifdef CONFIG_LINUX_AIO
      /* Currently Linux does AIO only for files opened with O_DIRECT */
     if (s->use_linux_aio && !(s->open_flags & O_DIRECT)) {
@@ -538,6 +711,462 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
     return raw_open_common(bs, options, flags, 0, errp);
 }
 
+typedef enum {
+    RAW_LT_PREPARE,
+    RAW_LT_COMMIT,
+    RAW_LT_ABORT
+} RawLockTransOp;
+
+typedef int (*RawReopenFunc)(RawLockTransOp op,
+                             int old_lock_fd, int new_lock_fd,
+                             BDRVRawLockMode old_lock,
+                             BDRVRawLockMode new_lock,
+                             Error **errp);
+
+static int raw_lt_nop(RawLockTransOp op,
+                      int old_lock_fd, int new_lock_fd,
+                      BDRVRawLockMode old_lock,
+                      BDRVRawLockMode new_lock,
+                      Error **errp)
+{
+    assert(old_lock == new_lock || new_lock == RAW_L_READ_SHARE_RW);
+    return 0;
+}
+
+static int raw_lt_from_unlock(RawLockTransOp op,
+                              int old_lock_fd, int new_lock_fd,
+                              BDRVRawLockMode old_lock,
+                              BDRVRawLockMode new_lock,
+                              Error **errp)
+{
+    assert(old_lock != new_lock);
+    assert(old_lock == RAW_L_READ_SHARE_RW);
+    switch (op) {
+    case RAW_LT_PREPARE:
+        return raw_lock_fd(new_lock_fd, new_lock, errp);
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        break;
+    }
+
+    return 0;
+}
+
+static int raw_lt_read_to_write_share(RawLockTransOp op,
+                                      int old_lock_fd, int new_lock_fd,
+                                      BDRVRawLockMode old_lock,
+                                      BDRVRawLockMode new_lock,
+                                      Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_READ);
+    assert(new_lock == RAW_L_WRITE_SHARE_RW);
+
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                S                           0
+     * new                0                           S
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        ret = qemu_unlock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to unlock old fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to upgrade new fd (share byte)");
+            break;
+        }
+        ret = qemu_unlock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1);
+        if (ret) {
+            /* This is very unlikely, but catch it anyway. */
+            error_setg_errno(errp, -ret, "Failed to unlock new fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to downgrade new fd (write byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_report("Failed to restore lock on old fd (share byte)");
+        }
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_read_to_write(RawLockTransOp op,
+                                int old_lock_fd, int new_lock_fd,
+                                BDRVRawLockMode old_lock,
+                                BDRVRawLockMode new_lock,
+                                Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_READ);
+    assert(new_lock == RAW_L_WRITE);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                S                           0
+     * new                X                           X
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        ret = qemu_unlock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to unlock old fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to upgrade new fd (share byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to downgrade new fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_report("Failed to restore lock on old fd (share byte)");
+        }
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_write_share_to_read(RawLockTransOp op,
+                                      int old_lock_fd, int new_lock_fd,
+                                      BDRVRawLockMode old_lock,
+                                      BDRVRawLockMode new_lock,
+                                      Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_WRITE_SHARE_RW);
+    assert(new_lock == RAW_L_READ);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                0                           S
+     * new                S                           0
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        /* Make sure there are no other writers. */
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock old fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_write_share_to_write(RawLockTransOp op,
+                                       int old_lock_fd, int new_lock_fd,
+                                       BDRVRawLockMode old_lock,
+                                       BDRVRawLockMode new_lock,
+                                       Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_WRITE_SHARE_RW);
+    assert(new_lock == RAW_L_WRITE);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                0                           S
+     * new                X                           X
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        /* Make sure there are no other writers. */
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (write byte)");
+            break;
+        }
+        ret = qemu_unlock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to unlock old fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to upgrade new fd (write byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to downgrade new fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to restore old fd (write byte)");
+            break;
+        }
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_write_to_read(RawLockTransOp op,
+                                int old_lock_fd, int new_lock_fd,
+                                BDRVRawLockMode old_lock,
+                                BDRVRawLockMode new_lock,
+                                Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_WRITE);
+    assert(new_lock == RAW_L_READ);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                X                           X
+     * new                S                           0
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to downgrade old fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to restore old fd (share byte)");
+        }
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_write_to_write_share(RawLockTransOp op,
+                                       int old_lock_fd, int new_lock_fd,
+                                       BDRVRawLockMode old_lock,
+                                       BDRVRawLockMode new_lock,
+                                       Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_WRITE);
+    assert(new_lock == RAW_L_WRITE_SHARE_RW);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                X                           X
+     * new                0                           S
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        break;
+    case RAW_LT_COMMIT:
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_report("Failed to downgrade old fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_report("Failed to unlock new fd (share byte)");
+            break;
+        }
+        break;
+    case RAW_LT_ABORT:
+        break;
+    }
+    return ret;
+}
+
+/**
+ * Transactionally moving between possible locking states is tricky and must be
+ * done carefully. That is mostly because downgrading an exclusive lock to
+ * shared or unlocked is not guaranteed to be revertible. As a result, in such
+ * cases we have to defer the downgrading to "commit", given that no revert will
+ * happen after that point, and that downgrading a lock should never fail.
+ *
+ * On the other hand, upgrading a lock (e.g. from unlocked or shared to
+ * exclusive lock) must happen in "prepare" because it may fail.
+ *
+ * Manage the operation matrix with this state transition table to make
+ * fulfilling above conditions easier.
+ */
+static const struct RawReopenFuncRecord {
+    BDRVRawLockMode old_lock;
+    BDRVRawLockMode new_lock;
+    RawReopenFunc func;
+    bool need_lock_fd;
+    bool close_old_lock_fd;
+} reopen_functions[] = {
+
+    {RAW_L_READ_SHARE_RW, RAW_L_READ_SHARE_RW, raw_lt_nop, false, false},
+    {RAW_L_READ_SHARE_RW, RAW_L_READ, raw_lt_from_unlock, true},
+    {RAW_L_READ_SHARE_RW, RAW_L_WRITE_SHARE_RW, raw_lt_from_unlock, true},
+    {RAW_L_READ_SHARE_RW, RAW_L_WRITE, raw_lt_from_unlock, true},
+
+    {RAW_L_READ, RAW_L_READ_SHARE_RW, raw_lt_nop, false, true},
+    {RAW_L_READ, RAW_L_READ, raw_lt_nop, false, false},
+    {RAW_L_READ, RAW_L_WRITE_SHARE_RW, raw_lt_read_to_write_share, true},
+    {RAW_L_READ, RAW_L_WRITE, raw_lt_read_to_write, true},
+
+    {RAW_L_WRITE_SHARE_RW, RAW_L_READ_SHARE_RW, raw_lt_nop, false, true},
+    {RAW_L_WRITE_SHARE_RW, RAW_L_READ, raw_lt_write_share_to_read, true},
+    {RAW_L_WRITE_SHARE_RW, RAW_L_WRITE_SHARE_RW, raw_lt_nop, false, false},
+    {RAW_L_WRITE_SHARE_RW, RAW_L_WRITE, raw_lt_write_share_to_write, true},
+
+    {RAW_L_WRITE, RAW_L_READ_SHARE_RW, raw_lt_nop, false, true},
+    {RAW_L_WRITE, RAW_L_READ, raw_lt_write_to_read, true},
+    {RAW_L_WRITE, RAW_L_WRITE_SHARE_RW, raw_lt_write_to_write_share, true},
+    {RAW_L_WRITE, RAW_L_WRITE, raw_lt_nop, false, false},
+};
+
+static int raw_reopen_handle_lock(BDRVReopenState *state,
+                                  RawLockTransOp op,
+                                  Error **errp)
+{
+    BDRVRawReopenState *rs = state->opaque;
+    BDRVRawState *s = state->bs->opaque;
+    BDRVRawLockMode old_lock, new_lock;
+    const struct RawReopenFuncRecord *rec;
+    int ret;
+
+    old_lock = s->cur_lock_mode;
+    rs->disable_lock = qdict_get_try_bool(state->options, "disable-lock",
+                                          false);
+    qdict_del(state->options, "disable-lock");
+
+    if (rs->disable_lock) {
+        new_lock = RAW_L_READ_SHARE_RW;
+    } else {
+        new_lock = raw_get_lock_mode(state->flags);
+    }
+
+    for (rec = &reopen_functions[0];
+         rec < &reopen_functions[ARRAY_SIZE(reopen_functions)];
+         rec++) {
+        if (rec->old_lock == old_lock && rec->new_lock == new_lock) {
+            break;
+        }
+    }
+    assert(rec != &reopen_functions[ARRAY_SIZE(reopen_functions)]);
+
+    switch (op) {
+    case RAW_LT_PREPARE:
+        if (rec->need_lock_fd) {
+            ret = raw_open_lockfd(state->bs->exact_filename,
+                                  rs->open_flags, &new_lock, errp);
+            if (ret < 0) {
+                return ret;
+            }
+            rs->lock_fd = ret;
+        } else {
+            rs->lock_fd = -1;
+        }
+        ret = rec->func(op, s->lock_fd, rs->lock_fd, old_lock, new_lock, errp);
+        if (!ret) {
+            return ret;
+        }
+        /* Only succeeded preparation will be reverted by block layer, we
+         * need to clean up this failure manually. */
+        op = RAW_LT_ABORT;
+        /* fall through */
+    case RAW_LT_ABORT:
+        rec->func(op, s->lock_fd, rs->lock_fd, old_lock, new_lock, &error_abort);
+        if (rs->lock_fd >= 0) {
+            qemu_close(rs->lock_fd);
+            rs->lock_fd = -1;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        rec->func(op, s->lock_fd, rs->lock_fd, old_lock, new_lock, &error_abort);
+        if ((rec->need_lock_fd || rec->close_old_lock_fd) && s->lock_fd >= 0) {
+            qemu_close(s->lock_fd);
+            s->lock_fd = -1;
+        }
+        if (rec->need_lock_fd) {
+            s->lock_fd = rs->lock_fd;
+        }
+        s->cur_lock_mode = new_lock;
+        s->disable_lock = rs->disable_lock;
+        break;
+    }
+    return 0;
+}
+
 static int raw_reopen_prepare(BDRVReopenState *state,
                               BlockReopenQueue *queue, Error **errp)
 {
@@ -609,13 +1238,20 @@ static int raw_reopen_prepare(BDRVReopenState *state,
     if (rs->fd != -1) {
         raw_probe_alignment(state->bs, rs->fd, &local_err);
         if (local_err) {
-            qemu_close(rs->fd);
-            rs->fd = -1;
             error_propagate(errp, local_err);
             ret = -EINVAL;
+            goto fail;
         }
     }
+    ret = raw_reopen_handle_lock(state, RAW_LT_PREPARE, errp);
+    if (ret) {
+        goto fail;
+    }
 
+    return 0;
+fail:
+    qemu_close(rs->fd);
+    rs->fd = -1;
     return ret;
 }
 
@@ -626,6 +1262,8 @@ static void raw_reopen_commit(BDRVReopenState *state)
 
     s->open_flags = rs->open_flags;
 
+    raw_reopen_handle_lock(state, RAW_LT_COMMIT, &error_abort);
+
     qemu_close(s->fd);
     s->fd = rs->fd;
 
@@ -643,6 +1281,8 @@ static void raw_reopen_abort(BDRVReopenState *state)
         return;
     }
 
+    raw_reopen_handle_lock(state, RAW_LT_ABORT, &error_abort);
+
     if (rs->fd >= 0) {
         qemu_close(rs->fd);
         rs->fd = -1;
@@ -1332,6 +1972,10 @@ static void raw_close(BlockDriverState *bs)
         qemu_close(s->fd);
         s->fd = -1;
     }
+    if (s->lock_fd >= 0) {
+        qemu_close(s->lock_fd);
+        s->lock_fd = -1;
+    }
 }
 
 static int raw_truncate(BlockDriverState *bs, int64_t offset)
@@ -1832,6 +2476,22 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
     return 0;
 }
 
+static int raw_inactivate(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    int r = 0;
+
+    if (s->cur_lock_mode != RAW_L_READ_SHARE_RW) {
+        r = raw_lock_fd(s->lock_fd, RAW_L_READ_SHARE_RW, NULL);
+    }
+    return r;
+}
+
+static void raw_invalidate_cache(BlockDriverState *bs, Error **errp)
+{
+    raw_apply_image_lock(bs, bdrv_get_flags(bs), errp);
+}
+
 static QemuOptsList raw_create_opts = {
     .name = "raw-create-opts",
     .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
@@ -1885,7 +2545,8 @@ BlockDriver bdrv_file = {
     .bdrv_get_info = raw_get_info,
     .bdrv_get_allocated_file_size
                         = raw_get_allocated_file_size,
-
+    .bdrv_inactivate = raw_inactivate,
+    .bdrv_invalidate_cache = raw_invalidate_cache,
     .create_opts = &raw_create_opts,
 };
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 15/16] qcow2: Force "no other writer" lock on bs->file
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (13 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 14/16] file-posix: Implement image locking Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 16/16] tests: Add test-image-lock Fam Zheng
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

Writing to the same qcow2 file from two QEMU processes at the same time
will never work correctly, so disable it even when the caller specifies
BDRV_O_RDWR.

Other formats are less vulnerable because they don't have internal
snapshots thus qemu-img is less often misused to create live snapshot.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/qcow2.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 96fb8a8..879361a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1177,6 +1177,17 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
         }
     }
 
+    if ((flags & BDRV_O_SHARE_RW) && (flags & BDRV_O_RDWR)) {
+        /* Shared write is never a good idea for qcow2, override it.
+         * XXX: Use permission propagation and masking mechanism in op blockers
+         * API once it's there. */
+        ret = bdrv_reopen(bs->file->bs, flags & ~BDRV_O_SHARE_RW, &local_err);
+        if (ret) {
+            error_propagate(errp, local_err);
+            goto fail;
+        }
+    }
+
 #ifdef DEBUG_ALLOC
     {
         BdrvCheckResult result = {0};
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v10 16/16] tests: Add test-image-lock
  2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
                   ` (14 preceding siblings ...)
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 15/16] qcow2: Force "no other writer" lock on bs->file Fam Zheng
@ 2017-01-19 14:38 ` Fam Zheng
  15 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-19 14:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, Kevin Wolf, Max Reitz, qemu-block, rjones

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/Makefile.include   |   2 +
 tests/test-image-lock.c  | 200 +++++++++++++++++++++++++++++++++++++++++++++++
 tests/test-replication.c |   6 +-
 3 files changed, 205 insertions(+), 3 deletions(-)
 create mode 100644 tests/test-image-lock.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 96f5970..7718a9b 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -55,6 +55,7 @@ check-unit-y += tests/test-hbitmap$(EXESUF)
 gcov-files-test-hbitmap-y = blockjob.c
 check-unit-y += tests/test-blockjob$(EXESUF)
 check-unit-y += tests/test-blockjob-txn$(EXESUF)
+check-unit-y += tests/test-image-lock$(EXESUF)
 check-unit-y += tests/test-x86-cpuid$(EXESUF)
 # all code tested by test-x86-cpuid is inside topology.h
 gcov-files-test-x86-cpuid-y =
@@ -516,6 +517,7 @@ tests/test-aio$(EXESUF): tests/test-aio.o $(test-block-obj-y)
 tests/test-throttle$(EXESUF): tests/test-throttle.o $(test-block-obj-y)
 tests/test-blockjob$(EXESUF): tests/test-blockjob.o $(test-block-obj-y) $(test-util-obj-y)
 tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(test-block-obj-y) $(test-util-obj-y)
+tests/test-image-lock$(EXESUF): tests/test-image-lock.o $(test-block-obj-y) $(libqos-obj-y)
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
 tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
 tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o $(test-util-obj-y)
diff --git a/tests/test-image-lock.c b/tests/test-image-lock.c
new file mode 100644
index 0000000..df86ff0
--- /dev/null
+++ b/tests/test-image-lock.c
@@ -0,0 +1,200 @@
+/*
+ * Image lock tests
+ *
+ * Copyright 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *  Fam Zheng <famz@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qapi/qmp/qbool.h"
+#include "sysemu/block-backend.h"
+
+#define DEBUG_IMAGE_LOCK_TEST 0
+#define DPRINTF(...) do { \
+        if (DEBUG_IMAGE_LOCK_TEST) { \
+            printf(__VA_ARGS__); \
+        } \
+    } while (0)
+
+#define TEST_IMAGE_SIZE 4096
+static char test_image[] = "/tmp/qtest.XXXXXX";
+static int test_image_fd;
+
+static BlockBackend *open_test_image(int flags, bool disable_lock)
+{
+    QDict *opts = qdict_new();
+
+    qdict_set_default_str(opts, "filename", test_image);
+    qdict_set_default_str(opts, "driver", "file");
+    if (disable_lock) {
+        qdict_put(opts, "disable-lock", qbool_from_bool(true));
+    }
+
+    return blk_new_open(NULL, NULL, opts, flags | BDRV_O_ALLOW_RDWR, NULL);
+}
+
+#define RW true
+#define RO false
+#define SHARE true
+#define EXCLU false
+
+static struct CompatData {
+    bool write_1;
+    bool share_1;
+    bool write_2;
+    bool share_2;
+    bool compatible;
+} compat_data[] = {
+    /* Write 1, Share 1, Write 2, Share 2, Compatible. */
+    { RO,       SHARE,   RO,      SHARE,   true,  },
+    { RO,       SHARE,   RO,      EXCLU,   true,  },
+    { RO,       SHARE,   RW,      SHARE,   true,  },
+    { RO,       SHARE,   RW,      EXCLU,   true,  },
+
+    { RO,       EXCLU,   RO,      EXCLU,   true,  },
+    { RO,       EXCLU,   RW,      SHARE,   false, },
+    { RO,       EXCLU,   RW,      EXCLU,   false, },
+
+    { RW,       SHARE,   RW,      SHARE,   true, },
+    { RW,       SHARE,   RW,      EXCLU,   false, },
+
+    { RW,       EXCLU,   RW,      EXCLU,   false, },
+};
+
+/* Test one combination scenario.
+ *
+ * @flags1: The flags of the first blk.
+ * @flags2: The flags of the second blk.
+ * @disable1: The value for raw-posix disable-lock option of the first blk.
+ * @disable2: The value for raw-posix disable-lock option of the second blk.
+ * @from_reopen: Whether or not the first blk should get flags1 from a reopen.
+ * @initial: The source flags from which the blk1 is reopened, only
+ *                effective if @from_reopen is true.
+ */
+static void do_test_compat_one(int flags1, int flags2,
+                               bool disable1, bool disable2,
+                               bool from_reopen, int initial_flags,
+                               bool compatible)
+{
+    BlockBackend *blk1, *blk2;
+
+    DPRINTF("\n===\ndo test compat one\n");
+    DPRINTF("flags %x %x\n", flags1, flags2);
+    DPRINTF("disable %d %d\n", disable1, disable2);
+    DPRINTF("from reopen %d, initial flags %d\n", from_reopen, initial_flags);
+    DPRINTF("compatible %d\n", compatible);
+    if (!from_reopen) {
+        blk1 = open_test_image(flags1, disable1);
+    } else {
+        int ret;
+        blk1 = open_test_image(initial_flags, disable1);
+        BlockReopenQueue *rq = NULL;
+
+        rq = bdrv_reopen_queue(rq, blk_bs(blk1), NULL, flags1);
+        ret = bdrv_reopen_multiple(blk_get_aio_context(blk1), rq, &error_abort);
+        g_assert_cmpint(ret, ==, 0);
+    }
+    g_assert_nonnull(blk1);
+    g_assert_cmphex(blk_get_flags(blk1) & (BDRV_O_SHARE_RW | BDRV_O_RDWR),
+                    ==, flags1);
+    blk2 = open_test_image(flags2, disable2);
+    if (compatible) {
+        g_assert_nonnull(blk2);
+    } else {
+        g_assert_null(blk2);
+    }
+    blk_unref(blk1);
+    blk_unref(blk2);
+}
+
+static void do_test_compat(bool test_disable, bool from_reopen,
+                           int initial_flags)
+{
+    int i;
+    int flags1, flags2;
+
+    for (i = 0; i < ARRAY_SIZE(compat_data); i++) {
+        struct CompatData *data = &compat_data[i];
+        bool compat = data->compatible;
+
+        flags1 = (data->write_1 ? BDRV_O_RDWR : 0) |
+                 (data->share_1 ? BDRV_O_SHARE_RW : 0);
+        flags2 = (data->write_2 ? BDRV_O_RDWR : 0) |
+                 (data->share_2 ? BDRV_O_SHARE_RW : 0);
+        if (!test_disable) {
+            do_test_compat_one(flags1, flags2, false, false,
+                               from_reopen, initial_flags, compat);
+
+            do_test_compat_one(flags2, flags1, false, false,
+                               from_reopen, initial_flags, compat);
+        } else {
+            compat = true;
+            do_test_compat_one(flags1, flags2, true, false,
+                               from_reopen, initial_flags, compat);
+            do_test_compat_one(flags1, flags2, false, true,
+                               from_reopen, initial_flags, compat);
+            do_test_compat_one(flags2, flags1, true, false,
+                               from_reopen, initial_flags, compat);
+            do_test_compat_one(flags2, flags1, false, true,
+                               from_reopen, initial_flags, compat);
+            do_test_compat_one(flags1, flags2, true, true,
+                               from_reopen, initial_flags, compat);
+        }
+    }
+}
+
+static void test_compat(void)
+{
+    do_test_compat(false, false, 0);
+}
+
+static void test_compat_after_reopen(void)
+{
+    do_test_compat(false, true, 0);
+    do_test_compat(false, true, BDRV_O_SHARE_RW);
+    do_test_compat(false, true, BDRV_O_RDWR);
+    do_test_compat(false, true, BDRV_O_RDWR | BDRV_O_SHARE_RW);
+}
+
+static void test_0bytefile(void)
+{
+    ftruncate(test_image_fd, 0);
+    do_test_compat(false, false, 0);
+}
+
+static void test_disable(void)
+{
+    do_test_compat(true, false, 0);
+    do_test_compat(true, true, 0);
+    do_test_compat(true, true, BDRV_O_SHARE_RW);
+    do_test_compat(true, true, BDRV_O_RDWR);
+    do_test_compat(true, true, BDRV_O_RDWR | BDRV_O_SHARE_RW);
+}
+
+int main(int argc, char **argv)
+{
+    int r;
+    test_image_fd = mkstemp(test_image);
+
+    qemu_init_main_loop(&error_fatal);
+    bdrv_init();
+
+    g_assert(test_image_fd >= 0);
+    ftruncate(test_image_fd, TEST_IMAGE_SIZE);
+    g_test_init(&argc, &argv, NULL);
+    g_test_add_func("/image-lock/compat", test_compat);
+    g_test_add_func("/image-lock/compat_after_reopen", test_compat_after_reopen);
+    g_test_add_func("/image-lock/compat_0bytefile", test_0bytefile);
+    g_test_add_func("/image-lock/disable", test_disable);
+    aio_context_acquire(qemu_get_aio_context());
+    r = g_test_run();
+    aio_context_release(qemu_get_aio_context());
+    return r;
+}
diff --git a/tests/test-replication.c b/tests/test-replication.c
index 5bede49..5fb69d2 100644
--- a/tests/test-replication.c
+++ b/tests/test-replication.c
@@ -312,7 +312,7 @@ static BlockBackend *start_secondary(void)
 
     /* add s_local_disk and forge S_LOCAL_DISK_ID */
     cmdline = g_strdup_printf("file.filename=%s,driver=qcow2,"
-                              "file.file.disable-lock=on",
+                              "file.disable-lock=on",
                               s_local_disk);
     opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
     g_free(cmdline);
@@ -334,10 +334,10 @@ static BlockBackend *start_secondary(void)
     /* add S_(ACTIVE/HIDDEN)_DISK and forge S_ID */
     cmdline = g_strdup_printf("driver=replication,mode=secondary,top-id=%s,"
                               "file.driver=qcow2,file.file.filename=%s,"
-                              "file.file.disable-lock=on",
+                              "file.file.disable-lock=on,"
                               "file.backing.driver=qcow2,"
                               "file.backing.file.filename=%s,"
-                              "file.backing.file.disable-lock=on",
+                              "file.backing.file.disable-lock=on,"
                               "file.backing.backing=%s"
                               , S_ID, s_active_disk, s_hidden_disk
                               , S_LOCAL_DISK_ID);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v10 14/16] file-posix: Implement image locking
  2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 14/16] file-posix: Implement image locking Fam Zheng
@ 2017-01-19 15:49   ` Daniel P. Berrange
  2017-01-19 16:19     ` [Qemu-devel] [Qemu-block] " Eric Blake
  2017-01-20  1:56     ` [Qemu-devel] " Fam Zheng
  0 siblings, 2 replies; 21+ messages in thread
From: Daniel P. Berrange @ 2017-01-19 15:49 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qemu-devel, Kevin Wolf, rjones, qemu-block, Max Reitz

On Thu, Jan 19, 2017 at 10:38:14PM +0800, Fam Zheng wrote:
> This implements open flag sensible image locking for local file
> and host device protocol.
> 
> virtlockd in libvirt locks the first byte, so we start looking at the
> file bytes from 1.
> 
> Quoting what was proposed by Kevin Wolf <kwolf@redhat.com>, there are
> four locking modes by combining two bits (BDRV_O_RDWR and
> BDRV_O_SHARE_RW), and implemented by taking two locks:
> 
> Lock bytes:
> 
> * byte 1: I can't allow other processes to write to the image
> * byte 2: I am writing to the image
> 
> Lock modes:
> 
> * shared writer (BDRV_O_RDWR | BDRV_O_SHARE_RW): Take shared lock on
>   byte 2. Test whether byte 1 is locked using an exclusive lock, and
>   fail if so.
> 
> * exclusive writer (BDRV_O_RDWR only): Take shared lock on byte 2. Test
>   whether byte 1 is locked using an exclusive lock, and fail if so. Then
>   take shared lock on byte 1. I suppose this is racy, but we can
>   probably tolerate that.
> 
> * reader that can tolerate writers (BDRV_O_SHARE_RW only): Don't do anything
> 
> * reader that can't tolerate writers (neither bit is set): Take shared
>   lock on byte 1. Test whether byte 2 is locked, and fail if so.

Ahh, using two bytes is an interesting technique for mapping the four
different access methods onto the more limit fcntl lock semantics. We
might want to copy this approach in libvirt too....

> +/* Posix file locking bytes. Libvirt takes byte 0, so start from byte 1. */
> +#define RAW_LOCK_BYTE_MIN             1
> +#define RAW_LOCK_BYTE_NO_OTHER_WRITER 1
> +#define RAW_LOCK_BYTE_WRITE           2

...would you mind if QEMU started from say byte 10, leaving the first 10
reserved for libvirt uses. This lets libvirt have a continuous space for
its own usage if we want to use more bytes

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v10 14/16] file-posix: Implement image locking
  2017-01-19 15:49   ` Daniel P. Berrange
@ 2017-01-19 16:19     ` Eric Blake
  2017-01-19 17:35       ` Richard W.M. Jones
  2017-01-20  1:56     ` [Qemu-devel] " Fam Zheng
  1 sibling, 1 reply; 21+ messages in thread
From: Eric Blake @ 2017-01-19 16:19 UTC (permalink / raw)
  To: Daniel P. Berrange, Fam Zheng
  Cc: Kevin Wolf, Max Reitz, qemu-devel, qemu-block, rjones

[-- Attachment #1: Type: text/plain, Size: 1278 bytes --]

On 01/19/2017 09:49 AM, Daniel P. Berrange wrote:
> On Thu, Jan 19, 2017 at 10:38:14PM +0800, Fam Zheng wrote:
>> This implements open flag sensible image locking for local file
>> and host device protocol.
>>
>> virtlockd in libvirt locks the first byte, so we start looking at the
>> file bytes from 1.
>>
>> Quoting what was proposed by Kevin Wolf <kwolf@redhat.com>, there are
>> four locking modes by combining two bits (BDRV_O_RDWR and
>> BDRV_O_SHARE_RW), and implemented by taking two locks:
>>

>> +/* Posix file locking bytes. Libvirt takes byte 0, so start from byte 1. */
>> +#define RAW_LOCK_BYTE_MIN             1
>> +#define RAW_LOCK_BYTE_NO_OTHER_WRITER 1
>> +#define RAW_LOCK_BYTE_WRITE           2
> 
> ...would you mind if QEMU started from say byte 10, leaving the first 10
> reserved for libvirt uses. This lets libvirt have a continuous space for
> its own usage if we want to use more bytes

Thankfully, fcntl() locks can be taken beyond end-of-file, so I think
we're okay making an arbitrarily larger range of bytes reserved for each
process, even in the unlikely corner case of passing files smaller than
512 bytes.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v10 14/16] file-posix: Implement image locking
  2017-01-19 16:19     ` [Qemu-devel] [Qemu-block] " Eric Blake
@ 2017-01-19 17:35       ` Richard W.M. Jones
  0 siblings, 0 replies; 21+ messages in thread
From: Richard W.M. Jones @ 2017-01-19 17:35 UTC (permalink / raw)
  To: Eric Blake
  Cc: Daniel P. Berrange, Fam Zheng, Kevin Wolf, Max Reitz, qemu-devel,
	qemu-block

On Thu, Jan 19, 2017 at 10:19:29AM -0600, Eric Blake wrote:
> On 01/19/2017 09:49 AM, Daniel P. Berrange wrote:
> > On Thu, Jan 19, 2017 at 10:38:14PM +0800, Fam Zheng wrote:
> >> This implements open flag sensible image locking for local file
> >> and host device protocol.
> >>
> >> virtlockd in libvirt locks the first byte, so we start looking at the
> >> file bytes from 1.
> >>
> >> Quoting what was proposed by Kevin Wolf <kwolf@redhat.com>, there are
> >> four locking modes by combining two bits (BDRV_O_RDWR and
> >> BDRV_O_SHARE_RW), and implemented by taking two locks:
> >>
> 
> >> +/* Posix file locking bytes. Libvirt takes byte 0, so start from byte 1. */
> >> +#define RAW_LOCK_BYTE_MIN             1
> >> +#define RAW_LOCK_BYTE_NO_OTHER_WRITER 1
> >> +#define RAW_LOCK_BYTE_WRITE           2
> > 
> > ...would you mind if QEMU started from say byte 10, leaving the first 10
> > reserved for libvirt uses. This lets libvirt have a continuous space for
> > its own usage if we want to use more bytes
> 
> Thankfully, fcntl() locks can be taken beyond end-of-file, so I think
> we're okay making an arbitrarily larger range of bytes reserved for each
> process, even in the unlikely corner case of passing files smaller than
> 512 bytes.

Not so unlikely.  libguestfs actually detects and works around this
case because qemu was (possibly still is) very broken when you pass
small files.

https://github.com/libguestfs/libguestfs/blob/master/src/drives.c#L395-L441

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v10 14/16] file-posix: Implement image locking
  2017-01-19 15:49   ` Daniel P. Berrange
  2017-01-19 16:19     ` [Qemu-devel] [Qemu-block] " Eric Blake
@ 2017-01-20  1:56     ` Fam Zheng
  1 sibling, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2017-01-20  1:56 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Kevin Wolf, Max Reitz, qemu-devel, qemu-block, rjones

On Thu, 01/19 15:49, Daniel P. Berrange wrote:
> On Thu, Jan 19, 2017 at 10:38:14PM +0800, Fam Zheng wrote:
> > This implements open flag sensible image locking for local file
> > and host device protocol.
> > 
> > virtlockd in libvirt locks the first byte, so we start looking at the
> > file bytes from 1.
> > 
> > Quoting what was proposed by Kevin Wolf <kwolf@redhat.com>, there are
> > four locking modes by combining two bits (BDRV_O_RDWR and
> > BDRV_O_SHARE_RW), and implemented by taking two locks:
> > 
> > Lock bytes:
> > 
> > * byte 1: I can't allow other processes to write to the image
> > * byte 2: I am writing to the image
> > 
> > Lock modes:
> > 
> > * shared writer (BDRV_O_RDWR | BDRV_O_SHARE_RW): Take shared lock on
> >   byte 2. Test whether byte 1 is locked using an exclusive lock, and
> >   fail if so.
> > 
> > * exclusive writer (BDRV_O_RDWR only): Take shared lock on byte 2. Test
> >   whether byte 1 is locked using an exclusive lock, and fail if so. Then
> >   take shared lock on byte 1. I suppose this is racy, but we can
> >   probably tolerate that.
> > 
> > * reader that can tolerate writers (BDRV_O_SHARE_RW only): Don't do anything
> > 
> > * reader that can't tolerate writers (neither bit is set): Take shared
> >   lock on byte 1. Test whether byte 2 is locked, and fail if so.
> 
> Ahh, using two bytes is an interesting technique for mapping the four
> different access methods onto the more limit fcntl lock semantics. We
> might want to copy this approach in libvirt too....
> 
> > +/* Posix file locking bytes. Libvirt takes byte 0, so start from byte 1. */
> > +#define RAW_LOCK_BYTE_MIN             1
> > +#define RAW_LOCK_BYTE_NO_OTHER_WRITER 1
> > +#define RAW_LOCK_BYTE_WRITE           2
> 
> ...would you mind if QEMU started from say byte 10, leaving the first 10
> reserved for libvirt uses. This lets libvirt have a continuous space for
> its own usage if we want to use more bytes

That's easy, will do it. (Actually the descriptions above are a bit stale
because exclusive writers now take two bytes exclusively and the lock testings
are done with F_OFD_GETLK so that RO readers can test against shared locks
without acquiring a write lock, which requires O_RDWR).

Fam

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-01-20  1:56 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-19 14:38 [Qemu-devel] [PATCH v10 00/16] block: Image locking series Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 02/16] block: Define BDRV_O_SHARE_RW Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 03/16] qemu-io: Set "share-rw" flag together with read-only Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 04/16] qemu-img: Set "share-rw" flag in read-only commands Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 05/16] block: Set "share-rw" flag in drive-backup when sync=none Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 06/16] iotests: 055: Don't attach the drive to vm for drive-backup Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 07/16] iotests: 030: Read-only open image for getting map Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 08/16] iotests: 087: Don't attach test image twice Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 09/16] iotests: 085: Avoid image locking conflict Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 10/16] iotests: 091: Quit QEMU before checking image Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 11/16] iotests: 172: Use separate images for multiple devices Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 12/16] tests: Use null-co:// instead of /dev/null as the dummy image Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 13/16] tests: Disable image lock in test-replication Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 14/16] file-posix: Implement image locking Fam Zheng
2017-01-19 15:49   ` Daniel P. Berrange
2017-01-19 16:19     ` [Qemu-devel] [Qemu-block] " Eric Blake
2017-01-19 17:35       ` Richard W.M. Jones
2017-01-20  1:56     ` [Qemu-devel] " Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 15/16] qcow2: Force "no other writer" lock on bs->file Fam Zheng
2017-01-19 14:38 ` [Qemu-devel] [PATCH v10 16/16] tests: Add test-image-lock Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.