All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v12 00/16] block: Image locking series
@ 2017-01-23 12:30 Fam Zheng
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
                   ` (16 more replies)
  0 siblings, 17 replies; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

v12: Fix test cases on old systems that doesn't have F_OFD_SETLK, such as RHEL
     6. [Patchew]
     Trim the commit message of patch 15 to avoid bitrotting.

v11: Move lock bytes from 1-2 to 0x10-0x12. [Daniel]

v10: While we still don't have comprehensive propagation mechanism that will be
provided by new op blocker system for "permissive modes", the locking enabled
by default is regardlessly useful and long overdue. So I think we should merge
this for 2.9 and build user options on top later when the op blocker API
settles.

Address comments from Max and Eric:
  - Use F_OFD_GETLK instead of opening r/w for ro images. [Max]
  - Lock both bytes exclusively for non-shared write. [Max]
  - Spell fixes. [Eric]
  - Fix test matrix. [Max]
  - Comment tweaks. [Max]
  - Return code cleanups. [Max]
  - Don't abuse "disable_lock" for migration. [Max]
  - Use bs->exact_filename instead of bs->filename. [Max]
  - Force protect qcow2 concurrent write.
  - Fix indentation. [Max]
  - Always use re-open for lockfd instead of dup. [Max]
  - Fall through to "abort" where "prepare" failed. [Max]
  - Fix option handling in raw_reopen_handle_lock. [Max]
  - Use "error_abort" in commit and abort. [Max]
  - Fix cleanup of raw_reopen_handle_lock() failure. [Max]
  - Add a patch for qcow2 to mask BDRV_O_SHARE_RW if r/w.
  - Rebase and fix new more cases that will be broken by "lock by default".

Fam Zheng (16):
  osdep: Add qemu_lock_fd and qemu_unlock_fd
  block: Define BDRV_O_SHARE_RW
  qemu-io: Set "share-rw" flag together with read-only
  qemu-img: Set "share-rw" flag in read-only commands
  block: Set "share-rw" flag in drive-backup when sync=none
  iotests: 055: Don't attach the drive to vm for drive-backup
  iotests: 030: Read-only open image for getting map
  iotests: 087: Don't attach test image twice
  iotests: 085: Avoid image locking conflict
  iotests: 091: Quit QEMU before checking image
  iotests: 172: Use separate images for multiple devices
  tests: Use null-co:// instead of /dev/null as the dummy image
  tests: Disable image lock in test-replication
  file-posix: Implement image locking
  qcow2: Force "no other writer" lock on bs->file
  tests: Add test-image-lock

 block/file-posix.c         | 681 ++++++++++++++++++++++++++++++++++++++++++++-
 block/qcow2.c              |  11 +
 blockdev.c                 |   3 +
 include/block/block.h      |   2 +
 include/qemu/osdep.h       |   3 +
 qemu-img.c                 |  10 +-
 qemu-io.c                  |   2 +
 tests/Makefile.include     |   2 +
 tests/drive_del-test.c     |   2 +-
 tests/nvme-test.c          |   2 +-
 tests/qemu-iotests/030     |   4 +-
 tests/qemu-iotests/055     |  32 ++-
 tests/qemu-iotests/085     |  32 ++-
 tests/qemu-iotests/085.out |   3 +-
 tests/qemu-iotests/087     |   6 +-
 tests/qemu-iotests/091     |   2 +
 tests/qemu-iotests/172     |  53 ++--
 tests/qemu-iotests/172.out |  50 ++--
 tests/test-image-lock.c    | 206 ++++++++++++++
 tests/test-replication.c   |   9 +-
 tests/usb-hcd-uhci-test.c  |   2 +-
 tests/usb-hcd-xhci-test.c  |   2 +-
 tests/virtio-blk-test.c    |   2 +-
 tests/virtio-scsi-test.c   |   4 +-
 util/osdep.c               |  48 ++++
 25 files changed, 1076 insertions(+), 97 deletions(-)
 create mode 100644 tests/test-image-lock.c

-- 
2.9.3

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-05 21:38   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 02/16] block: Define BDRV_O_SHARE_RW Fam Zheng
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

They are wrappers of POSIX fcntl "file private locking", with a
convenient "try lock" wrapper implemented with F_OFD_GETLK.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/qemu/osdep.h |  3 +++
 util/osdep.c         | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 689f253..e864fe8 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -297,6 +297,9 @@ int qemu_close(int fd);
 #ifndef _WIN32
 int qemu_dup(int fd);
 #endif
+int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive);
+int qemu_unlock_fd(int fd, int64_t start, int64_t len);
+int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive);
 
 #if defined(__HAIKU__) && defined(__i386__)
 #define FMT_pid "%ld"
diff --git a/util/osdep.c b/util/osdep.c
index 06fb1cf..3de4a18 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -140,6 +140,54 @@ static int qemu_parse_fdset(const char *param)
 {
     return qemu_parse_fd(param);
 }
+
+static int qemu_lock_fcntl(int fd, int64_t start, int64_t len, int fl_type)
+{
+#ifdef F_OFD_SETLK
+    int ret;
+    struct flock fl = {
+        .l_whence = SEEK_SET,
+        .l_start  = start,
+        .l_len    = len,
+        .l_type   = fl_type,
+    };
+    ret = fcntl(fd, F_OFD_SETLK, &fl);
+    return ret == -1 ? -errno : 0;
+#else
+    return -ENOTSUP;
+#endif
+}
+
+int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive)
+{
+    return qemu_lock_fcntl(fd, start, len, exclusive ? F_WRLCK : F_RDLCK);
+}
+
+int qemu_unlock_fd(int fd, int64_t start, int64_t len)
+{
+    return qemu_lock_fcntl(fd, start, len, F_UNLCK);
+}
+
+int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive)
+{
+#ifdef F_OFD_SETLK
+    int ret;
+    struct flock fl = {
+        .l_whence = SEEK_SET,
+        .l_start  = start,
+        .l_len    = len,
+        .l_type   = exclusive ? F_WRLCK : F_RDLCK,
+    };
+    ret = fcntl(fd, F_OFD_GETLK, &fl);
+    if (ret == -1) {
+        return -errno;
+    } else {
+        return fl.l_type == F_UNLCK ? 0 : -EAGAIN;
+    }
+#else
+    return -ENOTSUP;
+#endif
+}
 #endif
 
 /*
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 02/16] block: Define BDRV_O_SHARE_RW
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-05 21:51   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 03/16] qemu-io: Set "share-rw" flag together with read-only Fam Zheng
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/block/block.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 8b0dcda..243839d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -97,6 +97,8 @@ typedef struct HDGeometry {
                                       select an appropriate protocol driver,
                                       ignoring the format layer */
 #define BDRV_O_NO_IO       0x10000 /* don't initialize for I/O */
+#define BDRV_O_SHARE_RW    0x20000 /* accept shared read-write from other users
+                                      of the image */
 
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH)
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 03/16] qemu-io: Set "share-rw" flag together with read-only
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 02/16] block: Define BDRV_O_SHARE_RW Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-05 21:52   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 04/16] qemu-img: Set "share-rw" flag in read-only commands Fam Zheng
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

qemu-io is a low level tool to read or modify guest visible data, which
implies the user knows very well what is being done. Allowing reading
from a locked image is harmless in most cases, so do it.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-io.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/qemu-io.c b/qemu-io.c
index 23a229f..f000504 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -585,6 +585,8 @@ int main(int argc, char **argv)
     /* open the device */
     if (!readonly) {
         flags |= BDRV_O_RDWR;
+    } else {
+        flags |= BDRV_O_SHARE_RW;
     }
 
     if ((argc - optind) == 1) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 04/16] qemu-img: Set "share-rw" flag in read-only commands
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (2 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 03/16] qemu-io: Set "share-rw" flag together with read-only Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-05 21:55   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 05/16] block: Set "share-rw" flag in drive-backup when sync=none Fam Zheng
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

Checking the status of an image when it is being used by guest is
usually useful, and there is no risk of corrupting data, so don't let
the upcoming image locking feature limit this use case.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 qemu-img.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 5df66fe..6a091e0 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -705,6 +705,10 @@ static int img_check(int argc, char **argv)
         return 1;
     }
 
+    if (!(flags & BDRV_O_RDWR)) {
+        flags |= BDRV_O_SHARE_RW;
+    }
+
     blk = img_open(image_opts, filename, fmt, flags, writethrough, quiet);
     if (!blk) {
         return 1;
@@ -1238,6 +1242,7 @@ static int img_compare(int argc, char **argv)
         goto out3;
     }
 
+    flags |= BDRV_O_SHARE_RW;
     blk1 = img_open(image_opts, filename1, fmt1, flags, writethrough, quiet);
     if (!blk1) {
         ret = 2;
@@ -2286,7 +2291,8 @@ static ImageInfoList *collect_image_info_list(bool image_opts,
         g_hash_table_insert(filenames, (gpointer)filename, NULL);
 
         blk = img_open(image_opts, filename, fmt,
-                       BDRV_O_NO_BACKING | BDRV_O_NO_IO, false, false);
+                       BDRV_O_NO_BACKING | BDRV_O_NO_IO | BDRV_O_SHARE_RW,
+                       false, false);
         if (!blk) {
             goto err;
         }
@@ -2612,7 +2618,7 @@ static int img_map(int argc, char **argv)
         return 1;
     }
 
-    blk = img_open(image_opts, filename, fmt, 0, false, false);
+    blk = img_open(image_opts, filename, fmt, BDRV_O_SHARE_RW, false, false);
     if (!blk) {
         return 1;
     }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 05/16] block: Set "share-rw" flag in drive-backup when sync=none
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (3 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 04/16] qemu-img: Set "share-rw" flag in read-only commands Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-05 21:57   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 06/16] iotests: 055: Don't attach the drive to vm for drive-backup Fam Zheng
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

In this case we may open the source's backing image chain multiple
times. Setting share flag means the new open won't try to acquire or
check any lock, once we implement image locking.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 245e1e1..c97e97f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3177,6 +3177,9 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
     }
     if (backup->sync == MIRROR_SYNC_MODE_NONE) {
         source = bs;
+        /* FIXME: block layer should really open target with BDRV_O_NO_BACKING
+         * and reuse source's backing chain, if they share one. */
+        flags |= BDRV_O_SHARE_RW;
     }
 
     size = bdrv_getlength(bs);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 06/16] iotests: 055: Don't attach the drive to vm for drive-backup
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (4 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 05/16] block: Set "share-rw" flag in drive-backup when sync=none Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  0:06   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 07/16] iotests: 030: Read-only open image for getting map Fam Zheng
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/055 | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 1d3fd04..20a7596 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -455,17 +455,18 @@ class TestDriveCompression(iotests.QMPTestCase):
         except OSError:
             pass
 
-    def do_prepare_drives(self, fmt, args):
+    def do_prepare_drives(self, fmt, args, attach):
         self.vm = iotests.VM().add_drive(test_img)
 
         qemu_img('create', '-f', fmt, blockdev_target_img,
                  str(TestDriveCompression.image_len), *args)
-        self.vm.add_drive(blockdev_target_img, format=fmt)
+        if attach:
+            self.vm.add_drive(blockdev_target_img, format=fmt)
 
         self.vm.launch()
 
-    def do_test_compress_complete(self, cmd, format, **args):
-        self.do_prepare_drives(format['type'], format['args'])
+    def do_test_compress_complete(self, cmd, format, attach, **args):
+        self.do_prepare_drives(format['type'], format['args'], attach)
 
         self.assert_no_active_block_jobs()
 
@@ -481,15 +482,16 @@ class TestDriveCompression(iotests.QMPTestCase):
 
     def test_complete_compress_drive_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_complete('drive-backup', format,
+            self.do_test_compress_complete('drive-backup', format, attach=False,
                                            target=blockdev_target_img, mode='existing')
 
     def test_complete_compress_blockdev_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_complete('blockdev-backup', format, target='drive1')
+            self.do_test_compress_complete('blockdev-backup', format,
+                                           attach=True, target='drive1')
 
-    def do_test_compress_cancel(self, cmd, format, **args):
-        self.do_prepare_drives(format['type'], format['args'])
+    def do_test_compress_cancel(self, cmd, format, attach, **args):
+        self.do_prepare_drives(format['type'], format['args'], attach)
 
         self.assert_no_active_block_jobs()
 
@@ -503,15 +505,16 @@ class TestDriveCompression(iotests.QMPTestCase):
 
     def test_compress_cancel_drive_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_cancel('drive-backup', format,
+            self.do_test_compress_cancel('drive-backup', format, attach=False,
                                          target=blockdev_target_img, mode='existing')
 
     def test_compress_cancel_blockdev_backup(self):
        for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_cancel('blockdev-backup', format, target='drive1')
+            self.do_test_compress_cancel('blockdev-backup', format, attach=True,
+                                         target='drive1')
 
-    def do_test_compress_pause(self, cmd, format, **args):
-        self.do_prepare_drives(format['type'], format['args'])
+    def do_test_compress_pause(self, cmd, format, attach, **args):
+        self.do_prepare_drives(format['type'], format['args'], attach)
 
         self.assert_no_active_block_jobs()
 
@@ -543,12 +546,13 @@ class TestDriveCompression(iotests.QMPTestCase):
 
     def test_compress_pause_drive_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_pause('drive-backup', format,
+            self.do_test_compress_pause('drive-backup', format, attach=False,
                                         target=blockdev_target_img, mode='existing')
 
     def test_compress_pause_blockdev_backup(self):
         for format in TestDriveCompression.fmt_supports_compression:
-            self.do_test_compress_pause('blockdev-backup', format, target='drive1')
+            self.do_test_compress_pause('blockdev-backup', format, attach=True,
+                                        target='drive1')
 
 if __name__ == '__main__':
     iotests.main(supported_fmts=['raw', 'qcow2'])
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 07/16] iotests: 030: Read-only open image for getting map
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (5 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 06/16] iotests: 055: Don't attach the drive to vm for drive-backup Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  0:11   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 08/16] iotests: 087: Don't attach test image twice Fam Zheng
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/030 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index 54db54a..fe0c73f 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -114,7 +114,7 @@ class TestSingleDrive(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
 
         # The image map is empty before the operation
-        empty_map = qemu_io('-f', iotests.imgfmt, '-c', 'map', test_img)
+        empty_map = qemu_io('-f', iotests.imgfmt, '-c', 'map', '-r', test_img)
 
         # This is a no-op: no data should ever be copied from the base image
         result = self.vm.qmp('block-stream', device='drive0', base=mid_img)
@@ -125,7 +125,7 @@ class TestSingleDrive(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
         self.vm.shutdown()
 
-        self.assertEqual(qemu_io('-f', iotests.imgfmt, '-c', 'map', test_img),
+        self.assertEqual(qemu_io('-f', iotests.imgfmt, '-c', 'map', '-r', test_img),
                          empty_map, 'image file map changed after a no-op')
 
     def test_stream_partial(self):
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 08/16] iotests: 087: Don't attach test image twice
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (6 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 07/16] iotests: 030: Read-only open image for getting map Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  0:14   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 09/16] iotests: 085: Avoid image locking conflict Fam Zheng
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

The test scenario doesn't require the same image, instead it focuses on
the duplicated node-name, so use null-co to avoid locking conflict.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/087 | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087
index 9de57dd..6d52f7d 100755
--- a/tests/qemu-iotests/087
+++ b/tests/qemu-iotests/087
@@ -82,8 +82,7 @@ run_qemu -drive driver=$IMGFMT,id=disk,node-name=test-node,file="$TEST_IMG" <<EO
       "driver": "$IMGFMT",
       "node-name": "disk",
       "file": {
-          "driver": "file",
-          "filename": "$TEST_IMG"
+          "driver": "null-co"
       }
     }
   }
@@ -92,8 +91,7 @@ run_qemu -drive driver=$IMGFMT,id=disk,node-name=test-node,file="$TEST_IMG" <<EO
       "driver": "$IMGFMT",
       "node-name": "test-node",
       "file": {
-          "driver": "file",
-          "filename": "$TEST_IMG"
+          "driver": "null-co"
       }
     }
   }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 09/16] iotests: 085: Avoid image locking conflict
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (7 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 08/16] iotests: 087: Don't attach test image twice Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  0:25   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 10/16] iotests: 091: Quit QEMU before checking image Fam Zheng
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

In the case where we test the expected error when a blockdev-snapshot
target already has a backing image, backing chain is opened multiple
times. This will be a problem when we use image locking, so use a
different backing file that is not already open.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/085     | 32 +++++++++++++++++++-------------
 tests/qemu-iotests/085.out |  3 ++-
 2 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/085 b/tests/qemu-iotests/085
index c53e97f..c91d78d 100755
--- a/tests/qemu-iotests/085
+++ b/tests/qemu-iotests/085
@@ -87,24 +87,26 @@ function create_group_snapshot()
 }
 
 # ${1}: unique identifier for the snapshot filename
-# ${2}: true: open backing images; false: don't open them (default)
+# ${2}: extra_params to the blockdev-add command
+# ${3}: filename
+function do_blockdev_add()
+{
+    cmd="{ 'execute': 'blockdev-add', 'arguments':
+           { 'driver': 'qcow2', 'node-name': 'snap_${1}', ${2}
+             'file':
+             { 'driver': 'file', 'filename': '${3}',
+               'node-name': 'file_${1}' } } }"
+    _send_qemu_cmd $h "${cmd}" "return"
+}
+
+# ${1}: unique identifier for the snapshot filename
 function add_snapshot_image()
 {
-    if [ "${2}" = "true" ]; then
-        extra_params=""
-    else
-        extra_params="'backing': '', "
-    fi
     base_image="${TEST_DIR}/$((${1}-1))-${snapshot_virt0}"
     snapshot_file="${TEST_DIR}/${1}-${snapshot_virt0}"
     _make_test_img -b "${base_image}" "$size"
     mv "${TEST_IMG}" "${snapshot_file}"
-    cmd="{ 'execute': 'blockdev-add', 'arguments':
-           { 'driver': 'qcow2', 'node-name': 'snap_${1}', ${extra_params}
-             'file':
-             { 'driver': 'file', 'filename': '${snapshot_file}',
-               'node-name': 'file_${1}' } } }"
-    _send_qemu_cmd $h "${cmd}" "return"
+    do_blockdev_add "$1" "'backing': '', " "${snapshot_file}"
 }
 
 # ${1}: unique identifier for the snapshot filename
@@ -222,7 +224,11 @@ echo === Invalid command - snapshot node has a backing image ===
 echo
 
 SNAPSHOTS=$((${SNAPSHOTS}+1))
-add_snapshot_image ${SNAPSHOTS} true
+
+_make_test_img "$size"
+mv "${TEST_IMG}" "${TEST_IMG}.base"
+_make_test_img -b "${TEST_IMG}.base" "$size"
+do_blockdev_add ${SNAPSHOTS} "" "${TEST_IMG}"
 blockdev_snapshot ${SNAPSHOTS} error
 
 echo
diff --git a/tests/qemu-iotests/085.out b/tests/qemu-iotests/085.out
index 08e4bb7..7bbf84d 100644
--- a/tests/qemu-iotests/085.out
+++ b/tests/qemu-iotests/085.out
@@ -78,7 +78,8 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/
 
 === Invalid command - snapshot node has a backing image ===
 
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/12-snapshot-v0.IMGFMT
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/t.IMGFMT.base
 {"return": {}}
 {"error": {"class": "GenericError", "desc": "The snapshot already has a backing image"}}
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 10/16] iotests: 091: Quit QEMU before checking image
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (8 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 09/16] iotests: 085: Avoid image locking conflict Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  0:32   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 11/16] iotests: 172: Use separate images for multiple devices Fam Zheng
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/091 | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/091 b/tests/qemu-iotests/091
index 32bbd56..10ac4a8 100755
--- a/tests/qemu-iotests/091
+++ b/tests/qemu-iotests/091
@@ -95,7 +95,9 @@ echo "vm2: qemu process running successfully"
 echo "vm2: flush io, and quit"
 _send_qemu_cmd $h2 'qemu-io disk flush' "(qemu)"
 _send_qemu_cmd $h2 'quit' ""
+_send_qemu_cmd $h1 'quit' ""
 
+wait
 echo "Check image pattern"
 ${QEMU_IO} -c "read -P 0x22 0 4M" "${TEST_IMG}" | _filter_testdir | _filter_qemu_io
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 11/16] iotests: 172: Use separate images for multiple devices
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (9 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 10/16] iotests: 091: Quit QEMU before checking image Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  0:43   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 12/16] tests: Use null-co:// instead of /dev/null as the dummy image Fam Zheng
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

To avoid image lock failures.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/172     | 53 ++++++++++++++++++++++++----------------------
 tests/qemu-iotests/172.out | 50 ++++++++++++++++++++++---------------------
 2 files changed, 54 insertions(+), 49 deletions(-)

diff --git a/tests/qemu-iotests/172 b/tests/qemu-iotests/172
index 1b7d3a1..f9d4cff 100755
--- a/tests/qemu-iotests/172
+++ b/tests/qemu-iotests/172
@@ -86,6 +86,9 @@ size=720k
 
 _make_test_img $size
 
+TEST_IMG="$TEST_IMG.2" _make_test_img $size
+TEST_IMG="$TEST_IMG.3" _make_test_img $size
+
 # Default drive semantics:
 #
 # By default you get a single empty floppy drive. You can override it with
@@ -105,7 +108,7 @@ echo === Using -fda/-fdb options ===
 
 check_floppy_qtree -fda "$TEST_IMG"
 check_floppy_qtree -fdb "$TEST_IMG"
-check_floppy_qtree -fda "$TEST_IMG" -fdb "$TEST_IMG"
+check_floppy_qtree -fda "$TEST_IMG" -fdb "$TEST_IMG.2"
 
 
 echo
@@ -114,7 +117,7 @@ echo === Using -drive options ===
 
 check_floppy_qtree -drive if=floppy,file="$TEST_IMG"
 check_floppy_qtree -drive if=floppy,file="$TEST_IMG",index=1
-check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=floppy,file="$TEST_IMG",index=1
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=floppy,file="$TEST_IMG.2",index=1
 
 echo
 echo
@@ -122,7 +125,7 @@ echo === Using -drive if=none and -global ===
 
 check_floppy_qtree -drive if=none,file="$TEST_IMG" -global isa-fdc.driveA=none0
 check_floppy_qtree -drive if=none,file="$TEST_IMG" -global isa-fdc.driveB=none0
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveA=none0 -global isa-fdc.driveB=none1
 
 echo
@@ -131,7 +134,7 @@ echo === Using -drive if=none and -device ===
 
 check_floppy_qtree -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
 check_floppy_qtree -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG",file.disable-lock=on -drive if=none,file="$TEST_IMG.2" \
                    -device floppy,drive=none0 -device floppy,drive=none1,unit=1
 
 echo
@@ -139,58 +142,58 @@ echo
 echo === Mixing -fdX and -global ===
 
 # Working
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global isa-fdc.driveB=none0
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global isa-fdc.driveA=none0
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -global isa-fdc.driveB=none0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -global isa-fdc.driveA=none0
 
 # Conflicting (-fdX wins)
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global isa-fdc.driveA=none0
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global isa-fdc.driveB=none0
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -global isa-fdc.driveA=none0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -global isa-fdc.driveB=none0
 
 echo
 echo
 echo === Mixing -fdX and -device ===
 
 # Working
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=1
 
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=0
 
 # Conflicting
-check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=0
-check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=1
 
 echo
 echo
 echo === Mixing -drive and -device ===
 
 # Working
-check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
-check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=1
 
 # Conflicting
-check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=0
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" -device floppy,drive=none0,unit=0
 
 echo
 echo
 echo === Mixing -global and -device ===
 
 # Working
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveA=none0 -device floppy,drive=none1
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=1
 
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveB=none0 -device floppy,drive=none1
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=0
 
 # Conflicting
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=0
-check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
                    -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=1
 
 echo
@@ -199,8 +202,8 @@ echo === Too many floppy drives ===
 
 # Working
 check_floppy_qtree -drive if=floppy,file="$TEST_IMG" \
-                   -drive if=none,file="$TEST_IMG" \
-                   -drive if=none,file="$TEST_IMG" \
+                   -drive if=none,file="$TEST_IMG.2" \
+                   -drive if=none,file="$TEST_IMG.3" \
                    -global isa-fdc.driveB=none0 \
                    -device floppy,drive=none1
 
diff --git a/tests/qemu-iotests/172.out b/tests/qemu-iotests/172.out
index 6b7edaf..7f557ac 100644
--- a/tests/qemu-iotests/172.out
+++ b/tests/qemu-iotests/172.out
@@ -1,5 +1,7 @@
 QA output created by 172
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=737280
+Formatting 'TEST_DIR/t.IMGFMT.2', fmt=IMGFMT size=737280
+Formatting 'TEST_DIR/t.IMGFMT.3', fmt=IMGFMT size=737280
 
 
 === Default ===
@@ -95,7 +97,7 @@ Testing: -fdb TEST_DIR/t.qcow2
                 write-cache = "auto"
                 drive-type = "288"
 
-Testing: -fda TEST_DIR/t.qcow2 -fdb TEST_DIR/t.qcow2
+Testing: -fda TEST_DIR/t.qcow2 -fdb TEST_DIR/t.qcow2.2
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -196,7 +198,7 @@ Testing: -drive if=floppy,file=TEST_DIR/t.qcow2,index=1
                 write-cache = "auto"
                 drive-type = "288"
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=floppy,file=TEST_DIR/t.qcow2,index=1
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=floppy,file=TEST_DIR/t.qcow2.2,index=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -287,7 +289,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0 -global isa-fdc.driveB=none1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0 -global isa-fdc.driveB=none1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -378,7 +380,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=1
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0 -device floppy,drive=none1,unit=1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2,file.disable-lock=on -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0 -device floppy,drive=none1,unit=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -417,7 +419,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
 
 === Mixing -fdX and -global ===
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -453,7 +455,7 @@ Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -489,7 +491,7 @@ Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -515,7 +517,7 @@ Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -544,7 +546,7 @@ Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-
 
 === Mixing -fdX and -device ===
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -580,7 +582,7 @@ Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device flop
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=1
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -616,7 +618,7 @@ Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device flop
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -652,7 +654,7 @@ Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device flop
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=0
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -688,18 +690,18 @@ Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device flop
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=0
+Testing: -fda TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=0
 QEMU_PROG: -device floppy,drive=none0,unit=0: Floppy unit 0 is in use
 QEMU_PROG: -device floppy,drive=none0,unit=0: Device initialization failed.
 
-Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=1
+Testing: -fdb TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=1
 QEMU_PROG: -device floppy,drive=none0,unit=1: Floppy unit 1 is in use
 QEMU_PROG: -device floppy,drive=none0,unit=1: Device initialization failed.
 
 
 === Mixing -drive and -device ===
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -735,7 +737,7 @@ Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.q
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=1
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -771,14 +773,14 @@ Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.q
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -device floppy,drive=none0,unit=0
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -device floppy,drive=none0,unit=0
 QEMU_PROG: -device floppy,drive=none0,unit=0: Floppy unit 0 is in use
 QEMU_PROG: -device floppy,drive=none0,unit=0: Device initialization failed.
 
 
 === Mixing -global and -device ===
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0 -device floppy,drive=none1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0 -device floppy,drive=none1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -814,7 +816,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -850,7 +852,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0 -device floppy,drive=none1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0 -device floppy,drive=none1
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -886,7 +888,7 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=0
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=0
 
           dev: isa-fdc, id ""
             iobase = 1008 (0x3f0)
@@ -922,18 +924,18 @@ Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qco
                 write-cache = "auto"
                 drive-type = "144"
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=0
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveA=none0 -device floppy,drive=none1,unit=0
 QEMU_PROG: -device floppy,drive=none1,unit=0: Floppy unit 0 is in use
 QEMU_PROG: -device floppy,drive=none1,unit=0: Device initialization failed.
 
-Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=1
+Testing: -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -global isa-fdc.driveB=none0 -device floppy,drive=none1,unit=1
 QEMU_PROG: -device floppy,drive=none1,unit=1: Floppy unit 1 is in use
 QEMU_PROG: -device floppy,drive=none1,unit=1: Device initialization failed.
 
 
 === Too many floppy drives ===
 
-Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2 -global isa-fdc.driveB=none0 -device floppy,drive=none1
+Testing: -drive if=floppy,file=TEST_DIR/t.qcow2 -drive if=none,file=TEST_DIR/t.qcow2.2 -drive if=none,file=TEST_DIR/t.qcow2.3 -global isa-fdc.driveB=none0 -device floppy,drive=none1
 QEMU_PROG: -device floppy,drive=none1: Can't create floppy unit 2, bus supports only 2 units
 QEMU_PROG: -device floppy,drive=none1: Device initialization failed.
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 12/16] tests: Use null-co:// instead of /dev/null as the dummy image
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (10 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 11/16] iotests: 172: Use separate images for multiple devices Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  0:45   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 13/16] tests: Disable image lock in test-replication Fam Zheng
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/drive_del-test.c    | 2 +-
 tests/nvme-test.c         | 2 +-
 tests/usb-hcd-uhci-test.c | 2 +-
 tests/usb-hcd-xhci-test.c | 2 +-
 tests/virtio-blk-test.c   | 2 +-
 tests/virtio-scsi-test.c  | 4 ++--
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tests/drive_del-test.c b/tests/drive_del-test.c
index 121b9c9..2175139 100644
--- a/tests/drive_del-test.c
+++ b/tests/drive_del-test.c
@@ -92,7 +92,7 @@ static void test_after_failed_device_add(void)
 static void test_drive_del_device_del(void)
 {
     /* Start with a drive used by a device that unplugs instantaneously */
-    qtest_start("-drive if=none,id=drive0,file=/dev/null,format=raw"
+    qtest_start("-drive if=none,id=drive0,file=null-co://,format=raw"
                 " -device virtio-scsi-pci"
                 " -device scsi-hd,drive=drive0,id=dev0");
 
diff --git a/tests/nvme-test.c b/tests/nvme-test.c
index c8bece4..7674a44 100644
--- a/tests/nvme-test.c
+++ b/tests/nvme-test.c
@@ -22,7 +22,7 @@ int main(int argc, char **argv)
     g_test_init(&argc, &argv, NULL);
     qtest_add_func("/nvme/nop", nop);
 
-    qtest_start("-drive id=drv0,if=none,file=/dev/null,format=raw "
+    qtest_start("-drive id=drv0,if=none,file=null-co://,format=raw "
                 "-device nvme,drive=drv0,serial=foo");
     ret = g_test_run();
 
diff --git a/tests/usb-hcd-uhci-test.c b/tests/usb-hcd-uhci-test.c
index e956b9c..24c3ea9 100644
--- a/tests/usb-hcd-uhci-test.c
+++ b/tests/usb-hcd-uhci-test.c
@@ -78,7 +78,7 @@ int main(int argc, char **argv)
 {
     const char *arch = qtest_get_arch();
     const char *cmd = "-device piix3-usb-uhci,id=uhci,addr=1d.0"
-                      " -drive id=drive0,if=none,file=/dev/null,format=raw"
+                      " -drive id=drive0,if=none,file=null-co://,format=raw"
                       " -device usb-tablet,bus=uhci.0,port=1";
     int ret;
 
diff --git a/tests/usb-hcd-xhci-test.c b/tests/usb-hcd-xhci-test.c
index 22513e9..031764d 100644
--- a/tests/usb-hcd-xhci-test.c
+++ b/tests/usb-hcd-xhci-test.c
@@ -89,7 +89,7 @@ int main(int argc, char **argv)
     qtest_add_func("/xhci/pci/hotplug/usb-uas", test_usb_uas_hotplug);
 
     qtest_start("-device nec-usb-xhci,id=xhci"
-                " -drive id=drive0,if=none,file=/dev/null,format=raw");
+                " -drive id=drive0,if=none,file=null-co://,format=raw");
     ret = g_test_run();
     qtest_end();
 
diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index 0e32e41..0fc7c99 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -63,7 +63,7 @@ static QOSState *pci_test_start(void)
     const char *arch = qtest_get_arch();
     char *tmp_path;
     const char *cmd = "-drive if=none,id=drive0,file=%s,format=raw "
-                      "-drive if=none,id=drive1,file=/dev/null,format=raw "
+                      "-drive if=none,id=drive1,file=null-co://,format=raw "
                       "-device virtio-blk-pci,id=drv0,drive=drive0,"
                       "addr=%x.%x";
 
diff --git a/tests/virtio-scsi-test.c b/tests/virtio-scsi-test.c
index 69220ef..8c27641 100644
--- a/tests/virtio-scsi-test.c
+++ b/tests/virtio-scsi-test.c
@@ -35,7 +35,7 @@ typedef struct {
 static QOSState *qvirtio_scsi_start(const char *extra_opts)
 {
     const char *arch = qtest_get_arch();
-    const char *cmd = "-drive id=drv0,if=none,file=/dev/null,format=raw "
+    const char *cmd = "-drive id=drv0,if=none,file=null-co://,format=raw "
                       "-device virtio-scsi-pci,id=vs0 "
                       "-device scsi-hd,bus=vs0.0,drive=drv0 %s";
 
@@ -195,7 +195,7 @@ static void hotplug(void)
     QDict *response;
     QOSState *qs;
 
-    qs = qvirtio_scsi_start("-drive id=drv1,if=none,file=/dev/null,format=raw");
+    qs = qvirtio_scsi_start("-drive id=drv1,if=none,file=null-co://,format=raw");
     response = qmp("{\"execute\": \"device_add\","
                    " \"arguments\": {"
                    "   \"driver\": \"scsi-hd\","
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 13/16] tests: Disable image lock in test-replication
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (11 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 12/16] tests: Use null-co:// instead of /dev/null as the dummy image Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  0:56   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking Fam Zheng
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

The COLO block replication architecture requires one disk to be shared
between primary and secondary, in the test both processes use posix file
protocol (instead of over NBD) so it is affected by image locking.
Disable the lock.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/test-replication.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tests/test-replication.c b/tests/test-replication.c
index fac2da3..5bede49 100644
--- a/tests/test-replication.c
+++ b/tests/test-replication.c
@@ -179,7 +179,8 @@ static BlockBackend *start_primary(void)
     char *cmdline;
 
     cmdline = g_strdup_printf("driver=replication,mode=primary,node-name=xxx,"
-                              "file.driver=qcow2,file.file.filename=%s"
+                              "file.driver=qcow2,file.file.filename=%s,"
+                              "file.file.disable-lock=on"
                               , p_local_disk);
     opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
     g_free(cmdline);
@@ -310,7 +311,9 @@ static BlockBackend *start_secondary(void)
     Error *local_err = NULL;
 
     /* add s_local_disk and forge S_LOCAL_DISK_ID */
-    cmdline = g_strdup_printf("file.filename=%s,driver=qcow2", s_local_disk);
+    cmdline = g_strdup_printf("file.filename=%s,driver=qcow2,"
+                              "file.file.disable-lock=on",
+                              s_local_disk);
     opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
     g_free(cmdline);
 
@@ -331,8 +334,10 @@ static BlockBackend *start_secondary(void)
     /* add S_(ACTIVE/HIDDEN)_DISK and forge S_ID */
     cmdline = g_strdup_printf("driver=replication,mode=secondary,top-id=%s,"
                               "file.driver=qcow2,file.file.filename=%s,"
+                              "file.file.disable-lock=on",
                               "file.backing.driver=qcow2,"
                               "file.backing.file.filename=%s,"
+                              "file.backing.file.disable-lock=on",
                               "file.backing.backing=%s"
                               , S_ID, s_active_disk, s_hidden_disk
                               , S_LOCAL_DISK_ID);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (12 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 13/16] tests: Disable image lock in test-replication Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08  3:05   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 15/16] qcow2: Force "no other writer" lock on bs->file Fam Zheng
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

This implements open flag sensible image locking for local file
and host device protocol.

virtlockd in libvirt locks the first byte, so we start looking at the
file bytes from 1.

Quoting what was proposed by Kevin Wolf <kwolf@redhat.com>, there are
four locking modes by combining two bits (BDRV_O_RDWR and
BDRV_O_SHARE_RW), and implemented by taking two locks.

The complication is in the transactional reopen.  To make the reopen
logic managable, and allow better reuse, the code is internally
organized with a table from old mode to the new one.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/file-posix.c | 681 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 678 insertions(+), 3 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 28b47d9..a8c76d6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -131,8 +131,45 @@ do { \
 
 #define MAX_BLOCKSIZE	4096
 
+/* Posix file locking bytes. Libvirt takes byte 0, we start from byte 0x10,
+ * leaving a few more bytes for its future use. */
+#define RAW_LOCK_BYTE_MIN             0x10
+#define RAW_LOCK_BYTE_NO_OTHER_WRITER 0x10
+#define RAW_LOCK_BYTE_WRITE           0x11
+#ifdef F_OFD_SETLK
+#define RAW_LOCK_SUPPORTED 1
+#else
+#define RAW_LOCK_SUPPORTED 0
+#endif
+
+/*
+ ** reader that can tolerate writers: Don't do anything
+ *
+ ** reader that can't tolerate writers: Take shared lock on byte 1. Test
+ *  byte 2 is unlocked.
+ *
+ ** shared writer: Take shared lock on byte 2. Test byte 1 is unlocked.
+ *
+ ** exclusive writer: Take exclusive locks on both bytes.
+ */
+
+typedef enum {
+    /* Read only and accept other writers. */
+    RAW_L_READ_SHARE_RW,
+    /* Read only and try to forbid other writers. */
+    RAW_L_READ,
+    /* Read write and accept other writers. */
+    RAW_L_WRITE_SHARE_RW,
+    /* Read write and try to forbid other writers. */
+    RAW_L_WRITE,
+} BDRVRawLockMode;
+
 typedef struct BDRVRawState {
     int fd;
+    /* A dup of @fd to make manipulating lock easier, especially during reopen,
+     * where this will accept BDRVRawReopenState.lock_fd. */
+    int lock_fd;
+    bool disable_lock;
     int type;
     int open_flags;
     size_t buf_align;
@@ -146,11 +183,15 @@ typedef struct BDRVRawState {
     bool use_linux_aio:1;
     bool has_fallocate;
     bool needs_alignment;
+    BDRVRawLockMode cur_lock_mode;
 } BDRVRawState;
 
 typedef struct BDRVRawReopenState {
     int fd;
+    /* A dup of @fd used for acquiring lock. */
+    int lock_fd;
     int open_flags;
+    bool disable_lock;
 } BDRVRawReopenState;
 
 static int fd_open(BlockDriverState *bs);
@@ -368,6 +409,59 @@ static void raw_parse_flags(int bdrv_flags, int *open_flags)
     }
 }
 
+static int raw_lock_fd(int fd, BDRVRawLockMode mode, Error **errp)
+{
+    int ret;
+    assert(fd >= 0);
+    assert(RAW_LOCK_SUPPORTED);
+    switch (mode) {
+    case RAW_L_READ_SHARE_RW:
+        ret = qemu_unlock_fd(fd, RAW_LOCK_BYTE_MIN, 2);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to unlock fd");
+            goto fail;
+        }
+        break;
+    case RAW_L_READ:
+        ret = qemu_lock_fd(fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock share byte");
+            goto fail;
+        }
+        ret = qemu_lock_fd_test(fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Write byte lock is taken");
+            goto fail;
+        }
+        break;
+    case RAW_L_WRITE_SHARE_RW:
+        ret = qemu_lock_fd(fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock write byte");
+            goto fail;
+        }
+        ret = qemu_lock_fd_test(fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Share byte lock is taken");
+            goto fail;
+        }
+        break;
+    case RAW_L_WRITE:
+        ret = qemu_lock_fd(fd, RAW_LOCK_BYTE_MIN, 2, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock image");
+            goto fail;
+        }
+        break;
+    default:
+        abort();
+    }
+    return 0;
+fail:
+    qemu_unlock_fd(fd, RAW_LOCK_BYTE_MIN, 2);
+    return ret;
+}
+
 static void raw_parse_filename(const char *filename, QDict *options,
                                Error **errp)
 {
@@ -393,10 +487,88 @@ static QemuOptsList raw_runtime_opts = {
             .type = QEMU_OPT_STRING,
             .help = "host AIO implementation (threads, native)",
         },
+        {
+            .name = "disable-lock",
+            .type = QEMU_OPT_BOOL,
+            .help = "don't lock the file",
+        },
         { /* end of list */ }
     },
 };
 
+static BDRVRawLockMode raw_get_lock_mode(int flags)
+{
+    switch (flags & (BDRV_O_RDWR | BDRV_O_SHARE_RW)) {
+    case BDRV_O_RDWR:
+        return RAW_L_WRITE;
+    case BDRV_O_RDWR | BDRV_O_SHARE_RW:
+        return RAW_L_WRITE_SHARE_RW;
+    case BDRV_O_SHARE_RW:
+        return RAW_L_READ_SHARE_RW;
+    case 0:
+        return RAW_L_READ;
+    default:
+        abort();
+    }
+}
+
+static int raw_open_lockfd(const char *filename, int flags,
+                           BDRVRawLockMode *lock_mode, Error **errp)
+{
+    int ret = -1;
+    const char *normalized_filename = filename;
+
+    ret = raw_normalize_devicepath(&normalized_filename);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Could not normalize device path");
+    } else {
+        assert(!(flags & O_CREAT));
+        ret = qemu_open(normalized_filename, flags);
+        if (ret == -1) {
+            error_setg_errno(errp, errno, "Could not open file: %s",
+                             normalized_filename);
+            ret = -errno;
+        }
+    }
+    return ret;
+}
+
+static bool raw_lock_enabled(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+
+    return !(s->disable_lock || bdrv_get_flags(bs) & BDRV_O_INACTIVE);
+}
+
+static int raw_apply_image_lock(BlockDriverState *bs, int bdrv_flags,
+                                Error **errp)
+{
+    int ret;
+    BDRVRawState *s = bs->opaque;
+    BDRVRawLockMode lock_mode;
+
+    if (!raw_lock_enabled(bs)) {
+        return 0;
+    }
+    assert(s->cur_lock_mode == RAW_L_READ_SHARE_RW);
+    lock_mode = raw_get_lock_mode(bdrv_flags);
+    ret = raw_open_lockfd(bs->exact_filename, s->open_flags, &lock_mode,
+                          errp);
+    if (ret < 0) {
+        return ret;
+    }
+    s->lock_fd = ret;
+    if (lock_mode == RAW_L_READ_SHARE_RW) {
+        return 0;
+    }
+    ret = raw_lock_fd(s->lock_fd, lock_mode, errp);
+    if (ret) {
+        return ret;
+    }
+    s->cur_lock_mode = lock_mode;
+    return 0;
+}
+
 static int raw_open_common(BlockDriverState *bs, QDict *options,
                            int bdrv_flags, int open_flags, Error **errp)
 {
@@ -440,6 +612,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
     raw_parse_flags(bdrv_flags, &s->open_flags);
 
     s->fd = -1;
+    s->lock_fd = -1;
     fd = qemu_open(filename, s->open_flags, 0644);
     if (fd < 0) {
         ret = -errno;
@@ -451,6 +624,15 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
     }
     s->fd = fd;
 
+    s->disable_lock = qemu_opt_get_bool(opts, "disable-lock", false);
+
+    if (RAW_LOCK_SUPPORTED) {
+        ret = raw_apply_image_lock(bs, bdrv_flags, errp);
+        if (ret) {
+            goto fail;
+        }
+    }
+
 #ifdef CONFIG_LINUX_AIO
      /* Currently Linux does AIO only for files opened with O_DIRECT */
     if (s->use_linux_aio && !(s->open_flags & O_DIRECT)) {
@@ -538,6 +720,465 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
     return raw_open_common(bs, options, flags, 0, errp);
 }
 
+typedef enum {
+    RAW_LT_PREPARE,
+    RAW_LT_COMMIT,
+    RAW_LT_ABORT
+} RawLockTransOp;
+
+typedef int (*RawReopenFunc)(RawLockTransOp op,
+                             int old_lock_fd, int new_lock_fd,
+                             BDRVRawLockMode old_lock,
+                             BDRVRawLockMode new_lock,
+                             Error **errp);
+
+static int raw_lt_nop(RawLockTransOp op,
+                      int old_lock_fd, int new_lock_fd,
+                      BDRVRawLockMode old_lock,
+                      BDRVRawLockMode new_lock,
+                      Error **errp)
+{
+    assert(old_lock == new_lock || new_lock == RAW_L_READ_SHARE_RW);
+    return 0;
+}
+
+static int raw_lt_from_unlock(RawLockTransOp op,
+                              int old_lock_fd, int new_lock_fd,
+                              BDRVRawLockMode old_lock,
+                              BDRVRawLockMode new_lock,
+                              Error **errp)
+{
+    assert(old_lock != new_lock);
+    assert(old_lock == RAW_L_READ_SHARE_RW);
+    switch (op) {
+    case RAW_LT_PREPARE:
+        return raw_lock_fd(new_lock_fd, new_lock, errp);
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        break;
+    }
+
+    return 0;
+}
+
+static int raw_lt_read_to_write_share(RawLockTransOp op,
+                                      int old_lock_fd, int new_lock_fd,
+                                      BDRVRawLockMode old_lock,
+                                      BDRVRawLockMode new_lock,
+                                      Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_READ);
+    assert(new_lock == RAW_L_WRITE_SHARE_RW);
+
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                S                           0
+     * new                0                           S
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        ret = qemu_unlock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to unlock old fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to upgrade new fd (share byte)");
+            break;
+        }
+        ret = qemu_unlock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1);
+        if (ret) {
+            /* This is very unlikely, but catch it anyway. */
+            error_setg_errno(errp, -ret, "Failed to unlock new fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to downgrade new fd (write byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_report("Failed to restore lock on old fd (share byte)");
+        }
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_read_to_write(RawLockTransOp op,
+                                int old_lock_fd, int new_lock_fd,
+                                BDRVRawLockMode old_lock,
+                                BDRVRawLockMode new_lock,
+                                Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_READ);
+    assert(new_lock == RAW_L_WRITE);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                S                           0
+     * new                X                           X
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        ret = qemu_unlock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to unlock old fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to upgrade new fd (share byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to downgrade new fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_report("Failed to restore lock on old fd (share byte)");
+        }
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_write_share_to_read(RawLockTransOp op,
+                                      int old_lock_fd, int new_lock_fd,
+                                      BDRVRawLockMode old_lock,
+                                      BDRVRawLockMode new_lock,
+                                      Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_WRITE_SHARE_RW);
+    assert(new_lock == RAW_L_READ);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                0                           S
+     * new                S                           0
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        /* Make sure there are no other writers. */
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock old fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_write_share_to_write(RawLockTransOp op,
+                                       int old_lock_fd, int new_lock_fd,
+                                       BDRVRawLockMode old_lock,
+                                       BDRVRawLockMode new_lock,
+                                       Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_WRITE_SHARE_RW);
+    assert(new_lock == RAW_L_WRITE);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                0                           S
+     * new                X                           X
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        /* Make sure there are no other writers. */
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (write byte)");
+            break;
+        }
+        ret = qemu_unlock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to unlock old fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to upgrade new fd (write byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to downgrade new fd (write byte)");
+            break;
+        }
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to restore old fd (write byte)");
+            break;
+        }
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_write_to_read(RawLockTransOp op,
+                                int old_lock_fd, int new_lock_fd,
+                                BDRVRawLockMode old_lock,
+                                BDRVRawLockMode new_lock,
+                                Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_WRITE);
+    assert(new_lock == RAW_L_READ);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                X                           X
+     * new                S                           0
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to downgrade old fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
+            break;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        break;
+    case RAW_LT_ABORT:
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to restore old fd (share byte)");
+        }
+        break;
+    }
+    return ret;
+}
+
+static int raw_lt_write_to_write_share(RawLockTransOp op,
+                                       int old_lock_fd, int new_lock_fd,
+                                       BDRVRawLockMode old_lock,
+                                       BDRVRawLockMode new_lock,
+                                       Error **errp)
+{
+    int ret = 0;
+
+    assert(old_lock == RAW_L_WRITE);
+    assert(new_lock == RAW_L_WRITE_SHARE_RW);
+    /*
+     *        lock byte "no other writer"      lock byte "write"
+     * old                X                           X
+     * new                0                           S
+     *
+     * (0 = unlocked; S = shared; X = exclusive.)
+     */
+    switch (op) {
+    case RAW_LT_PREPARE:
+        break;
+    case RAW_LT_COMMIT:
+        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_report("Failed to downgrade old fd (share byte)");
+            break;
+        }
+        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
+        if (ret) {
+            error_report("Failed to unlock new fd (share byte)");
+            break;
+        }
+        break;
+    case RAW_LT_ABORT:
+        break;
+    }
+    return ret;
+}
+
+/**
+ * Transactionally moving between possible locking states is tricky and must be
+ * done carefully. That is mostly because downgrading an exclusive lock to
+ * shared or unlocked is not guaranteed to be revertible. As a result, in such
+ * cases we have to defer the downgrading to "commit", given that no revert will
+ * happen after that point, and that downgrading a lock should never fail.
+ *
+ * On the other hand, upgrading a lock (e.g. from unlocked or shared to
+ * exclusive lock) must happen in "prepare" because it may fail.
+ *
+ * Manage the operation matrix with this state transition table to make
+ * fulfilling above conditions easier.
+ */
+static const struct RawReopenFuncRecord {
+    BDRVRawLockMode old_lock;
+    BDRVRawLockMode new_lock;
+    RawReopenFunc func;
+    bool need_lock_fd;
+    bool close_old_lock_fd;
+} reopen_functions[] = {
+
+    {RAW_L_READ_SHARE_RW, RAW_L_READ_SHARE_RW, raw_lt_nop, false, false},
+    {RAW_L_READ_SHARE_RW, RAW_L_READ, raw_lt_from_unlock, true},
+    {RAW_L_READ_SHARE_RW, RAW_L_WRITE_SHARE_RW, raw_lt_from_unlock, true},
+    {RAW_L_READ_SHARE_RW, RAW_L_WRITE, raw_lt_from_unlock, true},
+
+    {RAW_L_READ, RAW_L_READ_SHARE_RW, raw_lt_nop, false, true},
+    {RAW_L_READ, RAW_L_READ, raw_lt_nop, false, false},
+    {RAW_L_READ, RAW_L_WRITE_SHARE_RW, raw_lt_read_to_write_share, true},
+    {RAW_L_READ, RAW_L_WRITE, raw_lt_read_to_write, true},
+
+    {RAW_L_WRITE_SHARE_RW, RAW_L_READ_SHARE_RW, raw_lt_nop, false, true},
+    {RAW_L_WRITE_SHARE_RW, RAW_L_READ, raw_lt_write_share_to_read, true},
+    {RAW_L_WRITE_SHARE_RW, RAW_L_WRITE_SHARE_RW, raw_lt_nop, false, false},
+    {RAW_L_WRITE_SHARE_RW, RAW_L_WRITE, raw_lt_write_share_to_write, true},
+
+    {RAW_L_WRITE, RAW_L_READ_SHARE_RW, raw_lt_nop, false, true},
+    {RAW_L_WRITE, RAW_L_READ, raw_lt_write_to_read, true},
+    {RAW_L_WRITE, RAW_L_WRITE_SHARE_RW, raw_lt_write_to_write_share, true},
+    {RAW_L_WRITE, RAW_L_WRITE, raw_lt_nop, false, false},
+};
+
+static int raw_reopen_handle_lock(BDRVReopenState *state,
+                                  RawLockTransOp op,
+                                  Error **errp)
+{
+    BDRVRawReopenState *rs = state->opaque;
+    BDRVRawState *s = state->bs->opaque;
+    BDRVRawLockMode old_lock, new_lock;
+    const struct RawReopenFuncRecord *rec;
+    int ret;
+
+    if (!RAW_LOCK_SUPPORTED) {
+        return 0;
+    }
+    old_lock = s->cur_lock_mode;
+    rs->disable_lock = qdict_get_try_bool(state->options, "disable-lock",
+                                          false);
+    qdict_del(state->options, "disable-lock");
+
+    if (rs->disable_lock) {
+        new_lock = RAW_L_READ_SHARE_RW;
+    } else {
+        new_lock = raw_get_lock_mode(state->flags);
+    }
+
+    for (rec = &reopen_functions[0];
+         rec < &reopen_functions[ARRAY_SIZE(reopen_functions)];
+         rec++) {
+        if (rec->old_lock == old_lock && rec->new_lock == new_lock) {
+            break;
+        }
+    }
+    assert(rec != &reopen_functions[ARRAY_SIZE(reopen_functions)]);
+
+    switch (op) {
+    case RAW_LT_PREPARE:
+        if (rec->need_lock_fd) {
+            ret = raw_open_lockfd(state->bs->exact_filename,
+                                  rs->open_flags, &new_lock, errp);
+            if (ret < 0) {
+                return ret;
+            }
+            rs->lock_fd = ret;
+        } else {
+            rs->lock_fd = -1;
+        }
+        ret = rec->func(op, s->lock_fd, rs->lock_fd, old_lock, new_lock, errp);
+        if (!ret) {
+            return ret;
+        }
+        /* Only succeeded preparation will be reverted by block layer, we
+         * need to clean up this failure manually. */
+        op = RAW_LT_ABORT;
+        /* fall through */
+    case RAW_LT_ABORT:
+        rec->func(op, s->lock_fd, rs->lock_fd, old_lock, new_lock, &error_abort);
+        if (rs->lock_fd >= 0) {
+            qemu_close(rs->lock_fd);
+            rs->lock_fd = -1;
+        }
+        break;
+    case RAW_LT_COMMIT:
+        rec->func(op, s->lock_fd, rs->lock_fd, old_lock, new_lock, &error_abort);
+        if ((rec->need_lock_fd || rec->close_old_lock_fd) && s->lock_fd >= 0) {
+            qemu_close(s->lock_fd);
+            s->lock_fd = -1;
+        }
+        if (rec->need_lock_fd) {
+            s->lock_fd = rs->lock_fd;
+        }
+        s->cur_lock_mode = new_lock;
+        s->disable_lock = rs->disable_lock;
+        break;
+    }
+    return 0;
+}
+
 static int raw_reopen_prepare(BDRVReopenState *state,
                               BlockReopenQueue *queue, Error **errp)
 {
@@ -609,13 +1250,20 @@ static int raw_reopen_prepare(BDRVReopenState *state,
     if (rs->fd != -1) {
         raw_probe_alignment(state->bs, rs->fd, &local_err);
         if (local_err) {
-            qemu_close(rs->fd);
-            rs->fd = -1;
             error_propagate(errp, local_err);
             ret = -EINVAL;
+            goto fail;
         }
     }
+    ret = raw_reopen_handle_lock(state, RAW_LT_PREPARE, errp);
+    if (ret) {
+        goto fail;
+    }
 
+    return 0;
+fail:
+    qemu_close(rs->fd);
+    rs->fd = -1;
     return ret;
 }
 
@@ -626,6 +1274,8 @@ static void raw_reopen_commit(BDRVReopenState *state)
 
     s->open_flags = rs->open_flags;
 
+    raw_reopen_handle_lock(state, RAW_LT_COMMIT, &error_abort);
+
     qemu_close(s->fd);
     s->fd = rs->fd;
 
@@ -643,6 +1293,8 @@ static void raw_reopen_abort(BDRVReopenState *state)
         return;
     }
 
+    raw_reopen_handle_lock(state, RAW_LT_ABORT, &error_abort);
+
     if (rs->fd >= 0) {
         qemu_close(rs->fd);
         rs->fd = -1;
@@ -1332,6 +1984,10 @@ static void raw_close(BlockDriverState *bs)
         qemu_close(s->fd);
         s->fd = -1;
     }
+    if (s->lock_fd >= 0) {
+        qemu_close(s->lock_fd);
+        s->lock_fd = -1;
+    }
 }
 
 static int raw_truncate(BlockDriverState *bs, int64_t offset)
@@ -1832,6 +2488,24 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
     return 0;
 }
 
+static int raw_inactivate(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    int r = 0;
+
+    if (RAW_LOCK_SUPPORTED && s->cur_lock_mode != RAW_L_READ_SHARE_RW) {
+        r = raw_lock_fd(s->lock_fd, RAW_L_READ_SHARE_RW, NULL);
+    }
+    return r;
+}
+
+static void raw_invalidate_cache(BlockDriverState *bs, Error **errp)
+{
+    if (RAW_LOCK_SUPPORTED) {
+        raw_apply_image_lock(bs, bdrv_get_flags(bs), errp);
+    }
+}
+
 static QemuOptsList raw_create_opts = {
     .name = "raw-create-opts",
     .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
@@ -1885,7 +2559,8 @@ BlockDriver bdrv_file = {
     .bdrv_get_info = raw_get_info,
     .bdrv_get_allocated_file_size
                         = raw_get_allocated_file_size,
-
+    .bdrv_inactivate = raw_inactivate,
+    .bdrv_invalidate_cache = raw_invalidate_cache,
     .create_opts = &raw_create_opts,
 };
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 15/16] qcow2: Force "no other writer" lock on bs->file
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (13 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08 14:33   ` Max Reitz
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 16/16] tests: Add test-image-lock Fam Zheng
  2017-02-05 21:58 ` [Qemu-devel] [PATCH v12 00/16] block: Image locking series Max Reitz
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

Writing to the same qcow2 file from two QEMU processes at the same time
will never work correctly, so disable it even when the caller specifies
BDRV_O_RDWR.

Other formats are less vulnerable because they don't have internal
snapshots thus qemu-img is less often misused to create live snapshot.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/qcow2.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 96fb8a8..879361a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1177,6 +1177,17 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
         }
     }
 
+    if ((flags & BDRV_O_SHARE_RW) && (flags & BDRV_O_RDWR)) {
+        /* Shared write is never a good idea for qcow2, override it.
+         * XXX: Use permission propagation and masking mechanism in op blockers
+         * API once it's there. */
+        ret = bdrv_reopen(bs->file->bs, flags & ~BDRV_O_SHARE_RW, &local_err);
+        if (ret) {
+            error_propagate(errp, local_err);
+            goto fail;
+        }
+    }
+
 #ifdef DEBUG_ALLOC
     {
         BdrvCheckResult result = {0};
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH v12 16/16] tests: Add test-image-lock
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (14 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 15/16] qcow2: Force "no other writer" lock on bs->file Fam Zheng
@ 2017-01-23 12:30 ` Fam Zheng
  2017-02-08 14:52   ` Max Reitz
  2017-02-05 21:58 ` [Qemu-devel] [PATCH v12 00/16] block: Image locking series Max Reitz
  16 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-01-23 12:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones, Max Reitz

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/Makefile.include   |   2 +
 tests/test-image-lock.c  | 206 +++++++++++++++++++++++++++++++++++++++++++++++
 tests/test-replication.c |   6 +-
 3 files changed, 211 insertions(+), 3 deletions(-)
 create mode 100644 tests/test-image-lock.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 96f5970..7718a9b 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -55,6 +55,7 @@ check-unit-y += tests/test-hbitmap$(EXESUF)
 gcov-files-test-hbitmap-y = blockjob.c
 check-unit-y += tests/test-blockjob$(EXESUF)
 check-unit-y += tests/test-blockjob-txn$(EXESUF)
+check-unit-y += tests/test-image-lock$(EXESUF)
 check-unit-y += tests/test-x86-cpuid$(EXESUF)
 # all code tested by test-x86-cpuid is inside topology.h
 gcov-files-test-x86-cpuid-y =
@@ -516,6 +517,7 @@ tests/test-aio$(EXESUF): tests/test-aio.o $(test-block-obj-y)
 tests/test-throttle$(EXESUF): tests/test-throttle.o $(test-block-obj-y)
 tests/test-blockjob$(EXESUF): tests/test-blockjob.o $(test-block-obj-y) $(test-util-obj-y)
 tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(test-block-obj-y) $(test-util-obj-y)
+tests/test-image-lock$(EXESUF): tests/test-image-lock.o $(test-block-obj-y) $(libqos-obj-y)
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
 tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
 tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o $(test-util-obj-y)
diff --git a/tests/test-image-lock.c b/tests/test-image-lock.c
new file mode 100644
index 0000000..9c929ca
--- /dev/null
+++ b/tests/test-image-lock.c
@@ -0,0 +1,206 @@
+/*
+ * Image lock tests
+ *
+ * Copyright 2016 Red Hat, Inc.
+ *
+ * Authors:
+ *  Fam Zheng <famz@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qapi/qmp/qbool.h"
+#include "sysemu/block-backend.h"
+
+#define DEBUG_IMAGE_LOCK_TEST 0
+#define DPRINTF(...) do { \
+        if (DEBUG_IMAGE_LOCK_TEST) { \
+            printf(__VA_ARGS__); \
+        } \
+    } while (0)
+
+#define TEST_IMAGE_SIZE 4096
+static char test_image[] = "/tmp/qtest.XXXXXX";
+static int test_image_fd;
+
+static BlockBackend *open_test_image(int flags, bool disable_lock)
+{
+    QDict *opts = qdict_new();
+
+    qdict_set_default_str(opts, "filename", test_image);
+    qdict_set_default_str(opts, "driver", "file");
+    if (disable_lock) {
+        qdict_put(opts, "disable-lock", qbool_from_bool(true));
+    }
+
+    return blk_new_open(NULL, NULL, opts, flags | BDRV_O_ALLOW_RDWR, NULL);
+}
+
+#define RW true
+#define RO false
+#define SHARE true
+#define EXCLU false
+
+static struct CompatData {
+    bool write_1;
+    bool share_1;
+    bool write_2;
+    bool share_2;
+    bool compatible;
+} compat_data[] = {
+    /* Write 1, Share 1, Write 2, Share 2, Compatible. */
+    { RO,       SHARE,   RO,      SHARE,   true,  },
+    { RO,       SHARE,   RO,      EXCLU,   true,  },
+    { RO,       SHARE,   RW,      SHARE,   true,  },
+    { RO,       SHARE,   RW,      EXCLU,   true,  },
+
+    { RO,       EXCLU,   RO,      EXCLU,   true,  },
+    { RO,       EXCLU,   RW,      SHARE,   false, },
+    { RO,       EXCLU,   RW,      EXCLU,   false, },
+
+    { RW,       SHARE,   RW,      SHARE,   true, },
+    { RW,       SHARE,   RW,      EXCLU,   false, },
+
+    { RW,       EXCLU,   RW,      EXCLU,   false, },
+};
+
+/* Test one combination scenario.
+ *
+ * @flags1: The flags of the first blk.
+ * @flags2: The flags of the second blk.
+ * @disable1: The value for raw-posix disable-lock option of the first blk.
+ * @disable2: The value for raw-posix disable-lock option of the second blk.
+ * @from_reopen: Whether or not the first blk should get flags1 from a reopen.
+ * @initial: The source flags from which the blk1 is reopened, only
+ *                effective if @from_reopen is true.
+ */
+static void do_test_compat_one(int flags1, int flags2,
+                               bool disable1, bool disable2,
+                               bool from_reopen, int initial_flags,
+                               bool compatible)
+{
+    BlockBackend *blk1, *blk2;
+
+    DPRINTF("\n===\ndo test compat one\n");
+    DPRINTF("flags %x %x\n", flags1, flags2);
+    DPRINTF("disable %d %d\n", disable1, disable2);
+    DPRINTF("from reopen %d, initial flags %d\n", from_reopen, initial_flags);
+    DPRINTF("compatible %d\n", compatible);
+    if (!from_reopen) {
+        blk1 = open_test_image(flags1, disable1);
+    } else {
+        int ret;
+        blk1 = open_test_image(initial_flags, disable1);
+        BlockReopenQueue *rq = NULL;
+
+        rq = bdrv_reopen_queue(rq, blk_bs(blk1), NULL, flags1);
+        ret = bdrv_reopen_multiple(blk_get_aio_context(blk1), rq, &error_abort);
+        g_assert_cmpint(ret, ==, 0);
+    }
+    g_assert_nonnull(blk1);
+    g_assert_cmphex(blk_get_flags(blk1) & (BDRV_O_SHARE_RW | BDRV_O_RDWR),
+                    ==, flags1);
+    blk2 = open_test_image(flags2, disable2);
+    if (compatible) {
+        g_assert_nonnull(blk2);
+    } else {
+        g_assert_null(blk2);
+    }
+    blk_unref(blk1);
+    blk_unref(blk2);
+}
+
+static void do_test_compat(bool test_disable, bool from_reopen,
+                           int initial_flags)
+{
+    int i;
+    int flags1, flags2;
+
+    for (i = 0; i < ARRAY_SIZE(compat_data); i++) {
+        struct CompatData *data = &compat_data[i];
+        bool compat = data->compatible;
+
+        flags1 = (data->write_1 ? BDRV_O_RDWR : 0) |
+                 (data->share_1 ? BDRV_O_SHARE_RW : 0);
+        flags2 = (data->write_2 ? BDRV_O_RDWR : 0) |
+                 (data->share_2 ? BDRV_O_SHARE_RW : 0);
+        if (!test_disable) {
+            do_test_compat_one(flags1, flags2, false, false,
+                               from_reopen, initial_flags, compat);
+
+            do_test_compat_one(flags2, flags1, false, false,
+                               from_reopen, initial_flags, compat);
+        } else {
+            compat = true;
+            do_test_compat_one(flags1, flags2, true, false,
+                               from_reopen, initial_flags, compat);
+            do_test_compat_one(flags1, flags2, false, true,
+                               from_reopen, initial_flags, compat);
+            do_test_compat_one(flags2, flags1, true, false,
+                               from_reopen, initial_flags, compat);
+            do_test_compat_one(flags2, flags1, false, true,
+                               from_reopen, initial_flags, compat);
+            do_test_compat_one(flags1, flags2, true, true,
+                               from_reopen, initial_flags, compat);
+        }
+    }
+}
+
+static void test_compat(void)
+{
+    do_test_compat(false, false, 0);
+}
+
+static void test_compat_after_reopen(void)
+{
+    do_test_compat(false, true, 0);
+    do_test_compat(false, true, BDRV_O_SHARE_RW);
+    do_test_compat(false, true, BDRV_O_RDWR);
+    do_test_compat(false, true, BDRV_O_RDWR | BDRV_O_SHARE_RW);
+}
+
+static void test_0bytefile(void)
+{
+    int ret;
+    ret = ftruncate(test_image_fd, 0);
+    g_assert_cmpint(ret, ==, 0);
+    do_test_compat(false, false, 0);
+}
+
+static void test_disable(void)
+{
+    do_test_compat(true, false, 0);
+    do_test_compat(true, true, 0);
+    do_test_compat(true, true, BDRV_O_SHARE_RW);
+    do_test_compat(true, true, BDRV_O_RDWR);
+    do_test_compat(true, true, BDRV_O_RDWR | BDRV_O_SHARE_RW);
+}
+
+int main(int argc, char **argv)
+{
+#ifndef F_OFD_SETLK
+    return 0;
+#endif
+    int r;
+    test_image_fd = mkstemp(test_image);
+
+    qemu_init_main_loop(&error_fatal);
+    bdrv_init();
+
+    g_assert(test_image_fd >= 0);
+    r = ftruncate(test_image_fd, TEST_IMAGE_SIZE);
+    g_assert_cmpint(r, ==, 0);
+    g_test_init(&argc, &argv, NULL);
+    g_test_add_func("/image-lock/compat", test_compat);
+    g_test_add_func("/image-lock/compat_after_reopen", test_compat_after_reopen);
+    g_test_add_func("/image-lock/compat_0bytefile", test_0bytefile);
+    g_test_add_func("/image-lock/disable", test_disable);
+    aio_context_acquire(qemu_get_aio_context());
+    r = g_test_run();
+    aio_context_release(qemu_get_aio_context());
+    return r;
+}
diff --git a/tests/test-replication.c b/tests/test-replication.c
index 5bede49..5fb69d2 100644
--- a/tests/test-replication.c
+++ b/tests/test-replication.c
@@ -312,7 +312,7 @@ static BlockBackend *start_secondary(void)
 
     /* add s_local_disk and forge S_LOCAL_DISK_ID */
     cmdline = g_strdup_printf("file.filename=%s,driver=qcow2,"
-                              "file.file.disable-lock=on",
+                              "file.disable-lock=on",
                               s_local_disk);
     opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
     g_free(cmdline);
@@ -334,10 +334,10 @@ static BlockBackend *start_secondary(void)
     /* add S_(ACTIVE/HIDDEN)_DISK and forge S_ID */
     cmdline = g_strdup_printf("driver=replication,mode=secondary,top-id=%s,"
                               "file.driver=qcow2,file.file.filename=%s,"
-                              "file.file.disable-lock=on",
+                              "file.file.disable-lock=on,"
                               "file.backing.driver=qcow2,"
                               "file.backing.file.filename=%s,"
-                              "file.backing.file.disable-lock=on",
+                              "file.backing.file.disable-lock=on,"
                               "file.backing.backing=%s"
                               , S_ID, s_active_disk, s_hidden_disk
                               , S_LOCAL_DISK_ID);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
@ 2017-02-05 21:38   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-05 21:38 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 425 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> They are wrappers of POSIX fcntl "file private locking", with a
> convenient "try lock" wrapper implemented with F_OFD_GETLK.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  include/qemu/osdep.h |  3 +++
>  util/osdep.c         | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 51 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 02/16] block: Define BDRV_O_SHARE_RW
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 02/16] block: Define BDRV_O_SHARE_RW Fam Zheng
@ 2017-02-05 21:51   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-05 21:51 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 854 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  include/block/block.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 8b0dcda..243839d 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -97,6 +97,8 @@ typedef struct HDGeometry {
>                                        select an appropriate protocol driver,
>                                        ignoring the format layer */
>  #define BDRV_O_NO_IO       0x10000 /* don't initialize for I/O */
> +#define BDRV_O_SHARE_RW    0x20000 /* accept shared read-write from other users
> +                                      of the image */
>  
>  #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 03/16] qemu-io: Set "share-rw" flag together with read-only
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 03/16] qemu-io: Set "share-rw" flag together with read-only Fam Zheng
@ 2017-02-05 21:52   ` Max Reitz
  2017-02-06  4:52     ` Fam Zheng
  0 siblings, 1 reply; 39+ messages in thread
From: Max Reitz @ 2017-02-05 21:52 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 394 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> qemu-io is a low level tool to read or modify guest visible data, which
> implies the user knows very well what is being done. Allowing reading
> from a locked image is harmless in most cases, so do it.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  qemu-io.c | 2 ++
>  1 file changed, 2 insertions(+)

What about open_f()?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 04/16] qemu-img: Set "share-rw" flag in read-only commands
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 04/16] qemu-img: Set "share-rw" flag in read-only commands Fam Zheng
@ 2017-02-05 21:55   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-05 21:55 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 428 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> Checking the status of an image when it is being used by guest is
> usually useful, and there is no risk of corrupting data, so don't let
> the upcoming image locking feature limit this use case.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  qemu-img.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 05/16] block: Set "share-rw" flag in drive-backup when sync=none
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 05/16] block: Set "share-rw" flag in drive-backup when sync=none Fam Zheng
@ 2017-02-05 21:57   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-05 21:57 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 397 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> In this case we may open the source's backing image chain multiple
> times. Setting share flag means the new open won't try to acquire or
> check any lock, once we implement image locking.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c | 3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 00/16] block: Image locking series
  2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
                   ` (15 preceding siblings ...)
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 16/16] tests: Add test-image-lock Fam Zheng
@ 2017-02-05 21:58 ` Max Reitz
  16 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-05 21:58 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 796 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> v12: Fix test cases on old systems that doesn't have F_OFD_SETLK, such as RHEL
>      6. [Patchew]
>      Trim the commit message of patch 15 to avoid bitrotting.
> 
> v11: Move lock bytes from 1-2 to 0x10-0x12. [Daniel]
> 
> v10: While we still don't have comprehensive propagation mechanism that will be
> provided by new op blocker system for "permissive modes", the locking enabled
> by default is regardlessly useful and long overdue. So I think we should merge
> this for 2.9 and build user options on top later when the op blocker API
> settles.

Reasonable, but it's worth noting that on my to-review list op blockers
actually have priority over this for 2.9...

(Or rather, they would have if there was a series posted.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 03/16] qemu-io: Set "share-rw" flag together with read-only
  2017-02-05 21:52   ` Max Reitz
@ 2017-02-06  4:52     ` Fam Zheng
  0 siblings, 0 replies; 39+ messages in thread
From: Fam Zheng @ 2017-02-06  4:52 UTC (permalink / raw)
  To: Max Reitz
  Cc: qemu-devel, Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

On Sun, 02/05 22:52, Max Reitz wrote:
> On 23.01.2017 13:30, Fam Zheng wrote:
> > qemu-io is a low level tool to read or modify guest visible data, which
> > implies the user knows very well what is being done. Allowing reading
> > from a locked image is harmless in most cases, so do it.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  qemu-io.c | 2 ++
> >  1 file changed, 2 insertions(+)
> 
> What about open_f()?

Right, should do the same.

Fam

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 06/16] iotests: 055: Don't attach the drive to vm for drive-backup
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 06/16] iotests: 055: Don't attach the drive to vm for drive-backup Fam Zheng
@ 2017-02-08  0:06   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08  0:06 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 259 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/qemu-iotests/055 | 32 ++++++++++++++++++--------------
>  1 file changed, 18 insertions(+), 14 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 07/16] iotests: 030: Read-only open image for getting map
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 07/16] iotests: 030: Read-only open image for getting map Fam Zheng
@ 2017-02-08  0:11   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08  0:11 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 300 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/qemu-iotests/030 | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>

(As always, took a couple of retries to let test 030 pass for me...)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 08/16] iotests: 087: Don't attach test image twice
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 08/16] iotests: 087: Don't attach test image twice Fam Zheng
@ 2017-02-08  0:14   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08  0:14 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 380 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> The test scenario doesn't require the same image, instead it focuses on
> the duplicated node-name, so use null-co to avoid locking conflict.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/qemu-iotests/087 | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 09/16] iotests: 085: Avoid image locking conflict
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 09/16] iotests: 085: Avoid image locking conflict Fam Zheng
@ 2017-02-08  0:25   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08  0:25 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 3495 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> In the case where we test the expected error when a blockdev-snapshot
> target already has a backing image, backing chain is opened multiple

*the backing chain

> times. This will be a problem when we use image locking, so use a
> different backing file that is not already open.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/qemu-iotests/085     | 32 +++++++++++++++++++-------------
>  tests/qemu-iotests/085.out |  3 ++-
>  2 files changed, 21 insertions(+), 14 deletions(-)
> 
> diff --git a/tests/qemu-iotests/085 b/tests/qemu-iotests/085
> index c53e97f..c91d78d 100755
> --- a/tests/qemu-iotests/085
> +++ b/tests/qemu-iotests/085
> @@ -87,24 +87,26 @@ function create_group_snapshot()
>  }
>  
>  # ${1}: unique identifier for the snapshot filename
> -# ${2}: true: open backing images; false: don't open them (default)
> +# ${2}: extra_params to the blockdev-add command
> +# ${3}: filename
> +function do_blockdev_add()
> +{
> +    cmd="{ 'execute': 'blockdev-add', 'arguments':
> +           { 'driver': 'qcow2', 'node-name': 'snap_${1}', ${2}
> +             'file':
> +             { 'driver': 'file', 'filename': '${3}',
> +               'node-name': 'file_${1}' } } }"
> +    _send_qemu_cmd $h "${cmd}" "return"
> +}
> +
> +# ${1}: unique identifier for the snapshot filename
>  function add_snapshot_image()
>  {
> -    if [ "${2}" = "true" ]; then
> -        extra_params=""
> -    else
> -        extra_params="'backing': '', "
> -    fi
>      base_image="${TEST_DIR}/$((${1}-1))-${snapshot_virt0}"
>      snapshot_file="${TEST_DIR}/${1}-${snapshot_virt0}"
>      _make_test_img -b "${base_image}" "$size"
>      mv "${TEST_IMG}" "${snapshot_file}"
> -    cmd="{ 'execute': 'blockdev-add', 'arguments':
> -           { 'driver': 'qcow2', 'node-name': 'snap_${1}', ${extra_params}
> -             'file':
> -             { 'driver': 'file', 'filename': '${snapshot_file}',
> -               'node-name': 'file_${1}' } } }"
> -    _send_qemu_cmd $h "${cmd}" "return"
> +    do_blockdev_add "$1" "'backing': '', " "${snapshot_file}"
>  }
>  
>  # ${1}: unique identifier for the snapshot filename
> @@ -222,7 +224,11 @@ echo === Invalid command - snapshot node has a backing image ===
>  echo
>  
>  SNAPSHOTS=$((${SNAPSHOTS}+1))
> -add_snapshot_image ${SNAPSHOTS} true
> +
> +_make_test_img "$size"
> +mv "${TEST_IMG}" "${TEST_IMG}.base"
> +_make_test_img -b "${TEST_IMG}.base" "$size"
> +do_blockdev_add ${SNAPSHOTS} "" "${TEST_IMG}"

These files ($TEST_IMG and ${TEST_IMG}.base) are not removed in _cleanup.

Looks good apart from that.

Max

>  blockdev_snapshot ${SNAPSHOTS} error
>  
>  echo
> diff --git a/tests/qemu-iotests/085.out b/tests/qemu-iotests/085.out
> index 08e4bb7..7bbf84d 100644
> --- a/tests/qemu-iotests/085.out
> +++ b/tests/qemu-iotests/085.out
> @@ -78,7 +78,8 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/
>  
>  === Invalid command - snapshot node has a backing image ===
>  
> -Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/12-snapshot-v0.IMGFMT
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 backing_file=TEST_DIR/t.IMGFMT.base
>  {"return": {}}
>  {"error": {"class": "GenericError", "desc": "The snapshot already has a backing image"}}
>  
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 10/16] iotests: 091: Quit QEMU before checking image
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 10/16] iotests: 091: Quit QEMU before checking image Fam Zheng
@ 2017-02-08  0:32   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08  0:32 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/qemu-iotests/091 | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tests/qemu-iotests/091 b/tests/qemu-iotests/091
> index 32bbd56..10ac4a8 100755
> --- a/tests/qemu-iotests/091
> +++ b/tests/qemu-iotests/091
> @@ -95,7 +95,9 @@ echo "vm2: qemu process running successfully"
>  echo "vm2: flush io, and quit"
>  _send_qemu_cmd $h2 'qemu-io disk flush' "(qemu)"
>  _send_qemu_cmd $h2 'quit' ""
> +_send_qemu_cmd $h1 'quit' ""
>  
> +wait

And to think I added $wait for _cleanup_qemu when it was so simple...

Well:

Reviewed-by: Max Reitz <mreitz@redhat.com>

(...maybe _cleanup_qemu's output redirection ability is still useful? I
don't know.)

>  echo "Check image pattern"
>  ${QEMU_IO} -c "read -P 0x22 0 4M" "${TEST_IMG}" | _filter_testdir | _filter_qemu_io
>  
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 11/16] iotests: 172: Use separate images for multiple devices
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 11/16] iotests: 172: Use separate images for multiple devices Fam Zheng
@ 2017-02-08  0:43   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08  0:43 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 2992 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> To avoid image lock failures.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/qemu-iotests/172     | 53 ++++++++++++++++++++++++----------------------
>  tests/qemu-iotests/172.out | 50 ++++++++++++++++++++++---------------------
>  2 files changed, 54 insertions(+), 49 deletions(-)
> 
> diff --git a/tests/qemu-iotests/172 b/tests/qemu-iotests/172
> index 1b7d3a1..f9d4cff 100755
> --- a/tests/qemu-iotests/172
> +++ b/tests/qemu-iotests/172
> @@ -86,6 +86,9 @@ size=720k
>  
>  _make_test_img $size
>  
> +TEST_IMG="$TEST_IMG.2" _make_test_img $size
> +TEST_IMG="$TEST_IMG.3" _make_test_img $size

These images should be removed in _cleanup.

Alternatively, you could use the good old driver=null-co,size=$size.

(btw, I'm amazed that size=720k works. Not sure if that is intended, but
I won't complain. It even works in JSON (with 'size': '720k').)

> +
>  # Default drive semantics:
>  #
>  # By default you get a single empty floppy drive. You can override it with
> @@ -105,7 +108,7 @@ echo === Using -fda/-fdb options ===
>  
>  check_floppy_qtree -fda "$TEST_IMG"
>  check_floppy_qtree -fdb "$TEST_IMG"
> -check_floppy_qtree -fda "$TEST_IMG" -fdb "$TEST_IMG"
> +check_floppy_qtree -fda "$TEST_IMG" -fdb "$TEST_IMG.2"
>  
>  
>  echo
> @@ -114,7 +117,7 @@ echo === Using -drive options ===
>  
>  check_floppy_qtree -drive if=floppy,file="$TEST_IMG"
>  check_floppy_qtree -drive if=floppy,file="$TEST_IMG",index=1
> -check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=floppy,file="$TEST_IMG",index=1
> +check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive if=floppy,file="$TEST_IMG.2",index=1
>  
>  echo
>  echo
> @@ -122,7 +125,7 @@ echo === Using -drive if=none and -global ===
>  
>  check_floppy_qtree -drive if=none,file="$TEST_IMG" -global isa-fdc.driveA=none0
>  check_floppy_qtree -drive if=none,file="$TEST_IMG" -global isa-fdc.driveB=none0
> -check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
> +check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG.2" \
>                     -global isa-fdc.driveA=none0 -global isa-fdc.driveB=none1
>  
>  echo
> @@ -131,7 +134,7 @@ echo === Using -drive if=none and -device ===
>  
>  check_floppy_qtree -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
>  check_floppy_qtree -drive if=none,file="$TEST_IMG" -device floppy,drive=none0,unit=1
> -check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive if=none,file="$TEST_IMG" \
> +check_floppy_qtree -drive if=none,file="$TEST_IMG",file.disable-lock=on -drive if=none,file="$TEST_IMG.2" \

disable-lock is not yet available, and also I don't know why you are
using it here. Is it just an artifact from an earlier version, perhaps?

Rest looks good.

Max

>                     -device floppy,drive=none0 -device floppy,drive=none1,unit=1
>  
>  echo

[...]



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 12/16] tests: Use null-co:// instead of /dev/null as the dummy image
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 12/16] tests: Use null-co:// instead of /dev/null as the dummy image Fam Zheng
@ 2017-02-08  0:45   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08  0:45 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 504 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/drive_del-test.c    | 2 +-
>  tests/nvme-test.c         | 2 +-
>  tests/usb-hcd-uhci-test.c | 2 +-
>  tests/usb-hcd-xhci-test.c | 2 +-
>  tests/virtio-blk-test.c   | 2 +-
>  tests/virtio-scsi-test.c  | 4 ++--
>  6 files changed, 7 insertions(+), 7 deletions(-)

Let me just say that null(-co) has turned out to be a really useful
driver indeed.

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 13/16] tests: Disable image lock in test-replication
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 13/16] tests: Disable image lock in test-replication Fam Zheng
@ 2017-02-08  0:56   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08  0:56 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 3025 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> The COLO block replication architecture requires one disk to be shared
> between primary and secondary, in the test both processes use posix file
> protocol (instead of over NBD) so it is affected by image locking.
> Disable the lock.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/test-replication.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)

Running these tests (even when fixed) results in a failure before the
next patch because disable-lock is not yet implemented. So just putting
this test fix after the next patch wouldn't be any worse, the test would
fail for a single revision.

I'm more or less fine with this, but you could also just squash this
patch into the next one; it's small enough.

Or you find a way to split up the next patch so this can go in between.

> diff --git a/tests/test-replication.c b/tests/test-replication.c
> index fac2da3..5bede49 100644
> --- a/tests/test-replication.c
> +++ b/tests/test-replication.c
> @@ -179,7 +179,8 @@ static BlockBackend *start_primary(void)
>      char *cmdline;
>  
>      cmdline = g_strdup_printf("driver=replication,mode=primary,node-name=xxx,"
> -                              "file.driver=qcow2,file.file.filename=%s"
> +                              "file.driver=qcow2,file.file.filename=%s,"
> +                              "file.file.disable-lock=on"
>                                , p_local_disk);

That is interesting formatting (albeit pre-existing).

>      opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
>      g_free(cmdline);
> @@ -310,7 +311,9 @@ static BlockBackend *start_secondary(void)
>      Error *local_err = NULL;
>  
>      /* add s_local_disk and forge S_LOCAL_DISK_ID */
> -    cmdline = g_strdup_printf("file.filename=%s,driver=qcow2", s_local_disk);
> +    cmdline = g_strdup_printf("file.filename=%s,driver=qcow2,"
> +                              "file.file.disable-lock=on",
> +                              s_local_disk);
>      opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
>      g_free(cmdline);
>  
> @@ -331,8 +334,10 @@ static BlockBackend *start_secondary(void)
>      /* add S_(ACTIVE/HIDDEN)_DISK and forge S_ID */
>      cmdline = g_strdup_printf("driver=replication,mode=secondary,top-id=%s,"
>                                "file.driver=qcow2,file.file.filename=%s,"
> +                              "file.file.disable-lock=on",

Comma should be before the " (gcc complains about a format string error
when compiling).

>                                "file.backing.driver=qcow2,"
>                                "file.backing.file.filename=%s,"
> +                              "file.backing.file.disable-lock=on",

Same here.

Max

>                                "file.backing.backing=%s"
>                                , S_ID, s_active_disk, s_hidden_disk
>                                , S_LOCAL_DISK_ID);
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking Fam Zheng
@ 2017-02-08  3:05   ` Max Reitz
  2017-02-08  6:00     ` Fam Zheng
  0 siblings, 1 reply; 39+ messages in thread
From: Max Reitz @ 2017-02-08  3:05 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 15397 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> This implements open flag sensible image locking for local file
> and host device protocol.
> 
> virtlockd in libvirt locks the first byte, so we start looking at the
> file bytes from 1.
> 
> Quoting what was proposed by Kevin Wolf <kwolf@redhat.com>, there are
> four locking modes by combining two bits (BDRV_O_RDWR and
> BDRV_O_SHARE_RW), and implemented by taking two locks.
> 
> The complication is in the transactional reopen.  To make the reopen
> logic managable, and allow better reuse, the code is internally
> organized with a table from old mode to the new one.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/file-posix.c | 681 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 678 insertions(+), 3 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 28b47d9..a8c76d6 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -131,8 +131,45 @@ do { \
>  
>  #define MAX_BLOCKSIZE	4096
>  
> +/* Posix file locking bytes. Libvirt takes byte 0, we start from byte 0x10,
> + * leaving a few more bytes for its future use. */
> +#define RAW_LOCK_BYTE_MIN             0x10
> +#define RAW_LOCK_BYTE_NO_OTHER_WRITER 0x10
> +#define RAW_LOCK_BYTE_WRITE           0x11
> +#ifdef F_OFD_SETLK
> +#define RAW_LOCK_SUPPORTED 1
> +#else
> +#define RAW_LOCK_SUPPORTED 0
> +#endif
> +
> +/*
> + ** reader that can tolerate writers: Don't do anything
> + *
> + ** reader that can't tolerate writers: Take shared lock on byte 1. Test
> + *  byte 2 is unlocked.

Byte 0x10 and 0x11 now -- or you call them byte 0 and byte 1. Or "the
first byte" and "the second byte".

Also, it should probably be "Test whether byte 2 is unlocked" or "Affirm
that byte 2 is unlocked" (this is what my sense of the English language
is telling me, may be wrong).

> + *
> + ** shared writer: Take shared lock on byte 2. Test byte 1 is unlocked.
> + *
> + ** exclusive writer: Take exclusive locks on both bytes.
> + */
> +
> +typedef enum {
> +    /* Read only and accept other writers. */
> +    RAW_L_READ_SHARE_RW,
> +    /* Read only and try to forbid other writers. */
> +    RAW_L_READ,
> +    /* Read write and accept other writers. */
> +    RAW_L_WRITE_SHARE_RW,
> +    /* Read write and try to forbid other writers. */

While fully comprehensible and I didn't nag about this in the last
revision, it isn't real English so let me complain now: May be better as
"Read+write", "Read & write" or "Read/write".

("Read and write" is kind of bad because of the immediate "and" afterwards.)

> +    RAW_L_WRITE,
> +} BDRVRawLockMode;
> +
>  typedef struct BDRVRawState {
>      int fd;
> +    /* A dup of @fd to make manipulating lock easier, especially during reopen,
> +     * where this will accept BDRVRawReopenState.lock_fd. */
> +    int lock_fd;
> +    bool disable_lock;
>      int type;
>      int open_flags;
>      size_t buf_align;

[...]

> @@ -393,10 +487,88 @@ static QemuOptsList raw_runtime_opts = {

[...]

> +static int raw_apply_image_lock(BlockDriverState *bs, int bdrv_flags,
> +                                Error **errp)
> +{
> +    int ret;
> +    BDRVRawState *s = bs->opaque;
> +    BDRVRawLockMode lock_mode;
> +
> +    if (!raw_lock_enabled(bs)) {
> +        return 0;
> +    }
> +    assert(s->cur_lock_mode == RAW_L_READ_SHARE_RW);
> +    lock_mode = raw_get_lock_mode(bdrv_flags);
> +    ret = raw_open_lockfd(bs->exact_filename, s->open_flags, &lock_mode,
> +                          errp);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +    s->lock_fd = ret;
> +    if (lock_mode == RAW_L_READ_SHARE_RW) {
> +        return 0;
> +    }

Not really sure why this needs to be special-cased. It doesn't hurt, but
it doesn't really improve anything either, does it?

> +    ret = raw_lock_fd(s->lock_fd, lock_mode, errp);
> +    if (ret) {
> +        return ret;
> +    }
> +    s->cur_lock_mode = lock_mode;
> +    return 0;
> +}
> +
>  static int raw_open_common(BlockDriverState *bs, QDict *options,
>                             int bdrv_flags, int open_flags, Error **errp)
>  {

[...]

> @@ -538,6 +720,465 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
>      return raw_open_common(bs, options, flags, 0, errp);
>  }
>  
> +typedef enum {
> +    RAW_LT_PREPARE,
> +    RAW_LT_COMMIT,
> +    RAW_LT_ABORT
> +} RawLockTransOp;
> +
> +typedef int (*RawReopenFunc)(RawLockTransOp op,
> +                             int old_lock_fd, int new_lock_fd,
> +                             BDRVRawLockMode old_lock,
> +                             BDRVRawLockMode new_lock,
> +                             Error **errp);
> +
> +static int raw_lt_nop(RawLockTransOp op,
> +                      int old_lock_fd, int new_lock_fd,
> +                      BDRVRawLockMode old_lock,
> +                      BDRVRawLockMode new_lock,
> +                      Error **errp)
> +{
> +    assert(old_lock == new_lock || new_lock == RAW_L_READ_SHARE_RW);
> +    return 0;
> +}
> +
> +static int raw_lt_from_unlock(RawLockTransOp op,
> +                              int old_lock_fd, int new_lock_fd,
> +                              BDRVRawLockMode old_lock,
> +                              BDRVRawLockMode new_lock,
> +                              Error **errp)
> +{
> +    assert(old_lock != new_lock);
> +    assert(old_lock == RAW_L_READ_SHARE_RW);
> +    switch (op) {
> +    case RAW_LT_PREPARE:
> +        return raw_lock_fd(new_lock_fd, new_lock, errp);
> +    case RAW_LT_COMMIT:
> +        break;

Ah, that's what the break was meant for.

(It was one line above in v10.)

> +    case RAW_LT_ABORT:
> +        break;
> +    }
> +
> +    return 0;
> +}
> +
> +static int raw_lt_read_to_write_share(RawLockTransOp op,
> +                                      int old_lock_fd, int new_lock_fd,
> +                                      BDRVRawLockMode old_lock,
> +                                      BDRVRawLockMode new_lock,
> +                                      Error **errp)
> +{
> +    int ret = 0;
> +
> +    assert(old_lock == RAW_L_READ);
> +    assert(new_lock == RAW_L_WRITE_SHARE_RW);
> +
> +    /*
> +     *        lock byte "no other writer"      lock byte "write"
> +     * old                S                           0
> +     * new                0                           S
> +     *
> +     * (0 = unlocked; S = shared; X = exclusive.)
> +     */

Thanks, these comments are nice.

> +    switch (op) {
> +    case RAW_LT_PREPARE:
> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to lock new fd (write byte)");
> +            break;
> +        }
> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
> +            break;
> +        }
> +        ret = qemu_unlock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to unlock old fd (share byte)");
> +            break;
> +        }
> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to upgrade new fd (share byte)");
> +            break;
> +        }
> +        ret = qemu_unlock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1);
> +        if (ret) {
> +            /* This is very unlikely, but catch it anyway. */
> +            error_setg_errno(errp, -ret, "Failed to unlock new fd (share byte)");

Let's say we fail here, however unlikely it is...

> +            break;
> +        }
> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to downgrade new fd (write byte)");
> +            break;
> +        }
> +        break;
> +    case RAW_LT_COMMIT:
> +        break;
> +    case RAW_LT_ABORT:
> +        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
> +        if (ret) {
> +            error_report("Failed to restore lock on old fd (share byte)");
> +        }

...will this not fail then?

(exclusive lock is still present on new_lock_fd.)

> +        break;
> +    }
> +    return ret;

By the way, couldn't this function use the same logic as
raw_lt_write_share_to_read()? (i.e. lock old_lock_fd's
RAW_LOCK_BYTE_NO_OTHER_WRITER exclusively and then lock new_lock_fd's
RAW_LOCK_BYTE_WRITE in shared mode)

> +}

[...]

> +static int raw_lt_write_share_to_read(RawLockTransOp op,
> +                                      int old_lock_fd, int new_lock_fd,
> +                                      BDRVRawLockMode old_lock,
> +                                      BDRVRawLockMode new_lock,
> +                                      Error **errp)
> +{
> +    int ret = 0;
> +
> +    assert(old_lock == RAW_L_WRITE_SHARE_RW);
> +    assert(new_lock == RAW_L_READ);
> +    /*
> +     *        lock byte "no other writer"      lock byte "write"
> +     * old                0                           S
> +     * new                S                           0
> +     *
> +     * (0 = unlocked; S = shared; X = exclusive.)
> +     */
> +    switch (op) {
> +    case RAW_LT_PREPARE:
> +        /* Make sure there are no other writers. */
> +        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, true);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to lock old fd (write byte)");
> +            break;
> +        }
> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
> +            break;
> +        }
> +        break;
> +    case RAW_LT_COMMIT:
> +        break;
> +    case RAW_LT_ABORT:
> +        break;

Shouldn't the abort path downgrade the exclusive lock on old_lock_fd to
a shared lock?

> +    }
> +    return ret;
> +}

[...]

> +static int raw_lt_write_to_read(RawLockTransOp op,
> +                                int old_lock_fd, int new_lock_fd,
> +                                BDRVRawLockMode old_lock,
> +                                BDRVRawLockMode new_lock,
> +                                Error **errp)
> +{
> +    int ret = 0;
> +
> +    assert(old_lock == RAW_L_WRITE);
> +    assert(new_lock == RAW_L_READ);
> +    /*
> +     *        lock byte "no other writer"      lock byte "write"
> +     * old                X                           X
> +     * new                S                           0
> +     *
> +     * (0 = unlocked; S = shared; X = exclusive.)
> +     */
> +    switch (op) {
> +    case RAW_LT_PREPARE:
> +        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to downgrade old fd (share byte)");
> +            break;
> +        }
> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, false);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to lock new fd (share byte)");
> +            break;
> +        }
> +        break;
> +    case RAW_LT_COMMIT:
> +        break;
> +    case RAW_LT_ABORT:
> +        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_NO_OTHER_WRITER, 1, true);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to restore old fd (share byte)");
> +        }

I think you should release the lock on new_lock_fd first.

> +        break;
> +    }
> +    return ret;
> +}
> +
> +static int raw_lt_write_to_write_share(RawLockTransOp op,
> +                                       int old_lock_fd, int new_lock_fd,
> +                                       BDRVRawLockMode old_lock,
> +                                       BDRVRawLockMode new_lock,
> +                                       Error **errp)
> +{
> +    int ret = 0;
> +
> +    assert(old_lock == RAW_L_WRITE);
> +    assert(new_lock == RAW_L_WRITE_SHARE_RW);
> +    /*
> +     *        lock byte "no other writer"      lock byte "write"
> +     * old                X                           X
> +     * new                0                           S
> +     *
> +     * (0 = unlocked; S = shared; X = exclusive.)
> +     */
> +    switch (op) {
> +    case RAW_LT_PREPARE:
> +        break;
> +    case RAW_LT_COMMIT:
> +        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
> +        if (ret) {
> +            error_report("Failed to downgrade old fd (share byte)");
> +            break;
> +        }
> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
> +        if (ret) {
> +            error_report("Failed to unlock new fd (share byte)");
> +            break;
> +        }

The second one is not an "unlock", but a new shared lock. Which brings
me to the point that both of these commands can fail and thus should be
in the prepare path.

(This function should be a mirror of raw_lt_write_to_read, if I'm not
mistaken.)

> +        break;
> +    case RAW_LT_ABORT:
> +        break;
> +    }
> +    return ret;
> +}
> +
> +/**
> + * Transactionally moving between possible locking states is tricky and must be
> + * done carefully. That is mostly because downgrading an exclusive lock to
> + * shared or unlocked is not guaranteed to be revertible. As a result, in such

Interesting. Wiktionary says "revertible" means "able to be reverted",
which sounds reasonable, albeit I'm not sure I have ever heard
"revertible" before.

However, my favorite online dictionary gave me a German word I have
never heard before.

Note that Wiktionary also has the word "revertable" with the same
definition. Of course, it also has "reversible". Now I understand there
is a difference between "to revert" and "to reverse", but maybe
"reversible" is still the better choice considering it has a unique
meaning and scores more than thousand times as many results on Google.

(For anyone wondering, the German word is "heimfällig" and it means
"designated to go back to the original owner" (e.g. after death). It's
apparently related to "anheimfallen", which I do know, which means "to
become someone's property" or "to become a victim of something"
("something" being a process of some sorts, usually, such as a mishap).)

((Apparently "heimfällig" is used in Austria and Swiss, mostly.))

Max

> + * cases we have to defer the downgrading to "commit", given that no revert will
> + * happen after that point, and that downgrading a lock should never fail.
> + *
> + * On the other hand, upgrading a lock (e.g. from unlocked or shared to
> + * exclusive lock) must happen in "prepare" because it may fail.
> + *
> + * Manage the operation matrix with this state transition table to make
> + * fulfilling above conditions easier.
> + */

[...]



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking
  2017-02-08  3:05   ` Max Reitz
@ 2017-02-08  6:00     ` Fam Zheng
  2017-02-08 13:18       ` Max Reitz
  0 siblings, 1 reply; 39+ messages in thread
From: Fam Zheng @ 2017-02-08  6:00 UTC (permalink / raw)
  To: Max Reitz
  Cc: qemu-devel, Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

On Wed, 02/08 04:05, Max Reitz wrote:
> > +static int raw_lt_write_to_write_share(RawLockTransOp op,
> > +                                       int old_lock_fd, int new_lock_fd,
> > +                                       BDRVRawLockMode old_lock,
> > +                                       BDRVRawLockMode new_lock,
> > +                                       Error **errp)
> > +{
> > +    int ret = 0;
> > +
> > +    assert(old_lock == RAW_L_WRITE);
> > +    assert(new_lock == RAW_L_WRITE_SHARE_RW);
> > +    /*
> > +     *        lock byte "no other writer"      lock byte "write"
> > +     * old                X                           X
> > +     * new                0                           S
> > +     *
> > +     * (0 = unlocked; S = shared; X = exclusive.)
> > +     */
> > +    switch (op) {
> > +    case RAW_LT_PREPARE:
> > +        break;
> > +    case RAW_LT_COMMIT:
> > +        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
> > +        if (ret) {
> > +            error_report("Failed to downgrade old fd (share byte)");
> > +            break;
> > +        }
> > +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
> > +        if (ret) {
> > +            error_report("Failed to unlock new fd (share byte)");
> > +            break;
> > +        }
> 
> The second one is not an "unlock", but a new shared lock.

You are right.

> Which brings
> me to the point that both of these commands can fail and thus should be
> in the prepare path.

We cannot. If we lose the exclusive lock already in prepare, and some other
things fail later in the transaction, abort() may not be able to restore that
lock (another process took a shared lock in between).

The reason for my code is, the lock semantics implies both of these commands can
succeed, so it doesn't hurt if we ignore ret codes here. I'm just trying to
catch the very unlikely abnormalities.

> 
> (This function should be a mirror of raw_lt_write_to_read, if I'm not
> mistaken.)
> 
> > +        break;
> > +    case RAW_LT_ABORT:
> > +        break;
> > +    }
> > +    return ret;
> > +}

Fam

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking
  2017-02-08  6:00     ` Fam Zheng
@ 2017-02-08 13:18       ` Max Reitz
  2017-02-08 13:40         ` Fam Zheng
  0 siblings, 1 reply; 39+ messages in thread
From: Max Reitz @ 2017-02-08 13:18 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]

On 08.02.2017 07:00, Fam Zheng wrote:
> On Wed, 02/08 04:05, Max Reitz wrote:
>>> +static int raw_lt_write_to_write_share(RawLockTransOp op,
>>> +                                       int old_lock_fd, int new_lock_fd,
>>> +                                       BDRVRawLockMode old_lock,
>>> +                                       BDRVRawLockMode new_lock,
>>> +                                       Error **errp)
>>> +{
>>> +    int ret = 0;
>>> +
>>> +    assert(old_lock == RAW_L_WRITE);
>>> +    assert(new_lock == RAW_L_WRITE_SHARE_RW);
>>> +    /*
>>> +     *        lock byte "no other writer"      lock byte "write"
>>> +     * old                X                           X
>>> +     * new                0                           S
>>> +     *
>>> +     * (0 = unlocked; S = shared; X = exclusive.)
>>> +     */
>>> +    switch (op) {
>>> +    case RAW_LT_PREPARE:
>>> +        break;
>>> +    case RAW_LT_COMMIT:
>>> +        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
>>> +        if (ret) {
>>> +            error_report("Failed to downgrade old fd (share byte)");
>>> +            break;
>>> +        }
>>> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
>>> +        if (ret) {
>>> +            error_report("Failed to unlock new fd (share byte)");
>>> +            break;
>>> +        }
>>
>> The second one is not an "unlock", but a new shared lock.
> 
> You are right.
> 
>> Which brings
>> me to the point that both of these commands can fail and thus should be
>> in the prepare path.
> 
> We cannot. If we lose the exclusive lock already in prepare, and some other
> things fail later in the transaction, abort() may not be able to restore that
> lock (another process took a shared lock in between).
> 
> The reason for my code is, the lock semantics implies both of these commands can
> succeed, so it doesn't hurt if we ignore ret codes here. I'm just trying to
> catch the very unlikely abnormalities.

Indeed. Well, then raw_lt_write_to_read() should do the same, though.

Max

>> (This function should be a mirror of raw_lt_write_to_read, if I'm not
>> mistaken.)
>>
>>> +        break;
>>> +    case RAW_LT_ABORT:
>>> +        break;
>>> +    }
>>> +    return ret;
>>> +}
> 
> Fam
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking
  2017-02-08 13:18       ` Max Reitz
@ 2017-02-08 13:40         ` Fam Zheng
  0 siblings, 0 replies; 39+ messages in thread
From: Fam Zheng @ 2017-02-08 13:40 UTC (permalink / raw)
  To: Max Reitz
  Cc: qemu-devel, Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

On Wed, 02/08 14:18, Max Reitz wrote:
> On 08.02.2017 07:00, Fam Zheng wrote:
> > On Wed, 02/08 04:05, Max Reitz wrote:
> >>> +static int raw_lt_write_to_write_share(RawLockTransOp op,
> >>> +                                       int old_lock_fd, int new_lock_fd,
> >>> +                                       BDRVRawLockMode old_lock,
> >>> +                                       BDRVRawLockMode new_lock,
> >>> +                                       Error **errp)
> >>> +{
> >>> +    int ret = 0;
> >>> +
> >>> +    assert(old_lock == RAW_L_WRITE);
> >>> +    assert(new_lock == RAW_L_WRITE_SHARE_RW);
> >>> +    /*
> >>> +     *        lock byte "no other writer"      lock byte "write"
> >>> +     * old                X                           X
> >>> +     * new                0                           S
> >>> +     *
> >>> +     * (0 = unlocked; S = shared; X = exclusive.)
> >>> +     */
> >>> +    switch (op) {
> >>> +    case RAW_LT_PREPARE:
> >>> +        break;
> >>> +    case RAW_LT_COMMIT:
> >>> +        ret = qemu_lock_fd(old_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
> >>> +        if (ret) {
> >>> +            error_report("Failed to downgrade old fd (share byte)");
> >>> +            break;
> >>> +        }
> >>> +        ret = qemu_lock_fd(new_lock_fd, RAW_LOCK_BYTE_WRITE, 1, false);
> >>> +        if (ret) {
> >>> +            error_report("Failed to unlock new fd (share byte)");
> >>> +            break;
> >>> +        }
> >>
> >> The second one is not an "unlock", but a new shared lock.
> > 
> > You are right.
> > 
> >> Which brings
> >> me to the point that both of these commands can fail and thus should be
> >> in the prepare path.
> > 
> > We cannot. If we lose the exclusive lock already in prepare, and some other
> > things fail later in the transaction, abort() may not be able to restore that
> > lock (another process took a shared lock in between).
> > 
> > The reason for my code is, the lock semantics implies both of these commands can
> > succeed, so it doesn't hurt if we ignore ret codes here. I'm just trying to
> > catch the very unlikely abnormalities.
> 
> Indeed. Well, then raw_lt_write_to_read() should do the same, though.
> 
> Max

Right, will fix!

Fam

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 15/16] qcow2: Force "no other writer" lock on bs->file
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 15/16] qcow2: Force "no other writer" lock on bs->file Fam Zheng
@ 2017-02-08 14:33   ` Max Reitz
  2017-02-08 15:00     ` Fam Zheng
  0 siblings, 1 reply; 39+ messages in thread
From: Max Reitz @ 2017-02-08 14:33 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 1937 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> Writing to the same qcow2 file from two QEMU processes at the same time
> will never work correctly, so disable it even when the caller specifies
> BDRV_O_RDWR.
> 
> Other formats are less vulnerable because they don't have internal
> snapshots thus qemu-img is less often misused to create live snapshot.

Hm, OK, reasonable. Also reasonable since we can just wait with those
until we have op blockers.

Which brings me to the op blocker point. I don't know the exact
influence Kevin's patches will have on this series, but I'd imagine they
mostly change where the BDRV_O_SHARE_RW flag comes from or whether we
need that flag at all. Therefore, I personally don't mind the order in
which your series land in master.

> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/qcow2.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 96fb8a8..879361a 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1177,6 +1177,17 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
>          }
>      }
>  
> +    if ((flags & BDRV_O_SHARE_RW) && (flags & BDRV_O_RDWR)) {
> +        /* Shared write is never a good idea for qcow2, override it.
> +         * XXX: Use permission propagation and masking mechanism in op blockers
> +         * API once it's there. */
> +        ret = bdrv_reopen(bs->file->bs, flags & ~BDRV_O_SHARE_RW, &local_err);
> +        if (ret) {
> +            error_propagate(errp, local_err);
> +            goto fail;
> +        }
> +    }
> +

Good in principle, but I don't think it should be at the bottom of this
function, especially not after "Repair image if dirty". I think it would
be good to put this right at the start of qcow2_open(), actually.

Max

>  #ifdef DEBUG_ALLOC
>      {
>          BdrvCheckResult result = {0};
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 16/16] tests: Add test-image-lock
  2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 16/16] tests: Add test-image-lock Fam Zheng
@ 2017-02-08 14:52   ` Max Reitz
  0 siblings, 0 replies; 39+ messages in thread
From: Max Reitz @ 2017-02-08 14:52 UTC (permalink / raw)
  To: Fam Zheng, qemu-devel
  Cc: Daniel P. Berrange, qemu-block, eblake, Kevin Wolf, rjones

[-- Attachment #1: Type: text/plain, Size: 3155 bytes --]

On 23.01.2017 13:30, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  tests/Makefile.include   |   2 +
>  tests/test-image-lock.c  | 206 +++++++++++++++++++++++++++++++++++++++++++++++
>  tests/test-replication.c |   6 +-
>  3 files changed, 211 insertions(+), 3 deletions(-)
>  create mode 100644 tests/test-image-lock.c

Looks good overall, just some minor issues below.

[...]

> diff --git a/tests/test-image-lock.c b/tests/test-image-lock.c
> new file mode 100644
> index 0000000..9c929ca
> --- /dev/null
> +++ b/tests/test-image-lock.c
> @@ -0,0 +1,206 @@

[...]

> +int main(int argc, char **argv)
> +{
> +#ifndef F_OFD_SETLK
> +    return 0;
> +#endif
> +    int r;
> +    test_image_fd = mkstemp(test_image);
> +
> +    qemu_init_main_loop(&error_fatal);
> +    bdrv_init();
> +
> +    g_assert(test_image_fd >= 0);
> +    r = ftruncate(test_image_fd, TEST_IMAGE_SIZE);
> +    g_assert_cmpint(r, ==, 0);
> +    g_test_init(&argc, &argv, NULL);
> +    g_test_add_func("/image-lock/compat", test_compat);
> +    g_test_add_func("/image-lock/compat_after_reopen", test_compat_after_reopen);
> +    g_test_add_func("/image-lock/compat_0bytefile", test_0bytefile);
> +    g_test_add_func("/image-lock/disable", test_disable);

Maybe test_0bytefile should be the last test since it truncates the test
image to 0 bytes.

> +    aio_context_acquire(qemu_get_aio_context());
> +    r = g_test_run();
> +    aio_context_release(qemu_get_aio_context());

Should test_image be unlinked here?

> +    return r;
> +}


> diff --git a/tests/test-replication.c b/tests/test-replication.c
> index 5bede49..5fb69d2 100644
> --- a/tests/test-replication.c
> +++ b/tests/test-replication.c
> @@ -312,7 +312,7 @@ static BlockBackend *start_secondary(void)
>  
>      /* add s_local_disk and forge S_LOCAL_DISK_ID */
>      cmdline = g_strdup_printf("file.filename=%s,driver=qcow2,"
> -                              "file.file.disable-lock=on",
> +                              "file.disable-lock=on",
>                                s_local_disk);
>      opts = qemu_opts_parse_noisily(&qemu_drive_opts, cmdline, false);
>      g_free(cmdline);
> @@ -334,10 +334,10 @@ static BlockBackend *start_secondary(void)
>      /* add S_(ACTIVE/HIDDEN)_DISK and forge S_ID */
>      cmdline = g_strdup_printf("driver=replication,mode=secondary,top-id=%s,"
>                                "file.driver=qcow2,file.file.filename=%s,"
> -                              "file.file.disable-lock=on",
> +                              "file.file.disable-lock=on,"
>                                "file.backing.driver=qcow2,"
>                                "file.backing.file.filename=%s,"
> -                              "file.backing.file.disable-lock=on",
> +                              "file.backing.file.disable-lock=on,"
>                                "file.backing.backing=%s"
>                                , S_ID, s_active_disk, s_hidden_disk
>                                , S_LOCAL_DISK_ID);
> 

These two hunks should be in patch 13, I presume.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH v12 15/16] qcow2: Force "no other writer" lock on bs->file
  2017-02-08 14:33   ` Max Reitz
@ 2017-02-08 15:00     ` Fam Zheng
  0 siblings, 0 replies; 39+ messages in thread
From: Fam Zheng @ 2017-02-08 15:00 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Kevin Wolf, rjones, qemu-block

On Wed, 02/08 15:33, Max Reitz wrote:
> On 23.01.2017 13:30, Fam Zheng wrote:
> > Writing to the same qcow2 file from two QEMU processes at the same time
> > will never work correctly, so disable it even when the caller specifies
> > BDRV_O_RDWR.
> > 
> > Other formats are less vulnerable because they don't have internal
> > snapshots thus qemu-img is less often misused to create live snapshot.
> 
> Hm, OK, reasonable. Also reasonable since we can just wait with those
> until we have op blockers.
> 
> Which brings me to the op blocker point. I don't know the exact
> influence Kevin's patches will have on this series, but I'd imagine they
> mostly change where the BDRV_O_SHARE_RW flag comes from or whether we
> need that flag at all. Therefore, I personally don't mind the order in
> which your series land in master.

I agree. I believe BDRV_O_SHARE_RW is replaced by op blocker primitives but I've
not checked yet. Tomorrow I'll take a look at Kevin's branch and see if it is
easy enough to softly base on top of it.

> 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/qcow2.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index 96fb8a8..879361a 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> > @@ -1177,6 +1177,17 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
> >          }
> >      }
> >  
> > +    if ((flags & BDRV_O_SHARE_RW) && (flags & BDRV_O_RDWR)) {
> > +        /* Shared write is never a good idea for qcow2, override it.
> > +         * XXX: Use permission propagation and masking mechanism in op blockers
> > +         * API once it's there. */
> > +        ret = bdrv_reopen(bs->file->bs, flags & ~BDRV_O_SHARE_RW, &local_err);
> > +        if (ret) {
> > +            error_propagate(errp, local_err);
> > +            goto fail;
> > +        }
> > +    }
> > +
> 
> Good in principle, but I don't think it should be at the bottom of this
> function, especially not after "Repair image if dirty". I think it would
> be good to put this right at the start of qcow2_open(), actually.

Sounds good!

Fam

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2017-02-08 15:00 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-23 12:30 [Qemu-devel] [PATCH v12 00/16] block: Image locking series Fam Zheng
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 01/16] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
2017-02-05 21:38   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 02/16] block: Define BDRV_O_SHARE_RW Fam Zheng
2017-02-05 21:51   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 03/16] qemu-io: Set "share-rw" flag together with read-only Fam Zheng
2017-02-05 21:52   ` Max Reitz
2017-02-06  4:52     ` Fam Zheng
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 04/16] qemu-img: Set "share-rw" flag in read-only commands Fam Zheng
2017-02-05 21:55   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 05/16] block: Set "share-rw" flag in drive-backup when sync=none Fam Zheng
2017-02-05 21:57   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 06/16] iotests: 055: Don't attach the drive to vm for drive-backup Fam Zheng
2017-02-08  0:06   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 07/16] iotests: 030: Read-only open image for getting map Fam Zheng
2017-02-08  0:11   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 08/16] iotests: 087: Don't attach test image twice Fam Zheng
2017-02-08  0:14   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 09/16] iotests: 085: Avoid image locking conflict Fam Zheng
2017-02-08  0:25   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 10/16] iotests: 091: Quit QEMU before checking image Fam Zheng
2017-02-08  0:32   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 11/16] iotests: 172: Use separate images for multiple devices Fam Zheng
2017-02-08  0:43   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 12/16] tests: Use null-co:// instead of /dev/null as the dummy image Fam Zheng
2017-02-08  0:45   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 13/16] tests: Disable image lock in test-replication Fam Zheng
2017-02-08  0:56   ` Max Reitz
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 14/16] file-posix: Implement image locking Fam Zheng
2017-02-08  3:05   ` Max Reitz
2017-02-08  6:00     ` Fam Zheng
2017-02-08 13:18       ` Max Reitz
2017-02-08 13:40         ` Fam Zheng
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 15/16] qcow2: Force "no other writer" lock on bs->file Fam Zheng
2017-02-08 14:33   ` Max Reitz
2017-02-08 15:00     ` Fam Zheng
2017-01-23 12:30 ` [Qemu-devel] [PATCH v12 16/16] tests: Add test-image-lock Fam Zheng
2017-02-08 14:52   ` Max Reitz
2017-02-05 21:58 ` [Qemu-devel] [PATCH v12 00/16] block: Image locking series Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.