All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] block/file-posix: File locking during creation
@ 2018-04-20 22:09 Max Reitz
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 1/3] block/file-posix: Pass FD to locking helpers Max Reitz
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Max Reitz @ 2018-04-20 22:09 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf

Currently we do not take permissions on a file while it is being
created.  That is a bit sad.  The simplest way to test this is the
following:

    $ qemu-img create -f qcow2 foo.qcow2 64M
    Formatting 'foo.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off refcount_bits=16
    $ qemu-img convert -f qcow2 -O qcow2 foo.qcow2 foo.qcow2
    qemu-img: foo.qcow2: error while converting qcow2: Failed to get "write" lock
    Is another process using the image?
    $ qemu-img info foo.qcow2
    image: foo.qcow2
    file format: raw
    virtual size: 0 (0 bytes)
    disk size: 0

(See also https://bugzilla.redhat.com/show_bug.cgi?id=1519144)

Here is what's happening: file-posix opens the file with
O_CREAT | O_TRUNC, thus truncating the file to 0.  Then qcow2 tries to
format it, but fails because it needs the WRITE permission to do so.  So
it cannot format it and the file stays empty.

We should actually take a WRITE and a RESIZE permission before we
truncate the file, and this is what this series does.

I personally consider the solution taken here a bit of a hack, but it
works and we don't have any locking in any drivers but file-posix
anyway, so it isn't lacking anything in that regard.  Integrating it in
blockdev-create might be possible, but there are two issues:
(1) It would be harder.
(2) What would we do anyway?  We'd advise protocol drivers to take WRITE
    and RESIZE permissions before they truncate a file to be empty...
    Well, and then they'd do exactly what file-posix is made to do in
    this series.

So basically what I consider a hack could be seen as exactly the right
way to tackle the issue: Protocol drivers have to ensure the correct
permissions (and they have to choose what those are) are applied before
changing any file -- which is what this series implements.


Max Reitz (3):
  block/file-posix: Pass FD to locking helpers
  block/file-posix: File locking during creation
  iotests: Add creation test to 153

 block/file-posix.c         | 68 ++++++++++++++++++++++++++++++++++++----------
 tests/qemu-iotests/153     | 18 ++++++++++++
 tests/qemu-iotests/153.out | 13 +++++++++
 3 files changed, 84 insertions(+), 15 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 1/3] block/file-posix: Pass FD to locking helpers
  2018-04-20 22:09 [Qemu-devel] [PATCH 0/3] block/file-posix: File locking during creation Max Reitz
@ 2018-04-20 22:09 ` Max Reitz
  2018-04-27  6:24   ` Fam Zheng
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation Max Reitz
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Max Reitz @ 2018-04-20 22:09 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf

raw_apply_lock_bytes() and raw_check_lock_bytes() currently take a
BDRVRawState *, but they only use the lock_fd field.  During image
creation, we do not have a BDRVRawState, but we do have an FD; so if we
want to reuse the functions there, we should modify them to receive only
the FD.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/file-posix.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 3794c0007a..c98a4a1556 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -630,7 +630,7 @@ typedef enum {
  * file; if @unlock == true, also unlock the unneeded bytes.
  * @shared_perm_lock_bits is the mask of all permissions that are NOT shared.
  */
-static int raw_apply_lock_bytes(BDRVRawState *s,
+static int raw_apply_lock_bytes(int fd,
                                 uint64_t perm_lock_bits,
                                 uint64_t shared_perm_lock_bits,
                                 bool unlock, Error **errp)
@@ -641,13 +641,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s,
     PERM_FOREACH(i) {
         int off = RAW_LOCK_PERM_BASE + i;
         if (perm_lock_bits & (1ULL << i)) {
-            ret = qemu_lock_fd(s->lock_fd, off, 1, false);
+            ret = qemu_lock_fd(fd, off, 1, false);
             if (ret) {
                 error_setg(errp, "Failed to lock byte %d", off);
                 return ret;
             }
         } else if (unlock) {
-            ret = qemu_unlock_fd(s->lock_fd, off, 1);
+            ret = qemu_unlock_fd(fd, off, 1);
             if (ret) {
                 error_setg(errp, "Failed to unlock byte %d", off);
                 return ret;
@@ -657,13 +657,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s,
     PERM_FOREACH(i) {
         int off = RAW_LOCK_SHARED_BASE + i;
         if (shared_perm_lock_bits & (1ULL << i)) {
-            ret = qemu_lock_fd(s->lock_fd, off, 1, false);
+            ret = qemu_lock_fd(fd, off, 1, false);
             if (ret) {
                 error_setg(errp, "Failed to lock byte %d", off);
                 return ret;
             }
         } else if (unlock) {
-            ret = qemu_unlock_fd(s->lock_fd, off, 1);
+            ret = qemu_unlock_fd(fd, off, 1);
             if (ret) {
                 error_setg(errp, "Failed to unlock byte %d", off);
                 return ret;
@@ -674,8 +674,7 @@ static int raw_apply_lock_bytes(BDRVRawState *s,
 }
 
 /* Check "unshared" bytes implied by @perm and ~@shared_perm in the file. */
-static int raw_check_lock_bytes(BDRVRawState *s,
-                                uint64_t perm, uint64_t shared_perm,
+static int raw_check_lock_bytes(int fd, uint64_t perm, uint64_t shared_perm,
                                 Error **errp)
 {
     int ret;
@@ -685,7 +684,7 @@ static int raw_check_lock_bytes(BDRVRawState *s,
         int off = RAW_LOCK_SHARED_BASE + i;
         uint64_t p = 1ULL << i;
         if (perm & p) {
-            ret = qemu_lock_fd_test(s->lock_fd, off, 1, true);
+            ret = qemu_lock_fd_test(fd, off, 1, true);
             if (ret) {
                 char *perm_name = bdrv_perm_names(p);
                 error_setg(errp,
@@ -702,7 +701,7 @@ static int raw_check_lock_bytes(BDRVRawState *s,
         int off = RAW_LOCK_PERM_BASE + i;
         uint64_t p = 1ULL << i;
         if (!(shared_perm & p)) {
-            ret = qemu_lock_fd_test(s->lock_fd, off, 1, true);
+            ret = qemu_lock_fd_test(fd, off, 1, true);
             if (ret) {
                 char *perm_name = bdrv_perm_names(p);
                 error_setg(errp,
@@ -739,11 +738,11 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
 
     switch (op) {
     case RAW_PL_PREPARE:
-        ret = raw_apply_lock_bytes(s, s->perm | new_perm,
+        ret = raw_apply_lock_bytes(s->lock_fd, s->perm | new_perm,
                                    ~s->shared_perm | ~new_shared,
                                    false, errp);
         if (!ret) {
-            ret = raw_check_lock_bytes(s, new_perm, new_shared, errp);
+            ret = raw_check_lock_bytes(s->lock_fd, new_perm, new_shared, errp);
             if (!ret) {
                 return 0;
             }
@@ -751,7 +750,8 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
         op = RAW_PL_ABORT;
         /* fall through to unlock bytes. */
     case RAW_PL_ABORT:
-        raw_apply_lock_bytes(s, s->perm, ~s->shared_perm, true, &local_err);
+        raw_apply_lock_bytes(s->lock_fd, s->perm, ~s->shared_perm,
+                             true, &local_err);
         if (local_err) {
             /* Theoretically the above call only unlocks bytes and it cannot
              * fail. Something weird happened, report it.
@@ -760,7 +760,8 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
         }
         break;
     case RAW_PL_COMMIT:
-        raw_apply_lock_bytes(s, new_perm, ~new_shared, true, &local_err);
+        raw_apply_lock_bytes(s->lock_fd, new_perm, ~new_shared,
+                             true, &local_err);
         if (local_err) {
             /* Theoretically the above call only unlocks bytes and it cannot
              * fail. Something weird happened, report it.
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation
  2018-04-20 22:09 [Qemu-devel] [PATCH 0/3] block/file-posix: File locking during creation Max Reitz
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 1/3] block/file-posix: Pass FD to locking helpers Max Reitz
@ 2018-04-20 22:09 ` Max Reitz
  2018-04-27  6:22   ` Fam Zheng
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 3/3] iotests: Add creation test to 153 Max Reitz
  2018-04-23 13:19 ` [Qemu-devel] [Qemu-block] [PATCH 0/3] block/file-posix: File locking during creation Alberto Garcia
  3 siblings, 1 reply; 13+ messages in thread
From: Max Reitz @ 2018-04-20 22:09 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf

When creating a file, we should take the WRITE and RESIZE permissions.
We do not need either for the creation itself, but we do need them for
clearing and resizing it.  So we can take the proper permissions by
replacing O_TRUNC with an explicit truncation to 0, and by taking the
appropriate file locks between those two steps.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/file-posix.c | 41 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index c98a4a1556..ed7932d6e8 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1992,6 +1992,7 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
 {
     BlockdevCreateOptionsFile *file_opts;
     int fd;
+    int perm, shared;
     int result = 0;
 
     /* Validate options and set default values */
@@ -2006,14 +2007,48 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
     }
 
     /* Create file */
-    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
-                   0644);
+    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
     if (fd < 0) {
         result = -errno;
         error_setg_errno(errp, -result, "Could not create file");
         goto out;
     }
 
+    /* Take permissions: We want to discard everything, so we need
+     * BLK_PERM_WRITE; and truncation to the desired size requires
+     * BLK_PERM_RESIZE.
+     * On the other hand, we cannot share the RESIZE permission
+     * because we promise that after this function, the file has the
+     * size given in the options.  If someone else were to resize it
+     * concurrently, we could not guarantee that. */
+    perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
+    shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
+
+    /* Step one: Take locks in shared mode */
+    result = raw_apply_lock_bytes(fd, perm, shared, false, errp);
+    if (result < 0) {
+        goto out_close;
+    }
+
+    /* Step two: Try to get them exclusively */
+    result = raw_check_lock_bytes(fd, perm, shared, errp);
+    if (result < 0) {
+        goto out_close;
+    }
+
+    /* Step three: Downgrade them to shared again, and keep
+     *             them that way until we are done */
+    result = raw_apply_lock_bytes(fd, perm, shared, true, errp);
+    if (result < 0) {
+        goto out_close;
+    }
+
+    /* Clear the file by truncating it to 0 */
+    result = raw_regular_truncate(fd, 0, PREALLOC_MODE_OFF, errp);
+    if (result < 0) {
+        goto out_close;
+    }
+
     if (file_opts->nocow) {
 #ifdef __linux__
         /* Set NOCOW flag to solve performance issue on fs like btrfs.
@@ -2029,6 +2064,8 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
 #endif
     }
 
+    /* Resize and potentially preallocate the file to the desired
+     * final size */
     result = raw_regular_truncate(fd, file_opts->size, file_opts->preallocation,
                                   errp);
     if (result < 0) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 3/3] iotests: Add creation test to 153
  2018-04-20 22:09 [Qemu-devel] [PATCH 0/3] block/file-posix: File locking during creation Max Reitz
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 1/3] block/file-posix: Pass FD to locking helpers Max Reitz
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation Max Reitz
@ 2018-04-20 22:09 ` Max Reitz
  2018-04-27  6:24   ` Fam Zheng
  2018-04-23 13:19 ` [Qemu-devel] [Qemu-block] [PATCH 0/3] block/file-posix: File locking during creation Alberto Garcia
  3 siblings, 1 reply; 13+ messages in thread
From: Max Reitz @ 2018-04-20 22:09 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf

This patch adds a test case to 153 which tries to overwrite an image
(using qemu-img create) while it is in use.  Without the original user
explicitly sharing the necessary permissions (writing and truncation),
this should not be allowed.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/153     | 18 ++++++++++++++++++
 tests/qemu-iotests/153.out | 13 +++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
index a0fd815483..1be83912f8 100755
--- a/tests/qemu-iotests/153
+++ b/tests/qemu-iotests/153
@@ -137,6 +137,24 @@ for opts1 in "" "read-only=on" "read-only=on,force-share=on"; do
         _run_cmd $QEMU_IMG dd          $L if="${TEST_IMG}" of="${TEST_IMG}.convert" bs=512 count=1
         _run_cmd $QEMU_IMG bench       $L -c 1 "${TEST_IMG}"
         _run_cmd $QEMU_IMG bench       $L -w -c 1 "${TEST_IMG}"
+
+        # qemu-img create does not support -U
+        if [ -z "$L" ]; then
+            _run_cmd $QEMU_IMG create -f $IMGFMT "${TEST_IMG}" \
+                                      -b ${TEST_IMG}.base
+            # Read the file format.  It used to be the case that
+            # file-posix simply truncated the file, but the qcow2
+            # driver then failed to format it because it was unable
+            # to acquire the necessary WRITE permission.  However, the
+            # truncation was already wrong, and the whole process
+            # resulted in the file being completely empty and thus its
+            # format would be detected to be raw.
+            # So we read it here to see that creation either completed
+            # successfully (thus the format is qcow2) or it aborted
+            # before the file was changed at all (thus the format stays
+            # qcow2).
+            _img_info -U | grep 'file format'
+        fi
     done
     _send_qemu_cmd $h "{ 'execute': 'quit', }" ""
     echo
diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
index bb721cb747..e8eb369cae 100644
--- a/tests/qemu-iotests/153.out
+++ b/tests/qemu-iotests/153.out
@@ -92,6 +92,11 @@ _qemu_img_wrapper bench -w -c 1 TEST_DIR/t.qcow2
 qemu-img: Could not open 'TEST_DIR/t.qcow2': Failed to get "write" lock
 Is another process using the image?
 
+_qemu_img_wrapper create -f qcow2 TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+qemu-img: TEST_DIR/t.qcow2: Failed to get "write" lock
+Is another process using the image?
+file format: IMGFMT
+
 == Running utility commands -U ==
 
 _qemu_io_wrapper -U -c read 0 512 TEST_DIR/t.qcow2
@@ -209,6 +214,11 @@ _qemu_img_wrapper bench -w -c 1 TEST_DIR/t.qcow2
 qemu-img: Could not open 'TEST_DIR/t.qcow2': Failed to get "write" lock
 Is another process using the image?
 
+_qemu_img_wrapper create -f qcow2 TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+qemu-img: TEST_DIR/t.qcow2: Failed to get "write" lock
+Is another process using the image?
+file format: IMGFMT
+
 == Running utility commands -U ==
 
 _qemu_io_wrapper -U -c read 0 512 TEST_DIR/t.qcow2
@@ -309,6 +319,9 @@ _qemu_img_wrapper bench -c 1 TEST_DIR/t.qcow2
 
 _qemu_img_wrapper bench -w -c 1 TEST_DIR/t.qcow2
 
+_qemu_img_wrapper create -f qcow2 TEST_DIR/t.qcow2 -b TEST_DIR/t.qcow2.base
+file format: IMGFMT
+
 == Running utility commands -U ==
 
 _qemu_io_wrapper -U -c read 0 512 TEST_DIR/t.qcow2
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block/file-posix: File locking during creation
  2018-04-20 22:09 [Qemu-devel] [PATCH 0/3] block/file-posix: File locking during creation Max Reitz
                   ` (2 preceding siblings ...)
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 3/3] iotests: Add creation test to 153 Max Reitz
@ 2018-04-23 13:19 ` Alberto Garcia
  2018-04-23 15:56   ` Max Reitz
  3 siblings, 1 reply; 13+ messages in thread
From: Alberto Garcia @ 2018-04-23 13:19 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

On Sat 21 Apr 2018 12:09:10 AM CEST, Max Reitz wrote:
> Currently we do not take permissions on a file while it is being
> created.  That is a bit sad.  The simplest way to test this is the
> following:
>
>     $ qemu-img create -f qcow2 foo.qcow2 64M
>     Formatting 'foo.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off refcount_bits=16
>     $ qemu-img convert -f qcow2 -O qcow2 foo.qcow2 foo.qcow2
>     qemu-img: foo.qcow2: error while converting qcow2: Failed to get "write" lock
>     Is another process using the image?
>     $ qemu-img info foo.qcow2
>     image: foo.qcow2
>     file format: raw
>     virtual size: 0 (0 bytes)
>     disk size: 0

Not quite the same problem, but here's another example of QEMU creating
an image before checking if it can be created:

$ qemu-img create -o cluster_size=2M -f qcow2 img.qcow2 7E
Formatting 'img.qcow2', fmt=qcow2 size=8070450532247928832 encryption=off cluster_size=2097152 lazy_refcounts=off refcount_bits=16
qemu-img: img.qcow2: The image size is too large for file format 'qcow2' (try using a larger cluster size)

$ qemu-img info img.qcow2 
image: img.qcow2
file format: qcow2
virtual size: 0 (0 bytes)
disk size: 6.0M
cluster_size: 2097152
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Berto

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block/file-posix: File locking during creation
  2018-04-23 13:19 ` [Qemu-devel] [Qemu-block] [PATCH 0/3] block/file-posix: File locking during creation Alberto Garcia
@ 2018-04-23 15:56   ` Max Reitz
  0 siblings, 0 replies; 13+ messages in thread
From: Max Reitz @ 2018-04-23 15:56 UTC (permalink / raw)
  To: Alberto Garcia, qemu-block; +Cc: Kevin Wolf, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2086 bytes --]

On 2018-04-23 15:19, Alberto Garcia wrote:
> On Sat 21 Apr 2018 12:09:10 AM CEST, Max Reitz wrote:
>> Currently we do not take permissions on a file while it is being
>> created.  That is a bit sad.  The simplest way to test this is the
>> following:
>>
>>     $ qemu-img create -f qcow2 foo.qcow2 64M
>>     Formatting 'foo.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off refcount_bits=16
>>     $ qemu-img convert -f qcow2 -O qcow2 foo.qcow2 foo.qcow2
>>     qemu-img: foo.qcow2: error while converting qcow2: Failed to get "write" lock
>>     Is another process using the image?
>>     $ qemu-img info foo.qcow2
>>     image: foo.qcow2
>>     file format: raw
>>     virtual size: 0 (0 bytes)
>>     disk size: 0
> 
> Not quite the same problem, but here's another example of QEMU creating
> an image before checking if it can be created:
> 
> $ qemu-img create -o cluster_size=2M -f qcow2 img.qcow2 7E
> Formatting 'img.qcow2', fmt=qcow2 size=8070450532247928832 encryption=off cluster_size=2097152 lazy_refcounts=off refcount_bits=16
> qemu-img: img.qcow2: The image size is too large for file format 'qcow2' (try using a larger cluster size)
> 
> $ qemu-img info img.qcow2 
> image: img.qcow2
> file format: qcow2
> virtual size: 0 (0 bytes)
> disk size: 6.0M
> cluster_size: 2097152
> Format specific information:
>     compat: 1.1
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
> 
> Berto

Actually a very different problem, and I think we are not going to
change this behavior -- especially considering the direction
blockdev-create is taking (that is, separating protocol file creation
and formatting).

The main reason why it's a different problem, though, is because it
doesn't mean any data loss.  You just get a stale file lying around.

(You could argue that in any case the file should at least be a raw file
of size 0, because the qcow2 driver should not start formatting until it
has checked all options, but, well...  I don't know how that would help
anyone.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation Max Reitz
@ 2018-04-27  6:22   ` Fam Zheng
  2018-04-28 11:03     ` Max Reitz
  0 siblings, 1 reply; 13+ messages in thread
From: Fam Zheng @ 2018-04-27  6:22 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, Kevin Wolf, qemu-devel

On Sat, 04/21 00:09, Max Reitz wrote:
> When creating a file, we should take the WRITE and RESIZE permissions.
> We do not need either for the creation itself, but we do need them for
> clearing and resizing it.  So we can take the proper permissions by
> replacing O_TRUNC with an explicit truncation to 0, and by taking the
> appropriate file locks between those two steps.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/file-posix.c | 41 +++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index c98a4a1556..ed7932d6e8 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1992,6 +1992,7 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
>  {
>      BlockdevCreateOptionsFile *file_opts;
>      int fd;
> +    int perm, shared;
>      int result = 0;
>  
>      /* Validate options and set default values */
> @@ -2006,14 +2007,48 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
>      }
>  
>      /* Create file */
> -    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
> -                   0644);
> +    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
>      if (fd < 0) {
>          result = -errno;
>          error_setg_errno(errp, -result, "Could not create file");
>          goto out;
>      }
>  
> +    /* Take permissions: We want to discard everything, so we need
> +     * BLK_PERM_WRITE; and truncation to the desired size requires
> +     * BLK_PERM_RESIZE.
> +     * On the other hand, we cannot share the RESIZE permission
> +     * because we promise that after this function, the file has the
> +     * size given in the options.  If someone else were to resize it
> +     * concurrently, we could not guarantee that. */

Strictly speaking, we do close(fd) before this function returns so the lock is
lost and race can happen.

> +    perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
> +    shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
> +
> +    /* Step one: Take locks in shared mode */
> +    result = raw_apply_lock_bytes(fd, perm, shared, false, errp);
> +    if (result < 0) {
> +        goto out_close;
> +    }
> +
> +    /* Step two: Try to get them exclusively */
> +    result = raw_check_lock_bytes(fd, perm, shared, errp);
> +    if (result < 0) {
> +        goto out_close;
> +    }
> +
> +    /* Step three: Downgrade them to shared again, and keep
> +     *             them that way until we are done */
> +    result = raw_apply_lock_bytes(fd, perm, shared, true, errp);

You comment it as "downgrade to shared" but what this actually does in addition
to step one is to "unlock" unneeded bytes which we haven't locked anyway. So I'm
confused why we need this stop. IIUC no downgrading is necessary after step one
and step two: only necessary shared locks are acquired in step one, and step two
is read-only op (F_OFD_GETLK).

> +    if (result < 0) {
> +        goto out_close;
> +    }
> +
> +    /* Clear the file by truncating it to 0 */
> +    result = raw_regular_truncate(fd, 0, PREALLOC_MODE_OFF, errp);
> +    if (result < 0) {
> +        goto out_close;
> +    }
> +
>      if (file_opts->nocow) {
>  #ifdef __linux__
>          /* Set NOCOW flag to solve performance issue on fs like btrfs.
> @@ -2029,6 +2064,8 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
>  #endif
>      }
>  
> +    /* Resize and potentially preallocate the file to the desired
> +     * final size */
>      result = raw_regular_truncate(fd, file_opts->size, file_opts->preallocation,
>                                    errp);
>      if (result < 0) {
> -- 
> 2.14.3
> 
> 

Fam

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] block/file-posix: Pass FD to locking helpers
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 1/3] block/file-posix: Pass FD to locking helpers Max Reitz
@ 2018-04-27  6:24   ` Fam Zheng
  0 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-04-27  6:24 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, Kevin Wolf, qemu-devel

On Sat, 04/21 00:09, Max Reitz wrote:
> raw_apply_lock_bytes() and raw_check_lock_bytes() currently take a
> BDRVRawState *, but they only use the lock_fd field.  During image
> creation, we do not have a BDRVRawState, but we do have an FD; so if we
> want to reuse the functions there, we should modify them to receive only
> the FD.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] iotests: Add creation test to 153
  2018-04-20 22:09 ` [Qemu-devel] [PATCH 3/3] iotests: Add creation test to 153 Max Reitz
@ 2018-04-27  6:24   ` Fam Zheng
  0 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-04-27  6:24 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, Kevin Wolf, qemu-devel

On Sat, 04/21 00:09, Max Reitz wrote:
> This patch adds a test case to 153 which tries to overwrite an image
> (using qemu-img create) while it is in use.  Without the original user
> explicitly sharing the necessary permissions (writing and truncation),
> this should not be allowed.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation
  2018-04-27  6:22   ` Fam Zheng
@ 2018-04-28 11:03     ` Max Reitz
  2018-05-03  6:45       ` Fam Zheng
  0 siblings, 1 reply; 13+ messages in thread
From: Max Reitz @ 2018-04-28 11:03 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qemu-block, Kevin Wolf, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 4380 bytes --]

On 2018-04-27 08:22, Fam Zheng wrote:
> On Sat, 04/21 00:09, Max Reitz wrote:
>> When creating a file, we should take the WRITE and RESIZE permissions.
>> We do not need either for the creation itself, but we do need them for
>> clearing and resizing it.  So we can take the proper permissions by
>> replacing O_TRUNC with an explicit truncation to 0, and by taking the
>> appropriate file locks between those two steps.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>  block/file-posix.c | 41 +++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 39 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/file-posix.c b/block/file-posix.c
>> index c98a4a1556..ed7932d6e8 100644
>> --- a/block/file-posix.c
>> +++ b/block/file-posix.c
>> @@ -1992,6 +1992,7 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
>>  {
>>      BlockdevCreateOptionsFile *file_opts;
>>      int fd;
>> +    int perm, shared;
>>      int result = 0;
>>  
>>      /* Validate options and set default values */
>> @@ -2006,14 +2007,48 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
>>      }
>>  
>>      /* Create file */
>> -    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
>> -                   0644);
>> +    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
>>      if (fd < 0) {
>>          result = -errno;
>>          error_setg_errno(errp, -result, "Could not create file");
>>          goto out;
>>      }
>>  
>> +    /* Take permissions: We want to discard everything, so we need
>> +     * BLK_PERM_WRITE; and truncation to the desired size requires
>> +     * BLK_PERM_RESIZE.
>> +     * On the other hand, we cannot share the RESIZE permission
>> +     * because we promise that after this function, the file has the
>> +     * size given in the options.  If someone else were to resize it
>> +     * concurrently, we could not guarantee that. */
> 
> Strictly speaking, we do close(fd) before this function returns so the lock is
> lost and race can happen.

Right, but then creation from the perspective of file-posix is over.  We
are going to reopen the file for formatting, but that is just a normal
bdrv_open() so it will automatically be locked, no?

>> +    perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
>> +    shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
>> +
>> +    /* Step one: Take locks in shared mode */
>> +    result = raw_apply_lock_bytes(fd, perm, shared, false, errp);
>> +    if (result < 0) {
>> +        goto out_close;
>> +    }
>> +
>> +    /* Step two: Try to get them exclusively */
>> +    result = raw_check_lock_bytes(fd, perm, shared, errp);
>> +    if (result < 0) {
>> +        goto out_close;
>> +    }
>> +
>> +    /* Step three: Downgrade them to shared again, and keep
>> +     *             them that way until we are done */
>> +    result = raw_apply_lock_bytes(fd, perm, shared, true, errp);
> 
> You comment it as "downgrade to shared" but what this actually does in addition
> to step one is to "unlock" unneeded bytes which we haven't locked anyway. So I'm
> confused why we need this stop. IIUC no downgrading is necessary after step one
> and step two: only necessary shared locks are acquired in step one, and step two
> is read-only op (F_OFD_GETLK).

Oops.  For some reason I thought raw_check_lock_bytes() would take the
locks exclusively.  Yes, then we don't need this third step.

Thanks for reviewing!

Max

>> +    if (result < 0) {
>> +        goto out_close;
>> +    }
>> +
>> +    /* Clear the file by truncating it to 0 */
>> +    result = raw_regular_truncate(fd, 0, PREALLOC_MODE_OFF, errp);
>> +    if (result < 0) {
>> +        goto out_close;
>> +    }
>> +
>>      if (file_opts->nocow) {
>>  #ifdef __linux__
>>          /* Set NOCOW flag to solve performance issue on fs like btrfs.
>> @@ -2029,6 +2064,8 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
>>  #endif
>>      }
>>  
>> +    /* Resize and potentially preallocate the file to the desired
>> +     * final size */
>>      result = raw_regular_truncate(fd, file_opts->size, file_opts->preallocation,
>>                                    errp);
>>      if (result < 0) {
>> -- 
>> 2.14.3
>>
>>
> 
> Fam
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation
  2018-04-28 11:03     ` Max Reitz
@ 2018-05-03  6:45       ` Fam Zheng
  2018-05-04 13:45         ` Max Reitz
  0 siblings, 1 reply; 13+ messages in thread
From: Fam Zheng @ 2018-05-03  6:45 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, Kevin Wolf, qemu-devel

On Sat, 04/28 13:03, Max Reitz wrote:
> On 2018-04-27 08:22, Fam Zheng wrote:
> > On Sat, 04/21 00:09, Max Reitz wrote:
> >> When creating a file, we should take the WRITE and RESIZE permissions.
> >> We do not need either for the creation itself, but we do need them for
> >> clearing and resizing it.  So we can take the proper permissions by
> >> replacing O_TRUNC with an explicit truncation to 0, and by taking the
> >> appropriate file locks between those two steps.
> >>
> >> Signed-off-by: Max Reitz <mreitz@redhat.com>
> >> ---
> >>  block/file-posix.c | 41 +++++++++++++++++++++++++++++++++++++++--
> >>  1 file changed, 39 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/block/file-posix.c b/block/file-posix.c
> >> index c98a4a1556..ed7932d6e8 100644
> >> --- a/block/file-posix.c
> >> +++ b/block/file-posix.c
> >> @@ -1992,6 +1992,7 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
> >>  {
> >>      BlockdevCreateOptionsFile *file_opts;
> >>      int fd;
> >> +    int perm, shared;
> >>      int result = 0;
> >>  
> >>      /* Validate options and set default values */
> >> @@ -2006,14 +2007,48 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
> >>      }
> >>  
> >>      /* Create file */
> >> -    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
> >> -                   0644);
> >> +    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
> >>      if (fd < 0) {
> >>          result = -errno;
> >>          error_setg_errno(errp, -result, "Could not create file");
> >>          goto out;
> >>      }
> >>  
> >> +    /* Take permissions: We want to discard everything, so we need
> >> +     * BLK_PERM_WRITE; and truncation to the desired size requires
> >> +     * BLK_PERM_RESIZE.
> >> +     * On the other hand, we cannot share the RESIZE permission
> >> +     * because we promise that after this function, the file has the
> >> +     * size given in the options.  If someone else were to resize it
> >> +     * concurrently, we could not guarantee that. */
> > 
> > Strictly speaking, we do close(fd) before this function returns so the lock is
> > lost and race can happen.
> 
> Right, but then creation from the perspective of file-posix is over.  We
> are going to reopen the file for formatting, but that is just a normal
> bdrv_open() so it will automatically be locked, no?

After this function close() but before the following bdrv_open(), another
process can sneak in and steal the lock. So we cannot guarantee the RESIZE lock
does what we really want.

Fam

> 
> >> +    perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
> >> +    shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
> >> +
> >> +    /* Step one: Take locks in shared mode */
> >> +    result = raw_apply_lock_bytes(fd, perm, shared, false, errp);
> >> +    if (result < 0) {
> >> +        goto out_close;
> >> +    }
> >> +
> >> +    /* Step two: Try to get them exclusively */
> >> +    result = raw_check_lock_bytes(fd, perm, shared, errp);
> >> +    if (result < 0) {
> >> +        goto out_close;
> >> +    }
> >> +
> >> +    /* Step three: Downgrade them to shared again, and keep
> >> +     *             them that way until we are done */
> >> +    result = raw_apply_lock_bytes(fd, perm, shared, true, errp);
> > 
> > You comment it as "downgrade to shared" but what this actually does in addition
> > to step one is to "unlock" unneeded bytes which we haven't locked anyway. So I'm
> > confused why we need this stop. IIUC no downgrading is necessary after step one
> > and step two: only necessary shared locks are acquired in step one, and step two
> > is read-only op (F_OFD_GETLK).
> 
> Oops.  For some reason I thought raw_check_lock_bytes() would take the
> locks exclusively.  Yes, then we don't need this third step.
> 
> Thanks for reviewing!
> 
> Max
> 
> >> +    if (result < 0) {
> >> +        goto out_close;
> >> +    }
> >> +
> >> +    /* Clear the file by truncating it to 0 */
> >> +    result = raw_regular_truncate(fd, 0, PREALLOC_MODE_OFF, errp);
> >> +    if (result < 0) {
> >> +        goto out_close;
> >> +    }
> >> +
> >>      if (file_opts->nocow) {
> >>  #ifdef __linux__
> >>          /* Set NOCOW flag to solve performance issue on fs like btrfs.
> >> @@ -2029,6 +2064,8 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
> >>  #endif
> >>      }
> >>  
> >> +    /* Resize and potentially preallocate the file to the desired
> >> +     * final size */
> >>      result = raw_regular_truncate(fd, file_opts->size, file_opts->preallocation,
> >>                                    errp);
> >>      if (result < 0) {
> >> -- 
> >> 2.14.3
> >>
> >>
> > 
> > Fam
> > 
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation
  2018-05-03  6:45       ` Fam Zheng
@ 2018-05-04 13:45         ` Max Reitz
  2018-05-07  7:23           ` Fam Zheng
  0 siblings, 1 reply; 13+ messages in thread
From: Max Reitz @ 2018-05-04 13:45 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qemu-block, Kevin Wolf, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5146 bytes --]

On 2018-05-03 08:45, Fam Zheng wrote:
> On Sat, 04/28 13:03, Max Reitz wrote:
>> On 2018-04-27 08:22, Fam Zheng wrote:
>>> On Sat, 04/21 00:09, Max Reitz wrote:
>>>> When creating a file, we should take the WRITE and RESIZE permissions.
>>>> We do not need either for the creation itself, but we do need them for
>>>> clearing and resizing it.  So we can take the proper permissions by
>>>> replacing O_TRUNC with an explicit truncation to 0, and by taking the
>>>> appropriate file locks between those two steps.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>>  block/file-posix.c | 41 +++++++++++++++++++++++++++++++++++++++--
>>>>  1 file changed, 39 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>>> index c98a4a1556..ed7932d6e8 100644
>>>> --- a/block/file-posix.c
>>>> +++ b/block/file-posix.c
>>>> @@ -1992,6 +1992,7 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
>>>>  {
>>>>      BlockdevCreateOptionsFile *file_opts;
>>>>      int fd;
>>>> +    int perm, shared;
>>>>      int result = 0;
>>>>  
>>>>      /* Validate options and set default values */
>>>> @@ -2006,14 +2007,48 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
>>>>      }
>>>>  
>>>>      /* Create file */
>>>> -    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
>>>> -                   0644);
>>>> +    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
>>>>      if (fd < 0) {
>>>>          result = -errno;
>>>>          error_setg_errno(errp, -result, "Could not create file");
>>>>          goto out;
>>>>      }
>>>>  
>>>> +    /* Take permissions: We want to discard everything, so we need
>>>> +     * BLK_PERM_WRITE; and truncation to the desired size requires
>>>> +     * BLK_PERM_RESIZE.
>>>> +     * On the other hand, we cannot share the RESIZE permission
>>>> +     * because we promise that after this function, the file has the
>>>> +     * size given in the options.  If someone else were to resize it
>>>> +     * concurrently, we could not guarantee that. */
>>>
>>> Strictly speaking, we do close(fd) before this function returns so the lock is
>>> lost and race can happen.
>>
>> Right, but then creation from the perspective of file-posix is over.  We
>> are going to reopen the file for formatting, but that is just a normal
>> bdrv_open() so it will automatically be locked, no?
> 
> After this function close() but before the following bdrv_open(), another
> process can sneak in and steal the lock. So we cannot guarantee the RESIZE lock
> does what we really want.

Right, but I'd argue that is not the purpose of the lock.  If someone
wants the file (or then rather the node) to be a certain size, they'd
have to call bdrv_getlength() on it to check.

But indeed you're right in that not sharing the RESIZE is hypocritical
considering it actually doesn't promise anything.

Anyway, let's consider the issues.  For formatting the file, this
behavior is OK because since Kevin's x-blockdev-create series format
drivers always truncate the file during formatting (that is, once they
have opened the file), and not during creation, so that part is secured.

But we do have issues with e.g. drive-mirror, where we create an image
and then open it.  The opened image may have a different size than what
we wanted to create.  Now in this case I wouldn't consider this a big
problem because drive-mirror is basically deprecated anyway, so there is
no reason to waste resources here; and this applies to all of the QMP
commands that create images and then open them automatically.

In the future, we want users to use blockdev-create and then
blockdev-add in two separate steps.  Then it's up to the user to realize
that the blockdev-add'ed node may have a different length then what they
wanted to achieve in blockdev-create.  Actually, it may just differ
altogether, I don't think qemu is going to make any promises on the
state of the image in the interim.

So...  I suppose there aren't any real issues with not being able to
promise that the image has the intended length immediately after
creating it.  So I guess we can indeed share RESIZE.

But OTOH, it definitely does not feel right to me to share RESIZE.  We
definitely do not want other parties to resize the image while we create it.

So I guess what I would like to do is keep RESIZE not shared and add a
note after the comment:

> Note that after this function, we can no longer guarantee that the
> file is not touched by a third party, so it may be resized then.

Ideally, we'd want the lock to stay active until blockdev-create or
qemu-img create returns, but I don't think that is really worth it.  If
there is a race condition between raw_co_create() returning and those
commands returning (and we don't have a format layer to correct things),
then I can't imagine that it couldn't bite you after those commands have
returned.  (And after those commands, it isn't our problem anymore anyway.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation
  2018-05-04 13:45         ` Max Reitz
@ 2018-05-07  7:23           ` Fam Zheng
  0 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2018-05-07  7:23 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-block, Kevin Wolf, qemu-devel

On Fri, 05/04 15:45, Max Reitz wrote:
> On 2018-05-03 08:45, Fam Zheng wrote:
> > On Sat, 04/28 13:03, Max Reitz wrote:
> >> On 2018-04-27 08:22, Fam Zheng wrote:
> >>> On Sat, 04/21 00:09, Max Reitz wrote:
> >>>> When creating a file, we should take the WRITE and RESIZE permissions.
> >>>> We do not need either for the creation itself, but we do need them for
> >>>> clearing and resizing it.  So we can take the proper permissions by
> >>>> replacing O_TRUNC with an explicit truncation to 0, and by taking the
> >>>> appropriate file locks between those two steps.
> >>>>
> >>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> >>>> ---
> >>>>  block/file-posix.c | 41 +++++++++++++++++++++++++++++++++++++++--
> >>>>  1 file changed, 39 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/block/file-posix.c b/block/file-posix.c
> >>>> index c98a4a1556..ed7932d6e8 100644
> >>>> --- a/block/file-posix.c
> >>>> +++ b/block/file-posix.c
> >>>> @@ -1992,6 +1992,7 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
> >>>>  {
> >>>>      BlockdevCreateOptionsFile *file_opts;
> >>>>      int fd;
> >>>> +    int perm, shared;
> >>>>      int result = 0;
> >>>>  
> >>>>      /* Validate options and set default values */
> >>>> @@ -2006,14 +2007,48 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
> >>>>      }
> >>>>  
> >>>>      /* Create file */
> >>>> -    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
> >>>> -                   0644);
> >>>> +    fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
> >>>>      if (fd < 0) {
> >>>>          result = -errno;
> >>>>          error_setg_errno(errp, -result, "Could not create file");
> >>>>          goto out;
> >>>>      }
> >>>>  
> >>>> +    /* Take permissions: We want to discard everything, so we need
> >>>> +     * BLK_PERM_WRITE; and truncation to the desired size requires
> >>>> +     * BLK_PERM_RESIZE.
> >>>> +     * On the other hand, we cannot share the RESIZE permission
> >>>> +     * because we promise that after this function, the file has the
> >>>> +     * size given in the options.  If someone else were to resize it
> >>>> +     * concurrently, we could not guarantee that. */
> >>>
> >>> Strictly speaking, we do close(fd) before this function returns so the lock is
> >>> lost and race can happen.
> >>
> >> Right, but then creation from the perspective of file-posix is over.  We
> >> are going to reopen the file for formatting, but that is just a normal
> >> bdrv_open() so it will automatically be locked, no?
> > 
> > After this function close() but before the following bdrv_open(), another
> > process can sneak in and steal the lock. So we cannot guarantee the RESIZE lock
> > does what we really want.
> 
> Right, but I'd argue that is not the purpose of the lock.  If someone
> wants the file (or then rather the node) to be a certain size, they'd
> have to call bdrv_getlength() on it to check.
> 
> But indeed you're right in that not sharing the RESIZE is hypocritical
> considering it actually doesn't promise anything.
> 
> Anyway, let's consider the issues.  For formatting the file, this
> behavior is OK because since Kevin's x-blockdev-create series format
> drivers always truncate the file during formatting (that is, once they
> have opened the file), and not during creation, so that part is secured.
> 
> But we do have issues with e.g. drive-mirror, where we create an image
> and then open it.  The opened image may have a different size than what
> we wanted to create.  Now in this case I wouldn't consider this a big
> problem because drive-mirror is basically deprecated anyway, so there is
> no reason to waste resources here; and this applies to all of the QMP
> commands that create images and then open them automatically.
> 
> In the future, we want users to use blockdev-create and then
> blockdev-add in two separate steps.  Then it's up to the user to realize
> that the blockdev-add'ed node may have a different length then what they
> wanted to achieve in blockdev-create.  Actually, it may just differ
> altogether, I don't think qemu is going to make any promises on the
> state of the image in the interim.
> 
> So...  I suppose there aren't any real issues with not being able to
> promise that the image has the intended length immediately after
> creating it.  So I guess we can indeed share RESIZE.
> 
> But OTOH, it definitely does not feel right to me to share RESIZE.  We
> definitely do not want other parties to resize the image while we create it.
> 
> So I guess what I would like to do is keep RESIZE not shared and add a
> note after the comment:
> 
> > Note that after this function, we can no longer guarantee that the
> > file is not touched by a third party, so it may be resized then.
> 
> Ideally, we'd want the lock to stay active until blockdev-create or
> qemu-img create returns, but I don't think that is really worth it.  If
> there is a race condition between raw_co_create() returning and those
> commands returning (and we don't have a format layer to correct things),
> then I can't imagine that it couldn't bite you after those commands have
> returned.  (And after those commands, it isn't our problem anymore anyway.)

Yes, what you said makes total sense to me. Having a comment and _not_ sharing
RESIZE seems the most reasonable.

Technically we could extend the API to be able to hold the fd after
blockdev-create returns by introducing an explicit blockdev-create-cleanup
command, but like you said it is probably not worth it.

Fam

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-05-07  7:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-20 22:09 [Qemu-devel] [PATCH 0/3] block/file-posix: File locking during creation Max Reitz
2018-04-20 22:09 ` [Qemu-devel] [PATCH 1/3] block/file-posix: Pass FD to locking helpers Max Reitz
2018-04-27  6:24   ` Fam Zheng
2018-04-20 22:09 ` [Qemu-devel] [PATCH 2/3] block/file-posix: File locking during creation Max Reitz
2018-04-27  6:22   ` Fam Zheng
2018-04-28 11:03     ` Max Reitz
2018-05-03  6:45       ` Fam Zheng
2018-05-04 13:45         ` Max Reitz
2018-05-07  7:23           ` Fam Zheng
2018-04-20 22:09 ` [Qemu-devel] [PATCH 3/3] iotests: Add creation test to 153 Max Reitz
2018-04-27  6:24   ` Fam Zheng
2018-04-23 13:19 ` [Qemu-devel] [Qemu-block] [PATCH 0/3] block/file-posix: File locking during creation Alberto Garcia
2018-04-23 15:56   ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.