All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches
@ 2012-08-30 18:47 Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag Jeff Cody
                   ` (6 more replies)
  0 siblings, 7 replies; 27+ messages in thread
From: Jeff Cody @ 2012-08-30 18:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, eblake, supriyak

These patches are strongly based off Supriya Kannery's original bdrv_reopen()
patches as part of the hostcache series, including the _prepare(), _commit(), and
_abort() structure.

Some additions / changes:

* Added support for multiple image reopen transactionally
* Reopen changes are staged into temporary stashes in prepare(),
  and copied over in commit() (discarded in abort()).
* Driver-level reopen file changes are mainly contained in
  the raw-* files.


TODO: The raw-win32 driver still needs to be finished
TODO: The vmdk driver still needs to be finished

Jeff Cody (7):
  block: correctly set the keep_read_only flag
  block: Framework for reopening files safely
  block: raw-posix image file reopen
  block: raw image file reopen
  block: qed image file reopen
  block: qcow2 image file reopen
  block: qcow image file reopen

 block.c           | 242 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 block.h           |  16 ++++
 block/qcow.c      |  23 ++++++
 block/qcow2.c     |  22 +++++
 block/qed.c       |  20 +++++
 block/raw-posix.c | 153 ++++++++++++++++++++++++++++++----
 block/raw.c       |  22 +++++
 block_int.h       |  13 +++
 qemu-common.h     |   1 +
 9 files changed, 489 insertions(+), 23 deletions(-)

-- 
1.7.11.2

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag
  2012-08-30 18:47 [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches Jeff Cody
@ 2012-08-30 18:47 ` Jeff Cody
  2012-09-05 12:47   ` Kevin Wolf
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely Jeff Cody
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Jeff Cody @ 2012-08-30 18:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, eblake, supriyak

I believe the bs->keep_read_only flag is supposed to reflect
the initial open state of the device. If the device is initially
opened R/O, then commit operations, or reopen operations changing
to R/W, are prohibited.

Currently, the keep_read_only flag is only accurate for the active
layer, and its backing file. Subsequent images end up always having
the keep_read_only flag set.

For instance, what happens now:

[  base  ]  kro = 1, ro = 1
    |
    v
[ snap-1 ]  kro = 1, ro = 1
    |
    v
[ snap-2 ]  kro = 0, ro = 1
    |
    v
[ active ]  kro = 0, ro = 0

What we want:

[  base  ]  kro = 0, ro = 1
    |
    v
[ snap-1 ]  kro = 0, ro = 1
    |
    v
[ snap-2 ]  kro = 0, ro = 1
    |
    v
[ active ]  kro = 0, ro = 0

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block.c | 16 +++++++---------
 block.h |  1 +
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 470bdcc..e31b76f 100644
--- a/block.c
+++ b/block.c
@@ -655,7 +655,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
      * Clear flags that are internal to the block layer before opening the
      * image.
      */
-    open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING);
+    open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_ALLOW_RDWR);
 
     /*
      * Snapshots should be writable.
@@ -664,8 +664,6 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
         open_flags |= BDRV_O_RDWR;
     }
 
-    bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR);
-
     /* Open the image, either directly or using a protocol */
     if (drv->bdrv_file_open) {
         ret = drv->bdrv_file_open(bs, filename, open_flags);
@@ -804,6 +802,12 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
         goto unlink_and_fail;
     }
 
+    if (flags & BDRV_O_RDWR) {
+        flags |= BDRV_O_ALLOW_RDWR;
+    }
+
+    bs->keep_read_only = !(flags & BDRV_O_ALLOW_RDWR);
+
     /* Open the image */
     ret = bdrv_open_common(bs, filename, flags, drv);
     if (ret < 0) {
@@ -833,12 +837,6 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
             bdrv_close(bs);
             return ret;
         }
-        if (bs->is_temporary) {
-            bs->backing_hd->keep_read_only = !(flags & BDRV_O_RDWR);
-        } else {
-            /* base image inherits from "parent" */
-            bs->backing_hd->keep_read_only = bs->keep_read_only;
-        }
     }
 
     if (!bdrv_key_required(bs)) {
diff --git a/block.h b/block.h
index 2e2be11..4d919c2 100644
--- a/block.h
+++ b/block.h
@@ -80,6 +80,7 @@ typedef struct BlockDevOps {
 #define BDRV_O_COPY_ON_READ 0x0400 /* copy read backing sectors into image */
 #define BDRV_O_INCOMING    0x0800  /* consistency hint for incoming migration */
 #define BDRV_O_CHECK       0x1000  /* open solely for consistency check */
+#define BDRV_O_ALLOW_RDWR  0x2000  /* allow reopen to change from r/o to r/w */
 
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH)
 
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely
  2012-08-30 18:47 [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag Jeff Cody
@ 2012-08-30 18:47 ` Jeff Cody
  2012-09-05 15:09   ` Kevin Wolf
  2012-09-11 14:57   ` Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen Jeff Cody
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 27+ messages in thread
From: Jeff Cody @ 2012-08-30 18:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, eblake, supriyak

This is based heavily on Supriya Kannery's bdrv_reopen()
patch series.

This provides a transactional method to reopen multiple
images files safely.

Image files are queue for reopen via bdrv_reopen_queue(), and the
reopen occurs when bdrv_reopen_multiple() is called.  Changes are
staged in bdrv_reopen_prepare() and in the equivalent driver level
functions.  If any of the staged images fails a prepare, then all
of the images left untouched, and the staged changes for each image
abandoned.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block.c       | 226 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 block.h       |  15 ++++
 block_int.h   |  13 ++++
 qemu-common.h |   1 +
 4 files changed, 255 insertions(+)

diff --git a/block.c b/block.c
index e31b76f..9470319 100644
--- a/block.c
+++ b/block.c
@@ -857,6 +857,232 @@ unlink_and_fail:
     return ret;
 }
 
+/*
+ * Adds a BlockDriverState to a simple queue for an atomic, transactional
+ * reopen of multiple devices.
+ *
+ * bs_queue can either be an existing BlockReopenQueue that has had QSIMPLE_INIT
+ * already performed, or alternatively may be NULL a new BlockReopenQueue will
+ * be created and initialized. This newly created BlockReopenQueue should be
+ * passed back in for subsequent calls that are intended to be of the same
+ * atomic 'set'.
+ *
+ * bs is the BlockDriverState to add to the reopen queue.
+ *
+ * flags contains the open flags for the associated bs
+ *
+ * returns a pointer to bs_queue, which is either the newly allocated
+ * bs_queue, or the existing bs_queue being used.
+ *
+ */
+BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
+                                    BlockDriverState *bs, int flags)
+{
+    assert(bs != NULL);
+
+    BlockReopenQueueEntry *bs_entry;
+    if (bs_queue == NULL) {
+        bs_queue = g_new0(BlockReopenQueue, 1);
+        QSIMPLEQ_INIT(bs_queue);
+    }
+
+    if (bs->file) {
+        bdrv_reopen_queue(bs_queue, bs->file, flags);
+    }
+
+    bs_entry = g_new0(BlockReopenQueueEntry, 1);
+    QSIMPLEQ_INSERT_TAIL(bs_queue, bs_entry, entry);
+
+    bs_entry->state = g_new0(BDRVReopenState, 1);
+    bs_entry->state->bs = bs;
+    bs_entry->state->flags = flags;
+
+    return bs_queue;
+}
+
+/*
+ * Reopen multiple BlockDriverStates atomically & transactionally.
+ *
+ * The queue passed in (bs_queue) must have been built up previous
+ * via bdrv_reopen_queue().
+ *
+ * Reopens all BDS specified in the queue, with the appropriate
+ * flags.  All devices are prepared for reopen, and failure of any
+ * device will cause all device changes to be abandonded, and intermediate
+ * data cleaned up.
+ *
+ * If all devices prepare successfully, then the changes are committed
+ * to all devices.
+ *
+ */
+int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
+{
+    int ret = -1;
+    BlockReopenQueueEntry *bs_entry;
+    Error *local_err = NULL;
+
+    assert(bs_queue != NULL);
+
+    bdrv_drain_all();
+
+    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
+        if (bdrv_reopen_prepare(bs_entry->state, &local_err)) {
+            error_propagate(errp, local_err);
+            goto cleanup;
+        }
+        bs_entry->prepared = true;
+    }
+
+    /* If we reach this point, we have success and just need to apply the
+     * changes
+     */
+    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
+        bdrv_reopen_commit(bs_entry->state);
+    }
+
+    ret = 0;
+
+cleanup:
+    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
+        if (ret && bs_entry->prepared) {
+            bdrv_reopen_abort(bs_entry->state);
+        }
+        g_free(bs_entry->state);
+        g_free(bs_entry);
+    }
+    g_free(bs_queue);
+    return ret;
+}
+
+
+/* Reopen a single BlockDriverState with the specified flags. */
+int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp)
+{
+    int ret = -1;
+    Error *local_err = NULL;
+    BlockReopenQueue *queue = bdrv_reopen_queue(NULL, bs, bdrv_flags);
+
+    ret = bdrv_reopen_multiple(queue, &local_err);
+    if (local_err != NULL) {
+        error_propagate(errp, local_err);
+    }
+    return ret;
+}
+
+
+/*
+ * Prepares a BlockDriverState for reopen. All changes are staged in the
+ * 'reopen_state' field of the BlockDriverState, which must be NULL when
+ * entering (all previous reopens must have completed for the BDS).
+ *
+ * bs is the BlockDriverState to reopen
+ * flags are the new open flags
+ *
+ * Returns 0 on success, non-zero on error.  On error errp will be set
+ * as well.
+ *
+ * On failure, bdrv_reopen_abort() will be called to clean up any data.
+ * It is the responsibility of the caller to then call the abort() or
+ * commit() for any other BDS that have been left in a prepare() state
+ *
+ */
+int bdrv_reopen_prepare(BDRVReopenState *reopen_state, Error **errp)
+{
+    int ret = -1;
+    Error *local_err = NULL;
+    BlockDriver *drv;
+
+    assert(reopen_state != NULL);
+    assert(reopen_state->bs->drv != NULL);
+    drv = reopen_state->bs->drv;
+
+    /* if we are to stay read-only, do not allow permission change
+     * to r/w */
+    if (reopen_state->bs->keep_read_only &&
+        reopen_state->flags & BDRV_O_RDWR) {
+        error_set(errp, QERR_DEVICE_IS_READ_ONLY,
+                  reopen_state->bs->device_name);
+        goto error;
+    }
+
+
+    ret = bdrv_flush(reopen_state->bs);
+    if (ret) {
+        error_set(errp, QERR_IO_ERROR);
+        goto error;
+    }
+
+    if (drv->bdrv_reopen_prepare) {
+        ret = drv->bdrv_reopen_prepare(reopen_state, &local_err);
+        if (ret) {
+            if (local_err != NULL) {
+                error_propagate(errp, local_err);
+            } else {
+                error_set(errp, QERR_OPEN_FILE_FAILED,
+                          reopen_state->bs->filename);
+            }
+            goto error;
+        }
+    } else {
+        /* It is currently mandatory to have a bdrv_reopen_prepare()
+         * handler for each supported drv. */
+        error_set(errp, QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
+                  drv->format_name, reopen_state->bs->device_name,
+                 "reopening of file");
+        ret = -1;
+        goto error;
+    }
+
+    return 0;
+
+error:
+    bdrv_reopen_abort(reopen_state);
+    return ret;
+}
+
+/*
+ * Takes the staged changes for the reopen from bdrv_reopen_prepare(), and
+ * makes them final by swapping the staging BlockDriverState contents into
+ * the active BlockDriverState contents.
+ */
+void bdrv_reopen_commit(BDRVReopenState *reopen_state)
+{
+    BlockDriver *drv;
+
+    assert(reopen_state != NULL);
+    drv = reopen_state->bs->drv;
+    assert(drv != NULL);
+
+    /* If there are any driver level actions to take */
+    if (drv->bdrv_reopen_commit) {
+        drv->bdrv_reopen_commit(reopen_state);
+    }
+
+    /* set BDS specific flags now */
+    reopen_state->bs->open_flags         = reopen_state->flags;
+    reopen_state->bs->enable_write_cache = !!(reopen_state->flags &
+                                              BDRV_O_CACHE_WB);
+    reopen_state->bs->read_only = !(reopen_state->flags & BDRV_O_RDWR);
+}
+
+/*
+ * Abort the reopen, and delete and free the staged changes in
+ * reopen_state
+ */
+void bdrv_reopen_abort(BDRVReopenState *reopen_state)
+{
+    BlockDriver *drv;
+
+    assert(reopen_state != NULL);
+    drv = reopen_state->bs->drv;
+    assert(drv != NULL);
+
+    if (drv->bdrv_reopen_abort) {
+        drv->bdrv_reopen_abort(reopen_state);
+    }
+}
+
+
 void bdrv_close(BlockDriverState *bs)
 {
     bdrv_flush(bs);
diff --git a/block.h b/block.h
index 4d919c2..db812b1 100644
--- a/block.h
+++ b/block.h
@@ -97,6 +97,14 @@ typedef enum {
     BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
 } BlockQMPEventAction;
 
+typedef struct BlockReopenQueueEntry {
+     bool prepared;
+     BDRVReopenState *state;
+     QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
+} BlockReopenQueueEntry;
+
+typedef QSIMPLEQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) BlockReopenQueue;
+
 void bdrv_iostatus_enable(BlockDriverState *bs);
 void bdrv_iostatus_reset(BlockDriverState *bs);
 void bdrv_iostatus_disable(BlockDriverState *bs);
@@ -131,6 +139,13 @@ int bdrv_parse_cache_flags(const char *mode, int *flags);
 int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags);
 int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
               BlockDriver *drv);
+BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
+                                    BlockDriverState *bs, int flags);
+int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
+int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp);
+int bdrv_reopen_prepare(BDRVReopenState *reopen_state, Error **errp);
+void bdrv_reopen_commit(BDRVReopenState *reopen_state);
+void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 void bdrv_close(BlockDriverState *bs);
 int bdrv_attach_dev(BlockDriverState *bs, void *dev);
 void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev);
diff --git a/block_int.h b/block_int.h
index 4452f6f..7a4e226 100644
--- a/block_int.h
+++ b/block_int.h
@@ -139,6 +139,12 @@ struct BlockDriver {
     int instance_size;
     int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename);
     int (*bdrv_probe_device)(const char *filename);
+
+    /* For handling image reopen for split or non-split files */
+    int (*bdrv_reopen_prepare)(BDRVReopenState *reopen_state, Error **errp);
+    void (*bdrv_reopen_commit)(BDRVReopenState *reopen_state);
+    void (*bdrv_reopen_abort)(BDRVReopenState *reopen_state);
+
     int (*bdrv_open)(BlockDriverState *bs, int flags);
     int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags);
     int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num,
@@ -336,6 +342,13 @@ struct BlockDriverState {
 
     /* long-running background operation */
     BlockJob *job;
+
+};
+
+struct BDRVReopenState {
+    BlockDriverState *bs;
+    int flags;
+    void *opaque;
 };
 
 int get_tmp_filename(char *filename, int size);
diff --git a/qemu-common.h b/qemu-common.h
index e5c2bcd..6a6181c 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -245,6 +245,7 @@ typedef struct NICInfo NICInfo;
 typedef struct HCIInfo HCIInfo;
 typedef struct AudioState AudioState;
 typedef struct BlockDriverState BlockDriverState;
+typedef struct BDRVReopenState BDRVReopenState;
 typedef struct DriveInfo DriveInfo;
 typedef struct DisplayState DisplayState;
 typedef struct DisplayChangeListener DisplayChangeListener;
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-08-30 18:47 [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely Jeff Cody
@ 2012-08-30 18:47 ` Jeff Cody
  2012-08-30 22:15   ` Eric Blake
  2012-09-05 15:30   ` Kevin Wolf
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 4/7] block: raw " Jeff Cody
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 27+ messages in thread
From: Jeff Cody @ 2012-08-30 18:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, eblake, supriyak

This is derived from the Supriya Kannery's reopen patches.

This contains the raw-posix driver changes for the bdrv_reopen_*
functions.  All changes are staged into a temporary scratch buffer
during the prepare() stage, and copied over to the live structure
during commit().  Upon abort(), all changes are abandoned, and the
live structures are unmodified.

The _prepare() will create an extra fd - either by means of a dup,
if possible, or opening a new fd if not (for instance, access
control changes).  Upon _commit(), the original fd is closed and
the new fd is used.  Upon _abort(), the duplicate/new fd is closed.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/raw-posix.c | 153 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 139 insertions(+), 14 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 6be20b1..48086d7 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -140,6 +140,15 @@ typedef struct BDRVRawState {
 #endif
 } BDRVRawState;
 
+typedef struct BDRVRawReopenState {
+    BDRVReopenState reopen_state;
+    int fd;
+    int open_flags;
+    uint8_t *aligned_buf;
+    unsigned aligned_buf_size;
+    BDRVRawState *stash_s;
+} BDRVRawReopenState;
+
 static int fd_open(BlockDriverState *bs);
 static int64_t raw_getlength(BlockDriverState *bs);
 
@@ -185,6 +194,28 @@ static int raw_normalize_devicepath(const char **filename)
 }
 #endif
 
+static void raw_parse_flags(int bdrv_flags, int *open_flags)
+{
+    assert(open_flags != NULL);
+
+    *open_flags |= O_BINARY;
+    *open_flags &= ~O_ACCMODE;
+    if (bdrv_flags & BDRV_O_RDWR) {
+        *open_flags |= O_RDWR;
+    } else {
+        *open_flags |= O_RDONLY;
+    }
+
+    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
+     * and O_DIRECT for no caching. */
+    if ((bdrv_flags & BDRV_O_NOCACHE)) {
+        *open_flags |= O_DIRECT;
+    }
+    if (!(bdrv_flags & BDRV_O_CACHE_WB)) {
+        *open_flags |= O_DSYNC;
+    }
+}
+
 static int raw_open_common(BlockDriverState *bs, const char *filename,
                            int bdrv_flags, int open_flags)
 {
@@ -196,20 +227,8 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
         return ret;
     }
 
-    s->open_flags = open_flags | O_BINARY;
-    s->open_flags &= ~O_ACCMODE;
-    if (bdrv_flags & BDRV_O_RDWR) {
-        s->open_flags |= O_RDWR;
-    } else {
-        s->open_flags |= O_RDONLY;
-    }
-
-    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
-     * and O_DIRECT for no caching. */
-    if ((bdrv_flags & BDRV_O_NOCACHE))
-        s->open_flags |= O_DIRECT;
-    if (!(bdrv_flags & BDRV_O_CACHE_WB))
-        s->open_flags |= O_DSYNC;
+    s->open_flags = open_flags;
+    raw_parse_flags(bdrv_flags, &s->open_flags);
 
     s->fd = -1;
     fd = qemu_open(filename, s->open_flags, 0644);
@@ -283,6 +302,109 @@ static int raw_open(BlockDriverState *bs, const char *filename, int flags)
     return raw_open_common(bs, filename, flags, 0);
 }
 
+static int raw_reopen_prepare(BDRVReopenState *state, Error **errp)
+{
+    BDRVRawState *s;
+    BDRVRawReopenState *raw_s;
+    int ret = 0;
+
+    assert(state != NULL);
+    assert(state->bs != NULL);
+
+    s = state->bs->opaque;
+
+    state->opaque = g_malloc0(sizeof(BDRVRawReopenState));
+    raw_s = state->opaque;
+
+    raw_parse_flags(state->flags, &raw_s->open_flags);
+
+    /*
+     * If we didn't have BDRV_O_NOCACHE set before, we may not have allocated
+     * aligned_buf
+     */
+    if ((state->flags & BDRV_O_NOCACHE)) {
+        /*
+         * Allocate a buffer for read/modify/write cycles.  Choose the size
+         * pessimistically as we don't know the block size yet.
+         */
+        raw_s->aligned_buf_size = 32 * MAX_BLOCKSIZE;
+        raw_s->aligned_buf = qemu_memalign(MAX_BLOCKSIZE,
+                                           raw_s->aligned_buf_size);
+
+        if (raw_s->aligned_buf == NULL) {
+            ret = -1;
+            goto error;
+        }
+    }
+
+    int fcntl_flags = O_APPEND | O_ASYNC | O_NONBLOCK;
+#ifdef O_NOATIME
+    fcntl_flags |= O_NOATIME;
+#endif
+    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
+        /* dup the original fd */
+        /* TODO: use qemu fcntl wrapper */
+        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
+        if (raw_s->fd == -1) {
+            ret = -1;
+            goto error;
+        }
+        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
+    } else {
+        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
+        if (raw_s->fd == -1) {
+            ret = -1;
+        }
+    }
+error:
+    return ret;
+}
+
+
+static void raw_reopen_commit(BDRVReopenState *state)
+{
+    BDRVRawReopenState *raw_s = state->opaque;
+    BDRVRawState *s = state->bs->opaque;
+
+    if (raw_s->aligned_buf != NULL) {
+        if (s->aligned_buf) {
+            qemu_vfree(s->aligned_buf);
+        }
+        s->aligned_buf      = raw_s->aligned_buf;
+        s->aligned_buf_size = raw_s->aligned_buf_size;
+    }
+
+    s->open_flags = raw_s->open_flags;
+
+    close(s->fd);
+    s->fd = raw_s->fd;
+
+    g_free(state->opaque);
+    state->opaque = NULL;
+}
+
+
+static void raw_reopen_abort(BDRVReopenState *state)
+{
+    BDRVRawReopenState *raw_s = state->opaque;
+
+     /* nothing to do if NULL, we didn't get far enough */
+    if (raw_s == NULL) {
+        return;
+    }
+
+    if (raw_s->fd >= 0) {
+        close(raw_s->fd);
+        raw_s->fd = -1;
+        if (raw_s->aligned_buf != NULL) {
+            qemu_vfree(raw_s->aligned_buf);
+        }
+    }
+    g_free(state->opaque);
+    state->opaque = NULL;
+}
+
+
 /* XXX: use host sector size if necessary with:
 #ifdef DIOCGSECTORSIZE
         {
@@ -735,6 +857,9 @@ static BlockDriver bdrv_file = {
     .instance_size = sizeof(BDRVRawState),
     .bdrv_probe = NULL, /* no probe for protocols */
     .bdrv_file_open = raw_open,
+    .bdrv_reopen_prepare = raw_reopen_prepare,
+    .bdrv_reopen_commit = raw_reopen_commit,
+    .bdrv_reopen_abort = raw_reopen_abort,
     .bdrv_close = raw_close,
     .bdrv_create = raw_create,
     .bdrv_co_discard = raw_co_discard,
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 4/7] block: raw image file reopen
  2012-08-30 18:47 [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches Jeff Cody
                   ` (2 preceding siblings ...)
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen Jeff Cody
@ 2012-08-30 18:47 ` Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 5/7] block: qed " Jeff Cody
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Jeff Cody @ 2012-08-30 18:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, eblake, supriyak

These are the stubs for the file reopen drivers for the raw format.

There is currently nothing that needs to be done by the raw driver
in reopen.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/raw.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/block/raw.c b/block/raw.c
index ff34ea4..fa47ff1 100644
--- a/block/raw.c
+++ b/block/raw.c
@@ -9,6 +9,24 @@ static int raw_open(BlockDriverState *bs, int flags)
     return 0;
 }
 
+/* We have nothing to do for raw reopen, stubs just return
+ * success */
+static int raw_reopen_prepare(BDRVReopenState *state, Error **errp)
+{
+    return 0;
+}
+
+static void raw_reopen_commit(BDRVReopenState *state)
+{
+    return;
+}
+
+static void raw_reopen_abort(BDRVReopenState *state)
+{
+    return;
+}
+
+
 static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num,
                                      int nb_sectors, QEMUIOVector *qiov)
 {
@@ -115,6 +133,10 @@ static BlockDriver bdrv_raw = {
     .bdrv_open          = raw_open,
     .bdrv_close         = raw_close,
 
+    .bdrv_reopen_prepare  = raw_reopen_prepare,
+    .bdrv_reopen_commit   = raw_reopen_commit,
+    .bdrv_reopen_abort    = raw_reopen_abort,
+
     .bdrv_co_readv          = raw_co_readv,
     .bdrv_co_writev         = raw_co_writev,
     .bdrv_co_is_allocated   = raw_co_is_allocated,
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 5/7] block: qed image file reopen
  2012-08-30 18:47 [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches Jeff Cody
                   ` (3 preceding siblings ...)
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 4/7] block: raw " Jeff Cody
@ 2012-08-30 18:47 ` Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 6/7] block: qcow2 " Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 7/7] block: qcow " Jeff Cody
  6 siblings, 0 replies; 27+ messages in thread
From: Jeff Cody @ 2012-08-30 18:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, eblake, supriyak

These are the stubs for the file reopen drivers for the qed format.

There is currently nothing that needs to be done by the qed driver
in reopen.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/qed.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/block/qed.c b/block/qed.c
index a02dbfd..231e86a 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -505,6 +505,23 @@ out:
     return ret;
 }
 
+/* We have nothing to do for QED reopen, stubs just return
+ * success */
+static int bdrv_qed_reopen_prepare(BDRVReopenState *state, Error **errp)
+{
+    return 0;
+}
+
+static void bdrv_qed_reopen_commit(BDRVReopenState *state)
+{
+    return;
+}
+
+static void bdrv_qed_reopen_abort(BDRVReopenState *state)
+{
+    return;
+}
+
 static void bdrv_qed_close(BlockDriverState *bs)
 {
     BDRVQEDState *s = bs->opaque;
@@ -1553,6 +1570,9 @@ static BlockDriver bdrv_qed = {
     .bdrv_rebind              = bdrv_qed_rebind,
     .bdrv_open                = bdrv_qed_open,
     .bdrv_close               = bdrv_qed_close,
+    .bdrv_reopen_prepare      = bdrv_qed_reopen_prepare,
+    .bdrv_reopen_commit       = bdrv_qed_reopen_commit,
+    .bdrv_reopen_abort        = bdrv_qed_reopen_abort,
     .bdrv_create              = bdrv_qed_create,
     .bdrv_co_is_allocated     = bdrv_qed_co_is_allocated,
     .bdrv_make_empty          = bdrv_qed_make_empty,
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 6/7] block: qcow2 image file reopen
  2012-08-30 18:47 [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches Jeff Cody
                   ` (4 preceding siblings ...)
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 5/7] block: qed " Jeff Cody
@ 2012-08-30 18:47 ` Jeff Cody
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 7/7] block: qcow " Jeff Cody
  6 siblings, 0 replies; 27+ messages in thread
From: Jeff Cody @ 2012-08-30 18:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, eblake, supriyak

These are the stubs for the file reopen drivers for the qcow2 format.

There is currently nothing that needs to be done by the qcow2 driver
in reopen.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/qcow2.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 8f183f1..d462093 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -52,6 +52,7 @@ typedef struct {
     uint32_t magic;
     uint32_t len;
 } QCowExtension;
+
 #define  QCOW2_EXT_MAGIC_END 0
 #define  QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
 #define  QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857
@@ -558,6 +559,24 @@ static int qcow2_set_key(BlockDriverState *bs, const char *key)
     return 0;
 }
 
+/* We have nothing to do for QCOW2 reopen, stubs just return
+ * success */
+static int qcow2_reopen_prepare(BDRVReopenState *state, Error **errp)
+{
+    return 0;
+}
+
+static void qcow2_reopen_commit(BDRVReopenState *state)
+{
+    return;
+}
+
+static void qcow2_reopen_abort(BDRVReopenState *state)
+{
+    return;
+}
+
+
 static int coroutine_fn qcow2_co_is_allocated(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *pnum)
 {
@@ -1679,6 +1698,9 @@ static BlockDriver bdrv_qcow2 = {
     .bdrv_probe         = qcow2_probe,
     .bdrv_open          = qcow2_open,
     .bdrv_close         = qcow2_close,
+    .bdrv_reopen_prepare  = qcow2_reopen_prepare,
+    .bdrv_reopen_commit   = qcow2_reopen_commit,
+    .bdrv_reopen_abort    = qcow2_reopen_abort,
     .bdrv_create        = qcow2_create,
     .bdrv_co_is_allocated = qcow2_co_is_allocated,
     .bdrv_set_key       = qcow2_set_key,
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Qemu-devel] [PATCH 7/7] block: qcow image file reopen
  2012-08-30 18:47 [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches Jeff Cody
                   ` (5 preceding siblings ...)
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 6/7] block: qcow2 " Jeff Cody
@ 2012-08-30 18:47 ` Jeff Cody
  6 siblings, 0 replies; 27+ messages in thread
From: Jeff Cody @ 2012-08-30 18:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, stefanha, eblake, supriyak

These are the stubs for the file reopen drivers for the qcow format.

There is currently nothing that needs to be done by the qcow driver
in reopen.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/qcow.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/block/qcow.c b/block/qcow.c
index 7b5ab87..f201575 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -197,6 +197,26 @@ static int qcow_open(BlockDriverState *bs, int flags)
     return ret;
 }
 
+
+/* We have nothing to do for QCOW reopen, stubs just return
+ * success */
+static int qcow_reopen_prepare(BDRVReopenState *state, Error **errp)
+{
+    return 0;
+}
+
+static void qcow_reopen_commit(BDRVReopenState *state)
+{
+    return;
+}
+
+static void qcow_reopen_abort(BDRVReopenState *state)
+{
+    return;
+}
+
+
+
 static int qcow_set_key(BlockDriverState *bs, const char *key)
 {
     BDRVQcowState *s = bs->opaque;
@@ -868,6 +888,9 @@ static BlockDriver bdrv_qcow = {
     .bdrv_probe		= qcow_probe,
     .bdrv_open		= qcow_open,
     .bdrv_close		= qcow_close,
+    .bdrv_reopen_prepare = qcow_reopen_prepare,
+    .bdrv_reopen_commit  = qcow_reopen_commit,
+    .bdrv_reopen_abort   = qcow_reopen_abort,
     .bdrv_create	= qcow_create,
 
     .bdrv_co_readv          = qcow_co_readv,
-- 
1.7.11.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen Jeff Cody
@ 2012-08-30 22:15   ` Eric Blake
  2012-08-31 14:42     ` Jeff Cody
  2012-09-05 15:30   ` Kevin Wolf
  1 sibling, 1 reply; 27+ messages in thread
From: Eric Blake @ 2012-08-30 22:15 UTC (permalink / raw)
  To: Jeff Cody; +Cc: kwolf, pbonzini, stefanha, qemu-devel, supriyak

[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]

On 08/30/2012 11:47 AM, Jeff Cody wrote:
> This is derived from the Supriya Kannery's reopen patches.
> 
> This contains the raw-posix driver changes for the bdrv_reopen_*
> functions.  All changes are staged into a temporary scratch buffer
> during the prepare() stage, and copied over to the live structure
> during commit().  Upon abort(), all changes are abandoned, and the
> live structures are unmodified.
> 
> The _prepare() will create an extra fd - either by means of a dup,
> if possible, or opening a new fd if not (for instance, access
> control changes).  Upon _commit(), the original fd is closed and
> the new fd is used.  Upon _abort(), the duplicate/new fd is closed.
> 

> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
> +        /* dup the original fd */
> +        /* TODO: use qemu fcntl wrapper */
> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);

I assume this TODO has to be fixed to allow compilation on systems that
lack F_DUPFD_CLOEXEC.

> +        if (raw_s->fd == -1) {
> +            ret = -1;
> +            goto error;
> +        }
> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
> +    } else {
> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);

Is raw_s->open_flags every going to contain O_CREAT, or is the 0644 mode
argument spurious?

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-08-30 22:15   ` Eric Blake
@ 2012-08-31 14:42     ` Jeff Cody
  2012-08-31 14:49       ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Jeff Cody @ 2012-08-31 14:42 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, pbonzini, stefanha, qemu-devel, supriyak

On 08/30/2012 06:15 PM, Eric Blake wrote:
> On 08/30/2012 11:47 AM, Jeff Cody wrote:
>> This is derived from the Supriya Kannery's reopen patches.
>>
>> This contains the raw-posix driver changes for the bdrv_reopen_*
>> functions.  All changes are staged into a temporary scratch buffer
>> during the prepare() stage, and copied over to the live structure
>> during commit().  Upon abort(), all changes are abandoned, and the
>> live structures are unmodified.
>>
>> The _prepare() will create an extra fd - either by means of a dup,
>> if possible, or opening a new fd if not (for instance, access
>> control changes).  Upon _commit(), the original fd is closed and
>> the new fd is used.  Upon _abort(), the duplicate/new fd is closed.
>>
> 
>> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
>> +        /* dup the original fd */
>> +        /* TODO: use qemu fcntl wrapper */
>> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
> 
> I assume this TODO has to be fixed to allow compilation on systems that
> lack F_DUPFD_CLOEXEC.

Yes, either that or add the logic here.

> 
>> +        if (raw_s->fd == -1) {
>> +            ret = -1;
>> +            goto error;
>> +        }
>> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
>> +    } else {
>> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
> 
> Is raw_s->open_flags every going to contain O_CREAT, or is the 0644 mode
> argument spurious?
>

Thanks, you are right, it is spurious.  The raw_s->open_flags are
explicitly set via raw_parse_flags(), so we know it will never contain
O_CREAT.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-08-31 14:42     ` Jeff Cody
@ 2012-08-31 14:49       ` Kevin Wolf
  2012-08-31 15:10         ` Jeff Cody
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2012-08-31 14:49 UTC (permalink / raw)
  To: jcody; +Cc: stefanha, pbonzini, Eric Blake, qemu-devel, supriyak

Am 31.08.2012 16:42, schrieb Jeff Cody:
> On 08/30/2012 06:15 PM, Eric Blake wrote:
>> On 08/30/2012 11:47 AM, Jeff Cody wrote:
>>> This is derived from the Supriya Kannery's reopen patches.
>>>
>>> This contains the raw-posix driver changes for the bdrv_reopen_*
>>> functions.  All changes are staged into a temporary scratch buffer
>>> during the prepare() stage, and copied over to the live structure
>>> during commit().  Upon abort(), all changes are abandoned, and the
>>> live structures are unmodified.
>>>
>>> The _prepare() will create an extra fd - either by means of a dup,
>>> if possible, or opening a new fd if not (for instance, access
>>> control changes).  Upon _commit(), the original fd is closed and
>>> the new fd is used.  Upon _abort(), the duplicate/new fd is closed.
>>>
>>
>>> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
>>> +        /* dup the original fd */
>>> +        /* TODO: use qemu fcntl wrapper */
>>> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
>>
>> I assume this TODO has to be fixed to allow compilation on systems that
>> lack F_DUPFD_CLOEXEC.
> 
> Yes, either that or add the logic here.

Would qemu_dup_flags() from osdep.c be the right thing here? It was
introduces with Corey's fd passing series.

>>> +        if (raw_s->fd == -1) {
>>> +            ret = -1;
>>> +            goto error;
>>> +        }
>>> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
>>> +    } else {
>>> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
>>
>> Is raw_s->open_flags every going to contain O_CREAT, or is the 0644 mode
>> argument spurious?
> 
> Thanks, you are right, it is spurious.  The raw_s->open_flags are
> explicitly set via raw_parse_flags(), so we know it will never contain
> O_CREAT.

We can probably assert it.

Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-08-31 14:49       ` Kevin Wolf
@ 2012-08-31 15:10         ` Jeff Cody
  0 siblings, 0 replies; 27+ messages in thread
From: Jeff Cody @ 2012-08-31 15:10 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: stefanha, pbonzini, Eric Blake, qemu-devel, supriyak

On 08/31/2012 10:49 AM, Kevin Wolf wrote:
> Am 31.08.2012 16:42, schrieb Jeff Cody:
>> On 08/30/2012 06:15 PM, Eric Blake wrote:
>>> On 08/30/2012 11:47 AM, Jeff Cody wrote:
>>>> This is derived from the Supriya Kannery's reopen patches.
>>>>
>>>> This contains the raw-posix driver changes for the bdrv_reopen_*
>>>> functions.  All changes are staged into a temporary scratch buffer
>>>> during the prepare() stage, and copied over to the live structure
>>>> during commit().  Upon abort(), all changes are abandoned, and the
>>>> live structures are unmodified.
>>>>
>>>> The _prepare() will create an extra fd - either by means of a dup,
>>>> if possible, or opening a new fd if not (for instance, access
>>>> control changes).  Upon _commit(), the original fd is closed and
>>>> the new fd is used.  Upon _abort(), the duplicate/new fd is closed.
>>>>
>>>
>>>> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
>>>> +        /* dup the original fd */
>>>> +        /* TODO: use qemu fcntl wrapper */
>>>> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
>>>
>>> I assume this TODO has to be fixed to allow compilation on systems that
>>> lack F_DUPFD_CLOEXEC.
>>
>> Yes, either that or add the logic here.
> 
> Would qemu_dup_flags() from osdep.c be the right thing here? It was
> introduces with Corey's fd passing series.

I think so - that is the one I was thinking about.  It would just need
to be made non-static.

> 
>>>> +        if (raw_s->fd == -1) {
>>>> +            ret = -1;
>>>> +            goto error;
>>>> +        }
>>>> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
>>>> +    } else {
>>>> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
>>>
>>> Is raw_s->open_flags every going to contain O_CREAT, or is the 0644 mode
>>> argument spurious?
>>
>> Thanks, you are right, it is spurious.  The raw_s->open_flags are
>> explicitly set via raw_parse_flags(), so we know it will never contain
>> O_CREAT.
> 
> We can probably assert it.
>

OK

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag Jeff Cody
@ 2012-09-05 12:47   ` Kevin Wolf
  2012-09-05 13:08     ` Jeff Cody
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2012-09-05 12:47 UTC (permalink / raw)
  To: Jeff Cody; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

Am 30.08.2012 20:47, schrieb Jeff Cody:
> I believe the bs->keep_read_only flag is supposed to reflect
> the initial open state of the device. If the device is initially
> opened R/O, then commit operations, or reopen operations changing
> to R/W, are prohibited.
> 
> Currently, the keep_read_only flag is only accurate for the active
> layer, and its backing file. Subsequent images end up always having
> the keep_read_only flag set.
> 
> For instance, what happens now:
> 
> [  base  ]  kro = 1, ro = 1
>     |
>     v
> [ snap-1 ]  kro = 1, ro = 1
>     |
>     v
> [ snap-2 ]  kro = 0, ro = 1
>     |
>     v
> [ active ]  kro = 0, ro = 0
> 
> What we want:
> 
> [  base  ]  kro = 0, ro = 1
>     |
>     v
> [ snap-1 ]  kro = 0, ro = 1
>     |
>     v
> [ snap-2 ]  kro = 0, ro = 1
>     |
>     v
> [ active ]  kro = 0, ro = 0
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  block.c | 16 +++++++---------
>  block.h |  1 +
>  2 files changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 470bdcc..e31b76f 100644
> --- a/block.c
> +++ b/block.c
> @@ -655,7 +655,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>       * Clear flags that are internal to the block layer before opening the
>       * image.
>       */
> -    open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING);
> +    open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_ALLOW_RDWR);
>  
>      /*
>       * Snapshots should be writable.
> @@ -664,8 +664,6 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>          open_flags |= BDRV_O_RDWR;
>      }
>  
> -    bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR);
> -
>      /* Open the image, either directly or using a protocol */
>      if (drv->bdrv_file_open) {
>          ret = drv->bdrv_file_open(bs, filename, open_flags);
> @@ -804,6 +802,12 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
>          goto unlink_and_fail;
>      }
>  
> +    if (flags & BDRV_O_RDWR) {
> +        flags |= BDRV_O_ALLOW_RDWR;
> +    }
> +
> +    bs->keep_read_only = !(flags & BDRV_O_ALLOW_RDWR);

Do we still need bs->keep_read_only or is it duplicated in
bs->open_flags now? We can convert all users in a follow-up patch.

Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag
  2012-09-05 12:47   ` Kevin Wolf
@ 2012-09-05 13:08     ` Jeff Cody
  2012-09-05 13:12       ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Jeff Cody @ 2012-09-05 13:08 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

On 09/05/2012 08:47 AM, Kevin Wolf wrote:
> Am 30.08.2012 20:47, schrieb Jeff Cody:
>> I believe the bs->keep_read_only flag is supposed to reflect
>> the initial open state of the device. If the device is initially
>> opened R/O, then commit operations, or reopen operations changing
>> to R/W, are prohibited.
>>
>> Currently, the keep_read_only flag is only accurate for the active
>> layer, and its backing file. Subsequent images end up always having
>> the keep_read_only flag set.
>>
>> For instance, what happens now:
>>
>> [  base  ]  kro = 1, ro = 1
>>     |
>>     v
>> [ snap-1 ]  kro = 1, ro = 1
>>     |
>>     v
>> [ snap-2 ]  kro = 0, ro = 1
>>     |
>>     v
>> [ active ]  kro = 0, ro = 0
>>
>> What we want:
>>
>> [  base  ]  kro = 0, ro = 1
>>     |
>>     v
>> [ snap-1 ]  kro = 0, ro = 1
>>     |
>>     v
>> [ snap-2 ]  kro = 0, ro = 1
>>     |
>>     v
>> [ active ]  kro = 0, ro = 0
>>
>> Signed-off-by: Jeff Cody <jcody@redhat.com>
>> ---
>>  block.c | 16 +++++++---------
>>  block.h |  1 +
>>  2 files changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 470bdcc..e31b76f 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -655,7 +655,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>>       * Clear flags that are internal to the block layer before opening the
>>       * image.
>>       */
>> -    open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING);
>> +    open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_ALLOW_RDWR);
>>  
>>      /*
>>       * Snapshots should be writable.
>> @@ -664,8 +664,6 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>>          open_flags |= BDRV_O_RDWR;
>>      }
>>  
>> -    bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR);
>> -
>>      /* Open the image, either directly or using a protocol */
>>      if (drv->bdrv_file_open) {
>>          ret = drv->bdrv_file_open(bs, filename, open_flags);
>> @@ -804,6 +802,12 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
>>          goto unlink_and_fail;
>>      }
>>  
>> +    if (flags & BDRV_O_RDWR) {
>> +        flags |= BDRV_O_ALLOW_RDWR;
>> +    }
>> +
>> +    bs->keep_read_only = !(flags & BDRV_O_ALLOW_RDWR);
> 
> Do we still need bs->keep_read_only or is it duplicated in
> bs->open_flags now? We can convert all users in a follow-up patch.
>

I think we can ditch keep_read_only, and just use BDRV_O_ALLOW_RDWR.  The
only other user of keep_read_only is the original bdrv_commit(), in
block.c.  And, the natural way to convert that function would be to have
it use bdrv_reopen(), to change from r/o->r/w.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag
  2012-09-05 13:08     ` Jeff Cody
@ 2012-09-05 13:12       ` Kevin Wolf
  0 siblings, 0 replies; 27+ messages in thread
From: Kevin Wolf @ 2012-09-05 13:12 UTC (permalink / raw)
  To: jcody; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

Am 05.09.2012 15:08, schrieb Jeff Cody:
> On 09/05/2012 08:47 AM, Kevin Wolf wrote:
>> Am 30.08.2012 20:47, schrieb Jeff Cody:
>>> I believe the bs->keep_read_only flag is supposed to reflect
>>> the initial open state of the device. If the device is initially
>>> opened R/O, then commit operations, or reopen operations changing
>>> to R/W, are prohibited.
>>>
>>> Currently, the keep_read_only flag is only accurate for the active
>>> layer, and its backing file. Subsequent images end up always having
>>> the keep_read_only flag set.
>>>
>>> For instance, what happens now:
>>>
>>> [  base  ]  kro = 1, ro = 1
>>>     |
>>>     v
>>> [ snap-1 ]  kro = 1, ro = 1
>>>     |
>>>     v
>>> [ snap-2 ]  kro = 0, ro = 1
>>>     |
>>>     v
>>> [ active ]  kro = 0, ro = 0
>>>
>>> What we want:
>>>
>>> [  base  ]  kro = 0, ro = 1
>>>     |
>>>     v
>>> [ snap-1 ]  kro = 0, ro = 1
>>>     |
>>>     v
>>> [ snap-2 ]  kro = 0, ro = 1
>>>     |
>>>     v
>>> [ active ]  kro = 0, ro = 0
>>>
>>> Signed-off-by: Jeff Cody <jcody@redhat.com>
>>> ---
>>>  block.c | 16 +++++++---------
>>>  block.h |  1 +
>>>  2 files changed, 8 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/block.c b/block.c
>>> index 470bdcc..e31b76f 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -655,7 +655,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>>>       * Clear flags that are internal to the block layer before opening the
>>>       * image.
>>>       */
>>> -    open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING);
>>> +    open_flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_ALLOW_RDWR);
>>>  
>>>      /*
>>>       * Snapshots should be writable.
>>> @@ -664,8 +664,6 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename,
>>>          open_flags |= BDRV_O_RDWR;
>>>      }
>>>  
>>> -    bs->keep_read_only = bs->read_only = !(open_flags & BDRV_O_RDWR);
>>> -
>>>      /* Open the image, either directly or using a protocol */
>>>      if (drv->bdrv_file_open) {
>>>          ret = drv->bdrv_file_open(bs, filename, open_flags);
>>> @@ -804,6 +802,12 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
>>>          goto unlink_and_fail;
>>>      }
>>>  
>>> +    if (flags & BDRV_O_RDWR) {
>>> +        flags |= BDRV_O_ALLOW_RDWR;
>>> +    }
>>> +
>>> +    bs->keep_read_only = !(flags & BDRV_O_ALLOW_RDWR);
>>
>> Do we still need bs->keep_read_only or is it duplicated in
>> bs->open_flags now? We can convert all users in a follow-up patch.
>>
> 
> I think we can ditch keep_read_only, and just use BDRV_O_ALLOW_RDWR.  The
> only other user of keep_read_only is the original bdrv_commit(), in
> block.c.  And, the natural way to convert that function would be to have
> it use bdrv_reopen(), to change from r/o->r/w.

Ah, yes, makes sense. So just using it in patch 2 and then converting
bdrv_commit() should be enough.

Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely Jeff Cody
@ 2012-09-05 15:09   ` Kevin Wolf
  2012-09-05 16:38     ` Jeff Cody
  2012-09-11 14:57   ` Jeff Cody
  1 sibling, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2012-09-05 15:09 UTC (permalink / raw)
  To: Jeff Cody; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

Am 30.08.2012 20:47, schrieb Jeff Cody:
> This is based heavily on Supriya Kannery's bdrv_reopen()
> patch series.
> 
> This provides a transactional method to reopen multiple
> images files safely.
> 
> Image files are queue for reopen via bdrv_reopen_queue(), and the
> reopen occurs when bdrv_reopen_multiple() is called.  Changes are
> staged in bdrv_reopen_prepare() and in the equivalent driver level
> functions.  If any of the staged images fails a prepare, then all
> of the images left untouched, and the staged changes for each image
> abandoned.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>

> +/*
> + * Reopen multiple BlockDriverStates atomically & transactionally.
> + *
> + * The queue passed in (bs_queue) must have been built up previous
> + * via bdrv_reopen_queue().
> + *
> + * Reopens all BDS specified in the queue, with the appropriate
> + * flags.  All devices are prepared for reopen, and failure of any
> + * device will cause all device changes to be abandonded, and intermediate
> + * data cleaned up.
> + *
> + * If all devices prepare successfully, then the changes are committed
> + * to all devices.
> + *
> + */
> +int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
> +{
> +    int ret = -1;
> +    BlockReopenQueueEntry *bs_entry;
> +    Error *local_err = NULL;
> +
> +    assert(bs_queue != NULL);
> +
> +    bdrv_drain_all();
> +
> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
> +        if (bdrv_reopen_prepare(bs_entry->state, &local_err)) {
> +            error_propagate(errp, local_err);
> +            goto cleanup;
> +        }
> +        bs_entry->prepared = true;
> +    }
> +
> +    /* If we reach this point, we have success and just need to apply the
> +     * changes
> +     */
> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
> +        bdrv_reopen_commit(bs_entry->state);
> +    }
> +
> +    ret = 0;
> +
> +cleanup:
> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
> +        if (ret && bs_entry->prepared) {
> +            bdrv_reopen_abort(bs_entry->state);
> +        }
> +        g_free(bs_entry->state);
> +        g_free(bs_entry);
> +    }

Without QSIMPLEQ_FOREACH_SAFE, isn't this a use after free?

> +    g_free(bs_queue);
> +    return ret;
> +}
> +
> +
> +/* Reopen a single BlockDriverState with the specified flags. */
> +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp)
> +{
> +    int ret = -1;
> +    Error *local_err = NULL;
> +    BlockReopenQueue *queue = bdrv_reopen_queue(NULL, bs, bdrv_flags);
> +
> +    ret = bdrv_reopen_multiple(queue, &local_err);
> +    if (local_err != NULL) {
> +        error_propagate(errp, local_err);
> +    }
> +    return ret;
> +}
> +
> +
> +/*
> + * Prepares a BlockDriverState for reopen. All changes are staged in the
> + * 'reopen_state' field of the BlockDriverState, which must be NULL when
> + * entering (all previous reopens must have completed for the BDS).
> + *
> + * bs is the BlockDriverState to reopen
> + * flags are the new open flags
> + *
> + * Returns 0 on success, non-zero on error.  On error errp will be set
> + * as well.
> + *
> + * On failure, bdrv_reopen_abort() will be called to clean up any data.
> + * It is the responsibility of the caller to then call the abort() or
> + * commit() for any other BDS that have been left in a prepare() state
> + *
> + */
> +int bdrv_reopen_prepare(BDRVReopenState *reopen_state, Error **errp)
> +{
> +    int ret = -1;
> +    Error *local_err = NULL;
> +    BlockDriver *drv;
> +
> +    assert(reopen_state != NULL);
> +    assert(reopen_state->bs->drv != NULL);
> +    drv = reopen_state->bs->drv;
> +
> +    /* if we are to stay read-only, do not allow permission change
> +     * to r/w */
> +    if (reopen_state->bs->keep_read_only &&

Just for completeness, we decided to use the flag here instead of
keep_read_only.

> +        reopen_state->flags & BDRV_O_RDWR) {
> +        error_set(errp, QERR_DEVICE_IS_READ_ONLY,
> +                  reopen_state->bs->device_name);
> +        goto error;
> +    }
> +
> +
> +    ret = bdrv_flush(reopen_state->bs);
> +    if (ret) {
> +        error_set(errp, QERR_IO_ERROR);
> +        goto error;
> +    }

This throws the error code away. Bad. We should probably change
QERR_IO_ERROR so that you can include strerror(-ret).

> +
> +    if (drv->bdrv_reopen_prepare) {
> +        ret = drv->bdrv_reopen_prepare(reopen_state, &local_err);
> +        if (ret) {
> +            if (local_err != NULL) {
> +                error_propagate(errp, local_err);
> +            } else {
> +                error_set(errp, QERR_OPEN_FILE_FAILED,
> +                          reopen_state->bs->filename);
> +            }
> +            goto error;
> +        }
> +    } else {
> +        /* It is currently mandatory to have a bdrv_reopen_prepare()
> +         * handler for each supported drv. */
> +        error_set(errp, QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
> +                  drv->format_name, reopen_state->bs->device_name,
> +                 "reopening of file");
> +        ret = -1;
> +        goto error;
> +    }
> +
> +    return 0;
> +
> +error:
> +    bdrv_reopen_abort(reopen_state);

This is unexpected for me. Shouldn't .bdrv_reopen_prepare() clean up
before returning an error, like any other function does? (Which could
actually be a call to bdrv_reopen_abort() where it makes sense.)

If you use .bdrv_reopen_abort() for it, block drivers must take care to
write this function in way that doesn't assume that
.bdrv_reopen_prepare() has completed. Sounds rather nasty to me.

> +    return ret;
> +}
> +
> +/*
> + * Takes the staged changes for the reopen from bdrv_reopen_prepare(), and
> + * makes them final by swapping the staging BlockDriverState contents into
> + * the active BlockDriverState contents.
> + */
> +void bdrv_reopen_commit(BDRVReopenState *reopen_state)
> +{
> +    BlockDriver *drv;
> +
> +    assert(reopen_state != NULL);
> +    drv = reopen_state->bs->drv;
> +    assert(drv != NULL);
> +
> +    /* If there are any driver level actions to take */
> +    if (drv->bdrv_reopen_commit) {
> +        drv->bdrv_reopen_commit(reopen_state);
> +    }
> +
> +    /* set BDS specific flags now */
> +    reopen_state->bs->open_flags         = reopen_state->flags;
> +    reopen_state->bs->enable_write_cache = !!(reopen_state->flags &
> +                                              BDRV_O_CACHE_WB);
> +    reopen_state->bs->read_only = !(reopen_state->flags & BDRV_O_RDWR);

Hm, I wonder if these three lines can somehow be shared with the normal
bdrv_open so that they stay in sync.

> +}
> +
> +/*
> + * Abort the reopen, and delete and free the staged changes in
> + * reopen_state
> + */
> +void bdrv_reopen_abort(BDRVReopenState *reopen_state)
> +{
> +    BlockDriver *drv;
> +
> +    assert(reopen_state != NULL);
> +    drv = reopen_state->bs->drv;
> +    assert(drv != NULL);
> +
> +    if (drv->bdrv_reopen_abort) {
> +        drv->bdrv_reopen_abort(reopen_state);
> +    }
> +}
> +
> +
>  void bdrv_close(BlockDriverState *bs)
>  {
>      bdrv_flush(bs);
> diff --git a/block.h b/block.h
> index 4d919c2..db812b1 100644
> --- a/block.h
> +++ b/block.h
> @@ -97,6 +97,14 @@ typedef enum {
>      BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
>  } BlockQMPEventAction;
>  
> +typedef struct BlockReopenQueueEntry {
> +     bool prepared;
> +     BDRVReopenState *state;

As discussed on IRC, this can be directly embedded instead of using a
pointer.

> +     QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
> +} BlockReopenQueueEntry;


Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen Jeff Cody
  2012-08-30 22:15   ` Eric Blake
@ 2012-09-05 15:30   ` Kevin Wolf
  2012-09-05 16:43     ` Jeff Cody
  1 sibling, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2012-09-05 15:30 UTC (permalink / raw)
  To: Jeff Cody; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

Am 30.08.2012 20:47, schrieb Jeff Cody:
> This is derived from the Supriya Kannery's reopen patches.
> 
> This contains the raw-posix driver changes for the bdrv_reopen_*
> functions.  All changes are staged into a temporary scratch buffer
> during the prepare() stage, and copied over to the live structure
> during commit().  Upon abort(), all changes are abandoned, and the
> live structures are unmodified.
> 
> The _prepare() will create an extra fd - either by means of a dup,
> if possible, or opening a new fd if not (for instance, access
> control changes).  Upon _commit(), the original fd is closed and
> the new fd is used.  Upon _abort(), the duplicate/new fd is closed.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  block/raw-posix.c | 153 +++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 139 insertions(+), 14 deletions(-)
> 
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index 6be20b1..48086d7 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -140,6 +140,15 @@ typedef struct BDRVRawState {
>  #endif
>  } BDRVRawState;
>  
> +typedef struct BDRVRawReopenState {
> +    BDRVReopenState reopen_state;
> +    int fd;
> +    int open_flags;
> +    uint8_t *aligned_buf;
> +    unsigned aligned_buf_size;
> +    BDRVRawState *stash_s;
> +} BDRVRawReopenState;
> +
>  static int fd_open(BlockDriverState *bs);
>  static int64_t raw_getlength(BlockDriverState *bs);
>  
> @@ -185,6 +194,28 @@ static int raw_normalize_devicepath(const char **filename)
>  }
>  #endif
>  
> +static void raw_parse_flags(int bdrv_flags, int *open_flags)
> +{
> +    assert(open_flags != NULL);
> +
> +    *open_flags |= O_BINARY;
> +    *open_flags &= ~O_ACCMODE;
> +    if (bdrv_flags & BDRV_O_RDWR) {
> +        *open_flags |= O_RDWR;
> +    } else {
> +        *open_flags |= O_RDONLY;
> +    }
> +
> +    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
> +     * and O_DIRECT for no caching. */
> +    if ((bdrv_flags & BDRV_O_NOCACHE)) {
> +        *open_flags |= O_DIRECT;
> +    }
> +    if (!(bdrv_flags & BDRV_O_CACHE_WB)) {
> +        *open_flags |= O_DSYNC;
> +    }
> +}

The code motion would ideally be a separate patch.

> +
>  static int raw_open_common(BlockDriverState *bs, const char *filename,
>                             int bdrv_flags, int open_flags)
>  {
> @@ -196,20 +227,8 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
>          return ret;
>      }
>  
> -    s->open_flags = open_flags | O_BINARY;
> -    s->open_flags &= ~O_ACCMODE;
> -    if (bdrv_flags & BDRV_O_RDWR) {
> -        s->open_flags |= O_RDWR;
> -    } else {
> -        s->open_flags |= O_RDONLY;
> -    }
> -
> -    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
> -     * and O_DIRECT for no caching. */
> -    if ((bdrv_flags & BDRV_O_NOCACHE))
> -        s->open_flags |= O_DIRECT;
> -    if (!(bdrv_flags & BDRV_O_CACHE_WB))
> -        s->open_flags |= O_DSYNC;
> +    s->open_flags = open_flags;
> +    raw_parse_flags(bdrv_flags, &s->open_flags);
>  
>      s->fd = -1;
>      fd = qemu_open(filename, s->open_flags, 0644);
> @@ -283,6 +302,109 @@ static int raw_open(BlockDriverState *bs, const char *filename, int flags)
>      return raw_open_common(bs, filename, flags, 0);
>  }
>  
> +static int raw_reopen_prepare(BDRVReopenState *state, Error **errp)
> +{
> +    BDRVRawState *s;
> +    BDRVRawReopenState *raw_s;
> +    int ret = 0;
> +
> +    assert(state != NULL);
> +    assert(state->bs != NULL);
> +
> +    s = state->bs->opaque;
> +
> +    state->opaque = g_malloc0(sizeof(BDRVRawReopenState));
> +    raw_s = state->opaque;
> +
> +    raw_parse_flags(state->flags, &raw_s->open_flags);
> +
> +    /*
> +     * If we didn't have BDRV_O_NOCACHE set before, we may not have allocated
> +     * aligned_buf
> +     */
> +    if ((state->flags & BDRV_O_NOCACHE)) {
> +        /*
> +         * Allocate a buffer for read/modify/write cycles.  Choose the size
> +         * pessimistically as we don't know the block size yet.
> +         */
> +        raw_s->aligned_buf_size = 32 * MAX_BLOCKSIZE;
> +        raw_s->aligned_buf = qemu_memalign(MAX_BLOCKSIZE,
> +                                           raw_s->aligned_buf_size);
> +
> +        if (raw_s->aligned_buf == NULL) {
> +            ret = -1;
> +            goto error;
> +        }

Even though it's pretty small, I think I would factor this out into a
small static helper to make sure it's kept in sync with raw_open_common().

> +    }
> +
> +    int fcntl_flags = O_APPEND | O_ASYNC | O_NONBLOCK;
> +#ifdef O_NOATIME
> +    fcntl_flags |= O_NOATIME;
> +#endif
> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
> +        /* dup the original fd */
> +        /* TODO: use qemu fcntl wrapper */
> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
> +        if (raw_s->fd == -1) {
> +            ret = -1;
> +            goto error;
> +        }
> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
> +    } else {
> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
> +        if (raw_s->fd == -1) {
> +            ret = -1;
> +        }

Ignoring this part for now, with qemu_dup_flags() it's going to look a
bit different. In particular, I'm hoping that we don't get a second
fcntl_flags enumeration here, but can just fall back to qemu_open()
whenever qemu_dup_flags() fails.

If we do need to keep fcntl_flags here, we'll probably want to add
O_DIRECT to it.

> +    }
> +error:
> +    return ret;
> +}
> +
> +
> +static void raw_reopen_commit(BDRVReopenState *state)
> +{
> +    BDRVRawReopenState *raw_s = state->opaque;
> +    BDRVRawState *s = state->bs->opaque;
> +
> +    if (raw_s->aligned_buf != NULL) {
> +        if (s->aligned_buf) {
> +            qemu_vfree(s->aligned_buf);
> +        }
> +        s->aligned_buf      = raw_s->aligned_buf;
> +        s->aligned_buf_size = raw_s->aligned_buf_size;
> +    }
> +
> +    s->open_flags = raw_s->open_flags;
> +
> +    close(s->fd);
> +    s->fd = raw_s->fd;
> +
> +    g_free(state->opaque);
> +    state->opaque = NULL;
> +}

I think s->use_aio must be changed as well, it depends on
BDRV_O_NOCACHE. Maybe we even need to do it in prepare, laio_init() may
need to be called.

Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely
  2012-09-05 15:09   ` Kevin Wolf
@ 2012-09-05 16:38     ` Jeff Cody
  0 siblings, 0 replies; 27+ messages in thread
From: Jeff Cody @ 2012-09-05 16:38 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

On 09/05/2012 11:09 AM, Kevin Wolf wrote:
> Am 30.08.2012 20:47, schrieb Jeff Cody:
>> This is based heavily on Supriya Kannery's bdrv_reopen()
>> patch series.
>>
>> This provides a transactional method to reopen multiple
>> images files safely.
>>
>> Image files are queue for reopen via bdrv_reopen_queue(), and the
>> reopen occurs when bdrv_reopen_multiple() is called.  Changes are
>> staged in bdrv_reopen_prepare() and in the equivalent driver level
>> functions.  If any of the staged images fails a prepare, then all
>> of the images left untouched, and the staged changes for each image
>> abandoned.
>>
>> Signed-off-by: Jeff Cody <jcody@redhat.com>
> 
>> +/*
>> + * Reopen multiple BlockDriverStates atomically & transactionally.
>> + *
>> + * The queue passed in (bs_queue) must have been built up previous
>> + * via bdrv_reopen_queue().
>> + *
>> + * Reopens all BDS specified in the queue, with the appropriate
>> + * flags.  All devices are prepared for reopen, and failure of any
>> + * device will cause all device changes to be abandonded, and intermediate
>> + * data cleaned up.
>> + *
>> + * If all devices prepare successfully, then the changes are committed
>> + * to all devices.
>> + *
>> + */
>> +int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
>> +{
>> +    int ret = -1;
>> +    BlockReopenQueueEntry *bs_entry;
>> +    Error *local_err = NULL;
>> +
>> +    assert(bs_queue != NULL);
>> +
>> +    bdrv_drain_all();
>> +
>> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
>> +        if (bdrv_reopen_prepare(bs_entry->state, &local_err)) {
>> +            error_propagate(errp, local_err);
>> +            goto cleanup;
>> +        }
>> +        bs_entry->prepared = true;
>> +    }
>> +
>> +    /* If we reach this point, we have success and just need to apply the
>> +     * changes
>> +     */
>> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
>> +        bdrv_reopen_commit(bs_entry->state);
>> +    }
>> +
>> +    ret = 0;
>> +
>> +cleanup:
>> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
>> +        if (ret && bs_entry->prepared) {
>> +            bdrv_reopen_abort(bs_entry->state);
>> +        }
>> +        g_free(bs_entry->state);
>> +        g_free(bs_entry);
>> +    }
> 
> Without QSIMPLEQ_FOREACH_SAFE, isn't this a use after free?
> 

Yes - that needs to be a QSIMPLEQ_FOREACH_SAFE.


>> +    g_free(bs_queue);
>> +    return ret;
>> +}
>> +
>> +
>> +/* Reopen a single BlockDriverState with the specified flags. */
>> +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp)
>> +{
>> +    int ret = -1;
>> +    Error *local_err = NULL;
>> +    BlockReopenQueue *queue = bdrv_reopen_queue(NULL, bs, bdrv_flags);
>> +
>> +    ret = bdrv_reopen_multiple(queue, &local_err);
>> +    if (local_err != NULL) {
>> +        error_propagate(errp, local_err);
>> +    }
>> +    return ret;
>> +}
>> +
>> +
>> +/*
>> + * Prepares a BlockDriverState for reopen. All changes are staged in the
>> + * 'reopen_state' field of the BlockDriverState, which must be NULL when
>> + * entering (all previous reopens must have completed for the BDS).
>> + *
>> + * bs is the BlockDriverState to reopen
>> + * flags are the new open flags
>> + *
>> + * Returns 0 on success, non-zero on error.  On error errp will be set
>> + * as well.
>> + *
>> + * On failure, bdrv_reopen_abort() will be called to clean up any data.
>> + * It is the responsibility of the caller to then call the abort() or
>> + * commit() for any other BDS that have been left in a prepare() state
>> + *
>> + */
>> +int bdrv_reopen_prepare(BDRVReopenState *reopen_state, Error **errp)
>> +{
>> +    int ret = -1;
>> +    Error *local_err = NULL;
>> +    BlockDriver *drv;
>> +
>> +    assert(reopen_state != NULL);
>> +    assert(reopen_state->bs->drv != NULL);
>> +    drv = reopen_state->bs->drv;
>> +
>> +    /* if we are to stay read-only, do not allow permission change
>> +     * to r/w */
>> +    if (reopen_state->bs->keep_read_only &&
> 
> Just for completeness, we decided to use the flag here instead of
> keep_read_only.
> 
>> +        reopen_state->flags & BDRV_O_RDWR) {
>> +        error_set(errp, QERR_DEVICE_IS_READ_ONLY,
>> +                  reopen_state->bs->device_name);
>> +        goto error;
>> +    }
>> +
>> +
>> +    ret = bdrv_flush(reopen_state->bs);
>> +    if (ret) {
>> +        error_set(errp, QERR_IO_ERROR);
>> +        goto error;
>> +    }
> 
> This throws the error code away. Bad. We should probably change
> QERR_IO_ERROR so that you can include strerror(-ret).
> 

Or, I could use the new error_setg() here, and pass along all relevant
information.

>> +
>> +    if (drv->bdrv_reopen_prepare) {
>> +        ret = drv->bdrv_reopen_prepare(reopen_state, &local_err);
>> +        if (ret) {
>> +            if (local_err != NULL) {
>> +                error_propagate(errp, local_err);
>> +            } else {
>> +                error_set(errp, QERR_OPEN_FILE_FAILED,
>> +                          reopen_state->bs->filename);
>> +            }
>> +            goto error;
>> +        }
>> +    } else {
>> +        /* It is currently mandatory to have a bdrv_reopen_prepare()
>> +         * handler for each supported drv. */
>> +        error_set(errp, QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
>> +                  drv->format_name, reopen_state->bs->device_name,
>> +                 "reopening of file");
>> +        ret = -1;
>> +        goto error;
>> +    }
>> +
>> +    return 0;
>> +
>> +error:
>> +    bdrv_reopen_abort(reopen_state);
> 
> This is unexpected for me. Shouldn't .bdrv_reopen_prepare() clean up
> before returning an error, like any other function does? (Which could
> actually be a call to bdrv_reopen_abort() where it makes sense.)
> 
> If you use .bdrv_reopen_abort() for it, block drivers must take care to
> write this function in way that doesn't assume that
> .bdrv_reopen_prepare() has completed. Sounds rather nasty to me.
> 

Hmm.  Yes, I can see that - and there is no cleanup this function needs
to do itself, so it can just assume that .bdrv_reopen_prepare() will
clean up after itself (which, as you mentioned, will likely be the
driver calling its own .bdrv_reopen_abort()).

Although, I think it should always be good form for the block drivers to
not make such assumptions, if possible.


>> +    return ret;
>> +}
>> +
>> +/*
>> + * Takes the staged changes for the reopen from bdrv_reopen_prepare(), and
>> + * makes them final by swapping the staging BlockDriverState contents into
>> + * the active BlockDriverState contents.
>> + */
>> +void bdrv_reopen_commit(BDRVReopenState *reopen_state)
>> +{
>> +    BlockDriver *drv;
>> +
>> +    assert(reopen_state != NULL);
>> +    drv = reopen_state->bs->drv;
>> +    assert(drv != NULL);
>> +
>> +    /* If there are any driver level actions to take */
>> +    if (drv->bdrv_reopen_commit) {
>> +        drv->bdrv_reopen_commit(reopen_state);
>> +    }
>> +
>> +    /* set BDS specific flags now */
>> +    reopen_state->bs->open_flags         = reopen_state->flags;
>> +    reopen_state->bs->enable_write_cache = !!(reopen_state->flags &
>> +                                              BDRV_O_CACHE_WB);
>> +    reopen_state->bs->read_only = !(reopen_state->flags & BDRV_O_RDWR);
> 
> Hm, I wonder if these three lines can somehow be shared with the normal
> bdrv_open so that they stay in sync.

Sure, especially setting enable_write_cache.

> 
>> +}
>> +
>> +/*
>> + * Abort the reopen, and delete and free the staged changes in
>> + * reopen_state
>> + */
>> +void bdrv_reopen_abort(BDRVReopenState *reopen_state)
>> +{
>> +    BlockDriver *drv;
>> +
>> +    assert(reopen_state != NULL);
>> +    drv = reopen_state->bs->drv;
>> +    assert(drv != NULL);
>> +
>> +    if (drv->bdrv_reopen_abort) {
>> +        drv->bdrv_reopen_abort(reopen_state);
>> +    }
>> +}
>> +
>> +
>>  void bdrv_close(BlockDriverState *bs)
>>  {
>>      bdrv_flush(bs);
>> diff --git a/block.h b/block.h
>> index 4d919c2..db812b1 100644
>> --- a/block.h
>> +++ b/block.h
>> @@ -97,6 +97,14 @@ typedef enum {
>>      BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
>>  } BlockQMPEventAction;
>>  
>> +typedef struct BlockReopenQueueEntry {
>> +     bool prepared;
>> +     BDRVReopenState *state;
> 
> As discussed on IRC, this can be directly embedded instead of using a
> pointer.
> 
>> +     QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
>> +} BlockReopenQueueEntry;
> 
> 
> Kevin
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-09-05 15:30   ` Kevin Wolf
@ 2012-09-05 16:43     ` Jeff Cody
  2012-09-06  9:23       ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Jeff Cody @ 2012-09-05 16:43 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

On 09/05/2012 11:30 AM, Kevin Wolf wrote:
> Am 30.08.2012 20:47, schrieb Jeff Cody:
>> This is derived from the Supriya Kannery's reopen patches.
>>
>> This contains the raw-posix driver changes for the bdrv_reopen_*
>> functions.  All changes are staged into a temporary scratch buffer
>> during the prepare() stage, and copied over to the live structure
>> during commit().  Upon abort(), all changes are abandoned, and the
>> live structures are unmodified.
>>
>> The _prepare() will create an extra fd - either by means of a dup,
>> if possible, or opening a new fd if not (for instance, access
>> control changes).  Upon _commit(), the original fd is closed and
>> the new fd is used.  Upon _abort(), the duplicate/new fd is closed.
>>
>> Signed-off-by: Jeff Cody <jcody@redhat.com>
>> ---
>>  block/raw-posix.c | 153 +++++++++++++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 139 insertions(+), 14 deletions(-)
>>
>> diff --git a/block/raw-posix.c b/block/raw-posix.c
>> index 6be20b1..48086d7 100644
>> --- a/block/raw-posix.c
>> +++ b/block/raw-posix.c
>> @@ -140,6 +140,15 @@ typedef struct BDRVRawState {
>>  #endif
>>  } BDRVRawState;
>>  
>> +typedef struct BDRVRawReopenState {
>> +    BDRVReopenState reopen_state;
>> +    int fd;
>> +    int open_flags;
>> +    uint8_t *aligned_buf;
>> +    unsigned aligned_buf_size;
>> +    BDRVRawState *stash_s;
>> +} BDRVRawReopenState;
>> +
>>  static int fd_open(BlockDriverState *bs);
>>  static int64_t raw_getlength(BlockDriverState *bs);
>>  
>> @@ -185,6 +194,28 @@ static int raw_normalize_devicepath(const char **filename)
>>  }
>>  #endif
>>  
>> +static void raw_parse_flags(int bdrv_flags, int *open_flags)
>> +{
>> +    assert(open_flags != NULL);
>> +
>> +    *open_flags |= O_BINARY;
>> +    *open_flags &= ~O_ACCMODE;
>> +    if (bdrv_flags & BDRV_O_RDWR) {
>> +        *open_flags |= O_RDWR;
>> +    } else {
>> +        *open_flags |= O_RDONLY;
>> +    }
>> +
>> +    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
>> +     * and O_DIRECT for no caching. */
>> +    if ((bdrv_flags & BDRV_O_NOCACHE)) {
>> +        *open_flags |= O_DIRECT;
>> +    }
>> +    if (!(bdrv_flags & BDRV_O_CACHE_WB)) {
>> +        *open_flags |= O_DSYNC;
>> +    }
>> +}
> 
> The code motion would ideally be a separate patch.
>

OK

>> +
>>  static int raw_open_common(BlockDriverState *bs, const char *filename,
>>                             int bdrv_flags, int open_flags)
>>  {
>> @@ -196,20 +227,8 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
>>          return ret;
>>      }
>>  
>> -    s->open_flags = open_flags | O_BINARY;
>> -    s->open_flags &= ~O_ACCMODE;
>> -    if (bdrv_flags & BDRV_O_RDWR) {
>> -        s->open_flags |= O_RDWR;
>> -    } else {
>> -        s->open_flags |= O_RDONLY;
>> -    }
>> -
>> -    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
>> -     * and O_DIRECT for no caching. */
>> -    if ((bdrv_flags & BDRV_O_NOCACHE))
>> -        s->open_flags |= O_DIRECT;
>> -    if (!(bdrv_flags & BDRV_O_CACHE_WB))
>> -        s->open_flags |= O_DSYNC;
>> +    s->open_flags = open_flags;
>> +    raw_parse_flags(bdrv_flags, &s->open_flags);
>>  
>>      s->fd = -1;
>>      fd = qemu_open(filename, s->open_flags, 0644);
>> @@ -283,6 +302,109 @@ static int raw_open(BlockDriverState *bs, const char *filename, int flags)
>>      return raw_open_common(bs, filename, flags, 0);
>>  }
>>  
>> +static int raw_reopen_prepare(BDRVReopenState *state, Error **errp)
>> +{
>> +    BDRVRawState *s;
>> +    BDRVRawReopenState *raw_s;
>> +    int ret = 0;
>> +
>> +    assert(state != NULL);
>> +    assert(state->bs != NULL);
>> +
>> +    s = state->bs->opaque;
>> +
>> +    state->opaque = g_malloc0(sizeof(BDRVRawReopenState));
>> +    raw_s = state->opaque;
>> +
>> +    raw_parse_flags(state->flags, &raw_s->open_flags);
>> +
>> +    /*
>> +     * If we didn't have BDRV_O_NOCACHE set before, we may not have allocated
>> +     * aligned_buf
>> +     */
>> +    if ((state->flags & BDRV_O_NOCACHE)) {
>> +        /*
>> +         * Allocate a buffer for read/modify/write cycles.  Choose the size
>> +         * pessimistically as we don't know the block size yet.
>> +         */
>> +        raw_s->aligned_buf_size = 32 * MAX_BLOCKSIZE;
>> +        raw_s->aligned_buf = qemu_memalign(MAX_BLOCKSIZE,
>> +                                           raw_s->aligned_buf_size);
>> +
>> +        if (raw_s->aligned_buf == NULL) {
>> +            ret = -1;
>> +            goto error;
>> +        }
> 
> Even though it's pretty small, I think I would factor this out into a
> small static helper to make sure it's kept in sync with raw_open_common().
> 

OK, and like your suggestion above, I'll put that in a separate code
motion patch.

>> +    }
>> +
>> +    int fcntl_flags = O_APPEND | O_ASYNC | O_NONBLOCK;
>> +#ifdef O_NOATIME
>> +    fcntl_flags |= O_NOATIME;
>> +#endif
>> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
>> +        /* dup the original fd */
>> +        /* TODO: use qemu fcntl wrapper */
>> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
>> +        if (raw_s->fd == -1) {
>> +            ret = -1;
>> +            goto error;
>> +        }
>> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
>> +    } else {
>> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
>> +        if (raw_s->fd == -1) {
>> +            ret = -1;
>> +        }
> 
> Ignoring this part for now, with qemu_dup_flags() it's going to look a
> bit different. In particular, I'm hoping that we don't get a second
> fcntl_flags enumeration here, but can just fall back to qemu_open()
> whenever qemu_dup_flags() fails.

That will require modification to qemu_dup_flags()... I believe
qemu_dup_flags() silently filters out fcntl incompatible flags.

Maybe it would be best to create a small helper function in osdep.c, that
fetches the fcntl_flags.  Then qemu_dup_flags() and this function would
use the same helper to fetch fcntl_flags.  The results of that would
determine if we call qemu_dup_flags() or qemu_open().

Although, I do think it makes sense to always try qemu_open() if
qemu_dup_flags() fails for some reason.

> 
> If we do need to keep fcntl_flags here, we'll probably want to add
> O_DIRECT to it.
> 

>> +    }
>> +error:
>> +    return ret;
>> +}
>> +
>> +
>> +static void raw_reopen_commit(BDRVReopenState *state)
>> +{
>> +    BDRVRawReopenState *raw_s = state->opaque;
>> +    BDRVRawState *s = state->bs->opaque;
>> +
>> +    if (raw_s->aligned_buf != NULL) {
>> +        if (s->aligned_buf) {
>> +            qemu_vfree(s->aligned_buf);
>> +        }
>> +        s->aligned_buf      = raw_s->aligned_buf;
>> +        s->aligned_buf_size = raw_s->aligned_buf_size;
>> +    }
>> +
>> +    s->open_flags = raw_s->open_flags;
>> +
>> +    close(s->fd);
>> +    s->fd = raw_s->fd;
>> +
>> +    g_free(state->opaque);
>> +    state->opaque = NULL;
>> +}
> 
> I think s->use_aio must be changed as well, it depends on
> BDRV_O_NOCACHE. Maybe we even need to do it in prepare, laio_init() may
> need to be called.

Yes, good catch, thanks.

> 
> Kevin
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-09-05 16:43     ` Jeff Cody
@ 2012-09-06  9:23       ` Kevin Wolf
  2012-09-06 15:34         ` Corey Bryant
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2012-09-06  9:23 UTC (permalink / raw)
  To: jcody; +Cc: supriyak, stefanha, Corey Bryant, qemu-devel, pbonzini, eblake

Am 05.09.2012 18:43, schrieb Jeff Cody:
>>> +    }
>>> +
>>> +    int fcntl_flags = O_APPEND | O_ASYNC | O_NONBLOCK;
>>> +#ifdef O_NOATIME
>>> +    fcntl_flags |= O_NOATIME;
>>> +#endif
>>> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
>>> +        /* dup the original fd */
>>> +        /* TODO: use qemu fcntl wrapper */
>>> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
>>> +        if (raw_s->fd == -1) {
>>> +            ret = -1;
>>> +            goto error;
>>> +        }
>>> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
>>> +    } else {
>>> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
>>> +        if (raw_s->fd == -1) {
>>> +            ret = -1;
>>> +        }
>>
>> Ignoring this part for now, with qemu_dup_flags() it's going to look a
>> bit different. In particular, I'm hoping that we don't get a second
>> fcntl_flags enumeration here, but can just fall back to qemu_open()
>> whenever qemu_dup_flags() fails.
> 
> That will require modification to qemu_dup_flags()... I believe
> qemu_dup_flags() silently filters out fcntl incompatible flags.
> 
> Maybe it would be best to create a small helper function in osdep.c, that
> fetches the fcntl_flags.  Then qemu_dup_flags() and this function would
> use the same helper to fetch fcntl_flags.  The results of that would
> determine if we call qemu_dup_flags() or qemu_open().
> 
> Although, I do think it makes sense to always try qemu_open() if
> qemu_dup_flags() fails for some reason.

If we can modify qemu_dup_flags() to fail if it can't provide the right
set of flags, then I think we should do it - and I think we can. Even
for the existing cases with fd passing it shouldn't break anything, but
only add an additional safety check.

And if touching the function motivates Corey to write some fd passing
test cases so that you can't break it, even better. ;-)

Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-09-06  9:23       ` Kevin Wolf
@ 2012-09-06 15:34         ` Corey Bryant
  2012-09-07 10:40           ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Corey Bryant @ 2012-09-06 15:34 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: supriyak, stefanha, jcody, qemu-devel, pbonzini, eblake



On 09/06/2012 05:23 AM, Kevin Wolf wrote:
> Am 05.09.2012 18:43, schrieb Jeff Cody:
>>>> +    }
>>>> +
>>>> +    int fcntl_flags = O_APPEND | O_ASYNC | O_NONBLOCK;
>>>> +#ifdef O_NOATIME
>>>> +    fcntl_flags |= O_NOATIME;
>>>> +#endif
>>>> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
>>>> +        /* dup the original fd */
>>>> +        /* TODO: use qemu fcntl wrapper */
>>>> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
>>>> +        if (raw_s->fd == -1) {
>>>> +            ret = -1;
>>>> +            goto error;
>>>> +        }
>>>> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
>>>> +    } else {
>>>> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
>>>> +        if (raw_s->fd == -1) {
>>>> +            ret = -1;
>>>> +        }
>>>
>>> Ignoring this part for now, with qemu_dup_flags() it's going to look a
>>> bit different. In particular, I'm hoping that we don't get a second
>>> fcntl_flags enumeration here, but can just fall back to qemu_open()
>>> whenever qemu_dup_flags() fails.
>>
>> That will require modification to qemu_dup_flags()... I believe
>> qemu_dup_flags() silently filters out fcntl incompatible flags.
>>
>> Maybe it would be best to create a small helper function in osdep.c, that
>> fetches the fcntl_flags.  Then qemu_dup_flags() and this function would
>> use the same helper to fetch fcntl_flags.  The results of that would
>> determine if we call qemu_dup_flags() or qemu_open().
>>
>> Although, I do think it makes sense to always try qemu_open() if
>> qemu_dup_flags() fails for some reason.

I'm curious why you can't always call qemu_open().

Some things to consider so that fd passing doesn't break when a reopen 
occurs.  Mainly all the concerns revolve around how fd passing keeps 
track of references to fd sets (note: adding and removing fd set 
references is all done in qemu_open and qemu_close).

* When reopening, qemu_open needs to be called before qemu_close.  This 
will prevent the reference list for an fdset from becoming empty.  If 
qemu_close is called before qemu_open, the reference list can become 
empty, and the fdset could be cleaned up before the qemu_open.  Then 
qemu_open would fail.

* qemu_open/qemu_close need to be used rather than open/close so that 
the references for fd passing are properly accounted for.

* I don't think you want to call qemu_dup_flags directly since it 
doesn't update the reference list for fd passing.  Only qemu_open and 
qemu_close update the reference list.

>
> If we can modify qemu_dup_flags() to fail if it can't provide the right
> set of flags, then I think we should do it - and I think we can. Even
> for the existing cases with fd passing it shouldn't break anything, but
> only add an additional safety check.
>
> And if touching the function motivates Corey to write some fd passing
> test cases so that you can't break it, even better. ;-)

:) Sorry, I do plan to do this soon.  I've just been side-tracked with 
some other things.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-09-06 15:34         ` Corey Bryant
@ 2012-09-07 10:40           ` Kevin Wolf
  2012-09-07 14:29             ` Corey Bryant
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2012-09-07 10:40 UTC (permalink / raw)
  To: Corey Bryant; +Cc: supriyak, stefanha, jcody, qemu-devel, pbonzini, eblake

Am 06.09.2012 17:34, schrieb Corey Bryant:
> 
> 
> On 09/06/2012 05:23 AM, Kevin Wolf wrote:
>> Am 05.09.2012 18:43, schrieb Jeff Cody:
>>>>> +    }
>>>>> +
>>>>> +    int fcntl_flags = O_APPEND | O_ASYNC | O_NONBLOCK;
>>>>> +#ifdef O_NOATIME
>>>>> +    fcntl_flags |= O_NOATIME;
>>>>> +#endif
>>>>> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
>>>>> +        /* dup the original fd */
>>>>> +        /* TODO: use qemu fcntl wrapper */
>>>>> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
>>>>> +        if (raw_s->fd == -1) {
>>>>> +            ret = -1;
>>>>> +            goto error;
>>>>> +        }
>>>>> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
>>>>> +    } else {
>>>>> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
>>>>> +        if (raw_s->fd == -1) {
>>>>> +            ret = -1;
>>>>> +        }
>>>>
>>>> Ignoring this part for now, with qemu_dup_flags() it's going to look a
>>>> bit different. In particular, I'm hoping that we don't get a second
>>>> fcntl_flags enumeration here, but can just fall back to qemu_open()
>>>> whenever qemu_dup_flags() fails.
>>>
>>> That will require modification to qemu_dup_flags()... I believe
>>> qemu_dup_flags() silently filters out fcntl incompatible flags.
>>>
>>> Maybe it would be best to create a small helper function in osdep.c, that
>>> fetches the fcntl_flags.  Then qemu_dup_flags() and this function would
>>> use the same helper to fetch fcntl_flags.  The results of that would
>>> determine if we call qemu_dup_flags() or qemu_open().
>>>
>>> Although, I do think it makes sense to always try qemu_open() if
>>> qemu_dup_flags() fails for some reason.
> 
> I'm curious why you can't always call qemu_open().

I believe the original reason was that qemu_open() is more likely to
fail, for example if the image file has been renamed/moved/deleted since
the first open. You could still use fcntl() on an existing file
descriptor, but reopening would fail.

> Some things to consider so that fd passing doesn't break when a reopen 
> occurs.  Mainly all the concerns revolve around how fd passing keeps 
> track of references to fd sets (note: adding and removing fd set 
> references is all done in qemu_open and qemu_close).
> 
> * When reopening, qemu_open needs to be called before qemu_close.  This 
> will prevent the reference list for an fdset from becoming empty.  If 
> qemu_close is called before qemu_open, the reference list can become 
> empty, and the fdset could be cleaned up before the qemu_open.  Then 
> qemu_open would fail.

Will automatically be right when we properly implement transactional
semantics.

> * qemu_open/qemu_close need to be used rather than open/close so that 
> the references for fd passing are properly accounted for.

Congratulations, you've just discovered a bug in Jeff's patches. It was
a good idea to CC you. ;-)

> * I don't think you want to call qemu_dup_flags directly since it 
> doesn't update the reference list for fd passing.  Only qemu_open and 
> qemu_close update the reference list.

That's a good point, too. So probably a small wrapper that just updates
the reference list in addition?

>> If we can modify qemu_dup_flags() to fail if it can't provide the right
>> set of flags, then I think we should do it - and I think we can. Even
>> for the existing cases with fd passing it shouldn't break anything, but
>> only add an additional safety check.
>>
>> And if touching the function motivates Corey to write some fd passing
>> test cases so that you can't break it, even better. ;-)
> 
> :) Sorry, I do plan to do this soon.  I've just been side-tracked with 
> some other things.

No problem, it was just such a great opportunity to remind you. ;-)

Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
  2012-09-07 10:40           ` Kevin Wolf
@ 2012-09-07 14:29             ` Corey Bryant
  0 siblings, 0 replies; 27+ messages in thread
From: Corey Bryant @ 2012-09-07 14:29 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: supriyak, stefanha, jcody, qemu-devel, pbonzini, eblake



On 09/07/2012 06:40 AM, Kevin Wolf wrote:
> Am 06.09.2012 17:34, schrieb Corey Bryant:
>>
>>
>> On 09/06/2012 05:23 AM, Kevin Wolf wrote:
>>> Am 05.09.2012 18:43, schrieb Jeff Cody:
>>>>>> +    }
>>>>>> +
>>>>>> +    int fcntl_flags = O_APPEND | O_ASYNC | O_NONBLOCK;
>>>>>> +#ifdef O_NOATIME
>>>>>> +    fcntl_flags |= O_NOATIME;
>>>>>> +#endif
>>>>>> +    if ((raw_s->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
>>>>>> +        /* dup the original fd */
>>>>>> +        /* TODO: use qemu fcntl wrapper */
>>>>>> +        raw_s->fd = fcntl(s->fd, F_DUPFD_CLOEXEC, 0);
>>>>>> +        if (raw_s->fd == -1) {
>>>>>> +            ret = -1;
>>>>>> +            goto error;
>>>>>> +        }
>>>>>> +        ret = fcntl_setfl(raw_s->fd, raw_s->open_flags);
>>>>>> +    } else {
>>>>>> +        raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags, 0644);
>>>>>> +        if (raw_s->fd == -1) {
>>>>>> +            ret = -1;
>>>>>> +        }
>>>>>
>>>>> Ignoring this part for now, with qemu_dup_flags() it's going to look a
>>>>> bit different. In particular, I'm hoping that we don't get a second
>>>>> fcntl_flags enumeration here, but can just fall back to qemu_open()
>>>>> whenever qemu_dup_flags() fails.
>>>>
>>>> That will require modification to qemu_dup_flags()... I believe
>>>> qemu_dup_flags() silently filters out fcntl incompatible flags.
>>>>
>>>> Maybe it would be best to create a small helper function in osdep.c, that
>>>> fetches the fcntl_flags.  Then qemu_dup_flags() and this function would
>>>> use the same helper to fetch fcntl_flags.  The results of that would
>>>> determine if we call qemu_dup_flags() or qemu_open().
>>>>
>>>> Although, I do think it makes sense to always try qemu_open() if
>>>> qemu_dup_flags() fails for some reason.
>>
>> I'm curious why you can't always call qemu_open().
>
> I believe the original reason was that qemu_open() is more likely to
> fail, for example if the image file has been renamed/moved/deleted since
> the first open. You could still use fcntl() on an existing file
> descriptor, but reopening would fail.
>
>> Some things to consider so that fd passing doesn't break when a reopen
>> occurs.  Mainly all the concerns revolve around how fd passing keeps
>> track of references to fd sets (note: adding and removing fd set
>> references is all done in qemu_open and qemu_close).
>>
>> * When reopening, qemu_open needs to be called before qemu_close.  This
>> will prevent the reference list for an fdset from becoming empty.  If
>> qemu_close is called before qemu_open, the reference list can become
>> empty, and the fdset could be cleaned up before the qemu_open.  Then
>> qemu_open would fail.
>
> Will automatically be right when we properly implement transactional
> semantics.
>
>> * qemu_open/qemu_close need to be used rather than open/close so that
>> the references for fd passing are properly accounted for.
>
> Congratulations, you've just discovered a bug in Jeff's patches. It was
> a good idea to CC you. ;-)
>
>> * I don't think you want to call qemu_dup_flags directly since it
>> doesn't update the reference list for fd passing.  Only qemu_open and
>> qemu_close update the reference list.
>
> That's a good point, too. So probably a small wrapper that just updates
> the reference list in addition?
>

You could do that.  And yes you'd have to add code to add the new dup fd 
to an fdset's dup_fds list if in fact the fd that you dup'd was a member 
of an fdset's dup_fds list (see how qemu_close() and qemu_open() do this).

But wouldn't it be easier to just go through qemu_close() then 
qemu_open() to perform the reopen?  Then you don't have to add this 
extra code to account for fd passing.

-- 
Regards,
Corey

>>> If we can modify qemu_dup_flags() to fail if it can't provide the right
>>> set of flags, then I think we should do it - and I think we can. Even
>>> for the existing cases with fd passing it shouldn't break anything, but
>>> only add an additional safety check.
>>>
>>> And if touching the function motivates Corey to write some fd passing
>>> test cases so that you can't break it, even better. ;-)
>>
>> :) Sorry, I do plan to do this soon.  I've just been side-tracked with
>> some other things.
>
> No problem, it was just such a great opportunity to remind you. ;-)
>
> Kevin
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely
  2012-08-30 18:47 ` [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely Jeff Cody
  2012-09-05 15:09   ` Kevin Wolf
@ 2012-09-11 14:57   ` Jeff Cody
  2012-09-11 15:14     ` Kevin Wolf
  1 sibling, 1 reply; 27+ messages in thread
From: Jeff Cody @ 2012-09-11 14:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake, supriyak, stefanha

On 08/30/2012 02:47 PM, Jeff Cody wrote:
> This is based heavily on Supriya Kannery's bdrv_reopen()
> patch series.
> 
> This provides a transactional method to reopen multiple
> images files safely.
> 
> Image files are queue for reopen via bdrv_reopen_queue(), and the
> reopen occurs when bdrv_reopen_multiple() is called.  Changes are
> staged in bdrv_reopen_prepare() and in the equivalent driver level
> functions.  If any of the staged images fails a prepare, then all
> of the images left untouched, and the staged changes for each image
> abandoned.
>

Open question (my assumption is yes):

Is it safe to assume that reopen() should always enable BDRV_O_CACHE_WB
(without affecting enable_write_cache), so as to not undo what was done
by Paolo's commit e1e9b0ac?

> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  block.c       | 226 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  block.h       |  15 ++++
>  block_int.h   |  13 ++++
>  qemu-common.h |   1 +
>  4 files changed, 255 insertions(+)
> 
> diff --git a/block.c b/block.c
> index e31b76f..9470319 100644
> --- a/block.c
> +++ b/block.c
> @@ -857,6 +857,232 @@ unlink_and_fail:
>      return ret;
>  }
>  
> +/*
> + * Adds a BlockDriverState to a simple queue for an atomic, transactional
> + * reopen of multiple devices.
> + *
> + * bs_queue can either be an existing BlockReopenQueue that has had QSIMPLE_INIT
> + * already performed, or alternatively may be NULL a new BlockReopenQueue will
> + * be created and initialized. This newly created BlockReopenQueue should be
> + * passed back in for subsequent calls that are intended to be of the same
> + * atomic 'set'.
> + *
> + * bs is the BlockDriverState to add to the reopen queue.
> + *
> + * flags contains the open flags for the associated bs
> + *
> + * returns a pointer to bs_queue, which is either the newly allocated
> + * bs_queue, or the existing bs_queue being used.
> + *
> + */
> +BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
> +                                    BlockDriverState *bs, int flags)
> +{
> +    assert(bs != NULL);
> +
> +    BlockReopenQueueEntry *bs_entry;
> +    if (bs_queue == NULL) {
> +        bs_queue = g_new0(BlockReopenQueue, 1);
> +        QSIMPLEQ_INIT(bs_queue);
> +    }
> +
> +    if (bs->file) {
> +        bdrv_reopen_queue(bs_queue, bs->file, flags);
> +    }
> +
> +    bs_entry = g_new0(BlockReopenQueueEntry, 1);
> +    QSIMPLEQ_INSERT_TAIL(bs_queue, bs_entry, entry);
> +
> +    bs_entry->state = g_new0(BDRVReopenState, 1);
> +    bs_entry->state->bs = bs;
> +    bs_entry->state->flags = flags;
> +
> +    return bs_queue;
> +}
> +
> +/*
> + * Reopen multiple BlockDriverStates atomically & transactionally.
> + *
> + * The queue passed in (bs_queue) must have been built up previous
> + * via bdrv_reopen_queue().
> + *
> + * Reopens all BDS specified in the queue, with the appropriate
> + * flags.  All devices are prepared for reopen, and failure of any
> + * device will cause all device changes to be abandonded, and intermediate
> + * data cleaned up.
> + *
> + * If all devices prepare successfully, then the changes are committed
> + * to all devices.
> + *
> + */
> +int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp)
> +{
> +    int ret = -1;
> +    BlockReopenQueueEntry *bs_entry;
> +    Error *local_err = NULL;
> +
> +    assert(bs_queue != NULL);
> +
> +    bdrv_drain_all();
> +
> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
> +        if (bdrv_reopen_prepare(bs_entry->state, &local_err)) {
> +            error_propagate(errp, local_err);
> +            goto cleanup;
> +        }
> +        bs_entry->prepared = true;
> +    }
> +
> +    /* If we reach this point, we have success and just need to apply the
> +     * changes
> +     */
> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
> +        bdrv_reopen_commit(bs_entry->state);
> +    }
> +
> +    ret = 0;
> +
> +cleanup:
> +    QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
> +        if (ret && bs_entry->prepared) {
> +            bdrv_reopen_abort(bs_entry->state);
> +        }
> +        g_free(bs_entry->state);
> +        g_free(bs_entry);
> +    }
> +    g_free(bs_queue);
> +    return ret;
> +}
> +
> +
> +/* Reopen a single BlockDriverState with the specified flags. */
> +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp)
> +{
> +    int ret = -1;
> +    Error *local_err = NULL;
> +    BlockReopenQueue *queue = bdrv_reopen_queue(NULL, bs, bdrv_flags);
> +
> +    ret = bdrv_reopen_multiple(queue, &local_err);
> +    if (local_err != NULL) {
> +        error_propagate(errp, local_err);
> +    }
> +    return ret;
> +}
> +
> +
> +/*
> + * Prepares a BlockDriverState for reopen. All changes are staged in the
> + * 'reopen_state' field of the BlockDriverState, which must be NULL when
> + * entering (all previous reopens must have completed for the BDS).
> + *
> + * bs is the BlockDriverState to reopen
> + * flags are the new open flags
> + *
> + * Returns 0 on success, non-zero on error.  On error errp will be set
> + * as well.
> + *
> + * On failure, bdrv_reopen_abort() will be called to clean up any data.
> + * It is the responsibility of the caller to then call the abort() or
> + * commit() for any other BDS that have been left in a prepare() state
> + *
> + */
> +int bdrv_reopen_prepare(BDRVReopenState *reopen_state, Error **errp)
> +{
> +    int ret = -1;
> +    Error *local_err = NULL;
> +    BlockDriver *drv;
> +
> +    assert(reopen_state != NULL);
> +    assert(reopen_state->bs->drv != NULL);
> +    drv = reopen_state->bs->drv;
> +
> +    /* if we are to stay read-only, do not allow permission change
> +     * to r/w */
> +    if (reopen_state->bs->keep_read_only &&
> +        reopen_state->flags & BDRV_O_RDWR) {
> +        error_set(errp, QERR_DEVICE_IS_READ_ONLY,
> +                  reopen_state->bs->device_name);
> +        goto error;
> +    }
> +
> +
> +    ret = bdrv_flush(reopen_state->bs);
> +    if (ret) {
> +        error_set(errp, QERR_IO_ERROR);
> +        goto error;
> +    }
> +
> +    if (drv->bdrv_reopen_prepare) {
> +        ret = drv->bdrv_reopen_prepare(reopen_state, &local_err);
> +        if (ret) {
> +            if (local_err != NULL) {
> +                error_propagate(errp, local_err);
> +            } else {
> +                error_set(errp, QERR_OPEN_FILE_FAILED,
> +                          reopen_state->bs->filename);
> +            }
> +            goto error;
> +        }
> +    } else {
> +        /* It is currently mandatory to have a bdrv_reopen_prepare()
> +         * handler for each supported drv. */
> +        error_set(errp, QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
> +                  drv->format_name, reopen_state->bs->device_name,
> +                 "reopening of file");
> +        ret = -1;
> +        goto error;
> +    }
> +
> +    return 0;
> +
> +error:
> +    bdrv_reopen_abort(reopen_state);
> +    return ret;
> +}
> +
> +/*
> + * Takes the staged changes for the reopen from bdrv_reopen_prepare(), and
> + * makes them final by swapping the staging BlockDriverState contents into
> + * the active BlockDriverState contents.
> + */
> +void bdrv_reopen_commit(BDRVReopenState *reopen_state)
> +{
> +    BlockDriver *drv;
> +
> +    assert(reopen_state != NULL);
> +    drv = reopen_state->bs->drv;
> +    assert(drv != NULL);
> +
> +    /* If there are any driver level actions to take */
> +    if (drv->bdrv_reopen_commit) {
> +        drv->bdrv_reopen_commit(reopen_state);
> +    }
> +
> +    /* set BDS specific flags now */
> +    reopen_state->bs->open_flags         = reopen_state->flags;
> +    reopen_state->bs->enable_write_cache = !!(reopen_state->flags &
> +                                              BDRV_O_CACHE_WB);
> +    reopen_state->bs->read_only = !(reopen_state->flags & BDRV_O_RDWR);
> +}
> +
> +/*
> + * Abort the reopen, and delete and free the staged changes in
> + * reopen_state
> + */
> +void bdrv_reopen_abort(BDRVReopenState *reopen_state)
> +{
> +    BlockDriver *drv;
> +
> +    assert(reopen_state != NULL);
> +    drv = reopen_state->bs->drv;
> +    assert(drv != NULL);
> +
> +    if (drv->bdrv_reopen_abort) {
> +        drv->bdrv_reopen_abort(reopen_state);
> +    }
> +}
> +
> +
>  void bdrv_close(BlockDriverState *bs)
>  {
>      bdrv_flush(bs);
> diff --git a/block.h b/block.h
> index 4d919c2..db812b1 100644
> --- a/block.h
> +++ b/block.h
> @@ -97,6 +97,14 @@ typedef enum {
>      BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
>  } BlockQMPEventAction;
>  
> +typedef struct BlockReopenQueueEntry {
> +     bool prepared;
> +     BDRVReopenState *state;
> +     QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
> +} BlockReopenQueueEntry;
> +
> +typedef QSIMPLEQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) BlockReopenQueue;
> +
>  void bdrv_iostatus_enable(BlockDriverState *bs);
>  void bdrv_iostatus_reset(BlockDriverState *bs);
>  void bdrv_iostatus_disable(BlockDriverState *bs);
> @@ -131,6 +139,13 @@ int bdrv_parse_cache_flags(const char *mode, int *flags);
>  int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags);
>  int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
>                BlockDriver *drv);
> +BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
> +                                    BlockDriverState *bs, int flags);
> +int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
> +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp);
> +int bdrv_reopen_prepare(BDRVReopenState *reopen_state, Error **errp);
> +void bdrv_reopen_commit(BDRVReopenState *reopen_state);
> +void bdrv_reopen_abort(BDRVReopenState *reopen_state);
>  void bdrv_close(BlockDriverState *bs);
>  int bdrv_attach_dev(BlockDriverState *bs, void *dev);
>  void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev);
> diff --git a/block_int.h b/block_int.h
> index 4452f6f..7a4e226 100644
> --- a/block_int.h
> +++ b/block_int.h
> @@ -139,6 +139,12 @@ struct BlockDriver {
>      int instance_size;
>      int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename);
>      int (*bdrv_probe_device)(const char *filename);
> +
> +    /* For handling image reopen for split or non-split files */
> +    int (*bdrv_reopen_prepare)(BDRVReopenState *reopen_state, Error **errp);
> +    void (*bdrv_reopen_commit)(BDRVReopenState *reopen_state);
> +    void (*bdrv_reopen_abort)(BDRVReopenState *reopen_state);
> +
>      int (*bdrv_open)(BlockDriverState *bs, int flags);
>      int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags);
>      int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num,
> @@ -336,6 +342,13 @@ struct BlockDriverState {
>  
>      /* long-running background operation */
>      BlockJob *job;
> +
> +};
> +
> +struct BDRVReopenState {
> +    BlockDriverState *bs;
> +    int flags;
> +    void *opaque;
>  };
>  
>  int get_tmp_filename(char *filename, int size);
> diff --git a/qemu-common.h b/qemu-common.h
> index e5c2bcd..6a6181c 100644
> --- a/qemu-common.h
> +++ b/qemu-common.h
> @@ -245,6 +245,7 @@ typedef struct NICInfo NICInfo;
>  typedef struct HCIInfo HCIInfo;
>  typedef struct AudioState AudioState;
>  typedef struct BlockDriverState BlockDriverState;
> +typedef struct BDRVReopenState BDRVReopenState;
>  typedef struct DriveInfo DriveInfo;
>  typedef struct DisplayState DisplayState;
>  typedef struct DisplayChangeListener DisplayChangeListener;
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely
  2012-09-11 14:57   ` Jeff Cody
@ 2012-09-11 15:14     ` Kevin Wolf
  2012-09-11 15:36       ` Jeff Cody
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2012-09-11 15:14 UTC (permalink / raw)
  To: jcody; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

Am 11.09.2012 16:57, schrieb Jeff Cody:
> On 08/30/2012 02:47 PM, Jeff Cody wrote:
>> This is based heavily on Supriya Kannery's bdrv_reopen()
>> patch series.
>>
>> This provides a transactional method to reopen multiple
>> images files safely.
>>
>> Image files are queue for reopen via bdrv_reopen_queue(), and the
>> reopen occurs when bdrv_reopen_multiple() is called.  Changes are
>> staged in bdrv_reopen_prepare() and in the equivalent driver level
>> functions.  If any of the staged images fails a prepare, then all
>> of the images left untouched, and the staged changes for each image
>> abandoned.
>>
> 
> Open question (my assumption is yes):
> 
> Is it safe to assume that reopen() should always enable BDRV_O_CACHE_WB
> (without affecting enable_write_cache), so as to not undo what was done
> by Paolo's commit e1e9b0ac?

I think it makes sense to behave the same as bdrv_open_common(), so I
guess yes. But now I'm wondering if we also need other code from there,
like filtering out BDRV_O_SNAPSHOT/NO_BACKING etc.

Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely
  2012-09-11 15:14     ` Kevin Wolf
@ 2012-09-11 15:36       ` Jeff Cody
  2012-09-11 15:41         ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Jeff Cody @ 2012-09-11 15:36 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

On 09/11/2012 11:14 AM, Kevin Wolf wrote:
> Am 11.09.2012 16:57, schrieb Jeff Cody:
>> On 08/30/2012 02:47 PM, Jeff Cody wrote:
>>> This is based heavily on Supriya Kannery's bdrv_reopen()
>>> patch series.
>>>
>>> This provides a transactional method to reopen multiple
>>> images files safely.
>>>
>>> Image files are queue for reopen via bdrv_reopen_queue(), and the
>>> reopen occurs when bdrv_reopen_multiple() is called.  Changes are
>>> staged in bdrv_reopen_prepare() and in the equivalent driver level
>>> functions.  If any of the staged images fails a prepare, then all
>>> of the images left untouched, and the staged changes for each image
>>> abandoned.
>>>
>>
>> Open question (my assumption is yes):
>>
>> Is it safe to assume that reopen() should always enable BDRV_O_CACHE_WB
>> (without affecting enable_write_cache), so as to not undo what was done
>> by Paolo's commit e1e9b0ac?
> 
> I think it makes sense to behave the same as bdrv_open_common(), so I
> guess yes. But now I'm wondering if we also need other code from there,
> like filtering out BDRV_O_SNAPSHOT/NO_BACKING etc.
>

I was wondering the same thing w/regards to BDRV_O_SNAPSHOT/NO_BACKING,
but I fell on the side of 'no'. Mainly because the raw drivers (raw-posix,
raw-win32) actively parse the passed flags to determine the actual open
flags, and so spurious flags such as those are ignored.  However,
BDRV_O_CACHE_WB is used in their flag parsing logic, so I think it needs
to be preserved.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely
  2012-09-11 15:36       ` Jeff Cody
@ 2012-09-11 15:41         ` Kevin Wolf
  0 siblings, 0 replies; 27+ messages in thread
From: Kevin Wolf @ 2012-09-11 15:41 UTC (permalink / raw)
  To: jcody; +Cc: supriyak, pbonzini, eblake, qemu-devel, stefanha

Am 11.09.2012 17:36, schrieb Jeff Cody:
> On 09/11/2012 11:14 AM, Kevin Wolf wrote:
>> Am 11.09.2012 16:57, schrieb Jeff Cody:
>>> On 08/30/2012 02:47 PM, Jeff Cody wrote:
>>>> This is based heavily on Supriya Kannery's bdrv_reopen()
>>>> patch series.
>>>>
>>>> This provides a transactional method to reopen multiple
>>>> images files safely.
>>>>
>>>> Image files are queue for reopen via bdrv_reopen_queue(), and the
>>>> reopen occurs when bdrv_reopen_multiple() is called.  Changes are
>>>> staged in bdrv_reopen_prepare() and in the equivalent driver level
>>>> functions.  If any of the staged images fails a prepare, then all
>>>> of the images left untouched, and the staged changes for each image
>>>> abandoned.
>>>>
>>>
>>> Open question (my assumption is yes):
>>>
>>> Is it safe to assume that reopen() should always enable BDRV_O_CACHE_WB
>>> (without affecting enable_write_cache), so as to not undo what was done
>>> by Paolo's commit e1e9b0ac?
>>
>> I think it makes sense to behave the same as bdrv_open_common(), so I
>> guess yes. But now I'm wondering if we also need other code from there,
>> like filtering out BDRV_O_SNAPSHOT/NO_BACKING etc.
>>
> 
> I was wondering the same thing w/regards to BDRV_O_SNAPSHOT/NO_BACKING,
> but I fell on the side of 'no'. Mainly because the raw drivers (raw-posix,
> raw-win32) actively parse the passed flags to determine the actual open
> flags, and so spurious flags such as those are ignored.  However,
> BDRV_O_CACHE_WB is used in their flag parsing logic, so I think it needs
> to be preserved.

That's probably logic that should be removed in raw-posix/win32.c as it
is unused now.

Kevin

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2012-09-11 15:43 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-30 18:47 [Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches Jeff Cody
2012-08-30 18:47 ` [Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag Jeff Cody
2012-09-05 12:47   ` Kevin Wolf
2012-09-05 13:08     ` Jeff Cody
2012-09-05 13:12       ` Kevin Wolf
2012-08-30 18:47 ` [Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely Jeff Cody
2012-09-05 15:09   ` Kevin Wolf
2012-09-05 16:38     ` Jeff Cody
2012-09-11 14:57   ` Jeff Cody
2012-09-11 15:14     ` Kevin Wolf
2012-09-11 15:36       ` Jeff Cody
2012-09-11 15:41         ` Kevin Wolf
2012-08-30 18:47 ` [Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen Jeff Cody
2012-08-30 22:15   ` Eric Blake
2012-08-31 14:42     ` Jeff Cody
2012-08-31 14:49       ` Kevin Wolf
2012-08-31 15:10         ` Jeff Cody
2012-09-05 15:30   ` Kevin Wolf
2012-09-05 16:43     ` Jeff Cody
2012-09-06  9:23       ` Kevin Wolf
2012-09-06 15:34         ` Corey Bryant
2012-09-07 10:40           ` Kevin Wolf
2012-09-07 14:29             ` Corey Bryant
2012-08-30 18:47 ` [Qemu-devel] [PATCH 4/7] block: raw " Jeff Cody
2012-08-30 18:47 ` [Qemu-devel] [PATCH 5/7] block: qed " Jeff Cody
2012-08-30 18:47 ` [Qemu-devel] [PATCH 6/7] block: qcow2 " Jeff Cody
2012-08-30 18:47 ` [Qemu-devel] [PATCH 7/7] block: qcow " Jeff Cody

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.