All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
@ 2018-03-10  8:27 John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 01/21] blockjobs: fix set-speed kick John Snow
                   ` (23 more replies)
  0 siblings, 24 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

This series seeks to address two distinct but closely related issues
concerning the job management API.

(1) For jobs that complete when a monitor is not attached and receiving
    events or notifications, there's no way to discern the job's final
    return code. Jobs must remain in the query list until dismissed
    for reliable management.

(2) Jobs that change the block graph structure at an indeterminate point
    after the job starts compete with the management layer that relies
    on that graph structure to issue meaningful commands.

    This structure should change only at the behest of the management
    API, and not asynchronously at unknown points in time. Before a job
    issues such changes, it must rely on explicit and synchronous
    confirmation from the management API.

These changes are implemented by formalizing a State Transition Machine
for the BlockJob subsystem.

Job States:

UNDEFINED       Default state. Internal state only.
CREATED         Job has been created
RUNNING         Job has been started and is running
PAUSED          Job is not ready and has been paused
READY           Job is ready and is running
STANDBY         Job is ready and is paused

WAITING         Job is waiting on peers in transaction
PENDING         Job is waiting on ACK from QMP
ABORTING        Job is aborting or has been cancelled
CONCLUDED       Job has finished and has a retcode available
NULL            Job is being dismantled. Internal state only.

Job Verbs:

CANCEL          Instructs a running job to terminate with error,
                (Except when that job is READY, which produces no error.)
PAUSE           Request a job to pause.
RESUME          Request a job to resume from a pause.
SET-SPEED       Change the speed limiting parameter of a job.
COMPLETE        Ask a READY job to finish and exit.

FINALIZE        Ask a PENDING job to perform its graph finalization.
DISMISS         Finish cleaning up an empty job.

And here's my stab at a diagram:

                 +---------+
                 |UNDEFINED|
                 +--+------+
                    |
                 +--v----+
       +---------+CREATED+-----------------+
       |         +--+----+                 |
       |            |                      |
       |         +--+----+     +------+    |
       +---------+RUNNING<----->PAUSED|    |
       |         +--+-+--+     +------+    |
       |            | |                    |
       |            | +------------------+ |
       |            |                    | |
       |         +--v--+       +-------+ | |
       +---------+READY<------->STANDBY| | |
       |         +--+--+       +-------+ | |
       |            |                    | |
       |         +--v----+               | |
       +---------+WAITING+---------------+ |
       |         +--+----+                 |
       |            |                      |
       |         +--v----+                 |
       +---------+PENDING|                 |
       |         +--+----+                 |
       |            |                      |
    +--v-----+   +--v------+               |
    |ABORTING+--->CONCLUDED|               |
    +--------+   +--+------+               |
                    |                      |
                 +--v-+                    |
                 |NULL+--------------------+
                 +----+

v5:
001/21:[----] [--] 'blockjobs: fix set-speed kick'
002/21:[0001] [FC] 'blockjobs: model single jobs as transactions'
003/21:[down] 'Blockjobs: documentation touchup'
004/21:[0004] [FC] 'blockjobs: add status enum'
005/21:[0004] [FC] 'blockjobs: add state transition table'
006/21:[----] [--] 'iotests: add pause_wait'
007/21:[----] [--] 'blockjobs: add block_job_verb permission table'
008/21:[0004] [FC] 'blockjobs: add ABORTING state'
009/21:[0022] [FC] 'blockjobs: add CONCLUDED state'
010/21:[0025] [FC] 'blockjobs: add NULL state'
011/21:[0025] [FC] 'blockjobs: add block_job_dismiss'
012/21:[0002] [FC] 'blockjobs: ensure abort is called for cancelled jobs'
013/21:[----] [--] 'blockjobs: add commit, abort, clean helpers'
014/21:[0005] [FC] 'blockjobs: add block_job_txn_apply function'
015/21:[0006] [FC] 'blockjobs: add prepare callback'
016/21:[0031] [FC] 'blockjobs: add waiting status'
017/21:[0018] [FC] 'blockjobs: add PENDING status and event'
018/21:[0043] [FC] 'blockjobs: add block-job-finalize'
019/21:[0118] [FC] 'blockjobs: Expose manual property'
020/21:[0036] [FC] 'iotests: test manual job dismissal'
021/21:[down] 'tests/test-blockjob: test cancellations'

Big changes:
- Disallow implicit loopback transitions
- Allow CREATED-->ABORTING transitions; this is modeling how canceling
  jobs that aren't started already works.
- Use block_job_decommission to invalidate job objects instead of using
  a context-less 'unref'
- Add cancellation unit test.
- May have not added all the R-Bs I should have, and likely added some
  I shouldn't have.

02: Removed stale comment
03: New patch touching up comments, replacing the 'manual' property patch.
04: Contextual, removed doc touchup for comments previously added in 03
05: Disallow implicit loopback transitions
    Removed initial state assignment in create
08: Allow CREATED-->ABORTING transition.
    Removed forward reference to @concluded state
09: Re-added doc reference to @concluded
    Replaced block_job_event_concluded with block_job_conclude
    Fixed erroneous transition avoidance for internal jobs
    STM table change fallout from #08
10: Added assertion that jobs with no refcounts are in status NULL.
    Added block_job_decommission for the transition to the NULL state.
    block_job_create now uses block_job_early_fail on error pathways
    block_job_early_fail now uses block_job_decommission
11: block_job_do_dismiss now just calls block_job_decommission,
      see commit message
    added job->auto_dismiss property
    removed extra reference for dismiss functionality, it was an artifact
      of an older implementation and isn't needed
12: Allow ABORTING -> ABORTING transitions (kwolf)
14: Keep the assertion that completed jobs have an RC of 0.
15: Rewrote block_job_prepare to be 35% less stupid (by volume)
    Updated commit message, which is now 59% less wrong.
16: Fixed typos and diagram.
    Removed waiting event entirely.
    STM table change fallout from #08 and #12.
17: Touched up the diagram again.
    Added the MANUAL_FINALIZE and auto_finalize flag and property.
18: Changed commit message.
    Removed special casing for mixed-mode transactions for finalization step.
    block_job_cancel gets a new case for handling the cancellation of
    jobs deferred to the main loop.
19: Fallout from splitting property names, almost entirely different now.
20: Changed property names, job now tests 'dismiss' exclusively again.
21: New patch to test cancellation modes.

V4:
 - All jobs are now transactions.
 - All jobs now transition through states in a uniform way.
 - Verb permissions are now enforced.

V3:
 - Added WAITING and PENDING events
 - Added block_job_finalize verb
 - Added .pending() callback for jobs
 - Tweaked how .commit/.abort work

V2:
 - Added tests!
 - Changed property name (Jeff, Paolo)

RFC / Known problems:
- Still need more tests.
- STANDBY is still a dumb name. See v4's cover letter.
- Mirror needs to be refactored to use the commit/abort/pending/clean callbacks
  to fulfill the promise made by "no graph changes without user authorization"
  that PENDING is supposed to offer.

________________________________________________________________________________

For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch block-job-reap
https://github.com/jnsnow/qemu/tree/block-job-reap

This version is tagged block-job-reap-v5:
https://github.com/jnsnow/qemu/releases/tag/block-job-reap-v5

John Snow (21):
  blockjobs: fix set-speed kick
  blockjobs: model single jobs as transactions
  Blockjobs: documentation touchup
  blockjobs: add status enum
  blockjobs: add state transition table
  iotests: add pause_wait
  blockjobs: add block_job_verb permission table
  blockjobs: add ABORTING state
  blockjobs: add CONCLUDED state
  blockjobs: add NULL state
  blockjobs: add block_job_dismiss
  blockjobs: ensure abort is called for cancelled jobs
  blockjobs: add commit, abort, clean helpers
  blockjobs: add block_job_txn_apply function
  blockjobs: add prepare callback
  blockjobs: add waiting status
  blockjobs: add PENDING status and event
  blockjobs: add block-job-finalize
  blockjobs: Expose manual property
  iotests: test manual job dismissal
  tests/test-blockjob: test cancellations

 block/backup.c                |   5 +-
 block/commit.c                |   2 +-
 block/mirror.c                |   2 +-
 block/stream.c                |   2 +-
 block/trace-events            |   7 +
 blockdev.c                    |  69 ++++++++-
 blockjob.c                    | 348 ++++++++++++++++++++++++++++++++++++------
 include/block/blockjob.h      |  61 +++++++-
 include/block/blockjob_int.h  |  17 ++-
 qapi/block-core.json          | 184 +++++++++++++++++++++-
 tests/qemu-iotests/030        |   6 +-
 tests/qemu-iotests/055        |  17 +--
 tests/qemu-iotests/056        | 187 +++++++++++++++++++++++
 tests/qemu-iotests/056.out    |   4 +-
 tests/qemu-iotests/109.out    |  24 +--
 tests/qemu-iotests/iotests.py |  12 +-
 tests/test-bdrv-drain.c       |   5 +-
 tests/test-blockjob-txn.c     |  19 +--
 tests/test-blockjob.c         | 233 +++++++++++++++++++++++++++-
 19 files changed, 1076 insertions(+), 128 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 01/21] blockjobs: fix set-speed kick
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-12 18:13   ` Jeff Cody
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 02/21] blockjobs: model single jobs as transactions John Snow
                   ` (22 subsequent siblings)
  23 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

If speed is '0' it's not actually "less than" the previous speed.
Kick the job in this case too.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 blockjob.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blockjob.c b/blockjob.c
index 801d29d849..afd92db01f 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -499,7 +499,7 @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
     }
 
     job->speed = speed;
-    if (speed <= old_speed) {
+    if (speed && speed <= old_speed) {
         return;
     }
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 02/21] blockjobs: model single jobs as transactions
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 01/21] blockjobs: fix set-speed kick John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 03/21] Blockjobs: documentation touchup John Snow
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

model all independent jobs as single job transactions.

It's one less case we have to worry about when we add more states to the
transition machine. This way, we can just treat all job lifetimes exactly
the same. This helps tighten assertions of the STM graph and removes some
conditionals that would have been needed in the coming commits adding a
more explicit job lifetime management API.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 block/backup.c               |  3 +--
 block/commit.c               |  2 +-
 block/mirror.c               |  2 +-
 block/stream.c               |  2 +-
 blockjob.c                   | 25 ++++++++++++++++---------
 include/block/blockjob.h     |  1 -
 include/block/blockjob_int.h |  3 ++-
 tests/test-bdrv-drain.c      |  4 ++--
 tests/test-blockjob-txn.c    | 19 +++++++------------
 tests/test-blockjob.c        |  2 +-
 10 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 4a16a37229..7e254dabff 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -621,7 +621,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     }
 
     /* job->common.len is fixed, so we can't allow resize */
-    job = block_job_create(job_id, &backup_job_driver, bs,
+    job = block_job_create(job_id, &backup_job_driver, txn, bs,
                            BLK_PERM_CONSISTENT_READ,
                            BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
                            BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
@@ -677,7 +677,6 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
                        &error_abort);
     job->common.len = len;
-    block_job_txn_add_job(txn, &job->common);
 
     return &job->common;
 
diff --git a/block/commit.c b/block/commit.c
index 1943c9c3e1..ab4fa3c3cf 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -289,7 +289,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
         return;
     }
 
-    s = block_job_create(job_id, &commit_job_driver, bs, 0, BLK_PERM_ALL,
+    s = block_job_create(job_id, &commit_job_driver, NULL, bs, 0, BLK_PERM_ALL,
                          speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
     if (!s) {
         return;
diff --git a/block/mirror.c b/block/mirror.c
index f5bf620942..76fddb3838 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1166,7 +1166,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
     }
 
     /* Make sure that the source is not resized while the job is running */
-    s = block_job_create(job_id, driver, mirror_top_bs,
+    s = block_job_create(job_id, driver, NULL, mirror_top_bs,
                          BLK_PERM_CONSISTENT_READ,
                          BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
                          BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD, speed,
diff --git a/block/stream.c b/block/stream.c
index 499cdacdb0..f3b53f49e2 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -244,7 +244,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
     /* Prevent concurrent jobs trying to modify the graph structure here, we
      * already have our own plans. Also don't allow resize as the image size is
      * queried only at the job start and then cached. */
-    s = block_job_create(job_id, &stream_job_driver, bs,
+    s = block_job_create(job_id, &stream_job_driver, NULL, bs,
                          BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
                          BLK_PERM_GRAPH_MOD,
                          BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
diff --git a/blockjob.c b/blockjob.c
index afd92db01f..ecc5fcbdf8 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -357,10 +357,8 @@ static void block_job_completed_single(BlockJob *job)
         }
     }
 
-    if (job->txn) {
-        QLIST_REMOVE(job, txn_list);
-        block_job_txn_unref(job->txn);
-    }
+    QLIST_REMOVE(job, txn_list);
+    block_job_txn_unref(job->txn);
     block_job_unref(job);
 }
 
@@ -647,7 +645,7 @@ static void block_job_event_completed(BlockJob *job, const char *msg)
  */
 
 void *block_job_create(const char *job_id, const BlockJobDriver *driver,
-                       BlockDriverState *bs, uint64_t perm,
+                       BlockJobTxn *txn, BlockDriverState *bs, uint64_t perm,
                        uint64_t shared_perm, int64_t speed, int flags,
                        BlockCompletionFunc *cb, void *opaque, Error **errp)
 {
@@ -729,6 +727,17 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
             return NULL;
         }
     }
+
+    /* Single jobs are modeled as single-job transactions for sake of
+     * consolidating the job management logic */
+    if (!txn) {
+        txn = block_job_txn_new();
+        block_job_txn_add_job(txn, job);
+        block_job_txn_unref(txn);
+    } else {
+        block_job_txn_add_job(txn, job);
+    }
+
     return job;
 }
 
@@ -752,13 +761,11 @@ void block_job_early_fail(BlockJob *job)
 
 void block_job_completed(BlockJob *job, int ret)
 {
+    assert(job && job->txn && !job->completed);
     assert(blk_bs(job->blk)->job == job);
-    assert(!job->completed);
     job->completed = true;
     job->ret = ret;
-    if (!job->txn) {
-        block_job_completed_single(job);
-    } else if (ret < 0 || block_job_is_cancelled(job)) {
+    if (ret < 0 || block_job_is_cancelled(job)) {
         block_job_completed_txn_abort(job);
     } else {
         block_job_completed_txn_success(job);
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 00403d9482..29cde3ffe3 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -141,7 +141,6 @@ typedef struct BlockJob {
      */
     QEMUTimer sleep_timer;
 
-    /** Non-NULL if this job is part of a transaction */
     BlockJobTxn *txn;
     QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index c9b23b0cc9..becaae74c2 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -115,6 +115,7 @@ struct BlockJobDriver {
  * @job_id: The id of the newly-created job, or %NULL to have one
  * generated automatically.
  * @job_type: The class object for the newly-created job.
+ * @txn: The transaction this job belongs to, if any. %NULL otherwise.
  * @bs: The block
  * @perm, @shared_perm: Permissions to request for @bs
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
@@ -132,7 +133,7 @@ struct BlockJobDriver {
  * called from a wrapper that is specific to the job type.
  */
 void *block_job_create(const char *job_id, const BlockJobDriver *driver,
-                       BlockDriverState *bs, uint64_t perm,
+                       BlockJobTxn *txn, BlockDriverState *bs, uint64_t perm,
                        uint64_t shared_perm, int64_t speed, int flags,
                        BlockCompletionFunc *cb, void *opaque, Error **errp);
 
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index d760e2b243..a7eecf1c1e 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -541,8 +541,8 @@ static void test_blockjob_common(enum drain_type drain_type)
     blk_target = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
     blk_insert_bs(blk_target, target, &error_abort);
 
-    job = block_job_create("job0", &test_job_driver, src, 0, BLK_PERM_ALL, 0,
-                           0, NULL, NULL, &error_abort);
+    job = block_job_create("job0", &test_job_driver, NULL, src, 0, BLK_PERM_ALL,
+                           0, 0, NULL, NULL, &error_abort);
     block_job_add_bdrv(job, "target", target, 0, BLK_PERM_ALL, &error_abort);
     block_job_start(job);
 
diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
index 3591c9617f..34f09ef8c1 100644
--- a/tests/test-blockjob-txn.c
+++ b/tests/test-blockjob-txn.c
@@ -87,7 +87,7 @@ static const BlockJobDriver test_block_job_driver = {
  */
 static BlockJob *test_block_job_start(unsigned int iterations,
                                       bool use_timer,
-                                      int rc, int *result)
+                                      int rc, int *result, BlockJobTxn *txn)
 {
     BlockDriverState *bs;
     TestBlockJob *s;
@@ -101,7 +101,7 @@ static BlockJob *test_block_job_start(unsigned int iterations,
     g_assert_nonnull(bs);
 
     snprintf(job_id, sizeof(job_id), "job%u", counter++);
-    s = block_job_create(job_id, &test_block_job_driver, bs,
+    s = block_job_create(job_id, &test_block_job_driver, txn, bs,
                          0, BLK_PERM_ALL, 0, BLOCK_JOB_DEFAULT,
                          test_block_job_cb, data, &error_abort);
     s->iterations = iterations;
@@ -120,8 +120,7 @@ static void test_single_job(int expected)
     int result = -EINPROGRESS;
 
     txn = block_job_txn_new();
-    job = test_block_job_start(1, true, expected, &result);
-    block_job_txn_add_job(txn, job);
+    job = test_block_job_start(1, true, expected, &result, txn);
     block_job_start(job);
 
     if (expected == -ECANCELED) {
@@ -160,10 +159,8 @@ static void test_pair_jobs(int expected1, int expected2)
     int result2 = -EINPROGRESS;
 
     txn = block_job_txn_new();
-    job1 = test_block_job_start(1, true, expected1, &result1);
-    block_job_txn_add_job(txn, job1);
-    job2 = test_block_job_start(2, true, expected2, &result2);
-    block_job_txn_add_job(txn, job2);
+    job1 = test_block_job_start(1, true, expected1, &result1, txn);
+    job2 = test_block_job_start(2, true, expected2, &result2, txn);
     block_job_start(job1);
     block_job_start(job2);
 
@@ -224,10 +221,8 @@ static void test_pair_jobs_fail_cancel_race(void)
     int result2 = -EINPROGRESS;
 
     txn = block_job_txn_new();
-    job1 = test_block_job_start(1, true, -ECANCELED, &result1);
-    block_job_txn_add_job(txn, job1);
-    job2 = test_block_job_start(2, false, 0, &result2);
-    block_job_txn_add_job(txn, job2);
+    job1 = test_block_job_start(1, true, -ECANCELED, &result1, txn);
+    job2 = test_block_job_start(2, false, 0, &result2, txn);
     block_job_start(job1);
     block_job_start(job2);
 
diff --git a/tests/test-blockjob.c b/tests/test-blockjob.c
index 23bdf1a932..599e28d732 100644
--- a/tests/test-blockjob.c
+++ b/tests/test-blockjob.c
@@ -30,7 +30,7 @@ static BlockJob *do_test_id(BlockBackend *blk, const char *id,
     BlockJob *job;
     Error *errp = NULL;
 
-    job = block_job_create(id, &test_block_job_driver, blk_bs(blk),
+    job = block_job_create(id, &test_block_job_driver, NULL, blk_bs(blk),
                            0, BLK_PERM_ALL, 0, BLOCK_JOB_DEFAULT, block_job_cb,
                            NULL, &errp);
     if (should_succeed) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 03/21] Blockjobs: documentation touchup
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 01/21] blockjobs: fix set-speed kick John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 02/21] blockjobs: model single jobs as transactions John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 04/21] blockjobs: add status enum John Snow
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Trivial; Document what the job creation flags do,
and some general tidying.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 include/block/blockjob.h     | 8 ++++----
 include/block/blockjob_int.h | 4 +++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 29cde3ffe3..b77fac118d 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -127,12 +127,10 @@ typedef struct BlockJob {
     /** Reference count of the block job */
     int refcnt;
 
-    /* True if this job has reported completion by calling block_job_completed.
-     */
+    /** True when job has reported completion by calling block_job_completed. */
     bool completed;
 
-    /* ret code passed to block_job_completed.
-     */
+    /** ret code passed to block_job_completed. */
     int ret;
 
     /**
@@ -146,7 +144,9 @@ typedef struct BlockJob {
 } BlockJob;
 
 typedef enum BlockJobCreateFlags {
+    /* Default behavior */
     BLOCK_JOB_DEFAULT = 0x00,
+    /* BlockJob is not QMP-created and should not send QMP events */
     BLOCK_JOB_INTERNAL = 0x01,
 } BlockJobCreateFlags;
 
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index becaae74c2..259d49b32a 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -114,11 +114,13 @@ struct BlockJobDriver {
  * block_job_create:
  * @job_id: The id of the newly-created job, or %NULL to have one
  * generated automatically.
- * @job_type: The class object for the newly-created job.
+ * @driver: The class object for the newly-created job.
  * @txn: The transaction this job belongs to, if any. %NULL otherwise.
  * @bs: The block
  * @perm, @shared_perm: Permissions to request for @bs
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
+ * @flags: Creation flags for the Block Job.
+ *         See @BlockJobCreateFlags
  * @cb: Completion function for the job.
  * @opaque: Opaque pointer value passed to @cb.
  * @errp: Error object.
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 04/21] blockjobs: add status enum
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (2 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 03/21] Blockjobs: documentation touchup John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 05/21] blockjobs: add state transition table John Snow
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

We're about to add several new states, and booleans are becoming
unwieldly and difficult to reason about. It would help to have a
more explicit bookkeeping of the state of blockjobs. To this end,
add a new "status" field and add our existing states in a redundant
manner alongside the bools they are replacing:

UNDEFINED: Placeholder, default state. Not currently visible to QMP
           unless changes occur in the future to allow creating jobs
           without starting them via QMP.
CREATED:   replaces !!job->co && paused && !busy
RUNNING:   replaces effectively (!paused && busy)
PAUSED:    Nearly redundant with info->paused, which shows pause_count.
           This reports the actual status of the job, which almost always
           matches the paused request status. It differs in that it is
           strictly only true when the job has actually gone dormant.
READY:     replaces job->ready.
STANDBY:   Paused, but job->ready is true.

New state additions in coming commits will not be quite so redundant:

WAITING:   Waiting on transaction. This job has finished all the work
           it can until the transaction converges, fails, or is canceled.
PENDING:   Pending authorization from user. This job has finished all the
           work it can until the job or transaction is finalized via
           block_job_finalize. This implies the transaction has converged
           and left the WAITING phase.
ABORTING:  Job has encountered an error condition and is in the process
           of aborting.
CONCLUDED: Job has ceased all operations and has a return code available
           for query and may be dismissed via block_job_dismiss.
NULL:      Job has been dismissed and (should) be destroyed. Should never
           be visible to QMP.

Some of these states appear somewhat superfluous, but it helps define the
expected flow of a job; so some of the states wind up being synchronous
empty transitions. Importantly, jobs can be in only one of these states
at any given time, which helps code and external users alike reason about
the current condition of a job unambiguously.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockjob.c                 |  9 +++++++++
 include/block/blockjob.h   |  3 +++
 qapi/block-core.json       | 31 ++++++++++++++++++++++++++++++-
 tests/qemu-iotests/109.out | 24 ++++++++++++------------
 4 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index ecc5fcbdf8..719169cccd 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -320,6 +320,7 @@ void block_job_start(BlockJob *job)
     job->pause_count--;
     job->busy = true;
     job->paused = false;
+    job->status = BLOCK_JOB_STATUS_RUNNING;
     bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -598,6 +599,7 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
     info->speed     = job->speed;
     info->io_status = job->iostatus;
     info->ready     = job->ready;
+    info->status    = job->status;
     return info;
 }
 
@@ -700,6 +702,7 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
     job->paused        = true;
     job->pause_count   = 1;
     job->refcnt        = 1;
+    job->status        = BLOCK_JOB_STATUS_CREATED;
     aio_timer_init(qemu_get_aio_context(), &job->sleep_timer,
                    QEMU_CLOCK_REALTIME, SCALE_NS,
                    block_job_sleep_timer_cb, job);
@@ -813,9 +816,14 @@ void coroutine_fn block_job_pause_point(BlockJob *job)
     }
 
     if (block_job_should_pause(job) && !block_job_is_cancelled(job)) {
+        BlockJobStatus status = job->status;
+        job->status = status == BLOCK_JOB_STATUS_READY ? \
+                                BLOCK_JOB_STATUS_STANDBY : \
+                                BLOCK_JOB_STATUS_PAUSED;
         job->paused = true;
         block_job_do_yield(job, -1);
         job->paused = false;
+        job->status = status;
     }
 
     if (job->driver->resume) {
@@ -921,6 +929,7 @@ void block_job_iostatus_reset(BlockJob *job)
 
 void block_job_event_ready(BlockJob *job)
 {
+    job->status = BLOCK_JOB_STATUS_READY;
     job->ready = true;
 
     if (block_job_is_internal(job)) {
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index b77fac118d..b39a2f9521 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -139,6 +139,9 @@ typedef struct BlockJob {
      */
     QEMUTimer sleep_timer;
 
+    /** Current state; See @BlockJobStatus for details. */
+    BlockJobStatus status;
+
     BlockJobTxn *txn;
     QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 00475f08d4..8dfed5c7da 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -955,6 +955,32 @@
 { 'enum': 'BlockJobType',
   'data': ['commit', 'stream', 'mirror', 'backup'] }
 
+##
+# @BlockJobStatus:
+#
+# Indicates the present state of a given blockjob in its lifetime.
+#
+# @undefined: Erroneous, default state. Should not ever be visible.
+#
+# @created: The job has been created, but not yet started.
+#
+# @running: The job is currently running.
+#
+# @paused: The job is running, but paused. The pause may be requested by
+#          either the QMP user or by internal processes.
+#
+# @ready: The job is running, but is ready for the user to signal completion.
+#         This is used for long-running jobs like mirror that are designed to
+#         run indefinitely.
+#
+# @standby: The job is ready, but paused. This is nearly identical to @paused.
+#           The job may return to @ready or otherwise be canceled.
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockJobStatus',
+  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby'] }
+
 ##
 # @BlockJobInfo:
 #
@@ -981,12 +1007,15 @@
 #
 # @ready: true if the job may be completed (since 2.2)
 #
+# @status: Current job state/status (since 2.12)
+#
 # Since: 1.1
 ##
 { 'struct': 'BlockJobInfo',
   'data': {'type': 'str', 'device': 'str', 'len': 'int',
            'offset': 'int', 'busy': 'bool', 'paused': 'bool', 'speed': 'int',
-           'io-status': 'BlockDeviceIoStatus', 'ready': 'bool'} }
+           'io-status': 'BlockDeviceIoStatus', 'ready': 'bool',
+           'status': 'BlockJobStatus' } }
 
 ##
 # @query-block-jobs:
diff --git a/tests/qemu-iotests/109.out b/tests/qemu-iotests/109.out
index c189e2833d..d288f2eef6 100644
--- a/tests/qemu-iotests/109.out
+++ b/tests/qemu-iotests/109.out
@@ -19,7 +19,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, "offset": 1024, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, "offset": 1024, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
@@ -45,7 +45,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 197120, "offset": 197120, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 197120, "offset": 197120, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 197120, "offset": 197120, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 197120, "offset": 197120, "speed": 0, "type": "mirror"}}
@@ -71,7 +71,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 327680, "offset": 327680, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 327680, "offset": 327680, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
@@ -97,7 +97,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, "offset": 1024, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, "offset": 1024, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
@@ -123,7 +123,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 65536, "offset": 65536, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 65536, "offset": 65536, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 65536, "offset": 65536, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 65536, "offset": 65536, "speed": 0, "type": "mirror"}}
@@ -149,7 +149,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2560, "offset": 2560, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2560, "offset": 2560, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
@@ -174,7 +174,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2560, "offset": 2560, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2560, "offset": 2560, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
@@ -199,7 +199,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 31457280, "offset": 31457280, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 31457280, "offset": 31457280, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 31457280, "offset": 31457280, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 31457280, "offset": 31457280, "speed": 0, "type": "mirror"}}
@@ -224,7 +224,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 327680, "offset": 327680, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 327680, "offset": 327680, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
@@ -249,7 +249,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 2048, "offset": 2048, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2048, "offset": 2048, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2048, "offset": 2048, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2048, "offset": 2048, "speed": 0, "type": "mirror"}}
@@ -265,7 +265,7 @@ Automatically detecting the format is dangerous for raw images, write operations
 Specify the 'raw' format explicitly to remove the restrictions.
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 512, "offset": 512, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 512, "offset": 512, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 512, "offset": 512, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 512, "offset": 512, "speed": 0, "type": "mirror"}}
@@ -274,7 +274,7 @@ Images are identical.
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 512, "offset": 512, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 512, "offset": 512, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 512, "offset": 512, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 512, "offset": 512, "speed": 0, "type": "mirror"}}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 05/21] blockjobs: add state transition table
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (3 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 04/21] blockjobs: add status enum John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 06/21] iotests: add pause_wait John Snow
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

The state transition table has mostly been implied. We're about to make
it a bit more complex, so let's make the STM explicit instead.

Perform state transitions with a function that for now just asserts the
transition is appropriate.

Transitions:
Undefined -> Created: During job initialization.
Created   -> Running: Once the job is started.
                      Jobs cannot transition from "Created" to "Paused"
                      directly, but will instead synchronously transition
                      to running to paused immediately.
Running   -> Paused:  Normal workflow for pauses.
Running   -> Ready:   Normal workflow for jobs reaching their sync point.
                      (e.g. mirror)
Ready     -> Standby: Normal workflow for pausing ready jobs.
Paused    -> Running: Normal resume.
Standby   -> Ready:   Resume of a Standby job.


+---------+
|UNDEFINED|
+--+------+
   |
+--v----+
|CREATED|
+--+----+
   |
+--v----+     +------+
|RUNNING<----->PAUSED|
+--+----+     +------+
   |
+--v--+       +-------+
|READY<------->STANDBY|
+-----+       +-------+


Notably, there is no state presently defined as of this commit that
deals with a job after the "running" or "ready" states, so this table
will be adjusted alongside the commits that introduce those states.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 block/trace-events |  3 +++
 blockjob.c         | 40 +++++++++++++++++++++++++++++++++-------
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/block/trace-events b/block/trace-events
index 02dd80ff0c..b75a0c8409 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -4,6 +4,9 @@
 bdrv_open_common(void *bs, const char *filename, int flags, const char *format_name) "bs %p filename \"%s\" flags 0x%x format_name \"%s\""
 bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
 
+# blockjob.c
+block_job_state_transition(void *job,  int ret, const char *legal, const char *s0, const char *s1) "job %p (ret: %d) attempting %s transition (%s-->%s)"
+
 # block/block-backend.c
 blk_co_preadv(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags) "blk %p bs %p offset %"PRId64" bytes %u flags 0x%x"
 blk_co_pwritev(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags) "blk %p bs %p offset %"PRId64" bytes %u flags 0x%x"
diff --git a/blockjob.c b/blockjob.c
index 719169cccd..442426e27b 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -28,6 +28,7 @@
 #include "block/block.h"
 #include "block/blockjob_int.h"
 #include "block/block_int.h"
+#include "block/trace.h"
 #include "sysemu/block-backend.h"
 #include "qapi/error.h"
 #include "qapi/qapi-events-block-core.h"
@@ -41,6 +42,31 @@
  * block_job_enter. */
 static QemuMutex block_job_mutex;
 
+/* BlockJob State Transition Table */
+bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
+                                          /* U, C, R, P, Y, S */
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0},
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0},
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0},
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0},
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1},
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},
+};
+
+static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
+{
+    BlockJobStatus s0 = job->status;
+    assert(s1 >= 0 && s1 <= BLOCK_JOB_STATUS__MAX);
+    trace_block_job_state_transition(job, job->ret, BlockJobSTT[s0][s1] ?
+                                     "allowed" : "disallowed",
+                                     qapi_enum_lookup(&BlockJobStatus_lookup,
+                                                      s0),
+                                     qapi_enum_lookup(&BlockJobStatus_lookup,
+                                                      s1));
+    assert(BlockJobSTT[s0][s1]);
+    job->status = s1;
+}
+
 static void block_job_lock(void)
 {
     qemu_mutex_lock(&block_job_mutex);
@@ -320,7 +346,7 @@ void block_job_start(BlockJob *job)
     job->pause_count--;
     job->busy = true;
     job->paused = false;
-    job->status = BLOCK_JOB_STATUS_RUNNING;
+    block_job_state_transition(job, BLOCK_JOB_STATUS_RUNNING);
     bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -702,7 +728,7 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
     job->paused        = true;
     job->pause_count   = 1;
     job->refcnt        = 1;
-    job->status        = BLOCK_JOB_STATUS_CREATED;
+    block_job_state_transition(job, BLOCK_JOB_STATUS_CREATED);
     aio_timer_init(qemu_get_aio_context(), &job->sleep_timer,
                    QEMU_CLOCK_REALTIME, SCALE_NS,
                    block_job_sleep_timer_cb, job);
@@ -817,13 +843,13 @@ void coroutine_fn block_job_pause_point(BlockJob *job)
 
     if (block_job_should_pause(job) && !block_job_is_cancelled(job)) {
         BlockJobStatus status = job->status;
-        job->status = status == BLOCK_JOB_STATUS_READY ? \
-                                BLOCK_JOB_STATUS_STANDBY : \
-                                BLOCK_JOB_STATUS_PAUSED;
+        block_job_state_transition(job, status == BLOCK_JOB_STATUS_READY ? \
+                                   BLOCK_JOB_STATUS_STANDBY :           \
+                                   BLOCK_JOB_STATUS_PAUSED);
         job->paused = true;
         block_job_do_yield(job, -1);
         job->paused = false;
-        job->status = status;
+        block_job_state_transition(job, status);
     }
 
     if (job->driver->resume) {
@@ -929,7 +955,7 @@ void block_job_iostatus_reset(BlockJob *job)
 
 void block_job_event_ready(BlockJob *job)
 {
-    job->status = BLOCK_JOB_STATUS_READY;
+    block_job_state_transition(job, BLOCK_JOB_STATUS_READY);
     job->ready = true;
 
     if (block_job_is_internal(job)) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 06/21] iotests: add pause_wait
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (4 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 05/21] blockjobs: add state transition table John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 07/21] blockjobs: add block_job_verb permission table John Snow
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Split out the pause command into the actual pause and the wait.
Not every usage presently needs to resubmit a pause request.

The intent with the next commit will be to explicitly disallow
redundant or meaningless pause/resume requests, so the tests
need to become more judicious to reflect that.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/030        |  6 ++----
 tests/qemu-iotests/055        | 17 ++++++-----------
 tests/qemu-iotests/iotests.py | 12 ++++++++----
 3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index 457984b8e9..251883226c 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -86,11 +86,9 @@ class TestSingleDrive(iotests.QMPTestCase):
         result = self.vm.qmp('block-stream', device='drive0')
         self.assert_qmp(result, 'return', {})
 
-        result = self.vm.qmp('block-job-pause', device='drive0')
-        self.assert_qmp(result, 'return', {})
-
+        self.pause_job('drive0', wait=False)
         self.vm.resume_drive('drive0')
-        self.pause_job('drive0')
+        self.pause_wait('drive0')
 
         result = self.vm.qmp('query-block-jobs')
         offset = self.dictpath(result, 'return[0]/offset')
diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 8a5d9fd269..3437c11507 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -86,11 +86,9 @@ class TestSingleDrive(iotests.QMPTestCase):
                              target=target, sync='full')
         self.assert_qmp(result, 'return', {})
 
-        result = self.vm.qmp('block-job-pause', device='drive0')
-        self.assert_qmp(result, 'return', {})
-
+        self.pause_job('drive0', wait=False)
         self.vm.resume_drive('drive0')
-        self.pause_job('drive0')
+        self.pause_wait('drive0')
 
         result = self.vm.qmp('query-block-jobs')
         offset = self.dictpath(result, 'return[0]/offset')
@@ -303,13 +301,12 @@ class TestSingleTransaction(iotests.QMPTestCase):
         ])
         self.assert_qmp(result, 'return', {})
 
-        result = self.vm.qmp('block-job-pause', device='drive0')
-        self.assert_qmp(result, 'return', {})
+        self.pause_job('drive0', wait=False)
 
         result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0)
         self.assert_qmp(result, 'return', {})
 
-        self.pause_job('drive0')
+        self.pause_wait('drive0')
 
         result = self.vm.qmp('query-block-jobs')
         offset = self.dictpath(result, 'return[0]/offset')
@@ -534,11 +531,9 @@ class TestDriveCompression(iotests.QMPTestCase):
         result = self.vm.qmp(cmd, device='drive0', sync='full', compress=True, **args)
         self.assert_qmp(result, 'return', {})
 
-        result = self.vm.qmp('block-job-pause', device='drive0')
-        self.assert_qmp(result, 'return', {})
-
+        self.pause_job('drive0', wait=False)
         self.vm.resume_drive('drive0')
-        self.pause_job('drive0')
+        self.pause_wait('drive0')
 
         result = self.vm.qmp('query-block-jobs')
         offset = self.dictpath(result, 'return[0]/offset')
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 1bcc9ca57d..5303bbc8e2 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -473,10 +473,7 @@ class QMPTestCase(unittest.TestCase):
         event = self.wait_until_completed(drive=drive)
         self.assert_qmp(event, 'data/type', 'mirror')
 
-    def pause_job(self, job_id='job0'):
-        result = self.vm.qmp('block-job-pause', device=job_id)
-        self.assert_qmp(result, 'return', {})
-
+    def pause_wait(self, job_id='job0'):
         with Timeout(1, "Timeout waiting for job to pause"):
             while True:
                 result = self.vm.qmp('query-block-jobs')
@@ -484,6 +481,13 @@ class QMPTestCase(unittest.TestCase):
                     if job['device'] == job_id and job['paused'] == True and job['busy'] == False:
                         return job
 
+    def pause_job(self, job_id='job0', wait=True):
+        result = self.vm.qmp('block-job-pause', device=job_id)
+        self.assert_qmp(result, 'return', {})
+        if wait:
+            return self.pause_wait(job_id)
+        return result
+
 
 def notrun(reason):
     '''Skip this test suite'''
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 07/21] blockjobs: add block_job_verb permission table
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (5 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 06/21] iotests: add pause_wait John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 08/21] blockjobs: add ABORTING state John Snow
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Which commands ("verbs") are appropriate for jobs in which state is
also somewhat burdensome to keep track of.

As of this commit, it looks rather useless, but begins to look more
interesting the more states we add to the STM table.

A recurring theme is that no verb will apply to an 'undefined' job.

Further, it's not presently possible to restrict the "pause" or "resume"
verbs any more than they are in this commit because of the asynchronous
nature of how jobs enter the PAUSED state; justifications for some
seemingly erroneous applications are given below.

=====
Verbs
=====

Cancel:    Any state except undefined.
Pause:     Any state except undefined;
           'created': Requests that the job pauses as it starts.
           'running': Normal usage. (PAUSED)
           'paused':  The job may be paused for internal reasons,
                      but the user may wish to force an indefinite
                      user-pause, so this is allowed.
           'ready':   Normal usage. (STANDBY)
           'standby': Same logic as above.
Resume:    Any state except undefined;
           'created': Will lift a user's pause-on-start request.
           'running': Will lift a pause request before it takes effect.
           'paused':  Normal usage.
           'ready':   Will lift a pause request before it takes effect.
           'standby': Normal usage.
Set-speed: Any state except undefined, though ready may not be meaningful.
Complete:  Only a 'ready' job may accept a complete request.


=======
Changes
=======

(1)

To facilitate "nice" error checking, all five major block-job verb
interfaces in blockjob.c now support an errp parameter:

- block_job_user_cancel is added as a new interface.
- block_job_user_pause gains an errp paramter
- block_job_user_resume gains an errp parameter
- block_job_set_speed already had an errp parameter.
- block_job_complete already had an errp parameter.

(2)

block-job-pause and block-job-resume will no longer no-op when trying
to pause an already paused job, or trying to resume a job that isn't
paused. These functions will now report that they did not perform the
action requested because it was not possible.

iotests have been adjusted to address this new behavior.

(3)

block-job-complete doesn't worry about checking !block_job_started,
because the permission table guards against this.

(4)

test-bdrv-drain's job implementation needs to announce that it is
'ready' now, in order to be completed.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/trace-events       |  1 +
 blockdev.c               | 10 +++----
 blockjob.c               | 71 ++++++++++++++++++++++++++++++++++++++++++------
 include/block/blockjob.h | 13 +++++++--
 qapi/block-core.json     | 20 ++++++++++++++
 tests/test-bdrv-drain.c  |  1 +
 6 files changed, 100 insertions(+), 16 deletions(-)

diff --git a/block/trace-events b/block/trace-events
index b75a0c8409..3fe89f7ea6 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -6,6 +6,7 @@ bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
 
 # blockjob.c
 block_job_state_transition(void *job,  int ret, const char *legal, const char *s0, const char *s1) "job %p (ret: %d) attempting %s transition (%s-->%s)"
+block_job_apply_verb(void *job, const char *state, const char *verb, const char *legal) "job %p in state %s; applying verb %s (%s)"
 
 # block/block-backend.c
 blk_co_preadv(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags) "blk %p bs %p offset %"PRId64" bytes %u flags 0x%x"
diff --git a/blockdev.c b/blockdev.c
index 1fbfd3a2c4..f70a783803 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3806,7 +3806,7 @@ void qmp_block_job_cancel(const char *device,
     }
 
     trace_qmp_block_job_cancel(job);
-    block_job_cancel(job);
+    block_job_user_cancel(job, errp);
 out:
     aio_context_release(aio_context);
 }
@@ -3816,12 +3816,12 @@ void qmp_block_job_pause(const char *device, Error **errp)
     AioContext *aio_context;
     BlockJob *job = find_block_job(device, &aio_context, errp);
 
-    if (!job || block_job_user_paused(job)) {
+    if (!job) {
         return;
     }
 
     trace_qmp_block_job_pause(job);
-    block_job_user_pause(job);
+    block_job_user_pause(job, errp);
     aio_context_release(aio_context);
 }
 
@@ -3830,12 +3830,12 @@ void qmp_block_job_resume(const char *device, Error **errp)
     AioContext *aio_context;
     BlockJob *job = find_block_job(device, &aio_context, errp);
 
-    if (!job || !block_job_user_paused(job)) {
+    if (!job) {
         return;
     }
 
     trace_qmp_block_job_resume(job);
-    block_job_user_resume(job);
+    block_job_user_resume(job, errp);
     aio_context_release(aio_context);
 }
 
diff --git a/blockjob.c b/blockjob.c
index 442426e27b..d369c0cb4d 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -53,6 +53,15 @@ bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
     /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},
 };
 
+bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
+                                          /* U, C, R, P, Y, S */
+    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1},
+    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1},
+    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1},
+    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1},
+    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0},
+};
+
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
 {
     BlockJobStatus s0 = job->status;
@@ -67,6 +76,23 @@ static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
     job->status = s1;
 }
 
+static int block_job_apply_verb(BlockJob *job, BlockJobVerb bv, Error **errp)
+{
+    assert(bv >= 0 && bv <= BLOCK_JOB_VERB__MAX);
+    trace_block_job_apply_verb(job, qapi_enum_lookup(&BlockJobStatus_lookup,
+                                                     job->status),
+                               qapi_enum_lookup(&BlockJobVerb_lookup, bv),
+                               BlockJobVerbTable[bv][job->status] ?
+                               "allowed" : "prohibited");
+    if (BlockJobVerbTable[bv][job->status]) {
+        return 0;
+    }
+    error_setg(errp, "Job '%s' in state '%s' cannot accept command verb '%s'",
+               job->id, qapi_enum_lookup(&BlockJobStatus_lookup, job->status),
+               qapi_enum_lookup(&BlockJobVerb_lookup, bv));
+    return -EPERM;
+}
+
 static void block_job_lock(void)
 {
     qemu_mutex_lock(&block_job_mutex);
@@ -517,6 +543,9 @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
         error_setg(errp, QERR_UNSUPPORTED);
         return;
     }
+    if (block_job_apply_verb(job, BLOCK_JOB_VERB_SET_SPEED, errp)) {
+        return;
+    }
     job->driver->set_speed(job, speed, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
@@ -536,8 +565,10 @@ void block_job_complete(BlockJob *job, Error **errp)
 {
     /* Should not be reachable via external interface for internal jobs */
     assert(job->id);
-    if (job->pause_count || job->cancelled ||
-        !block_job_started(job) || !job->driver->complete) {
+    if (block_job_apply_verb(job, BLOCK_JOB_VERB_COMPLETE, errp)) {
+        return;
+    }
+    if (job->pause_count || job->cancelled || !job->driver->complete) {
         error_setg(errp, "The active block job '%s' cannot be completed",
                    job->id);
         return;
@@ -546,8 +577,15 @@ void block_job_complete(BlockJob *job, Error **errp)
     job->driver->complete(job, errp);
 }
 
-void block_job_user_pause(BlockJob *job)
+void block_job_user_pause(BlockJob *job, Error **errp)
 {
+    if (block_job_apply_verb(job, BLOCK_JOB_VERB_PAUSE, errp)) {
+        return;
+    }
+    if (job->user_paused) {
+        error_setg(errp, "Job is already paused");
+        return;
+    }
     job->user_paused = true;
     block_job_pause(job);
 }
@@ -557,13 +595,19 @@ bool block_job_user_paused(BlockJob *job)
     return job->user_paused;
 }
 
-void block_job_user_resume(BlockJob *job)
+void block_job_user_resume(BlockJob *job, Error **errp)
 {
-    if (job && job->user_paused && job->pause_count > 0) {
-        block_job_iostatus_reset(job);
-        job->user_paused = false;
-        block_job_resume(job);
+    assert(job);
+    if (!job->user_paused || job->pause_count <= 0) {
+        error_setg(errp, "Can't resume a job that was not paused");
+        return;
     }
+    if (block_job_apply_verb(job, BLOCK_JOB_VERB_RESUME, errp)) {
+        return;
+    }
+    block_job_iostatus_reset(job);
+    job->user_paused = false;
+    block_job_resume(job);
 }
 
 void block_job_cancel(BlockJob *job)
@@ -576,6 +620,14 @@ void block_job_cancel(BlockJob *job)
     }
 }
 
+void block_job_user_cancel(BlockJob *job, Error **errp)
+{
+    if (block_job_apply_verb(job, BLOCK_JOB_VERB_CANCEL, errp)) {
+        return;
+    }
+    block_job_cancel(job);
+}
+
 /* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
  * used with block_job_finish_sync() without the need for (rather nasty)
  * function pointer casts there. */
@@ -999,8 +1051,9 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err,
                                         action, &error_abort);
     }
     if (action == BLOCK_ERROR_ACTION_STOP) {
+        block_job_pause(job);
         /* make the pause user visible, which will be resumed from QMP. */
-        block_job_user_pause(job);
+        job->user_paused = true;
         block_job_iostatus_set_err(job, error);
     }
     return action;
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index b39a2f9521..df0a9773d1 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -249,7 +249,7 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp);
  * Asynchronously pause the specified job.
  * Do not allow a resume until a matching call to block_job_user_resume.
  */
-void block_job_user_pause(BlockJob *job);
+void block_job_user_pause(BlockJob *job, Error **errp);
 
 /**
  * block_job_paused:
@@ -266,7 +266,16 @@ bool block_job_user_paused(BlockJob *job);
  * Resume the specified job.
  * Must be paired with a preceding block_job_user_pause.
  */
-void block_job_user_resume(BlockJob *job);
+void block_job_user_resume(BlockJob *job, Error **errp);
+
+/**
+ * block_job_user_cancel:
+ * @job: The job to be cancelled.
+ *
+ * Cancels the specified job, but may refuse to do so if the
+ * operation isn't currently meaningful.
+ */
+void block_job_user_cancel(BlockJob *job, Error **errp);
 
 /**
  * block_job_cancel_sync:
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8dfed5c7da..9e3c74be53 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -955,6 +955,26 @@
 { 'enum': 'BlockJobType',
   'data': ['commit', 'stream', 'mirror', 'backup'] }
 
+##
+# @BlockJobVerb:
+#
+# Represents command verbs that can be applied to a blockjob.
+#
+# @cancel: see @block-job-cancel
+#
+# @pause: see @block-job-pause
+#
+# @resume: see @block-job-resume
+#
+# @set-speed: see @block-job-set-speed
+#
+# @complete: see @block-job-complete
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockJobVerb',
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete' ] }
+
 ##
 # @BlockJobStatus:
 #
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index a7eecf1c1e..7673de1062 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -505,6 +505,7 @@ static void coroutine_fn test_job_start(void *opaque)
 {
     TestBlockJob *s = opaque;
 
+    block_job_event_ready(&s->common);
     while (!s->should_complete) {
         block_job_sleep_ns(&s->common, 100000);
     }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 08/21] blockjobs: add ABORTING state
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (6 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 07/21] blockjobs: add block_job_verb permission table John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 09/21] blockjobs: add CONCLUDED state John Snow
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Add a new state ABORTING.

This makes transitions from normative states to error states explicit
in the STM, and serves as a disambiguation for which states may complete
normally when normal end-states (CONCLUDED) are added in future commits.

Notably, Paused/Standby jobs do not transition directly to aborting,
as they must wake up first and cooperate in their cancellation.

Transitions:
Created -> Aborting: can be cancelled (by the system)
Running -> Aborting: can be cancelled or encounter an error
Ready   -> Aborting: can be cancelled or encounter an error

Verbs:
None. The job must finish cleaning itself up and report its final status.

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED|
   |         +--+----+
   |            |
   |         +--v----+     +------+
   +---------+RUNNING<----->PAUSED|
   |         +--+----+     +------+
   |            |
   |         +--v--+       +-------+
   +---------+READY<------->STANDBY|
   |         +-----+       +-------+
   |
+--v-----+
|ABORTING|
+--------+

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 blockjob.c           | 31 ++++++++++++++++++-------------
 qapi/block-core.json |  7 ++++++-
 2 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index d369c0cb4d..fe5b0041f7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,22 +44,23 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S */
-    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0},
-    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0},
-    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0},
-    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0},
-    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1},
-    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},
+                                          /* U, C, R, P, Y, S, X */
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0},
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1},
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1},
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0},
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1},
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0},
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S */
-    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1},
-    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1},
-    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1},
-    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1},
-    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0},
+                                          /* U, C, R, P, Y, S, X */
+    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 0},
+    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0},
+    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0},
+    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0},
+    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -380,6 +381,10 @@ static void block_job_completed_single(BlockJob *job)
 {
     assert(job->completed);
 
+    if (job->ret || block_job_is_cancelled(job)) {
+        block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
+    }
+
     if (!job->ret) {
         if (job->driver->commit) {
             job->driver->commit(job);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9e3c74be53..01c16c42dc 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -996,10 +996,15 @@
 # @standby: The job is ready, but paused. This is nearly identical to @paused.
 #           The job may return to @ready or otherwise be canceled.
 #
+# @aborting: The job is in the process of being aborted, and will finish with
+#            an error.
+#            This status may not be visible to the management process.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
-  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby'] }
+  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
+           'aborting' ] }
 
 ##
 # @BlockJobInfo:
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 09/21] blockjobs: add CONCLUDED state
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (7 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 08/21] blockjobs: add ABORTING state John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state John Snow
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

add a new state "CONCLUDED" that identifies a job that has ceased all
operations. The wording was chosen to avoid any phrasing that might
imply success, error, or cancellation. The task has simply ceased all
operation and can never again perform any work.

("finished", "done", and "completed" might all imply success.)

Transitions:
Running  -> Concluded: normal completion
Ready    -> Concluded: normal completion
Aborting -> Concluded: error and cancellations

Verbs:
None as of this commit. (a future commit adds 'dismiss')

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED|
   |         +--+----+
   |            |
   |         +--v----+     +------+
   +---------+RUNNING<----->PAUSED|
   |         +--+-+--+     +------+
   |            | |
   |            | +------------------+
   |            |                    |
   |         +--v--+       +-------+ |
   +---------+READY<------->STANDBY| |
   |         +--+--+       +-------+ |
   |            |                    |
+--v-----+   +--v------+             |
|ABORTING+--->CONCLUDED<-------------+
+--------+   +---------+

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockjob.c           | 39 ++++++++++++++++++++++++---------------
 qapi/block-core.json |  7 +++++--
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index fe5b0041f7..3f730967b3 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,23 +44,24 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S, X */
-    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0},
-    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1},
-    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1},
-    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0},
-    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1},
-    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0},
-    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0},
+                                          /* U, C, R, P, Y, S, X, E */
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0},
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0},
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1},
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0},
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1},
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0},
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1},
+    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S, X */
-    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 0},
-    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0},
-    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0},
-    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0},
-    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0},
+                                          /* U, C, R, P, Y, S, X, E */
+    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 0, 0},
+    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0, 0},
+    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0},
+    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0},
+    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -377,6 +378,11 @@ void block_job_start(BlockJob *job)
     bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
+static void block_job_conclude(BlockJob *job)
+{
+    block_job_state_transition(job, BLOCK_JOB_STATUS_CONCLUDED);
+}
+
 static void block_job_completed_single(BlockJob *job)
 {
     assert(job->completed);
@@ -417,6 +423,7 @@ static void block_job_completed_single(BlockJob *job)
 
     QLIST_REMOVE(job, txn_list);
     block_job_txn_unref(job->txn);
+    block_job_conclude(job);
     block_job_unref(job);
 }
 
@@ -617,7 +624,9 @@ void block_job_user_resume(BlockJob *job, Error **errp)
 
 void block_job_cancel(BlockJob *job)
 {
-    if (block_job_started(job)) {
+    if (job->status == BLOCK_JOB_STATUS_CONCLUDED) {
+        return;
+    } else if (block_job_started(job)) {
         block_job_cancel_async(job);
         block_job_enter(job);
     } else {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 01c16c42dc..e489660886 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -997,14 +997,17 @@
 #           The job may return to @ready or otherwise be canceled.
 #
 # @aborting: The job is in the process of being aborted, and will finish with
-#            an error.
+#            an error. The job will afterwards report that it is @concluded.
 #            This status may not be visible to the management process.
 #
+# @concluded: The job has finished all work. If manual was set to true, the job
+#             will remain in the query list until it is dismissed.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-           'aborting' ] }
+           'aborting', 'concluded' ] }
 
 ##
 # @BlockJobInfo:
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (8 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 09/21] blockjobs: add CONCLUDED state John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-12 15:28   ` Kevin Wolf
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 11/21] blockjobs: add block_job_dismiss John Snow
                   ` (13 subsequent siblings)
  23 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Add a new state that specifically demarcates when we begin to permanently
demolish a job after it has performed all work. This makes the transition
explicit in the STM table and highlights conditions under which a job may
be demolished.

Alongside this state, add a new helper command "block_job_decommission",
which transitions to the NULL state and puts down our implicit reference.
This separates instances in the code for "block_job_unref" which merely
undo a matching "block_job_ref" with instances intended to initiate the
full destruction of the object.

This decommission action also sets a number of fields to make sure that
block internals or external users that are holding a reference to a job
to see when it "finishes" are convinced that the job object is "done."
This is necessary, for instance, to do a block_job_cancel_sync on a
created object which will not make any progress.

Now, all jobs must go through block_job_decommission prior to being
freed, giving us start-to-finish state machine coverage for jobs.


Transitions:
Created   -> Null: Early failure event before the job is started
Concluded -> Null: Standard transition.

Verbs:
None. This should not ever be visible to the monitor.

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED+------------------+
   |         +--+----+                  |
   |            |                       |
   |         +--v----+     +------+     |
   +---------+RUNNING<----->PAUSED|     |
   |         +--+-+--+     +------+     |
   |            | |                     |
   |            | +------------------+  |
   |            |                    |  |
   |         +--v--+       +-------+ |  |
   +---------+READY<------->STANDBY| |  |
   |         +--+--+       +-------+ |  |
   |            |                    |  |
+--v-----+   +--v------+             |  |
|ABORTING+--->CONCLUDED<-------------+  |
+--------+   +--+------+                |
                |                       |
             +--v-+                     |
             |NULL<---------------------+
             +----+

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockjob.c           | 50 ++++++++++++++++++++++++++++++++------------------
 qapi/block-core.json |  5 ++++-
 2 files changed, 36 insertions(+), 19 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 3f730967b3..2ef48075b0 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,24 +44,25 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S, X, E */
-    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0},
-    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0},
-    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1},
-    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0},
-    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1},
-    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0},
-    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1},
-    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0},
+                                          /* U, C, R, P, Y, S, X, E, N */
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0},
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0, 1},
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0},
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0},
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1, 0},
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},
+    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
+    /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S, X, E */
-    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 0, 0},
-    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0, 0},
-    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0},
-    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0},
-    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0},
+                                          /* U, C, R, P, Y, S, X, E, N */
+    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 0, 0, 0},
+    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0, 0, 0},
+    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0, 0},
+    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0, 0},
+    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -225,6 +226,7 @@ static void block_job_detach_aio_context(void *opaque);
 void block_job_unref(BlockJob *job)
 {
     if (--job->refcnt == 0) {
+        assert(job->status == BLOCK_JOB_STATUS_NULL);
         BlockDriverState *bs = blk_bs(job->blk);
         QLIST_REMOVE(job, job_list);
         bs->job = NULL;
@@ -378,6 +380,17 @@ void block_job_start(BlockJob *job)
     bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
+static void block_job_decommission(BlockJob *job)
+{
+    assert(job);
+    job->completed = true;
+    job->busy = false;
+    job->paused = false;
+    job->deferred_to_main_loop = true;
+    block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
+    block_job_unref(job);
+}
+
 static void block_job_conclude(BlockJob *job)
 {
     block_job_state_transition(job, BLOCK_JOB_STATUS_CONCLUDED);
@@ -424,7 +437,7 @@ static void block_job_completed_single(BlockJob *job)
     QLIST_REMOVE(job, txn_list);
     block_job_txn_unref(job->txn);
     block_job_conclude(job);
-    block_job_unref(job);
+    block_job_decommission(job);
 }
 
 static void block_job_cancel_async(BlockJob *job)
@@ -817,7 +830,7 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
 
         block_job_set_speed(job, speed, &local_err);
         if (local_err) {
-            block_job_unref(job);
+            block_job_early_fail(job);
             error_propagate(errp, local_err);
             return NULL;
         }
@@ -851,7 +864,8 @@ void block_job_pause_all(void)
 
 void block_job_early_fail(BlockJob *job)
 {
-    block_job_unref(job);
+    assert(job->status == BLOCK_JOB_STATUS_CREATED);
+    block_job_decommission(job);
 }
 
 void block_job_completed(BlockJob *job, int ret)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index e489660886..dc25d1d306 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1003,11 +1003,14 @@
 # @concluded: The job has finished all work. If manual was set to true, the job
 #             will remain in the query list until it is dismissed.
 #
+# @null: The job is in the process of being dismantled. This state should not
+#        ever be visible externally.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-           'aborting', 'concluded' ] }
+           'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 11/21] blockjobs: add block_job_dismiss
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (9 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 12/21] blockjobs: ensure abort is called for cancelled jobs John Snow
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

For jobs that have reached their CONCLUDED state, prior to having their
last reference put down (meaning jobs that have completed successfully,
unsuccessfully, or have been canceled), allow the user to dismiss the
job's lingering status report via block-job-dismiss.

This gives management APIs the chance to conclusively determine if a job
failed or succeeded, even if the event broadcast was missed.

Note: block_job_do_dismiss and block_job_decommission happen to do
exactly the same thing, but they're called from different semantic
contexts, so both aliases are kept to improve readability.

Note 2: Don't worry about the 0x04 flag definition for AUTO_DISMISS, she
has a friend coming in a future patch to fill the hole where 0x02 is.


Verbs:
Dismiss: operates on CONCLUDED jobs only.
Signed-off-by: John Snow <jsnow@redhat.com>
---
 block/trace-events       |  1 +
 blockdev.c               | 14 ++++++++++++++
 blockjob.c               | 26 ++++++++++++++++++++++++--
 include/block/blockjob.h | 14 ++++++++++++++
 qapi/block-core.json     | 24 +++++++++++++++++++++++-
 5 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/block/trace-events b/block/trace-events
index 3fe89f7ea6..266afd9e99 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -50,6 +50,7 @@ qmp_block_job_cancel(void *job) "job %p"
 qmp_block_job_pause(void *job) "job %p"
 qmp_block_job_resume(void *job) "job %p"
 qmp_block_job_complete(void *job) "job %p"
+qmp_block_job_dismiss(void *job) "job %p"
 qmp_block_stream(void *bs, void *job) "bs %p job %p"
 
 # block/file-win32.c
diff --git a/blockdev.c b/blockdev.c
index f70a783803..9900cbc7dd 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3853,6 +3853,20 @@ void qmp_block_job_complete(const char *device, Error **errp)
     aio_context_release(aio_context);
 }
 
+void qmp_block_job_dismiss(const char *id, Error **errp)
+{
+    AioContext *aio_context;
+    BlockJob *job = find_block_job(id, &aio_context, errp);
+
+    if (!job) {
+        return;
+    }
+
+    trace_qmp_block_job_dismiss(job);
+    block_job_dismiss(&job, errp);
+    aio_context_release(aio_context);
+}
+
 void qmp_change_backing_file(const char *device,
                              const char *image_node_name,
                              const char *backing_file,
diff --git a/blockjob.c b/blockjob.c
index 2ef48075b0..59ac4a13c7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -63,6 +63,7 @@ bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
     [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0, 0},
     [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0, 0},
     [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_DISMISS]              = {0, 0, 0, 0, 0, 0, 0, 1, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -391,9 +392,17 @@ static void block_job_decommission(BlockJob *job)
     block_job_unref(job);
 }
 
+static void block_job_do_dismiss(BlockJob *job)
+{
+    block_job_decommission(job);
+}
+
 static void block_job_conclude(BlockJob *job)
 {
     block_job_state_transition(job, BLOCK_JOB_STATUS_CONCLUDED);
+    if (job->auto_dismiss || !block_job_started(job)) {
+        block_job_do_dismiss(job);
+    }
 }
 
 static void block_job_completed_single(BlockJob *job)
@@ -437,7 +446,6 @@ static void block_job_completed_single(BlockJob *job)
     QLIST_REMOVE(job, txn_list);
     block_job_txn_unref(job->txn);
     block_job_conclude(job);
-    block_job_decommission(job);
 }
 
 static void block_job_cancel_async(BlockJob *job)
@@ -602,6 +610,19 @@ void block_job_complete(BlockJob *job, Error **errp)
     job->driver->complete(job, errp);
 }
 
+void block_job_dismiss(BlockJob **jobptr, Error **errp)
+{
+    BlockJob *job = *jobptr;
+    /* similarly to _complete, this is QMP-interface only. */
+    assert(job->id);
+    if (block_job_apply_verb(job, BLOCK_JOB_VERB_DISMISS, errp)) {
+        return;
+    }
+
+    block_job_do_dismiss(job);
+    *jobptr = NULL;
+}
+
 void block_job_user_pause(BlockJob *job, Error **errp)
 {
     if (block_job_apply_verb(job, BLOCK_JOB_VERB_PAUSE, errp)) {
@@ -638,7 +659,7 @@ void block_job_user_resume(BlockJob *job, Error **errp)
 void block_job_cancel(BlockJob *job)
 {
     if (job->status == BLOCK_JOB_STATUS_CONCLUDED) {
-        return;
+        block_job_do_dismiss(job);
     } else if (block_job_started(job)) {
         block_job_cancel_async(job);
         block_job_enter(job);
@@ -807,6 +828,7 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
     job->paused        = true;
     job->pause_count   = 1;
     job->refcnt        = 1;
+    job->auto_dismiss  = !(flags & BLOCK_JOB_MANUAL_DISMISS);
     block_job_state_transition(job, BLOCK_JOB_STATUS_CREATED);
     aio_timer_init(qemu_get_aio_context(), &job->sleep_timer,
                    QEMU_CLOCK_REALTIME, SCALE_NS,
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index df0a9773d1..c535829b46 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -142,6 +142,9 @@ typedef struct BlockJob {
     /** Current state; See @BlockJobStatus for details. */
     BlockJobStatus status;
 
+    /** True if this job should automatically dismiss itself */
+    bool auto_dismiss;
+
     BlockJobTxn *txn;
     QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
@@ -151,6 +154,8 @@ typedef enum BlockJobCreateFlags {
     BLOCK_JOB_DEFAULT = 0x00,
     /* BlockJob is not QMP-created and should not send QMP events */
     BLOCK_JOB_INTERNAL = 0x01,
+    /* BlockJob requires manual dismiss step */
+    BLOCK_JOB_MANUAL_DISMISS = 0x04,
 } BlockJobCreateFlags;
 
 /**
@@ -234,6 +239,15 @@ void block_job_cancel(BlockJob *job);
  */
 void block_job_complete(BlockJob *job, Error **errp);
 
+/**
+ * block_job_dismiss:
+ * @job: The job to be dismissed.
+ * @errp: Error object.
+ *
+ * Remove a concluded job from the query list.
+ */
+void block_job_dismiss(BlockJob **job, Error **errp);
+
 /**
  * block_job_query:
  * @job: The job to get information about.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index dc25d1d306..e815d7d255 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -970,10 +970,12 @@
 #
 # @complete: see @block-job-complete
 #
+# @dismiss: see @block-job-dismiss
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobVerb',
-  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete' ] }
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss' ] }
 
 ##
 # @BlockJobStatus:
@@ -2243,6 +2245,26 @@
 ##
 { 'command': 'block-job-complete', 'data': { 'device': 'str' } }
 
+##
+# @block-job-dismiss:
+#
+# For jobs that have already concluded, remove them from the block-job-query
+# list. This command only needs to be run for jobs which were started with
+# QEMU 2.12+ job lifetime management semantics.
+#
+# This command will refuse to operate on any job that has not yet reached
+# its terminal state, BLOCK_JOB_STATUS_CONCLUDED. For jobs that make use of
+# BLOCK_JOB_READY event, block-job-cancel or block-job-complete will still need
+# to be used as appropriate.
+#
+# @id: The job identifier.
+#
+# Returns: Nothing on success
+#
+# Since: 2.12
+##
+{ 'command': 'block-job-dismiss', 'data': { 'id': 'str' } }
+
 ##
 # @BlockdevDiscardOptions:
 #
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 12/21] blockjobs: ensure abort is called for cancelled jobs
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (10 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 11/21] blockjobs: add block_job_dismiss John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 13/21] blockjobs: add commit, abort, clean helpers John Snow
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Presently, even if a job is canceled post-completion as a result of
a failing peer in a transaction, it will still call .commit because
nothing has updated or changed its return code.

The reason why this does not cause problems currently is because
backup's implementation of .commit checks for cancellation itself.

I'd like to simplify this contract:

(1) Abort is called if the job/transaction fails
(2) Commit is called if the job/transaction succeeds

To this end: A job's return code, if 0, will be forcibly set as
-ECANCELED if that job has already concluded. Remove the now
redundant check in the backup job implementation.

We need to check for cancellation in both block_job_completed
AND block_job_completed_single, because jobs may be cancelled between
those two calls; for instance in transactions. This also necessitates
an ABORTING -> ABORTING transition to be allowed.

The check in block_job_completed could be removed, but there's no
point in starting to attempt to succeed a transaction that we know
in advance will fail.

This does NOT affect mirror jobs that are "canceled" during their
synchronous phase. The mirror job itself forcibly sets the canceled
property to false prior to ceding control, so such cases will invoke
the "commit" callback.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 block/backup.c     |  2 +-
 block/trace-events |  1 +
 blockjob.c         | 21 ++++++++++++++++-----
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 7e254dabff..453cd62c24 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -206,7 +206,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
     BdrvDirtyBitmap *bm;
     BlockDriverState *bs = blk_bs(job->common.blk);
 
-    if (ret < 0 || block_job_is_cancelled(&job->common)) {
+    if (ret < 0) {
         /* Merge the successor back into the parent, delete nothing. */
         bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
         assert(bm);
diff --git a/block/trace-events b/block/trace-events
index 266afd9e99..5e531e0310 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -5,6 +5,7 @@ bdrv_open_common(void *bs, const char *filename, int flags, const char *format_n
 bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
 
 # blockjob.c
+block_job_completed(void *job, int ret, int jret) "job %p ret %d corrected ret %d"
 block_job_state_transition(void *job,  int ret, const char *legal, const char *s0, const char *s1) "job %p (ret: %d) attempting %s transition (%s-->%s)"
 block_job_apply_verb(void *job, const char *state, const char *verb, const char *legal) "job %p in state %s; applying verb %s (%s)"
 
diff --git a/blockjob.c b/blockjob.c
index 59ac4a13c7..61af628376 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -51,7 +51,7 @@ bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
     /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0},
     /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1, 0},
     /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
-    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 1, 1, 0},
     /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
     /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
@@ -405,13 +405,22 @@ static void block_job_conclude(BlockJob *job)
     }
 }
 
+static void block_job_update_rc(BlockJob *job)
+{
+    if (!job->ret && block_job_is_cancelled(job)) {
+        job->ret = -ECANCELED;
+    }
+    if (job->ret) {
+        block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
+    }
+}
+
 static void block_job_completed_single(BlockJob *job)
 {
     assert(job->completed);
 
-    if (job->ret || block_job_is_cancelled(job)) {
-        block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
-    }
+    /* Ensure abort is called for late-transactional failures */
+    block_job_update_rc(job);
 
     if (!job->ret) {
         if (job->driver->commit) {
@@ -896,7 +905,9 @@ void block_job_completed(BlockJob *job, int ret)
     assert(blk_bs(job->blk)->job == job);
     job->completed = true;
     job->ret = ret;
-    if (ret < 0 || block_job_is_cancelled(job)) {
+    block_job_update_rc(job);
+    trace_block_job_completed(job, ret, job->ret);
+    if (job->ret) {
         block_job_completed_txn_abort(job);
     } else {
         block_job_completed_txn_success(job);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 13/21] blockjobs: add commit, abort, clean helpers
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (11 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 12/21] blockjobs: ensure abort is called for cancelled jobs John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 14/21] blockjobs: add block_job_txn_apply function John Snow
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

The completed_single function is getting a little mucked up with
checking to see which callbacks exist, so let's factor them out.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 blockjob.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 61af628376..0c64fadc6d 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -415,6 +415,29 @@ static void block_job_update_rc(BlockJob *job)
     }
 }
 
+static void block_job_commit(BlockJob *job)
+{
+    assert(!job->ret);
+    if (job->driver->commit) {
+        job->driver->commit(job);
+    }
+}
+
+static void block_job_abort(BlockJob *job)
+{
+    assert(job->ret);
+    if (job->driver->abort) {
+        job->driver->abort(job);
+    }
+}
+
+static void block_job_clean(BlockJob *job)
+{
+    if (job->driver->clean) {
+        job->driver->clean(job);
+    }
+}
+
 static void block_job_completed_single(BlockJob *job)
 {
     assert(job->completed);
@@ -423,17 +446,11 @@ static void block_job_completed_single(BlockJob *job)
     block_job_update_rc(job);
 
     if (!job->ret) {
-        if (job->driver->commit) {
-            job->driver->commit(job);
-        }
+        block_job_commit(job);
     } else {
-        if (job->driver->abort) {
-            job->driver->abort(job);
-        }
-    }
-    if (job->driver->clean) {
-        job->driver->clean(job);
+        block_job_abort(job);
     }
+    block_job_clean(job);
 
     if (job->cb) {
         job->cb(job->opaque, job->ret);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 14/21] blockjobs: add block_job_txn_apply function
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (12 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 13/21] blockjobs: add commit, abort, clean helpers John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 15/21] blockjobs: add prepare callback John Snow
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Simply apply a function transaction-wide.
A few more uses of this in forthcoming patches.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockjob.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 0c64fadc6d..7e03824751 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -487,6 +487,19 @@ static void block_job_cancel_async(BlockJob *job)
     job->cancelled = true;
 }
 
+static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
+{
+    AioContext *ctx;
+    BlockJob *job, *next;
+
+    QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
+        ctx = blk_get_aio_context(job->blk);
+        aio_context_acquire(ctx);
+        fn(job);
+        aio_context_release(ctx);
+    }
+}
+
 static int block_job_finish_sync(BlockJob *job,
                                  void (*finish)(BlockJob *, Error **errp),
                                  Error **errp)
@@ -565,9 +578,8 @@ static void block_job_completed_txn_abort(BlockJob *job)
 
 static void block_job_completed_txn_success(BlockJob *job)
 {
-    AioContext *ctx;
     BlockJobTxn *txn = job->txn;
-    BlockJob *other_job, *next;
+    BlockJob *other_job;
     /*
      * Successful completion, see if there are other running jobs in this
      * txn.
@@ -576,15 +588,10 @@ static void block_job_completed_txn_success(BlockJob *job)
         if (!other_job->completed) {
             return;
         }
-    }
-    /* We are the last completed job, commit the transaction. */
-    QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
-        ctx = blk_get_aio_context(other_job->blk);
-        aio_context_acquire(ctx);
         assert(other_job->ret == 0);
-        block_job_completed_single(other_job);
-        aio_context_release(ctx);
     }
+    /* We are the last completed job, commit the transaction. */
+    block_job_txn_apply(txn, block_job_completed_single);
 }
 
 /* Assumes the block_job_mutex is held */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 15/21] blockjobs: add prepare callback
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (13 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 14/21] blockjobs: add block_job_txn_apply function John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 16/21] blockjobs: add waiting status John Snow
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Some jobs upon finalization may need to perform some work that can
still fail. If these jobs are part of a transaction, it's important
that these callbacks fail the entire transaction.

We allow for a new callback in addition to commit/abort/clean that
allows us the opportunity to have fairly late-breaking failures
in the transactional process.

The expected flow is:

- All jobs in a transaction converge to the PENDING state,
  added in a forthcoming commit.
- Upon being finalized, either automatically or explicitly
  by the user, jobs prepare to complete.
- If any job fails preparation, all jobs call .abort.
- Otherwise, they succeed and call .commit.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockjob.c                   | 30 +++++++++++++++++++++++++++---
 include/block/blockjob_int.h | 10 ++++++++++
 2 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 7e03824751..1395d8eed1 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -415,6 +415,14 @@ static void block_job_update_rc(BlockJob *job)
     }
 }
 
+static int block_job_prepare(BlockJob *job)
+{
+    if (job->ret == 0 && job->driver->prepare) {
+        job->ret = job->driver->prepare(job);
+    }
+    return job->ret;
+}
+
 static void block_job_commit(BlockJob *job)
 {
     assert(!job->ret);
@@ -438,7 +446,7 @@ static void block_job_clean(BlockJob *job)
     }
 }
 
-static void block_job_completed_single(BlockJob *job)
+static int block_job_completed_single(BlockJob *job)
 {
     assert(job->completed);
 
@@ -472,6 +480,7 @@ static void block_job_completed_single(BlockJob *job)
     QLIST_REMOVE(job, txn_list);
     block_job_txn_unref(job->txn);
     block_job_conclude(job);
+    return 0;
 }
 
 static void block_job_cancel_async(BlockJob *job)
@@ -487,17 +496,22 @@ static void block_job_cancel_async(BlockJob *job)
     job->cancelled = true;
 }
 
-static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
+static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
 {
     AioContext *ctx;
     BlockJob *job, *next;
+    int rc;
 
     QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
         ctx = blk_get_aio_context(job->blk);
         aio_context_acquire(ctx);
-        fn(job);
+        rc = fn(job);
         aio_context_release(ctx);
+        if (rc) {
+            break;
+        }
     }
+    return rc;
 }
 
 static int block_job_finish_sync(BlockJob *job,
@@ -580,6 +594,8 @@ static void block_job_completed_txn_success(BlockJob *job)
 {
     BlockJobTxn *txn = job->txn;
     BlockJob *other_job;
+    int rc = 0;
+
     /*
      * Successful completion, see if there are other running jobs in this
      * txn.
@@ -590,6 +606,14 @@ static void block_job_completed_txn_success(BlockJob *job)
         }
         assert(other_job->ret == 0);
     }
+
+    /* Jobs may require some prep-work to complete without failure */
+    rc = block_job_txn_apply(txn, block_job_prepare);
+    if (rc) {
+        block_job_completed_txn_abort(job);
+        return;
+    }
+
     /* We are the last completed job, commit the transaction. */
     block_job_txn_apply(txn, block_job_completed_single);
 }
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 259d49b32a..642adce68b 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -53,6 +53,16 @@ struct BlockJobDriver {
      */
     void (*complete)(BlockJob *job, Error **errp);
 
+    /**
+     * If the callback is not NULL, prepare will be invoked when all the jobs
+     * belonging to the same transaction complete; or upon this job's completion
+     * if it is not in a transaction.
+     *
+     * This callback will not be invoked if the job has already failed.
+     * If it fails, abort and then clean will be called.
+     */
+    int (*prepare)(BlockJob *job);
+
     /**
      * If the callback is not NULL, it will be invoked when all the jobs
      * belonging to the same transaction complete; or upon this job's
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 16/21] blockjobs: add waiting status
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (14 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 15/21] blockjobs: add prepare callback John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 17/21] blockjobs: add PENDING status and event John Snow
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

For jobs that are stuck waiting on others in a transaction, it would
be nice to know that they are no longer "running" in that sense, but
instead are waiting on other jobs in the transaction.

Jobs that are "waiting" in this sense cannot be meaningfully altered
any longer as they have left their running loop. The only meaningful
user verb for jobs in this state is "cancel," which will cancel the
whole transaction, too.

Transitions:
Running -> Waiting:   Normal transition.
Ready   -> Waiting:   Normal transition.
Waiting -> Aborting:  Transactional cancellation.
Waiting -> Concluded: Normal transition.

Removed Transitions:
Running -> Concluded: Jobs must go to WAITING first.
Ready   -> Concluded: Jobs must go to WAITING first.

Verbs:
Cancel: Can be applied to WAITING jobs.

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED+-----------------+
   |         +--+----+                 |
   |            |                      |
   |         +--v----+     +------+    |
   +---------+RUNNING<----->PAUSED|    |
   |         +--+-+--+     +------+    |
   |            | |                    |
   |            | +------------------+ |
   |            |                    | |
   |         +--v--+       +-------+ | |
   +---------+READY<------->STANDBY| | |
   |         +--+--+       +-------+ | |
   |            |                    | |
   |         +--v----+               | |
   +---------+WAITING<---------------+ |
   |         +--+----+                 |
   |            |                      |
+--v-----+   +--v------+               |
|ABORTING+--->CONCLUDED|               |
+--------+   +--+------+               |
                |                      |
             +--v-+                    |
             |NULL<--------------------+
             +----+

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockjob.c           | 37 ++++++++++++++++++++-----------------
 qapi/block-core.json |  6 +++++-
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 1395d8eed1..996278ed9c 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,26 +44,27 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S, X, E, N */
-    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0},
-    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0, 1},
-    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0},
-    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0},
-    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1, 0},
-    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
-    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 1, 1, 0},
-    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
-    /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0},
+                                          /* U, C, R, P, Y, S, W, X, E, N */
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0, 0},
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 0, 1, 0, 1},
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0, 0},
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1, 0, 0},
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
+    /* W: */ [BLOCK_JOB_STATUS_WAITING]   = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
+    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1},
+    /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S, X, E, N */
-    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 0, 0, 0},
-    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0, 0, 0},
-    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0, 0},
-    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0, 0},
-    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0, 0},
-    [BLOCK_JOB_VERB_DISMISS]              = {0, 0, 0, 0, 0, 0, 0, 1, 0},
+                                          /* U, C, R, P, Y, S, W, X, E, N */
+    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 1, 0, 0, 0},
+    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_DISMISS]              = {0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -596,6 +597,8 @@ static void block_job_completed_txn_success(BlockJob *job)
     BlockJob *other_job;
     int rc = 0;
 
+    block_job_state_transition(job, BLOCK_JOB_STATUS_WAITING);
+
     /*
      * Successful completion, see if there are other running jobs in this
      * txn.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index e815d7d255..374272014f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -998,6 +998,10 @@
 # @standby: The job is ready, but paused. This is nearly identical to @paused.
 #           The job may return to @ready or otherwise be canceled.
 #
+# @waiting: The job is waiting for other jobs in the transaction to converge
+#           to the waiting state. This status will likely not be visible for
+#           the last job in a transaction.
+#
 # @aborting: The job is in the process of being aborted, and will finish with
 #            an error. The job will afterwards report that it is @concluded.
 #            This status may not be visible to the management process.
@@ -1012,7 +1016,7 @@
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-           'aborting', 'concluded', 'null' ] }
+           'waiting', 'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 17/21] blockjobs: add PENDING status and event
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (15 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 16/21] blockjobs: add waiting status John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 18/21] blockjobs: add block-job-finalize John Snow
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

For jobs utilizing the new manual workflow, we intend to prohibit
them from modifying the block graph until the management layer provides
an explicit ACK via block-job-finalize to move the process forward.

To distinguish this runstate from "ready" or "waiting," we add a new
"pending" event and status.

For now, the transition from PENDING to CONCLUDED/ABORTING is automatic,
but a future commit will add the explicit block-job-finalize step.

Transitions:
Waiting -> Pending:   Normal transition.
Pending -> Concluded: Normal transition.
Pending -> Aborting:  Late transactional failures and cancellations.

Removed Transitions:
Waiting -> Concluded: Jobs must go to PENDING first.

Verbs:
Cancel: Can be applied to a pending job.

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED+-----------------+
   |         +--+----+                 |
   |            |                      |
   |         +--+----+     +------+    |
   +---------+RUNNING<----->PAUSED|    |
   |         +--+-+--+     +------+    |
   |            | |                    |
   |            | +------------------+ |
   |            |                    | |
   |         +--v--+       +-------+ | |
   +---------+READY<------->STANDBY| | |
   |         +--+--+       +-------+ | |
   |            |                    | |
   |         +--v----+               | |
   +---------+WAITING<---------------+ |
   |         +--+----+                 |
   |            |                      |
   |         +--v----+                 |
   +---------+PENDING|                 |
   |         +--+----+                 |
   |            |                      |
+--v-----+   +--v------+               |
|ABORTING+--->CONCLUDED|               |
+--------+   +--+------+               |
                |                      |
             +--v-+                    |
             |NULL<--------------------+
             +----+

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockjob.c               | 67 +++++++++++++++++++++++++++++++-----------------
 include/block/blockjob.h |  5 ++++
 qapi/block-core.json     | 31 +++++++++++++++++++++-
 3 files changed, 78 insertions(+), 25 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 996278ed9c..3880a89678 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,27 +44,28 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S, W, X, E, N */
-    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0, 0},
-    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 0, 1, 0, 1},
-    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0, 0},
-    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
-    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1, 0, 0},
-    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
-    /* W: */ [BLOCK_JOB_STATUS_WAITING]   = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
-    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
-    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1},
-    /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
+                                          /* U, C, R, P, Y, S, W, D, X, E, N */
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0},
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1},
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0},
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0},
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0},
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0},
+    /* W: */ [BLOCK_JOB_STATUS_WAITING]   = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0},
+    /* D: */ [BLOCK_JOB_STATUS_PENDING]   = {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
+    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1},
+    /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-                                          /* U, C, R, P, Y, S, W, X, E, N */
-    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 1, 0, 0, 0},
-    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0},
-    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0},
-    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0},
-    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
-    [BLOCK_JOB_VERB_DISMISS]              = {0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
+                                          /* U, C, R, P, Y, S, W, D, X, E, N */
+    [BLOCK_JOB_VERB_CANCEL]               = {0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0},
+    [BLOCK_JOB_VERB_PAUSE]                = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_DISMISS]              = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -115,6 +116,7 @@ static void __attribute__((__constructor__)) block_job_init(void)
 
 static void block_job_event_cancelled(BlockJob *job);
 static void block_job_event_completed(BlockJob *job, const char *msg);
+static int block_job_event_pending(BlockJob *job);
 static void block_job_enter_cond(BlockJob *job, bool(*fn)(BlockJob *job));
 
 /* Transactional group of block jobs */
@@ -497,17 +499,21 @@ static void block_job_cancel_async(BlockJob *job)
     job->cancelled = true;
 }
 
-static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
+static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *), bool lock)
 {
     AioContext *ctx;
     BlockJob *job, *next;
     int rc;
 
     QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
-        ctx = blk_get_aio_context(job->blk);
-        aio_context_acquire(ctx);
+        if (lock) {
+            ctx = blk_get_aio_context(job->blk);
+            aio_context_acquire(ctx);
+        }
         rc = fn(job);
-        aio_context_release(ctx);
+        if (lock) {
+            aio_context_release(ctx);
+        }
         if (rc) {
             break;
         }
@@ -611,14 +617,15 @@ static void block_job_completed_txn_success(BlockJob *job)
     }
 
     /* Jobs may require some prep-work to complete without failure */
-    rc = block_job_txn_apply(txn, block_job_prepare);
+    rc = block_job_txn_apply(txn, block_job_prepare, true);
     if (rc) {
         block_job_completed_txn_abort(job);
         return;
     }
 
     /* We are the last completed job, commit the transaction. */
-    block_job_txn_apply(txn, block_job_completed_single);
+    block_job_txn_apply(txn, block_job_event_pending, false);
+    block_job_txn_apply(txn, block_job_completed_single, true);
 }
 
 /* Assumes the block_job_mutex is held */
@@ -827,6 +834,17 @@ static void block_job_event_completed(BlockJob *job, const char *msg)
                                         &error_abort);
 }
 
+static int block_job_event_pending(BlockJob *job)
+{
+    block_job_state_transition(job, BLOCK_JOB_STATUS_PENDING);
+    if (!job->auto_finalize && !block_job_is_internal(job)) {
+        qapi_event_send_block_job_pending(job->driver->job_type,
+                                          job->id,
+                                          &error_abort);
+    }
+    return 0;
+}
+
 /*
  * API for block job drivers and the block layer.  These functions are
  * declared in blockjob_int.h.
@@ -888,6 +906,7 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
     job->paused        = true;
     job->pause_count   = 1;
     job->refcnt        = 1;
+    job->auto_finalize = !(flags & BLOCK_JOB_MANUAL_FINALIZE);
     job->auto_dismiss  = !(flags & BLOCK_JOB_MANUAL_DISMISS);
     block_job_state_transition(job, BLOCK_JOB_STATUS_CREATED);
     aio_timer_init(qemu_get_aio_context(), &job->sleep_timer,
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index c535829b46..7c8d51effa 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -142,6 +142,9 @@ typedef struct BlockJob {
     /** Current state; See @BlockJobStatus for details. */
     BlockJobStatus status;
 
+    /** True if this job should automatically finalize itself */
+    bool auto_finalize;
+
     /** True if this job should automatically dismiss itself */
     bool auto_dismiss;
 
@@ -154,6 +157,8 @@ typedef enum BlockJobCreateFlags {
     BLOCK_JOB_DEFAULT = 0x00,
     /* BlockJob is not QMP-created and should not send QMP events */
     BLOCK_JOB_INTERNAL = 0x01,
+    /* BlockJob requires manual finalize step */
+    BLOCK_JOB_MANUAL_FINALIZE = 0x02,
     /* BlockJob requires manual dismiss step */
     BLOCK_JOB_MANUAL_DISMISS = 0x04,
 } BlockJobCreateFlags;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 374272014f..73584ff3f5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1002,6 +1002,11 @@
 #           to the waiting state. This status will likely not be visible for
 #           the last job in a transaction.
 #
+# @pending: The job has finished its work, but has finalization steps that it
+#           needs to make prior to completing. These changes may require
+#           manual intervention by the management process if manual was set
+#           to true. These changes may still fail.
+#
 # @aborting: The job is in the process of being aborted, and will finish with
 #            an error. The job will afterwards report that it is @concluded.
 #            This status may not be visible to the management process.
@@ -1016,7 +1021,7 @@
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-           'waiting', 'aborting', 'concluded', 'null' ] }
+           'waiting', 'pending', 'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
@@ -3940,6 +3945,30 @@
             'offset': 'int',
             'speed' : 'int' } }
 
+##
+# @BLOCK_JOB_PENDING:
+#
+# Emitted when a block job is awaiting explicit authorization to finalize graph
+# changes via @block-job-finalize. If this job is part of a transaction, it will
+# not emit this event until the transaction has converged first.
+#
+# @type: job type
+#
+# @id: The job identifier.
+#
+# Since: 2.12
+#
+# Example:
+#
+# <- { "event": "BLOCK_JOB_WAITING",
+#      "data": { "device": "drive0", "type": "mirror" },
+#      "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
+#
+##
+{ 'event': 'BLOCK_JOB_PENDING',
+  'data': { 'type'  : 'BlockJobType',
+            'id'    : 'str' } }
+
 ##
 # @PreallocMode:
 #
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 18/21] blockjobs: add block-job-finalize
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (16 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 17/21] blockjobs: add PENDING status and event John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 19/21] blockjobs: Expose manual property John Snow
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Instead of automatically transitioning from PENDING to CONCLUDED, gate
the .prepare() and .commit() phases behind an explicit acknowledgement
provided by the QMP monitor if auto_finalize = false has been requested.

This allows us to perform graph changes in prepare and/or commit so that
graph changes do not occur autonomously without knowledge of the
controlling management layer.

Transactions that have reached the "PENDING" state together can all be
moved to invoke their finalization methods by issuing block_job_finalize
to any one job in the transaction.

Jobs in a transaction with mixed job->auto_finalize settings will all
remain stuck in the "PENDING" state, as if the entire transaction was
specified with auto_finalize = false. Jobs that specified
auto_finalize = true, however, will still not emit the PENDING event.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 block/trace-events       |  1 +
 blockdev.c               | 14 +++++++++++
 blockjob.c               | 60 +++++++++++++++++++++++++++++++++++-------------
 include/block/blockjob.h | 17 ++++++++++++++
 qapi/block-core.json     | 23 ++++++++++++++++++-
 5 files changed, 98 insertions(+), 17 deletions(-)

diff --git a/block/trace-events b/block/trace-events
index 5e531e0310..a81b66ff36 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -51,6 +51,7 @@ qmp_block_job_cancel(void *job) "job %p"
 qmp_block_job_pause(void *job) "job %p"
 qmp_block_job_resume(void *job) "job %p"
 qmp_block_job_complete(void *job) "job %p"
+qmp_block_job_finalize(void *job) "job %p"
 qmp_block_job_dismiss(void *job) "job %p"
 qmp_block_stream(void *bs, void *job) "bs %p job %p"
 
diff --git a/blockdev.c b/blockdev.c
index 9900cbc7dd..efd3ab2e99 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3853,6 +3853,20 @@ void qmp_block_job_complete(const char *device, Error **errp)
     aio_context_release(aio_context);
 }
 
+void qmp_block_job_finalize(const char *id, Error **errp)
+{
+    AioContext *aio_context;
+    BlockJob *job = find_block_job(id, &aio_context, errp);
+
+    if (!job) {
+        return;
+    }
+
+    trace_qmp_block_job_finalize(job);
+    block_job_finalize(job, errp);
+    aio_context_release(aio_context);
+}
+
 void qmp_block_job_dismiss(const char *id, Error **errp)
 {
     AioContext *aio_context;
diff --git a/blockjob.c b/blockjob.c
index 3880a89678..4b73cb0263 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -65,6 +65,7 @@ bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
     [BLOCK_JOB_VERB_RESUME]               = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
     [BLOCK_JOB_VERB_SET_SPEED]            = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
     [BLOCK_JOB_VERB_COMPLETE]             = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0},
+    [BLOCK_JOB_VERB_FINALIZE]             = {0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0},
     [BLOCK_JOB_VERB_DISMISS]              = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
 };
 
@@ -449,7 +450,7 @@ static void block_job_clean(BlockJob *job)
     }
 }
 
-static int block_job_completed_single(BlockJob *job)
+static int block_job_finalize_single(BlockJob *job)
 {
     assert(job->completed);
 
@@ -590,18 +591,36 @@ static void block_job_completed_txn_abort(BlockJob *job)
             assert(other_job->cancelled);
             block_job_finish_sync(other_job, NULL, NULL);
         }
-        block_job_completed_single(other_job);
+        block_job_finalize_single(other_job);
         aio_context_release(ctx);
     }
 
     block_job_txn_unref(txn);
 }
 
+static int block_job_needs_finalize(BlockJob *job)
+{
+    return !job->auto_finalize;
+}
+
+static void block_job_do_finalize(BlockJob *job)
+{
+    int rc;
+    assert(job && job->txn);
+
+    /* prepare the transaction to complete */
+    rc = block_job_txn_apply(job->txn, block_job_prepare, true);
+    if (rc) {
+        block_job_completed_txn_abort(job);
+    } else {
+        block_job_txn_apply(job->txn, block_job_finalize_single, true);
+    }
+}
+
 static void block_job_completed_txn_success(BlockJob *job)
 {
     BlockJobTxn *txn = job->txn;
     BlockJob *other_job;
-    int rc = 0;
 
     block_job_state_transition(job, BLOCK_JOB_STATUS_WAITING);
 
@@ -616,16 +635,12 @@ static void block_job_completed_txn_success(BlockJob *job)
         assert(other_job->ret == 0);
     }
 
-    /* Jobs may require some prep-work to complete without failure */
-    rc = block_job_txn_apply(txn, block_job_prepare, true);
-    if (rc) {
-        block_job_completed_txn_abort(job);
-        return;
-    }
-
-    /* We are the last completed job, commit the transaction. */
     block_job_txn_apply(txn, block_job_event_pending, false);
-    block_job_txn_apply(txn, block_job_completed_single, true);
+
+    /* If no jobs need manual finalization, automatically do so */
+    if (block_job_txn_apply(txn, block_job_needs_finalize, false) == 0) {
+        block_job_do_finalize(job);
+    }
 }
 
 /* Assumes the block_job_mutex is held */
@@ -677,6 +692,15 @@ void block_job_complete(BlockJob *job, Error **errp)
     job->driver->complete(job, errp);
 }
 
+void block_job_finalize(BlockJob *job, Error **errp)
+{
+    assert(job && job->id && job->txn);
+    if (block_job_apply_verb(job, BLOCK_JOB_VERB_FINALIZE, errp)) {
+        return;
+    }
+    block_job_do_finalize(job);
+}
+
 void block_job_dismiss(BlockJob **jobptr, Error **errp)
 {
     BlockJob *job = *jobptr;
@@ -727,11 +751,15 @@ void block_job_cancel(BlockJob *job)
 {
     if (job->status == BLOCK_JOB_STATUS_CONCLUDED) {
         block_job_do_dismiss(job);
-    } else if (block_job_started(job)) {
-        block_job_cancel_async(job);
-        block_job_enter(job);
-    } else {
+        return;
+    }
+    block_job_cancel_async(job);
+    if (!block_job_started(job)) {
         block_job_completed(job, -ECANCELED);
+    } else if (job->deferred_to_main_loop) {
+        block_job_completed_txn_abort(job);
+    } else {
+        block_job_enter(job);
     }
 }
 
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 7c8d51effa..978274ed2b 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -244,6 +244,23 @@ void block_job_cancel(BlockJob *job);
  */
 void block_job_complete(BlockJob *job, Error **errp);
 
+
+/**
+ * block_job_finalize:
+ * @job: The job to fully commit and finish.
+ * @errp: Error object.
+ *
+ * For jobs that have finished their work and are pending
+ * awaiting explicit acknowledgement to commit their work,
+ * This will commit that work.
+ *
+ * FIXME: Make the below statement universally true:
+ * For jobs that support the manual workflow mode, all graph
+ * changes that occur as a result will occur after this command
+ * and before a successful reply.
+ */
+void block_job_finalize(BlockJob *job, Error **errp);
+
 /**
  * block_job_dismiss:
  * @job: The job to be dismissed.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 73584ff3f5..961c4a3d0c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -972,10 +972,13 @@
 #
 # @dismiss: see @block-job-dismiss
 #
+# @finalize: see @block-job-finalize
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobVerb',
-  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss' ] }
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss',
+           'finalize' ] }
 
 ##
 # @BlockJobStatus:
@@ -2274,6 +2277,24 @@
 ##
 { 'command': 'block-job-dismiss', 'data': { 'id': 'str' } }
 
+##
+# @block-job-finalize:
+#
+# Once a job that has manual=true reaches the pending state, it can be
+# instructed to finalize any graph changes and do any necessary cleanup
+# via this command.
+# For jobs in a transaction, instructing one job to finalize will force
+# ALL jobs in the transaction to finalize, so it is only necessary to instruct
+# a single member job to finalize.
+#
+# @id: The job identifier.
+#
+# Returns: Nothing on success
+#
+# Since: 2.12
+##
+{ 'command': 'block-job-finalize', 'data': { 'id': 'str' } }
+
 ##
 # @BlockdevDiscardOptions:
 #
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 19/21] blockjobs: Expose manual property
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (17 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 18/21] blockjobs: add block-job-finalize John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 20/21] iotests: test manual job dismissal John Snow
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Expose the "manual" property via QAPI for the backup-related jobs.
As of this commit, this allows the management API to request the
"concluded" and "dismiss" semantics for backup jobs.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockdev.c                 | 31 +++++++++++++++++++++++++++---
 blockjob.c                 |  2 ++
 qapi/block-core.json       | 48 ++++++++++++++++++++++++++++++++++++++--------
 tests/qemu-iotests/109.out | 24 +++++++++++------------
 4 files changed, 82 insertions(+), 23 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index efd3ab2e99..809adbe7f9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3261,7 +3261,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
     AioContext *aio_context;
     QDict *options = NULL;
     Error *local_err = NULL;
-    int flags;
+    int flags, job_flags = BLOCK_JOB_DEFAULT;
     int64_t size;
     bool set_backing_hd = false;
 
@@ -3280,6 +3280,12 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
     if (!backup->has_job_id) {
         backup->job_id = NULL;
     }
+    if (!backup->has_auto_finalize) {
+        backup->auto_finalize = true;
+    }
+    if (!backup->has_auto_dismiss) {
+        backup->auto_dismiss = true;
+    }
     if (!backup->has_compress) {
         backup->compress = false;
     }
@@ -3371,11 +3377,17 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
             goto out;
         }
     }
+    if (!backup->auto_finalize) {
+        job_flags |= BLOCK_JOB_MANUAL_FINALIZE;
+    }
+    if (!backup->auto_dismiss) {
+        job_flags |= BLOCK_JOB_MANUAL_DISMISS;
+    }
 
     job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
                             backup->sync, bmap, backup->compress,
                             backup->on_source_error, backup->on_target_error,
-                            BLOCK_JOB_DEFAULT, NULL, NULL, txn, &local_err);
+                            job_flags, NULL, NULL, txn, &local_err);
     bdrv_unref(target_bs);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
@@ -3410,6 +3422,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn,
     Error *local_err = NULL;
     AioContext *aio_context;
     BlockJob *job = NULL;
+    int job_flags = BLOCK_JOB_DEFAULT;
 
     if (!backup->has_speed) {
         backup->speed = 0;
@@ -3423,6 +3436,12 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn,
     if (!backup->has_job_id) {
         backup->job_id = NULL;
     }
+    if (!backup->has_auto_finalize) {
+        backup->auto_finalize = true;
+    }
+    if (!backup->has_auto_dismiss) {
+        backup->auto_dismiss = true;
+    }
     if (!backup->has_compress) {
         backup->compress = false;
     }
@@ -3451,10 +3470,16 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn,
             goto out;
         }
     }
+    if (!backup->auto_finalize) {
+        job_flags |= BLOCK_JOB_MANUAL_FINALIZE;
+    }
+    if (!backup->auto_dismiss) {
+        job_flags |= BLOCK_JOB_MANUAL_DISMISS;
+    }
     job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
                             backup->sync, NULL, backup->compress,
                             backup->on_source_error, backup->on_target_error,
-                            BLOCK_JOB_DEFAULT, NULL, NULL, txn, &local_err);
+                            job_flags, NULL, NULL, txn, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
     }
diff --git a/blockjob.c b/blockjob.c
index 4b73cb0263..ba538c93dd 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -821,6 +821,8 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
     info->io_status = job->iostatus;
     info->ready     = job->ready;
     info->status    = job->status;
+    info->auto_finalize = job->auto_finalize;
+    info->auto_dismiss  = job->auto_dismiss;
     return info;
 }
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 961c4a3d0c..9e1534d70d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1054,13 +1054,20 @@
 #
 # @status: Current job state/status (since 2.12)
 #
+# @auto-finalize: Job will finalize itself when PENDING, moving to
+#                 the CONCLUDED state. (since 2.12)
+#
+# @auto-dismiss: Job will dismiss itself when CONCLUDED, moving to the NULL
+#                state and disappearing from the query list. (since 2.12)
+#
 # Since: 1.1
 ##
 { 'struct': 'BlockJobInfo',
   'data': {'type': 'str', 'device': 'str', 'len': 'int',
            'offset': 'int', 'busy': 'bool', 'paused': 'bool', 'speed': 'int',
            'io-status': 'BlockDeviceIoStatus', 'ready': 'bool',
-           'status': 'BlockJobStatus' } }
+           'status': 'BlockJobStatus',
+           'auto-finalize': 'bool', 'auto-dismiss': 'bool' } }
 
 ##
 # @query-block-jobs:
@@ -1210,6 +1217,18 @@
 #                   default 'report' (no limitations, since this applies to
 #                   a different block device than @device).
 #
+# @auto-finalize: When false, this job will wait in a PENDING state after it has
+#                 finished its work, waiting for @block-job-finalize.
+#                 When true, this job will automatically perform its abort or
+#                 commit actions.
+#                 Defaults to true. (Since 2.12)
+#
+# @auto-dismiss: When false, this job will wait in a CONCLUDED state after it
+#                has completed ceased all work, and wait for @block-job-dismiss.
+#                When true, this job will automatically disappear from the query
+#                list without user intervention.
+#                Defaults to true. (Since 2.12)
+#
 # Note: @on-source-error and @on-target-error only affect background
 # I/O.  If an error occurs during a guest write request, the device's
 # rerror/werror actions will be used.
@@ -1218,10 +1237,12 @@
 ##
 { 'struct': 'DriveBackup',
   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-            '*format': 'str', 'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
-            '*speed': 'int', '*bitmap': 'str', '*compress': 'bool',
+            '*format': 'str', 'sync': 'MirrorSyncMode',
+            '*mode': 'NewImageMode', '*speed': 'int',
+            '*bitmap': 'str', '*compress': 'bool',
             '*on-source-error': 'BlockdevOnError',
-            '*on-target-error': 'BlockdevOnError' } }
+            '*on-target-error': 'BlockdevOnError',
+            '*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
 
 ##
 # @BlockdevBackup:
@@ -1251,6 +1272,18 @@
 #                   default 'report' (no limitations, since this applies to
 #                   a different block device than @device).
 #
+# @auto-finalize: When false, this job will wait in a PENDING state after it has
+#                 finished its work, waiting for @block-job-finalize.
+#                 When true, this job will automatically perform its abort or
+#                 commit actions.
+#                 Defaults to true. (Since 2.12)
+#
+# @auto-dismiss: When false, this job will wait in a CONCLUDED state after it
+#                has completed ceased all work, and wait for @block-job-dismiss.
+#                When true, this job will automatically disappear from the query
+#                list without user intervention.
+#                Defaults to true. (Since 2.12)
+#
 # Note: @on-source-error and @on-target-error only affect background
 # I/O.  If an error occurs during a guest write request, the device's
 # rerror/werror actions will be used.
@@ -1259,11 +1292,10 @@
 ##
 { 'struct': 'BlockdevBackup',
   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-            'sync': 'MirrorSyncMode',
-            '*speed': 'int',
-            '*compress': 'bool',
+            'sync': 'MirrorSyncMode', '*speed': 'int', '*compress': 'bool',
             '*on-source-error': 'BlockdevOnError',
-            '*on-target-error': 'BlockdevOnError' } }
+            '*on-target-error': 'BlockdevOnError',
+            '*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
 
 ##
 # @blockdev-snapshot-sync:
diff --git a/tests/qemu-iotests/109.out b/tests/qemu-iotests/109.out
index d288f2eef6..8a9b93672b 100644
--- a/tests/qemu-iotests/109.out
+++ b/tests/qemu-iotests/109.out
@@ -19,7 +19,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, "offset": 1024, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 1024, "offset": 1024, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
@@ -45,7 +45,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 197120, "offset": 197120, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 197120, "offset": 197120, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 197120, "offset": 197120, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 197120, "offset": 197120, "speed": 0, "type": "mirror"}}
@@ -71,7 +71,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 327680, "offset": 327680, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 327680, "offset": 327680, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
@@ -97,7 +97,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, "offset": 1024, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 1024, "offset": 1024, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 1024, "speed": 0, "type": "mirror"}}
@@ -123,7 +123,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 65536, "offset": 65536, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 65536, "offset": 65536, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 65536, "offset": 65536, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 65536, "offset": 65536, "speed": 0, "type": "mirror"}}
@@ -149,7 +149,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2560, "offset": 2560, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 2560, "offset": 2560, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
@@ -174,7 +174,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2560, "offset": 2560, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 2560, "offset": 2560, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2560, "offset": 2560, "speed": 0, "type": "mirror"}}
@@ -199,7 +199,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 31457280, "offset": 31457280, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 31457280, "offset": 31457280, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 31457280, "offset": 31457280, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 31457280, "offset": 31457280, "speed": 0, "type": "mirror"}}
@@ -224,7 +224,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 327680, "offset": 327680, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 327680, "offset": 327680, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 327680, "offset": 327680, "speed": 0, "type": "mirror"}}
@@ -249,7 +249,7 @@ read 65536/65536 bytes at offset 0
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 2048, "offset": 2048, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 2048, "offset": 2048, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 2048, "offset": 2048, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 2048, "offset": 2048, "speed": 0, "type": "mirror"}}
@@ -265,7 +265,7 @@ Automatically detecting the format is dangerous for raw images, write operations
 Specify the 'raw' format explicitly to remove the restrictions.
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 512, "offset": 512, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 512, "offset": 512, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 512, "offset": 512, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 512, "offset": 512, "speed": 0, "type": "mirror"}}
@@ -274,7 +274,7 @@ Images are identical.
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 512, "offset": 512, "speed": 0, "type": "mirror"}}
-{"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 512, "offset": 512, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": [{"auto-finalize": true, "io-status": "ok", "device": "src", "auto-dismiss": true, "busy": false, "len": 512, "offset": 512, "status": "ready", "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 512, "offset": 512, "speed": 0, "type": "mirror"}}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 20/21] iotests: test manual job dismissal
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (18 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 19/21] blockjobs: Expose manual property John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 21/21] tests/test-blockjob: test cancellations John Snow
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Signed-off-by: John Snow <jsnow@redhat.com>
---
 tests/qemu-iotests/056     | 187 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/056.out |   4 +-
 2 files changed, 189 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
index 04f2c3c841..223292175a 100755
--- a/tests/qemu-iotests/056
+++ b/tests/qemu-iotests/056
@@ -29,6 +29,26 @@ backing_img = os.path.join(iotests.test_dir, 'backing.img')
 test_img = os.path.join(iotests.test_dir, 'test.img')
 target_img = os.path.join(iotests.test_dir, 'target.img')
 
+def img_create(img, fmt=iotests.imgfmt, size='64M', **kwargs):
+    fullname = os.path.join(iotests.test_dir, '%s.%s' % (img, fmt))
+    optargs = []
+    for k,v in kwargs.iteritems():
+        optargs = optargs + ['-o', '%s=%s' % (k,v)]
+    args = ['create', '-f', fmt] + optargs + [fullname, size]
+    iotests.qemu_img(*args)
+    return fullname
+
+def try_remove(img):
+    try:
+        os.remove(img)
+    except OSError:
+        pass
+
+def io_write_patterns(img, patterns):
+    for pattern in patterns:
+        iotests.qemu_io('-c', 'write -P%s %s %s' % pattern, img)
+
+
 class TestSyncModesNoneAndTop(iotests.QMPTestCase):
     image_len = 64 * 1024 * 1024 # MB
 
@@ -108,5 +128,172 @@ class TestBeforeWriteNotifier(iotests.QMPTestCase):
         event = self.cancel_and_wait()
         self.assert_qmp(event, 'data/type', 'backup')
 
+class BackupTest(iotests.QMPTestCase):
+    def setUp(self):
+        self.vm = iotests.VM()
+        self.test_img = img_create('test')
+        self.dest_img = img_create('dest')
+        self.vm.add_drive(self.test_img)
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        try_remove(self.test_img)
+        try_remove(self.dest_img)
+
+    def hmp_io_writes(self, drive, patterns):
+        for pattern in patterns:
+            self.vm.hmp_qemu_io(drive, 'write -P%s %s %s' % pattern)
+        self.vm.hmp_qemu_io(drive, 'flush')
+
+    def qmp_backup_and_wait(self, cmd='drive-backup', serror=None,
+                            aerror=None, **kwargs):
+        if not self.qmp_backup(cmd, serror, **kwargs):
+            return False
+        return self.qmp_backup_wait(kwargs['device'], aerror)
+
+    def qmp_backup(self, cmd='drive-backup',
+                   error=None, **kwargs):
+        self.assertTrue('device' in kwargs)
+        res = self.vm.qmp(cmd, **kwargs)
+        if error:
+            self.assert_qmp(res, 'error/desc', error)
+            return False
+        self.assert_qmp(res, 'return', {})
+        return True
+
+    def qmp_backup_wait(self, device, error=None):
+        event = self.vm.event_wait(name="BLOCK_JOB_COMPLETED",
+                                   match={'data': {'device': device}})
+        self.assertNotEqual(event, None)
+        try:
+            failure = self.dictpath(event, 'data/error')
+        except AssertionError:
+            # Backup succeeded.
+            self.assert_qmp(event, 'data/offset', event['data']['len'])
+            return True
+        else:
+            # Failure.
+            self.assert_qmp(event, 'data/error', qerror)
+            return False
+
+    def test_dismiss_false(self):
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return', [])
+        self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+                                 sync='full', target=self.dest_img,
+                                 auto_dismiss=True)
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return', [])
+
+    def test_dismiss_true(self):
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return', [])
+        self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+                                 sync='full', target=self.dest_img,
+                                 auto_dismiss=False)
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return[0]/status', 'concluded')
+        res = self.vm.qmp('block-job-dismiss', id='drive0')
+        self.assert_qmp(res, 'return', {})
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return', [])
+
+    def test_dismiss_bad_id(self):
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return', [])
+        res = self.vm.qmp('block-job-dismiss', id='foobar')
+        self.assert_qmp(res, 'error/class', 'DeviceNotActive')
+
+    def test_dismiss_collision(self):
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return', [])
+        self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+                                 sync='full', target=self.dest_img,
+                                 auto_dismiss=False)
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return[0]/status', 'concluded')
+        # Leave zombie job un-dismissed, observe a failure:
+        res = self.qmp_backup_and_wait(serror='Need a root block node',
+                                       device='drive0', format=iotests.imgfmt,
+                                       sync='full', target=self.dest_img,
+                                       auto_dismiss=False)
+        self.assertEqual(res, False)
+        # OK, dismiss the zombie.
+        res = self.vm.qmp('block-job-dismiss', id='drive0')
+        self.assert_qmp(res, 'return', {})
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return', [])
+        # Ensure it's really gone.
+        self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+                                 sync='full', target=self.dest_img,
+                                 auto_dismiss=False)
+
+    def dismissal_failure(self, dismissal_opt):
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return', [])
+        # Give blkdebug something to chew on
+        self.hmp_io_writes('drive0',
+                           (('0x9a', 0, 512),
+                           ('0x55', '8M', '352k'),
+                           ('0x78', '15872k', '1M')))
+        # Add destination node via blkdebug
+        res = self.vm.qmp('blockdev-add',
+                          node_name='target0',
+                          driver=iotests.imgfmt,
+                          file={
+                              'driver': 'blkdebug',
+                              'image': {
+                                  'driver': 'file',
+                                  'filename': self.dest_img
+                              },
+                              'inject-error': [{
+                                  'event': 'write_aio',
+                                  'errno': 5,
+                                  'immediately': False,
+                                  'once': True
+                              }],
+                          })
+        self.assert_qmp(res, 'return', {})
+
+        res = self.qmp_backup(cmd='blockdev-backup',
+                              device='drive0', target='target0',
+                              on_target_error='stop',
+                              sync='full',
+                              auto_dismiss=dismissal_opt)
+        self.assertTrue(res)
+        event = self.vm.event_wait(name="BLOCK_JOB_ERROR",
+                                   match={'data': {'device': 'drive0'}})
+        self.assertNotEqual(event, None)
+        # OK, job should be wedged
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return[0]/status', 'paused')
+        res = self.vm.qmp('block-job-dismiss', id='drive0')
+        self.assert_qmp(res, 'error/desc',
+                        "Job 'drive0' in state 'paused' cannot accept"
+                        " command verb 'dismiss'")
+        res = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(res, 'return[0]/status', 'paused')
+        # OK, unstick job and move forward.
+        res = self.vm.qmp('block-job-resume', device='drive0')
+        self.assert_qmp(res, 'return', {})
+        # And now we need to wait for it to conclude;
+        res = self.qmp_backup_wait(device='drive0')
+        self.assertTrue(res)
+        if not dismissal_opt:
+            # Job should now be languishing:
+            res = self.vm.qmp('query-block-jobs')
+            self.assert_qmp(res, 'return[0]/status', 'concluded')
+            res = self.vm.qmp('block-job-dismiss', id='drive0')
+            self.assert_qmp(res, 'return', {})
+            res = self.vm.qmp('query-block-jobs')
+            self.assert_qmp(res, 'return', [])
+
+    def test_dismiss_premature(self):
+        self.dismissal_failure(False)
+
+    def test_dismiss_erroneous(self):
+        self.dismissal_failure(True)
+
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2', 'qed'])
diff --git a/tests/qemu-iotests/056.out b/tests/qemu-iotests/056.out
index 8d7e996700..dae404e278 100644
--- a/tests/qemu-iotests/056.out
+++ b/tests/qemu-iotests/056.out
@@ -1,5 +1,5 @@
-...
+.........
 ----------------------------------------------------------------------
-Ran 3 tests
+Ran 9 tests
 
 OK
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [PATCH v5 21/21] tests/test-blockjob: test cancellations
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (19 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 20/21] iotests: test manual job dismissal John Snow
@ 2018-03-10  8:27 ` John Snow
  2018-03-12 16:23 ` [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management Kevin Wolf
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-10  8:27 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, pkrempa, jtc, qemu-devel, John Snow

Whatever the state a blockjob is in, it should be able to be canceled
by the block layer.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 tests/test-blockjob.c | 233 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 229 insertions(+), 4 deletions(-)

diff --git a/tests/test-blockjob.c b/tests/test-blockjob.c
index 599e28d732..8946bfd37b 100644
--- a/tests/test-blockjob.c
+++ b/tests/test-blockjob.c
@@ -24,14 +24,15 @@ static void block_job_cb(void *opaque, int ret)
 {
 }
 
-static BlockJob *do_test_id(BlockBackend *blk, const char *id,
-                            bool should_succeed)
+static BlockJob *mk_job(BlockBackend *blk, const char *id,
+                        const BlockJobDriver *drv, bool should_succeed,
+                        int flags)
 {
     BlockJob *job;
     Error *errp = NULL;
 
-    job = block_job_create(id, &test_block_job_driver, NULL, blk_bs(blk),
-                           0, BLK_PERM_ALL, 0, BLOCK_JOB_DEFAULT, block_job_cb,
+    job = block_job_create(id, drv, NULL, blk_bs(blk),
+                           0, BLK_PERM_ALL, 0, flags, block_job_cb,
                            NULL, &errp);
     if (should_succeed) {
         g_assert_null(errp);
@@ -50,6 +51,13 @@ static BlockJob *do_test_id(BlockBackend *blk, const char *id,
     return job;
 }
 
+static BlockJob *do_test_id(BlockBackend *blk, const char *id,
+                            bool should_succeed)
+{
+    return mk_job(blk, id, &test_block_job_driver,
+                  should_succeed, BLOCK_JOB_DEFAULT);
+}
+
 /* This creates a BlockBackend (optionally with a name) with a
  * BlockDriverState inserted. */
 static BlockBackend *create_blk(const char *name)
@@ -142,6 +150,216 @@ static void test_job_ids(void)
     destroy_blk(blk[2]);
 }
 
+typedef struct CancelJob {
+    BlockJob common;
+    BlockBackend *blk;
+    bool should_converge;
+    bool should_complete;
+    bool completed;
+} CancelJob;
+
+static void cancel_job_completed(BlockJob *job, void *opaque)
+{
+    CancelJob *s = opaque;
+    s->completed = true;
+    block_job_completed(job, 0);
+}
+
+static void cancel_job_complete(BlockJob *job, Error **errp)
+{
+    CancelJob *s = container_of(job, CancelJob, common);
+    s->should_complete = true;
+}
+
+static void coroutine_fn cancel_job_start(void *opaque)
+{
+    CancelJob *s = opaque;
+
+    while (!s->should_complete) {
+        if (block_job_is_cancelled(&s->common)) {
+            goto defer;
+        }
+
+        if (!s->common.ready && s->should_converge) {
+            block_job_event_ready(&s->common);
+        }
+
+        block_job_sleep_ns(&s->common, 100000);
+    }
+
+ defer:
+    block_job_defer_to_main_loop(&s->common, cancel_job_completed, s);
+}
+
+static const BlockJobDriver test_cancel_driver = {
+    .instance_size = sizeof(CancelJob),
+    .start         = cancel_job_start,
+    .complete      = cancel_job_complete,
+};
+
+static CancelJob *create_common(BlockJob **pjob)
+{
+    BlockBackend *blk;
+    BlockJob *job;
+    CancelJob *s;
+
+    blk = create_blk(NULL);
+    job = mk_job(blk, "Steve", &test_cancel_driver, true,
+                 BLOCK_JOB_MANUAL_FINALIZE | BLOCK_JOB_MANUAL_DISMISS);
+    block_job_ref(job);
+    assert(job->status == BLOCK_JOB_STATUS_CREATED);
+    s = container_of(job, CancelJob, common);
+    s->blk = blk;
+
+    *pjob = job;
+    return s;
+}
+
+static void cancel_common(CancelJob *s)
+{
+    BlockJob *job = &s->common;
+    BlockBackend *blk = s->blk;
+    BlockJobStatus sts = job->status;
+
+    block_job_cancel_sync(job);
+    if ((sts != BLOCK_JOB_STATUS_CREATED) &&
+        (sts != BLOCK_JOB_STATUS_CONCLUDED)) {
+        BlockJob *dummy = job;
+        block_job_dismiss(&dummy, &error_abort);
+    }
+    assert(job->status == BLOCK_JOB_STATUS_NULL);
+    block_job_unref(job);
+    destroy_blk(blk);
+}
+
+static void test_cancel_created(void)
+{
+    BlockJob *job;
+    CancelJob *s;
+
+    s = create_common(&job);
+    cancel_common(s);
+}
+
+static void test_cancel_running(void)
+{
+    BlockJob *job;
+    CancelJob *s;
+
+    s = create_common(&job);
+
+    block_job_start(job);
+    assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+    cancel_common(s);
+}
+
+static void test_cancel_paused(void)
+{
+    BlockJob *job;
+    CancelJob *s;
+
+    s = create_common(&job);
+
+    block_job_start(job);
+    assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+    block_job_user_pause(job, &error_abort);
+    block_job_enter(job);
+    assert(job->status == BLOCK_JOB_STATUS_PAUSED);
+
+    cancel_common(s);
+}
+
+static void test_cancel_ready(void)
+{
+    BlockJob *job;
+    CancelJob *s;
+
+    s = create_common(&job);
+
+    block_job_start(job);
+    assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+    s->should_converge = true;
+    block_job_enter(job);
+    assert(job->status == BLOCK_JOB_STATUS_READY);
+
+    cancel_common(s);
+}
+
+static void test_cancel_standby(void)
+{
+    BlockJob *job;
+    CancelJob *s;
+
+    s = create_common(&job);
+
+    block_job_start(job);
+    assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+    s->should_converge = true;
+    block_job_enter(job);
+    assert(job->status == BLOCK_JOB_STATUS_READY);
+
+    block_job_user_pause(job, &error_abort);
+    block_job_enter(job);
+    assert(job->status == BLOCK_JOB_STATUS_STANDBY);
+
+    cancel_common(s);
+}
+
+static void test_cancel_pending(void)
+{
+    BlockJob *job;
+    CancelJob *s;
+
+    s = create_common(&job);
+
+    block_job_start(job);
+    assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+    s->should_converge = true;
+    block_job_enter(job);
+    assert(job->status == BLOCK_JOB_STATUS_READY);
+
+    block_job_complete(job, &error_abort);
+    block_job_enter(job);
+    while (!s->completed) {
+        aio_poll(qemu_get_aio_context(), true);
+    }
+    assert(job->status == BLOCK_JOB_STATUS_PENDING);
+
+    cancel_common(s);
+}
+
+static void test_cancel_concluded(void)
+{
+    BlockJob *job;
+    CancelJob *s;
+
+    s = create_common(&job);
+
+    block_job_start(job);
+    assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+    s->should_converge = true;
+    block_job_enter(job);
+    assert(job->status == BLOCK_JOB_STATUS_READY);
+
+    block_job_complete(job, &error_abort);
+    block_job_enter(job);
+    while (!s->completed) {
+        aio_poll(qemu_get_aio_context(), true);
+    }
+    assert(job->status == BLOCK_JOB_STATUS_PENDING);
+
+    block_job_finalize(job, &error_abort);
+    assert(job->status == BLOCK_JOB_STATUS_CONCLUDED);
+
+    cancel_common(s);
+}
+
 int main(int argc, char **argv)
 {
     qemu_init_main_loop(&error_abort);
@@ -149,5 +367,12 @@ int main(int argc, char **argv)
 
     g_test_init(&argc, &argv, NULL);
     g_test_add_func("/blockjob/ids", test_job_ids);
+    g_test_add_func("/blockjob/cancel/created", test_cancel_created);
+    g_test_add_func("/blockjob/cancel/running", test_cancel_running);
+    g_test_add_func("/blockjob/cancel/paused", test_cancel_paused);
+    g_test_add_func("/blockjob/cancel/ready", test_cancel_ready);
+    g_test_add_func("/blockjob/cancel/standby", test_cancel_standby);
+    g_test_add_func("/blockjob/cancel/pending", test_cancel_pending);
+    g_test_add_func("/blockjob/cancel/concluded", test_cancel_concluded);
     return g_test_run();
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state John Snow
@ 2018-03-12 15:28   ` Kevin Wolf
  2018-03-12 15:41     ` John Snow
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2018-03-12 15:28 UTC (permalink / raw)
  To: John Snow; +Cc: qemu-block, pkrempa, jtc, qemu-devel

Am 10.03.2018 um 09:27 hat John Snow geschrieben:
> Add a new state that specifically demarcates when we begin to permanently
> demolish a job after it has performed all work. This makes the transition
> explicit in the STM table and highlights conditions under which a job may
> be demolished.
> 
> Alongside this state, add a new helper command "block_job_decommission",
> which transitions to the NULL state and puts down our implicit reference.
> This separates instances in the code for "block_job_unref" which merely
> undo a matching "block_job_ref" with instances intended to initiate the
> full destruction of the object.
> 
> This decommission action also sets a number of fields to make sure that
> block internals or external users that are holding a reference to a job
> to see when it "finishes" are convinced that the job object is "done."
> This is necessary, for instance, to do a block_job_cancel_sync on a
> created object which will not make any progress.
> 
> Now, all jobs must go through block_job_decommission prior to being
> freed, giving us start-to-finish state machine coverage for jobs.
> 
> 
> Transitions:
> Created   -> Null: Early failure event before the job is started
> Concluded -> Null: Standard transition.
> 
> Verbs:
> None. This should not ever be visible to the monitor.
> 
>              +---------+
>              |UNDEFINED|
>              +--+------+
>                 |
>              +--v----+
>    +---------+CREATED+------------------+
>    |         +--+----+                  |
>    |            |                       |
>    |         +--v----+     +------+     |
>    +---------+RUNNING<----->PAUSED|     |
>    |         +--+-+--+     +------+     |
>    |            | |                     |
>    |            | +------------------+  |
>    |            |                    |  |
>    |         +--v--+       +-------+ |  |
>    +---------+READY<------->STANDBY| |  |
>    |         +--+--+       +-------+ |  |
>    |            |                    |  |
> +--v-----+   +--v------+             |  |
> |ABORTING+--->CONCLUDED<-------------+  |
> +--------+   +--+------+                |
>                 |                       |
>              +--v-+                     |
>              |NULL<---------------------+
>              +----+
> 
> Signed-off-by: John Snow <jsnow@redhat.com>

> +static void block_job_decommission(BlockJob *job)
> +{
> +    assert(job);
> +    job->completed = true;
> +    job->busy = false;
> +    job->paused = false;
> +    job->deferred_to_main_loop = true;

Why do we set all of these fields now? I don't see the use of it, and
overwriting fields here potentially makes debugging harder.

Especially for deferred_to_main_loop I might expect an assert() that it
already is true, but shouldn't setting it always be done while actually
deferring to the main loop?

Can we turn all of these assignments into asserts or are there some that
actually aren't already guaranteed, but that we want anyway?

> +    block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
> +    block_job_unref(job);
> +}

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
  2018-03-12 15:28   ` Kevin Wolf
@ 2018-03-12 15:41     ` John Snow
  2018-03-12 16:07       ` Kevin Wolf
  0 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2018-03-12 15:41 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, pkrempa, jtc, qemu-devel



On 03/12/2018 11:28 AM, Kevin Wolf wrote:
> Am 10.03.2018 um 09:27 hat John Snow geschrieben:
>> Add a new state that specifically demarcates when we begin to permanently
>> demolish a job after it has performed all work. This makes the transition
>> explicit in the STM table and highlights conditions under which a job may
>> be demolished.
>>
>> Alongside this state, add a new helper command "block_job_decommission",
>> which transitions to the NULL state and puts down our implicit reference.
>> This separates instances in the code for "block_job_unref" which merely
>> undo a matching "block_job_ref" with instances intended to initiate the
>> full destruction of the object.
>>
>> This decommission action also sets a number of fields to make sure that
>> block internals or external users that are holding a reference to a job
>> to see when it "finishes" are convinced that the job object is "done."
>> This is necessary, for instance, to do a block_job_cancel_sync on a
>> created object which will not make any progress.
>>
>> Now, all jobs must go through block_job_decommission prior to being
>> freed, giving us start-to-finish state machine coverage for jobs.
>>
>>
>> Transitions:
>> Created   -> Null: Early failure event before the job is started
>> Concluded -> Null: Standard transition.
>>
>> Verbs:
>> None. This should not ever be visible to the monitor.
>>
>>              +---------+
>>              |UNDEFINED|
>>              +--+------+
>>                 |
>>              +--v----+
>>    +---------+CREATED+------------------+
>>    |         +--+----+                  |
>>    |            |                       |
>>    |         +--v----+     +------+     |
>>    +---------+RUNNING<----->PAUSED|     |
>>    |         +--+-+--+     +------+     |
>>    |            | |                     |
>>    |            | +------------------+  |
>>    |            |                    |  |
>>    |         +--v--+       +-------+ |  |
>>    +---------+READY<------->STANDBY| |  |
>>    |         +--+--+       +-------+ |  |
>>    |            |                    |  |
>> +--v-----+   +--v------+             |  |
>> |ABORTING+--->CONCLUDED<-------------+  |
>> +--------+   +--+------+                |
>>                 |                       |
>>              +--v-+                     |
>>              |NULL<---------------------+
>>              +----+
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
> 
>> +static void block_job_decommission(BlockJob *job)
>> +{
>> +    assert(job);
>> +    job->completed = true;
>> +    job->busy = false;
>> +    job->paused = false;
>> +    job->deferred_to_main_loop = true;
> 
> Why do we set all of these fields now? I don't see the use of it, and
> overwriting fields here potentially makes debugging harder.
> 
> Especially for deferred_to_main_loop I might expect an assert() that it
> already is true, but shouldn't setting it always be done while actually
> deferring to the main loop?
> 
> Can we turn all of these assignments into asserts or are there some that
> actually aren't already guaranteed, but that we want anyway?
> 
>> +    block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
>> +    block_job_unref(job);
>> +}
> 
> Kevin
> 

Gonna be real honest; we probably only need to set maybe one field
(job->completed = true) but it was late and I started hitting things
with big hammers.

The problem is that if jobs do not look "done" to functions like
finish_sync, they will loop forever trying to make progress on a job
that doesn't do anything.

I set a bunch of fields here more as a semantic statement than a
necessity, to be really really honest. ("Well, the job definitely has
these properties if it made it here, so let's update these fields to be
correct and the rest of the code will hopefully Do The Right Thing.")

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
  2018-03-12 15:41     ` John Snow
@ 2018-03-12 16:07       ` Kevin Wolf
  2018-03-12 16:23         ` John Snow
  0 siblings, 1 reply; 35+ messages in thread
From: Kevin Wolf @ 2018-03-12 16:07 UTC (permalink / raw)
  To: John Snow; +Cc: qemu-block, pkrempa, jtc, qemu-devel

Am 12.03.2018 um 16:41 hat John Snow geschrieben:
> On 03/12/2018 11:28 AM, Kevin Wolf wrote:
> > Am 10.03.2018 um 09:27 hat John Snow geschrieben:
> >> Add a new state that specifically demarcates when we begin to permanently
> >> demolish a job after it has performed all work. This makes the transition
> >> explicit in the STM table and highlights conditions under which a job may
> >> be demolished.
> >>
> >> Alongside this state, add a new helper command "block_job_decommission",
> >> which transitions to the NULL state and puts down our implicit reference.
> >> This separates instances in the code for "block_job_unref" which merely
> >> undo a matching "block_job_ref" with instances intended to initiate the
> >> full destruction of the object.
> >>
> >> This decommission action also sets a number of fields to make sure that
> >> block internals or external users that are holding a reference to a job
> >> to see when it "finishes" are convinced that the job object is "done."
> >> This is necessary, for instance, to do a block_job_cancel_sync on a
> >> created object which will not make any progress.
> >>
> >> Now, all jobs must go through block_job_decommission prior to being
> >> freed, giving us start-to-finish state machine coverage for jobs.
> >>
> >>
> >> Transitions:
> >> Created   -> Null: Early failure event before the job is started
> >> Concluded -> Null: Standard transition.
> >>
> >> Verbs:
> >> None. This should not ever be visible to the monitor.
> >>
> >>              +---------+
> >>              |UNDEFINED|
> >>              +--+------+
> >>                 |
> >>              +--v----+
> >>    +---------+CREATED+------------------+
> >>    |         +--+----+                  |
> >>    |            |                       |
> >>    |         +--v----+     +------+     |
> >>    +---------+RUNNING<----->PAUSED|     |
> >>    |         +--+-+--+     +------+     |
> >>    |            | |                     |
> >>    |            | +------------------+  |
> >>    |            |                    |  |
> >>    |         +--v--+       +-------+ |  |
> >>    +---------+READY<------->STANDBY| |  |
> >>    |         +--+--+       +-------+ |  |
> >>    |            |                    |  |
> >> +--v-----+   +--v------+             |  |
> >> |ABORTING+--->CONCLUDED<-------------+  |
> >> +--------+   +--+------+                |
> >>                 |                       |
> >>              +--v-+                     |
> >>              |NULL<---------------------+
> >>              +----+
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com>
> > 
> >> +static void block_job_decommission(BlockJob *job)
> >> +{
> >> +    assert(job);
> >> +    job->completed = true;
> >> +    job->busy = false;
> >> +    job->paused = false;
> >> +    job->deferred_to_main_loop = true;
> > 
> > Why do we set all of these fields now? I don't see the use of it, and
> > overwriting fields here potentially makes debugging harder.
> > 
> > Especially for deferred_to_main_loop I might expect an assert() that it
> > already is true, but shouldn't setting it always be done while actually
> > deferring to the main loop?
> > 
> > Can we turn all of these assignments into asserts or are there some that
> > actually aren't already guaranteed, but that we want anyway?
> > 
> >> +    block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
> >> +    block_job_unref(job);
> >> +}
> > 
> > Kevin
> > 
> 
> Gonna be real honest; we probably only need to set maybe one field
> (job->completed = true) but it was late and I started hitting things
> with big hammers.
> 
> The problem is that if jobs do not look "done" to functions like
> finish_sync, they will loop forever trying to make progress on a job
> that doesn't do anything.
> 
> I set a bunch of fields here more as a semantic statement than a
> necessity, to be really really honest. ("Well, the job definitely has
> these properties if it made it here, so let's update these fields to be
> correct and the rest of the code will hopefully Do The Right Thing.")

So essentially, we want this to be assert(), but currently that breaks
for some reasons and we can't figure out why before the freeze?

I guess that's fair enough, but then it would be good to use the freeze
period to find the offenders and actually turn it into assertions.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (20 preceding siblings ...)
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 21/21] tests/test-blockjob: test cancellations John Snow
@ 2018-03-12 16:23 ` Kevin Wolf
  2018-03-12 17:51 ` no-reply
  2018-04-17 13:44 ` Markus Armbruster
  23 siblings, 0 replies; 35+ messages in thread
From: Kevin Wolf @ 2018-03-12 16:23 UTC (permalink / raw)
  To: John Snow; +Cc: qemu-block, pkrempa, jtc, qemu-devel

Am 10.03.2018 um 09:27 hat John Snow geschrieben:
> This series seeks to address two distinct but closely related issues
> concerning the job management API.
> 
> (1) For jobs that complete when a monitor is not attached and receiving
>     events or notifications, there's no way to discern the job's final
>     return code. Jobs must remain in the query list until dismissed
>     for reliable management.
> 
> (2) Jobs that change the block graph structure at an indeterminate point
>     after the job starts compete with the management layer that relies
>     on that graph structure to issue meaningful commands.
> 
>     This structure should change only at the behest of the management
>     API, and not asynchronously at unknown points in time. Before a job
>     issues such changes, it must rely on explicit and synchronous
>     confirmation from the management API.
> 
> These changes are implemented by formalizing a State Transition Machine
> for the BlockJob subsystem.

Thanks, applied to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
  2018-03-12 16:07       ` Kevin Wolf
@ 2018-03-12 16:23         ` John Snow
  0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2018-03-12 16:23 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, pkrempa, jtc, qemu-devel



On 03/12/2018 12:07 PM, Kevin Wolf wrote:
> Am 12.03.2018 um 16:41 hat John Snow geschrieben:
>> On 03/12/2018 11:28 AM, Kevin Wolf wrote:
>>> Am 10.03.2018 um 09:27 hat John Snow geschrieben:
>>>> Add a new state that specifically demarcates when we begin to permanently
>>>> demolish a job after it has performed all work. This makes the transition
>>>> explicit in the STM table and highlights conditions under which a job may
>>>> be demolished.
>>>>
>>>> Alongside this state, add a new helper command "block_job_decommission",
>>>> which transitions to the NULL state and puts down our implicit reference.
>>>> This separates instances in the code for "block_job_unref" which merely
>>>> undo a matching "block_job_ref" with instances intended to initiate the
>>>> full destruction of the object.
>>>>
>>>> This decommission action also sets a number of fields to make sure that
>>>> block internals or external users that are holding a reference to a job
>>>> to see when it "finishes" are convinced that the job object is "done."
>>>> This is necessary, for instance, to do a block_job_cancel_sync on a
>>>> created object which will not make any progress.
>>>>
>>>> Now, all jobs must go through block_job_decommission prior to being
>>>> freed, giving us start-to-finish state machine coverage for jobs.
>>>>
>>>>
>>>> Transitions:
>>>> Created   -> Null: Early failure event before the job is started
>>>> Concluded -> Null: Standard transition.
>>>>
>>>> Verbs:
>>>> None. This should not ever be visible to the monitor.
>>>>
>>>>              +---------+
>>>>              |UNDEFINED|
>>>>              +--+------+
>>>>                 |
>>>>              +--v----+
>>>>    +---------+CREATED+------------------+
>>>>    |         +--+----+                  |
>>>>    |            |                       |
>>>>    |         +--v----+     +------+     |
>>>>    +---------+RUNNING<----->PAUSED|     |
>>>>    |         +--+-+--+     +------+     |
>>>>    |            | |                     |
>>>>    |            | +------------------+  |
>>>>    |            |                    |  |
>>>>    |         +--v--+       +-------+ |  |
>>>>    +---------+READY<------->STANDBY| |  |
>>>>    |         +--+--+       +-------+ |  |
>>>>    |            |                    |  |
>>>> +--v-----+   +--v------+             |  |
>>>> |ABORTING+--->CONCLUDED<-------------+  |
>>>> +--------+   +--+------+                |
>>>>                 |                       |
>>>>              +--v-+                     |
>>>>              |NULL<---------------------+
>>>>              +----+
>>>>
>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>>
>>>> +static void block_job_decommission(BlockJob *job)
>>>> +{
>>>> +    assert(job);
>>>> +    job->completed = true;
>>>> +    job->busy = false;
>>>> +    job->paused = false;
>>>> +    job->deferred_to_main_loop = true;
>>>
>>> Why do we set all of these fields now? I don't see the use of it, and
>>> overwriting fields here potentially makes debugging harder.
>>>
>>> Especially for deferred_to_main_loop I might expect an assert() that it
>>> already is true, but shouldn't setting it always be done while actually
>>> deferring to the main loop?
>>>
>>> Can we turn all of these assignments into asserts or are there some that
>>> actually aren't already guaranteed, but that we want anyway?
>>>
>>>> +    block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
>>>> +    block_job_unref(job);
>>>> +}
>>>
>>> Kevin
>>>
>>
>> Gonna be real honest; we probably only need to set maybe one field
>> (job->completed = true) but it was late and I started hitting things
>> with big hammers.
>>
>> The problem is that if jobs do not look "done" to functions like
>> finish_sync, they will loop forever trying to make progress on a job
>> that doesn't do anything.
>>
>> I set a bunch of fields here more as a semantic statement than a
>> necessity, to be really really honest. ("Well, the job definitely has
>> these properties if it made it here, so let's update these fields to be
>> correct and the rest of the code will hopefully Do The Right Thing.")
> 
> So essentially, we want this to be assert(), but currently that breaks
> for some reasons and we can't figure out why before the freeze?
> 

Nah, I knew exactly why it broke.

> I guess that's fair enough, but then it would be good to use the freeze
> period to find the offenders and actually turn it into assertions.
> 
> Kevin
> 

I appear to be horridly confused, and you haven't seen the intermediate
mess that caused my confusion. A veritable maelstrom of confusion. Mr
Babbage would not be able to rightly comprehend, &c.

Let's give this another shot.

I added that code at a time when my local branch was not calling
block_job_completed, because I declared in v4's STM that a pre-created
job "shall not pass go, and shall not collect $200" -- that CREATED jobs
should either go to RUNNING or NULL.

The discovery here is that directly decommissioning a created job
actually breaks finish_sync because it polls on the completed boolean,
which nothing ever sets. So, under the reasoning I gave you in my last
reply;

"I'm simply setting these booleans based on the facts of the state
machine at this point: we ARE completed, we AREN'T busy, we AREN'T
paused, and we have technically now deferred back to the main loop (we
never entered it.)"

this was enough for finish_sync to get out of the way, but there were
other problems with the approach -- a CREATED job has pre-2.12 called
the abort/commit callbacks, so I went back to the 2.11 style and added
the appropriate transition into the graph. I forgot to check if this
code was still necessary at that point.

So actually, as of right now, these lines in decommission are useless;
but they shouldn't be assertions ... they might fail in a few cases. for
instance, "deferred to main loop" won't be set when we cancel a CREATED job.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (21 preceding siblings ...)
  2018-03-12 16:23 ` [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management Kevin Wolf
@ 2018-03-12 17:51 ` no-reply
  2018-04-17 13:44 ` Markus Armbruster
  23 siblings, 0 replies; 35+ messages in thread
From: no-reply @ 2018-03-12 17:51 UTC (permalink / raw)
  To: jsnow; +Cc: famz, qemu-block, kwolf, pkrempa, jtc, qemu-devel

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180310082746.24198-1-jsnow@redhat.com
Subject: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/1520692378-1835-1-git-send-email-lidongchen@tencent.com -> patchew/1520692378-1835-1-git-send-email-lidongchen@tencent.com
 * [new tag]               patchew/1520716927-17068-1-git-send-email-zhangckid@gmail.com -> patchew/1520716927-17068-1-git-send-email-zhangckid@gmail.com
 * [new tag]               patchew/1520723090-22130-1-git-send-email-linux@roeck-us.net -> patchew/1520723090-22130-1-git-send-email-linux@roeck-us.net
 * [new tag]               patchew/1520754314-5969-1-git-send-email-zhangckid@gmail.com -> patchew/1520754314-5969-1-git-send-email-zhangckid@gmail.com
 * [new tag]               patchew/1520839658-20499-1-git-send-email-thuth@redhat.com -> patchew/1520839658-20499-1-git-send-email-thuth@redhat.com
 * [new tag]               patchew/152084575964.24079.3708480492746701627.stgit@bahia.lan -> patchew/152084575964.24079.3708480492746701627.stgit@bahia.lan
 * [new tag]               patchew/1520849818-6915-1-git-send-email-anton.nefedov@virtuozzo.com -> patchew/1520849818-6915-1-git-send-email-anton.nefedov@virtuozzo.com
 * [new tag]               patchew/1520850690-23245-1-git-send-email-abdallah.bouassida@lauterbach.com -> patchew/1520850690-23245-1-git-send-email-abdallah.bouassida@lauterbach.com
 t [tag update]            patchew/20180309165212.97144-1-vsementsov@virtuozzo.com -> patchew/20180309165212.97144-1-vsementsov@virtuozzo.com
 * [new tag]               patchew/20180309172713.26318-1-kwolf@redhat.com -> patchew/20180309172713.26318-1-kwolf@redhat.com
 t [tag update]            patchew/20180309175453.41548-1-dgilbert@redhat.com -> patchew/20180309175453.41548-1-dgilbert@redhat.com
 t [tag update]            patchew/20180309182202.31206-1-farosas@linux.vnet.ibm.com -> patchew/20180309182202.31206-1-farosas@linux.vnet.ibm.com
 * [new tag]               patchew/20180310082746.24198-1-jsnow@redhat.com -> patchew/20180310082746.24198-1-jsnow@redhat.com
 * [new tag]               patchew/20180310214554.157155-1-eblake@redhat.com -> patchew/20180310214554.157155-1-eblake@redhat.com
 * [new tag]               patchew/20180311201239.25506-1-nia.alarie@gmail.com -> patchew/20180311201239.25506-1-nia.alarie@gmail.com
 * [new tag]               patchew/20180312094308.21716-1-pbonzini@redhat.com -> patchew/20180312094308.21716-1-pbonzini@redhat.com
 * [new tag]               patchew/20180312104241.24965-1-kraxel@redhat.com -> patchew/20180312104241.24965-1-kraxel@redhat.com
 * [new tag]               patchew/20180312105941.15439-1-kraxel@redhat.com -> patchew/20180312105941.15439-1-kraxel@redhat.com
 * [new tag]               patchew/20180312110532.30967-1-kraxel@redhat.com -> patchew/20180312110532.30967-1-kraxel@redhat.com
Switched to a new branch 'test'
68abd0377d tests/test-blockjob: test cancellations
de58cfc7f7 iotests: test manual job dismissal
f39d922c64 blockjobs: Expose manual property
b488c52fc3 blockjobs: add block-job-finalize
eb5fcb71bb blockjobs: add PENDING status and event
0069db2cf3 blockjobs: add waiting status
9fed33f1b8 blockjobs: add prepare callback
4b3e0a2c89 blockjobs: add block_job_txn_apply function
f1c8da8844 blockjobs: add commit, abort, clean helpers
bde9d79765 blockjobs: ensure abort is called for cancelled jobs
7fb719574c blockjobs: add block_job_dismiss
960357f88d blockjobs: add NULL state
d55cd56bc8 blockjobs: add CONCLUDED state
5c9762922f blockjobs: add ABORTING state
fa1056efe6 blockjobs: add block_job_verb permission table
10f12bcc22 iotests: add pause_wait
0c88b0db61 blockjobs: add state transition table
53301418d9 blockjobs: add status enum
fca226d934 Blockjobs: documentation touchup
d8a93c4755 blockjobs: model single jobs as transactions
158eee13ab blockjobs: fix set-speed kick

=== OUTPUT BEGIN ===
Checking PATCH 1/21: blockjobs: fix set-speed kick...
Checking PATCH 2/21: blockjobs: model single jobs as transactions...
Checking PATCH 3/21: Blockjobs: documentation touchup...
Checking PATCH 4/21: blockjobs: add status enum...
Checking PATCH 5/21: blockjobs: add state transition table...
ERROR: space prohibited before open square bracket '['
#81: FILE: blockjob.c:48:
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#82: FILE: blockjob.c:49:
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#83: FILE: blockjob.c:50:
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0},

ERROR: space prohibited before open square bracket '['
#84: FILE: blockjob.c:51:
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#85: FILE: blockjob.c:52:
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1},

ERROR: space prohibited before open square bracket '['
#86: FILE: blockjob.c:53:
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},

total: 6 errors, 0 warnings, 88 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 6/21: iotests: add pause_wait...
Checking PATCH 7/21: blockjobs: add block_job_verb permission table...
Checking PATCH 8/21: blockjobs: add ABORTING state...
ERROR: space prohibited before open square bracket '['
#64: FILE: blockjob.c:48:
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#65: FILE: blockjob.c:49:
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1},

ERROR: space prohibited before open square bracket '['
#66: FILE: blockjob.c:50:
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1},

ERROR: space prohibited before open square bracket '['
#67: FILE: blockjob.c:51:
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#68: FILE: blockjob.c:52:
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1},

ERROR: space prohibited before open square bracket '['
#69: FILE: blockjob.c:53:
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0},

ERROR: space prohibited before open square bracket '['
#70: FILE: blockjob.c:54:
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0},

total: 7 errors, 0 warnings, 62 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 9/21: blockjobs: add CONCLUDED state...
ERROR: space prohibited before open square bracket '['
#63: FILE: blockjob.c:48:
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#64: FILE: blockjob.c:49:
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0},

ERROR: space prohibited before open square bracket '['
#65: FILE: blockjob.c:50:
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1},

ERROR: space prohibited before open square bracket '['
#66: FILE: blockjob.c:51:
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#67: FILE: blockjob.c:52:
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1},

ERROR: space prohibited before open square bracket '['
#68: FILE: blockjob.c:53:
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#69: FILE: blockjob.c:54:
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1},

ERROR: space prohibited before open square bracket '['
#70: FILE: blockjob.c:55:
+    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0},

total: 8 errors, 0 warnings, 85 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 10/21: blockjobs: add NULL state...
ERROR: space prohibited before open square bracket '['
#80: FILE: blockjob.c:48:
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#81: FILE: blockjob.c:49:
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0, 1},

ERROR: space prohibited before open square bracket '['
#82: FILE: blockjob.c:50:
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0},

ERROR: space prohibited before open square bracket '['
#83: FILE: blockjob.c:51:
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#84: FILE: blockjob.c:52:
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1, 0},

ERROR: space prohibited before open square bracket '['
#85: FILE: blockjob.c:53:
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#86: FILE: blockjob.c:54:
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},

ERROR: space prohibited before open square bracket '['
#87: FILE: blockjob.c:55:
+    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},

ERROR: space prohibited before open square bracket '['
#88: FILE: blockjob.c:56:
+    /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0},

total: 9 errors, 0 warnings, 104 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 11/21: blockjobs: add block_job_dismiss...
Checking PATCH 12/21: blockjobs: ensure abort is called for cancelled jobs...
ERROR: space prohibited before open square bracket '['
#75: FILE: blockjob.c:54:
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 1, 1, 0},

total: 1 errors, 0 warnings, 58 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 13/21: blockjobs: add commit, abort, clean helpers...
Checking PATCH 14/21: blockjobs: add block_job_txn_apply function...
Checking PATCH 15/21: blockjobs: add prepare callback...
Checking PATCH 16/21: blockjobs: add waiting status...
ERROR: space prohibited before open square bracket '['
#80: FILE: blockjob.c:48:
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#81: FILE: blockjob.c:49:
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 0, 1, 0, 1},

ERROR: space prohibited before open square bracket '['
#82: FILE: blockjob.c:50:
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0, 0},

ERROR: space prohibited before open square bracket '['
#83: FILE: blockjob.c:51:
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#84: FILE: blockjob.c:52:
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 1, 0, 0},

ERROR: space prohibited before open square bracket '['
#85: FILE: blockjob.c:53:
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#86: FILE: blockjob.c:54:
+    /* W: */ [BLOCK_JOB_STATUS_WAITING]   = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},

ERROR: space prohibited before open square bracket '['
#87: FILE: blockjob.c:55:
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},

ERROR: space prohibited before open square bracket '['
#88: FILE: blockjob.c:56:
+    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1},

ERROR: space prohibited before open square bracket '['
#89: FILE: blockjob.c:57:
+    /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0},

total: 10 errors, 0 warnings, 70 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 17/21: blockjobs: add PENDING status and event...
ERROR: space prohibited before open square bracket '['
#84: FILE: blockjob.c:48:
+    /* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#85: FILE: blockjob.c:49:
+    /* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1},

ERROR: space prohibited before open square bracket '['
#86: FILE: blockjob.c:50:
+    /* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0},

ERROR: space prohibited before open square bracket '['
#87: FILE: blockjob.c:51:
+    /* P: */ [BLOCK_JOB_STATUS_PAUSED]    = {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#88: FILE: blockjob.c:52:
+    /* Y: */ [BLOCK_JOB_STATUS_READY]     = {0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0},

ERROR: space prohibited before open square bracket '['
#89: FILE: blockjob.c:53:
+    /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0},

ERROR: space prohibited before open square bracket '['
#90: FILE: blockjob.c:54:
+    /* W: */ [BLOCK_JOB_STATUS_WAITING]   = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0},

ERROR: space prohibited before open square bracket '['
#91: FILE: blockjob.c:55:
+    /* D: */ [BLOCK_JOB_STATUS_PENDING]   = {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0},

ERROR: space prohibited before open square bracket '['
#92: FILE: blockjob.c:56:
+    /* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0},

ERROR: space prohibited before open square bracket '['
#93: FILE: blockjob.c:57:
+    /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1},

ERROR: space prohibited before open square bracket '['
#94: FILE: blockjob.c:58:
+    /* N: */ [BLOCK_JOB_STATUS_NULL]      = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},

total: 11 errors, 0 warnings, 185 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 18/21: blockjobs: add block-job-finalize...
Checking PATCH 19/21: blockjobs: Expose manual property...
Checking PATCH 20/21: iotests: test manual job dismissal...
Checking PATCH 21/21: tests/test-blockjob: test cancellations...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/21] blockjobs: fix set-speed kick
  2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 01/21] blockjobs: fix set-speed kick John Snow
@ 2018-03-12 18:13   ` Jeff Cody
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff Cody @ 2018-03-12 18:13 UTC (permalink / raw)
  To: John Snow; +Cc: qemu-block, kwolf, pkrempa, qemu-devel

On Sat, Mar 10, 2018 at 03:27:26AM -0500, John Snow wrote:
> If speed is '0' it's not actually "less than" the previous speed.
> Kick the job in this case too.
> 
> Signed-off-by: John Snow <jsnow@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  blockjob.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 801d29d849..afd92db01f 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -499,7 +499,7 @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
>      }
>  
>      job->speed = speed;
> -    if (speed <= old_speed) {
> +    if (speed && speed <= old_speed) {
>          return;
>      }
>  
> -- 
> 2.14.3
> 

Reviewed-by: Jeff Cody <jcody@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
  2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
                   ` (22 preceding siblings ...)
  2018-03-12 17:51 ` no-reply
@ 2018-04-17 13:44 ` Markus Armbruster
  2018-04-17 13:47   ` Markus Armbruster
  2018-04-17 16:56   ` John Snow
  23 siblings, 2 replies; 35+ messages in thread
From: Markus Armbruster @ 2018-04-17 13:44 UTC (permalink / raw)
  To: John Snow; +Cc: qemu-block, kwolf, pkrempa, jtc, qemu-devel

John Snow <jsnow@redhat.com> writes:

> This series seeks to address two distinct but closely related issues
> concerning the job management API.
>
> (1) For jobs that complete when a monitor is not attached and receiving
>     events or notifications, there's no way to discern the job's final
>     return code. Jobs must remain in the query list until dismissed
>     for reliable management.
>
> (2) Jobs that change the block graph structure at an indeterminate point
>     after the job starts compete with the management layer that relies
>     on that graph structure to issue meaningful commands.
>
>     This structure should change only at the behest of the management
>     API, and not asynchronously at unknown points in time. Before a job
>     issues such changes, it must rely on explicit and synchronous
>     confirmation from the management API.
>
> These changes are implemented by formalizing a State Transition Machine
> for the BlockJob subsystem.
>
> Job States:
>
> UNDEFINED       Default state. Internal state only.
> CREATED         Job has been created
> RUNNING         Job has been started and is running
> PAUSED          Job is not ready and has been paused
> READY           Job is ready and is running
> STANDBY         Job is ready and is paused
>
> WAITING         Job is waiting on peers in transaction
> PENDING         Job is waiting on ACK from QMP
> ABORTING        Job is aborting or has been cancelled
> CONCLUDED       Job has finished and has a retcode available
> NULL            Job is being dismantled. Internal state only.
>
> Job Verbs:
>
> CANCEL          Instructs a running job to terminate with error,
>                 (Except when that job is READY, which produces no error.)
> PAUSE           Request a job to pause.
> RESUME          Request a job to resume from a pause.
> SET-SPEED       Change the speed limiting parameter of a job.
> COMPLETE        Ask a READY job to finish and exit.
>
> FINALIZE        Ask a PENDING job to perform its graph finalization.
> DISMISS         Finish cleaning up an empty job.

For each job verb and job state: what's the new job state?

> And here's my stab at a diagram:
>
>                  +---------+
>                  |UNDEFINED|
>                  +--+------+
>                     |
>                  +--v----+
>        +---------+CREATED+-----------------+
>        |         +--+----+                 |
>        |            |                      |
>        |         +--+----+     +------+    |
>        +---------+RUNNING<----->PAUSED|    |
>        |         +--+-+--+     +------+    |
>        |            | |                    |
>        |            | +------------------+ |
>        |            |                    | |
>        |         +--v--+       +-------+ | |
>        +---------+READY<------->STANDBY| | |
>        |         +--+--+       +-------+ | |
>        |            |                    | |
>        |         +--v----+               | |
>        +---------+WAITING+---------------+ |
>        |         +--+----+                 |
>        |            |                      |
>        |         +--v----+                 |
>        +---------+PENDING|                 |
>        |         +--+----+                 |
>        |            |                      |
>     +--v-----+   +--v------+               |
>     |ABORTING+--->CONCLUDED|               |
>     +--------+   +--+------+               |
>                     |                      |
>                  +--v-+                    |
>                  |NULL+--------------------+
>                  +----+

Is this diagram missing a few arrowheads?  E.g. on the edge between
RUNNING and WAITING.

Might push the limits of ASCII art, but here goes anyway: can we label
the arrows with job verbs?

Can you briefly explain how this state machine addresses (1) and (2)?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
  2018-04-17 13:44 ` Markus Armbruster
@ 2018-04-17 13:47   ` Markus Armbruster
  2018-04-17 16:56   ` John Snow
  1 sibling, 0 replies; 35+ messages in thread
From: Markus Armbruster @ 2018-04-17 13:47 UTC (permalink / raw)
  To: John Snow; +Cc: qemu-block, kwolf, pkrempa, jtc, qemu-devel

Forgot to mention: yes, I know this has been merged already.  I'm merely
trying to catch up with recent block layer progress.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
  2018-04-17 13:44 ` Markus Armbruster
  2018-04-17 13:47   ` Markus Armbruster
@ 2018-04-17 16:56   ` John Snow
  2018-04-18  7:25     ` Markus Armbruster
  1 sibling, 1 reply; 35+ messages in thread
From: John Snow @ 2018-04-17 16:56 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-block, kwolf, pkrempa, jtc, qemu-devel



On 04/17/2018 09:44 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> This series seeks to address two distinct but closely related issues
>> concerning the job management API.
>>
>> (1) For jobs that complete when a monitor is not attached and receiving
>>     events or notifications, there's no way to discern the job's final
>>     return code. Jobs must remain in the query list until dismissed
>>     for reliable management.
>>
>> (2) Jobs that change the block graph structure at an indeterminate point
>>     after the job starts compete with the management layer that relies
>>     on that graph structure to issue meaningful commands.
>>
>>     This structure should change only at the behest of the management
>>     API, and not asynchronously at unknown points in time. Before a job
>>     issues such changes, it must rely on explicit and synchronous
>>     confirmation from the management API.
>>
>> These changes are implemented by formalizing a State Transition Machine
>> for the BlockJob subsystem.
>>
>> Job States:
>>
>> UNDEFINED       Default state. Internal state only.
>> CREATED         Job has been created
>> RUNNING         Job has been started and is running
>> PAUSED          Job is not ready and has been paused
>> READY           Job is ready and is running
>> STANDBY         Job is ready and is paused
>>
>> WAITING         Job is waiting on peers in transaction
>> PENDING         Job is waiting on ACK from QMP
>> ABORTING        Job is aborting or has been cancelled
>> CONCLUDED       Job has finished and has a retcode available
>> NULL            Job is being dismantled. Internal state only.
>>
>> Job Verbs:
>>

Backporting your quote up here:

> For each job verb and job state: what's the new job state?
>

That's not always 1:1, though I tried to address it in the commit messages.


>> CANCEL          Instructs a running job to terminate with error,
>>                 (Except when that job is READY, which produces no error.)

CANCEL will take a job to either NULL... (this is the early abort
pathway, prior to the job being fully realized.)

...or to ABORTING (from CREATED once it has fully realized the job, or
from RUNNING, READY, WAITING, or PENDING.)

>> PAUSE           Request a job to pause.

issued to RUNNING or READY, transitions to PAUSED or STANDBY respectively.

>> RESUME          Request a job to resume from a pause.

issued to PAUSED or STANDBY, transitions to RUNNING or READY respectively.

>> SET-SPEED       Change the speed limiting parameter of a job.

No run state change.

>> COMPLETE        Ask a READY job to finish and exit.
>>

Issued to a READY job, transitions to WAITING.

>> FINALIZE        Ask a PENDING job to perform its graph finalization.

Issued to a PENDING job, transitions to CONCLUDED.

>> DISMISS         Finish cleaning up an empty job.
> 

Issued to a CONCLUDED job, transitions to NULL.


>> And here's my stab at a diagram:
>>
>>                  +---------+
>>                  |UNDEFINED|
>>                  +--+------+
>>                     |
>>                  +--v----+
>>        +---------+CREATED+-----------------+
>>        |         +--+----+                 |
>>        |            |                      |
>>        |         +--+----+     +------+    |
>>        +---------+RUNNING<----->PAUSED|    |
>>        |         +--+-+--+     +------+    |
>>        |            | |                    |
>>        |            | +------------------+ |
>>        |            |                    | |
>>        |         +--v--+       +-------+ | |
>>        +---------+READY<------->STANDBY| | |
>>        |         +--+--+       +-------+ | |
>>        |            |                    | |
>>        |         +--v----+               | |
>>        +---------+WAITING<---------------+ |
>>        |         +--+----+                 |
>>        |            |                      |
>>        |         +--v----+                 |
>>        +---------+PENDING|                 |
>>        |         +--+----+                 |
>>        |            |                      |
>>     +--v-----+   +--v------+               |
>>     |ABORTING+--->CONCLUDED|               |
>>     +--------+   +--+------+               |
>>                     |                      |
>>                  +--v-+                    |
>>                  |NULL<--------------------+
>>                  +----+
> 
> Is this diagram missing a few arrowheads?  E.g. on the edge between
> RUNNING and WAITING.
> 

Apparently yes. :\

(Secretly fixed up in my reply.)

> Might push the limits of ASCII art, but here goes anyway: can we label
> the arrows with job verbs?
> 

Can you recommend a tool to help me do that? I've been using asciiflow
infinity (http://asciiflow.com) and it's not very good, but I don't have
anything better.

> Can you briefly explain how this state machine addresses (1) and (2)?
> 

(1) The CONCLUDED state allows jobs to persist in the job query list
after they would have disappeared in 2.11-era QEMU. This lets us query
for completion codes and to dismiss the job at our own leisure.

(2) The PENDING state allows jobs to wait in a nearly-completed state,
pending authorization from the QMP client to make graph changes.
Otherwise, the job has to asynchronously perform this cleanup and the
exact point in time is unknowable to the QMP client. By making a PENDING
state and a finalize callback (.prepare), we can make this portion of a
job's task synchronous.



"John, you added more than two states..."

Yup, this was to help simplify the existing state machine, believe it or
not. I modeled all jobs as transactions to eliminate different cleanup
routing and added two new interim states;

- WAITING
- ABORTING

to help make assertions about the valid transitions jobs can make. The
ABORTING state helps make it clear when a job is allowed to fail (and
emit QMP events related to such).

The WAITING state is simply advisory to help a client know that a job is
"finished" but cannot yet receive further instruction because of peers
in a transaction. This helps me to add nice QMP errors for any verbs
issued to such jobs. "Sorry pal, this job is waiting and can't hear you
right now!"

This kept the code cleaner than adding a bunch of very fragile boolean
error-checking pathways in dozens of helper functions to help avoid
illegal instructions on jobs not prepared to receive those instructions.

So these two new states don't help accomplish (1) or (2) strictly, but
they do facilitate the code additions that _do_ a lot less ugly.

--js

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
  2018-04-17 16:56   ` John Snow
@ 2018-04-18  7:25     ` Markus Armbruster
  2018-04-18 17:29       ` John Snow
  0 siblings, 1 reply; 35+ messages in thread
From: Markus Armbruster @ 2018-04-18  7:25 UTC (permalink / raw)
  To: John Snow; +Cc: kwolf, pkrempa, jtc, qemu-devel, qemu-block

John Snow <jsnow@redhat.com> writes:

> On 04/17/2018 09:44 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> This series seeks to address two distinct but closely related issues
>>> concerning the job management API.
>>>
>>> (1) For jobs that complete when a monitor is not attached and receiving
>>>     events or notifications, there's no way to discern the job's final
>>>     return code. Jobs must remain in the query list until dismissed
>>>     for reliable management.
>>>
>>> (2) Jobs that change the block graph structure at an indeterminate point
>>>     after the job starts compete with the management layer that relies
>>>     on that graph structure to issue meaningful commands.
>>>
>>>     This structure should change only at the behest of the management
>>>     API, and not asynchronously at unknown points in time. Before a job
>>>     issues such changes, it must rely on explicit and synchronous
>>>     confirmation from the management API.
>>>
>>> These changes are implemented by formalizing a State Transition Machine
>>> for the BlockJob subsystem.
>>>
>>> Job States:
>>>
>>> UNDEFINED       Default state. Internal state only.
>>> CREATED         Job has been created
>>> RUNNING         Job has been started and is running
>>> PAUSED          Job is not ready and has been paused
>>> READY           Job is ready and is running
>>> STANDBY         Job is ready and is paused
>>>
>>> WAITING         Job is waiting on peers in transaction
>>> PENDING         Job is waiting on ACK from QMP
>>> ABORTING        Job is aborting or has been cancelled
>>> CONCLUDED       Job has finished and has a retcode available
>>> NULL            Job is being dismantled. Internal state only.
>>>
>>> Job Verbs:
>>>
>
> Backporting your quote up here:
>
>> For each job verb and job state: what's the new job state?
>>
>
> That's not always 1:1, though I tried to address it in the commit messages.

Let me rephrase my question then.  For each job verb and job state: what
are the possible new job states?  If there's more than one, what's the
condition for each?

I appreciate commit messages explaining that, but having complete state
machine documentation in one place (a comment or in docs/) would be
nice, wouldn't it?

>>> CANCEL          Instructs a running job to terminate with error,
>>>                 (Except when that job is READY, which produces no error.)
>
> CANCEL will take a job to either NULL... (this is the early abort
> pathway, prior to the job being fully realized.)
>
> ...or to ABORTING (from CREATED once it has fully realized the job, or
> from RUNNING, READY, WAITING, or PENDING.)
>
>>> PAUSE           Request a job to pause.
>
> issued to RUNNING or READY, transitions to PAUSED or STANDBY respectively.
>
>>> RESUME          Request a job to resume from a pause.
>
> issued to PAUSED or STANDBY, transitions to RUNNING or READY respectively.
>
>>> SET-SPEED       Change the speed limiting parameter of a job.
>
> No run state change.
>
>>> COMPLETE        Ask a READY job to finish and exit.
>>>
>
> Issued to a READY job, transitions to WAITING.
>
>>> FINALIZE        Ask a PENDING job to perform its graph finalization.
>
> Issued to a PENDING job, transitions to CONCLUDED.
>
>>> DISMISS         Finish cleaning up an empty job.
>> 
>
> Issued to a CONCLUDED job, transitions to NULL.
>
>
>>> And here's my stab at a diagram:
>>>
>>>                  +---------+
>>>                  |UNDEFINED|
>>>                  +--+------+
>>>                     |
>>>                  +--v----+
>>>        +---------+CREATED+-----------------+
>>>        |         +--+----+                 |
>>>        |            |                      |
>>>        |         +--+----+     +------+    |
>>>        +---------+RUNNING<----->PAUSED|    |
>>>        |         +--+-+--+     +------+    |
>>>        |            | |                    |
>>>        |            | +------------------+ |
>>>        |            |                    | |
>>>        |         +--v--+       +-------+ | |
>>>        +---------+READY<------->STANDBY| | |
>>>        |         +--+--+       +-------+ | |
>>>        |            |                    | |
>>>        |         +--v----+               | |
>>>        +---------+WAITING<---------------+ |
>>>        |         +--+----+                 |
>>>        |            |                      |
>>>        |         +--v----+                 |
>>>        +---------+PENDING|                 |
>>>        |         +--+----+                 |
>>>        |            |                      |
>>>     +--v-----+   +--v------+               |
>>>     |ABORTING+--->CONCLUDED|               |
>>>     +--------+   +--+------+               |
>>>                     |                      |
>>>                  +--v-+                    |
>>>                  |NULL<--------------------+
>>>                  +----+
>> 
>> Is this diagram missing a few arrowheads?  E.g. on the edge between
>> RUNNING and WAITING.
>> 
>
> Apparently yes. :\
>
> (Secretly fixed up in my reply.)
>
>> Might push the limits of ASCII art, but here goes anyway: can we label
>> the arrows with job verbs?
>> 
>
> Can you recommend a tool to help me do that? I've been using asciiflow
> infinity (http://asciiflow.com) and it's not very good, but I don't have
> anything better.

I do my ASCII art in Emacs picture-mode.

>> Can you briefly explain how this state machine addresses (1) and (2)?
>> 
>
> (1) The CONCLUDED state allows jobs to persist in the job query list
> after they would have disappeared in 2.11-era QEMU. This lets us query
> for completion codes and to dismiss the job at our own leisure.

Got it.

> (2) The PENDING state allows jobs to wait in a nearly-completed state,
> pending authorization from the QMP client to make graph changes.
> Otherwise, the job has to asynchronously perform this cleanup and the
> exact point in time is unknowable to the QMP client. By making a PENDING
> state and a finalize callback (.prepare), we can make this portion of a
> job's task synchronous.

This provides for jobs modifying the graph on job completion.  It
doesn't provide for jobs modifying the graph while they run.  Fine with
me; we're not aware of a use for messing with the graph in the middle of
a job.

> "John, you added more than two states..."
>
> Yup, this was to help simplify the existing state machine, believe it or
> not. I modeled all jobs as transactions to eliminate different cleanup
> routing and added two new interim states;
>
> - WAITING
> - ABORTING
>
> to help make assertions about the valid transitions jobs can make. The
> ABORTING state helps make it clear when a job is allowed to fail (and
> emit QMP events related to such).
>
> The WAITING state is simply advisory to help a client know that a job is
> "finished" but cannot yet receive further instruction because of peers
> in a transaction. This helps me to add nice QMP errors for any verbs
> issued to such jobs. "Sorry pal, this job is waiting and can't hear you
> right now!"
>
> This kept the code cleaner than adding a bunch of very fragile boolean
> error-checking pathways in dozens of helper functions to help avoid
> illegal instructions on jobs not prepared to receive those instructions.
>
> So these two new states don't help accomplish (1) or (2) strictly, but
> they do facilitate the code additions that _do_ a lot less ugly.

Thanks!

Looks like a fine starting point for in-tree state machine documentation
:)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
  2018-04-18  7:25     ` Markus Armbruster
@ 2018-04-18 17:29       ` John Snow
  2018-04-18 17:42         ` Markus Armbruster
  0 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2018-04-18 17:29 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: kwolf, pkrempa, jtc, qemu-devel, qemu-block



On 04/18/2018 03:25 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> On 04/17/2018 09:44 AM, Markus Armbruster wrote:
>>> John Snow <jsnow@redhat.com> writes:
>>>
>>>> This series seeks to address two distinct but closely related issues
>>>> concerning the job management API.
>>>>
>>>> (1) For jobs that complete when a monitor is not attached and receiving
>>>>     events or notifications, there's no way to discern the job's final
>>>>     return code. Jobs must remain in the query list until dismissed
>>>>     for reliable management.
>>>>
>>>> (2) Jobs that change the block graph structure at an indeterminate point
>>>>     after the job starts compete with the management layer that relies
>>>>     on that graph structure to issue meaningful commands.
>>>>
>>>>     This structure should change only at the behest of the management
>>>>     API, and not asynchronously at unknown points in time. Before a job
>>>>     issues such changes, it must rely on explicit and synchronous
>>>>     confirmation from the management API.
>>>>
>>>> These changes are implemented by formalizing a State Transition Machine
>>>> for the BlockJob subsystem.
>>>>
>>>> Job States:
>>>>
>>>> UNDEFINED       Default state. Internal state only.
>>>> CREATED         Job has been created
>>>> RUNNING         Job has been started and is running
>>>> PAUSED          Job is not ready and has been paused
>>>> READY           Job is ready and is running
>>>> STANDBY         Job is ready and is paused
>>>>
>>>> WAITING         Job is waiting on peers in transaction
>>>> PENDING         Job is waiting on ACK from QMP
>>>> ABORTING        Job is aborting or has been cancelled
>>>> CONCLUDED       Job has finished and has a retcode available
>>>> NULL            Job is being dismantled. Internal state only.
>>>>
>>>> Job Verbs:
>>>>
>>
>> Backporting your quote up here:
>>
>>> For each job verb and job state: what's the new job state?
>>>
>>
>> That's not always 1:1, though I tried to address it in the commit messages.
> 
> Let me rephrase my question then.  For each job verb and job state: what
> are the possible new job states?  If there's more than one, what's the
> condition for each?
> 

Is my answer below not sufficient? Maybe you're asking "Can you write
this up in a formal document" instead, or did I miss explaining something?

> I appreciate commit messages explaining that, but having complete state
> machine documentation in one place (a comment or in docs/) would be
> nice, wouldn't it?
> 
>>>> CANCEL          Instructs a running job to terminate with error,
>>>>                 (Except when that job is READY, which produces no error.)
>>
>> CANCEL will take a job to either NULL... (this is the early abort
>> pathway, prior to the job being fully realized.)
>>
>> ...or to ABORTING (from CREATED once it has fully realized the job, or
>> from RUNNING, READY, WAITING, or PENDING.)
>>
>>>> PAUSE           Request a job to pause.
>>
>> issued to RUNNING or READY, transitions to PAUSED or STANDBY respectively.
>>
>>>> RESUME          Request a job to resume from a pause.
>>
>> issued to PAUSED or STANDBY, transitions to RUNNING or READY respectively.
>>
>>>> SET-SPEED       Change the speed limiting parameter of a job.
>>
>> No run state change.
>>
>>>> COMPLETE        Ask a READY job to finish and exit.
>>>>
>>
>> Issued to a READY job, transitions to WAITING.
>>
>>>> FINALIZE        Ask a PENDING job to perform its graph finalization.
>>
>> Issued to a PENDING job, transitions to CONCLUDED.
>>
>>>> DISMISS         Finish cleaning up an empty job.
>>>
>>
>> Issued to a CONCLUDED job, transitions to NULL.
>>
>>
>>>> And here's my stab at a diagram:
>>>>
>>>>                  +---------+
>>>>                  |UNDEFINED|
>>>>                  +--+------+
>>>>                     |
>>>>                  +--v----+
>>>>        +---------+CREATED+-----------------+
>>>>        |         +--+----+                 |
>>>>        |            |                      |
>>>>        |         +--+----+     +------+    |
>>>>        +---------+RUNNING<----->PAUSED|    |
>>>>        |         +--+-+--+     +------+    |
>>>>        |            | |                    |
>>>>        |            | +------------------+ |
>>>>        |            |                    | |
>>>>        |         +--v--+       +-------+ | |
>>>>        +---------+READY<------->STANDBY| | |
>>>>        |         +--+--+       +-------+ | |
>>>>        |            |                    | |
>>>>        |         +--v----+               | |
>>>>        +---------+WAITING<---------------+ |
>>>>        |         +--+----+                 |
>>>>        |            |                      |
>>>>        |         +--v----+                 |
>>>>        +---------+PENDING|                 |
>>>>        |         +--+----+                 |
>>>>        |            |                      |
>>>>     +--v-----+   +--v------+               |
>>>>     |ABORTING+--->CONCLUDED|               |
>>>>     +--------+   +--+------+               |
>>>>                     |                      |
>>>>                  +--v-+                    |
>>>>                  |NULL<--------------------+
>>>>                  +----+
>>>
>>> Is this diagram missing a few arrowheads?  E.g. on the edge between
>>> RUNNING and WAITING.
>>>
>>
>> Apparently yes. :\
>>
>> (Secretly fixed up in my reply.)
>>
>>> Might push the limits of ASCII art, but here goes anyway: can we label
>>> the arrows with job verbs?
>>>
>>
>> Can you recommend a tool to help me do that? I've been using asciiflow
>> infinity (http://asciiflow.com) and it's not very good, but I don't have
>> anything better.
> 
> I do my ASCII art in Emacs picture-mode.
> 
>>> Can you briefly explain how this state machine addresses (1) and (2)?
>>>
>>
>> (1) The CONCLUDED state allows jobs to persist in the job query list
>> after they would have disappeared in 2.11-era QEMU. This lets us query
>> for completion codes and to dismiss the job at our own leisure.
> 
> Got it.
> 
>> (2) The PENDING state allows jobs to wait in a nearly-completed state,
>> pending authorization from the QMP client to make graph changes.
>> Otherwise, the job has to asynchronously perform this cleanup and the
>> exact point in time is unknowable to the QMP client. By making a PENDING
>> state and a finalize callback (.prepare), we can make this portion of a
>> job's task synchronous.
> 
> This provides for jobs modifying the graph on job completion.  It
> doesn't provide for jobs modifying the graph while they run.  Fine with
> me; we're not aware of a use for messing with the graph in the middle of
> a job.
> 

I didn't consider this possibility. The concept could in theory be
expanded to arbitrary sync points, but I'm not going to worry about that
until the need arises.

>> "John, you added more than two states..."
>>
>> Yup, this was to help simplify the existing state machine, believe it or
>> not. I modeled all jobs as transactions to eliminate different cleanup
>> routing and added two new interim states;
>>
>> - WAITING
>> - ABORTING
>>
>> to help make assertions about the valid transitions jobs can make. The
>> ABORTING state helps make it clear when a job is allowed to fail (and
>> emit QMP events related to such).
>>
>> The WAITING state is simply advisory to help a client know that a job is
>> "finished" but cannot yet receive further instruction because of peers
>> in a transaction. This helps me to add nice QMP errors for any verbs
>> issued to such jobs. "Sorry pal, this job is waiting and can't hear you
>> right now!"
>>
>> This kept the code cleaner than adding a bunch of very fragile boolean
>> error-checking pathways in dozens of helper functions to help avoid
>> illegal instructions on jobs not prepared to receive those instructions.
>>
>> So these two new states don't help accomplish (1) or (2) strictly, but
>> they do facilitate the code additions that _do_ a lot less ugly.
> 

I really bungled that sentence.

> Thanks!
> 
> Looks like a fine starting point for in-tree state machine documentation
> :)
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
  2018-04-18 17:29       ` John Snow
@ 2018-04-18 17:42         ` Markus Armbruster
  0 siblings, 0 replies; 35+ messages in thread
From: Markus Armbruster @ 2018-04-18 17:42 UTC (permalink / raw)
  To: John Snow; +Cc: kwolf, pkrempa, jtc, qemu-devel, qemu-block

John Snow <jsnow@redhat.com> writes:

> On 04/18/2018 03:25 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> On 04/17/2018 09:44 AM, Markus Armbruster wrote:
>>>> John Snow <jsnow@redhat.com> writes:
>>>>
>>>>> This series seeks to address two distinct but closely related issues
>>>>> concerning the job management API.
>>>>>
>>>>> (1) For jobs that complete when a monitor is not attached and receiving
>>>>>     events or notifications, there's no way to discern the job's final
>>>>>     return code. Jobs must remain in the query list until dismissed
>>>>>     for reliable management.
>>>>>
>>>>> (2) Jobs that change the block graph structure at an indeterminate point
>>>>>     after the job starts compete with the management layer that relies
>>>>>     on that graph structure to issue meaningful commands.
>>>>>
>>>>>     This structure should change only at the behest of the management
>>>>>     API, and not asynchronously at unknown points in time. Before a job
>>>>>     issues such changes, it must rely on explicit and synchronous
>>>>>     confirmation from the management API.
>>>>>
>>>>> These changes are implemented by formalizing a State Transition Machine
>>>>> for the BlockJob subsystem.
>>>>>
>>>>> Job States:
>>>>>
>>>>> UNDEFINED       Default state. Internal state only.
>>>>> CREATED         Job has been created
>>>>> RUNNING         Job has been started and is running
>>>>> PAUSED          Job is not ready and has been paused
>>>>> READY           Job is ready and is running
>>>>> STANDBY         Job is ready and is paused
>>>>>
>>>>> WAITING         Job is waiting on peers in transaction
>>>>> PENDING         Job is waiting on ACK from QMP
>>>>> ABORTING        Job is aborting or has been cancelled
>>>>> CONCLUDED       Job has finished and has a retcode available
>>>>> NULL            Job is being dismantled. Internal state only.
>>>>>
>>>>> Job Verbs:
>>>>>
>>>
>>> Backporting your quote up here:
>>>
>>>> For each job verb and job state: what's the new job state?
>>>>
>>>
>>> That's not always 1:1, though I tried to address it in the commit messages.
>> 
>> Let me rephrase my question then.  For each job verb and job state: what
>> are the possible new job states?  If there's more than one, what's the
>> condition for each?
>> 
>
> Is my answer below not sufficient? Maybe you're asking "Can you write
> this up in a formal document" instead, or did I miss explaining something?

Your answer was fine.  I blame my one-pass reply writing.

Nevertheless:

>> I appreciate commit messages explaining that, but having complete state
>> machine documentation in one place (a comment or in docs/) would be
>> nice, wouldn't it?

Pretty-please?

>>>>> CANCEL          Instructs a running job to terminate with error,
>>>>>                 (Except when that job is READY, which produces no error.)
>>>
>>> CANCEL will take a job to either NULL... (this is the early abort
>>> pathway, prior to the job being fully realized.)
>>>
>>> ...or to ABORTING (from CREATED once it has fully realized the job, or
>>> from RUNNING, READY, WAITING, or PENDING.)
>>>
>>>>> PAUSE           Request a job to pause.
>>>
>>> issued to RUNNING or READY, transitions to PAUSED or STANDBY respectively.
>>>
>>>>> RESUME          Request a job to resume from a pause.
>>>
>>> issued to PAUSED or STANDBY, transitions to RUNNING or READY respectively.
>>>
>>>>> SET-SPEED       Change the speed limiting parameter of a job.
>>>
>>> No run state change.
>>>
>>>>> COMPLETE        Ask a READY job to finish and exit.
>>>>>
>>>
>>> Issued to a READY job, transitions to WAITING.
>>>
>>>>> FINALIZE        Ask a PENDING job to perform its graph finalization.
>>>
>>> Issued to a PENDING job, transitions to CONCLUDED.
>>>
>>>>> DISMISS         Finish cleaning up an empty job.
>>>>
>>>
>>> Issued to a CONCLUDED job, transitions to NULL.
>>>
>>>
>>>>> And here's my stab at a diagram:
>>>>>
>>>>>                  +---------+
>>>>>                  |UNDEFINED|
>>>>>                  +--+------+
>>>>>                     |
>>>>>                  +--v----+
>>>>>        +---------+CREATED+-----------------+
>>>>>        |         +--+----+                 |
>>>>>        |            |                      |
>>>>>        |         +--+----+     +------+    |
>>>>>        +---------+RUNNING<----->PAUSED|    |
>>>>>        |         +--+-+--+     +------+    |
>>>>>        |            | |                    |
>>>>>        |            | +------------------+ |
>>>>>        |            |                    | |
>>>>>        |         +--v--+       +-------+ | |
>>>>>        +---------+READY<------->STANDBY| | |
>>>>>        |         +--+--+       +-------+ | |
>>>>>        |            |                    | |
>>>>>        |         +--v----+               | |
>>>>>        +---------+WAITING<---------------+ |
>>>>>        |         +--+----+                 |
>>>>>        |            |                      |
>>>>>        |         +--v----+                 |
>>>>>        +---------+PENDING|                 |
>>>>>        |         +--+----+                 |
>>>>>        |            |                      |
>>>>>     +--v-----+   +--v------+               |
>>>>>     |ABORTING+--->CONCLUDED|               |
>>>>>     +--------+   +--+------+               |
>>>>>                     |                      |
>>>>>                  +--v-+                    |
>>>>>                  |NULL<--------------------+
>>>>>                  +----+
>>>>
>>>> Is this diagram missing a few arrowheads?  E.g. on the edge between
>>>> RUNNING and WAITING.
>>>>
>>>
>>> Apparently yes. :\
>>>
>>> (Secretly fixed up in my reply.)
>>>
>>>> Might push the limits of ASCII art, but here goes anyway: can we label
>>>> the arrows with job verbs?
>>>>
>>>
>>> Can you recommend a tool to help me do that? I've been using asciiflow
>>> infinity (http://asciiflow.com) and it's not very good, but I don't have
>>> anything better.
>> 
>> I do my ASCII art in Emacs picture-mode.
>> 
>>>> Can you briefly explain how this state machine addresses (1) and (2)?
>>>>
>>>
>>> (1) The CONCLUDED state allows jobs to persist in the job query list
>>> after they would have disappeared in 2.11-era QEMU. This lets us query
>>> for completion codes and to dismiss the job at our own leisure.
>> 
>> Got it.
>> 
>>> (2) The PENDING state allows jobs to wait in a nearly-completed state,
>>> pending authorization from the QMP client to make graph changes.
>>> Otherwise, the job has to asynchronously perform this cleanup and the
>>> exact point in time is unknowable to the QMP client. By making a PENDING
>>> state and a finalize callback (.prepare), we can make this portion of a
>>> job's task synchronous.
>> 
>> This provides for jobs modifying the graph on job completion.  It
>> doesn't provide for jobs modifying the graph while they run.  Fine with
>> me; we're not aware of a use for messing with the graph in the middle of
>> a job.
>> 
>
> I didn't consider this possibility. The concept could in theory be
> expanded to arbitrary sync points, but I'm not going to worry about that
> until the need arises.

Makes sense.

>>> "John, you added more than two states..."
>>>
>>> Yup, this was to help simplify the existing state machine, believe it or
>>> not. I modeled all jobs as transactions to eliminate different cleanup
>>> routing and added two new interim states;
>>>
>>> - WAITING
>>> - ABORTING
>>>
>>> to help make assertions about the valid transitions jobs can make. The
>>> ABORTING state helps make it clear when a job is allowed to fail (and
>>> emit QMP events related to such).
>>>
>>> The WAITING state is simply advisory to help a client know that a job is
>>> "finished" but cannot yet receive further instruction because of peers
>>> in a transaction. This helps me to add nice QMP errors for any verbs
>>> issued to such jobs. "Sorry pal, this job is waiting and can't hear you
>>> right now!"
>>>
>>> This kept the code cleaner than adding a bunch of very fragile boolean
>>> error-checking pathways in dozens of helper functions to help avoid
>>> illegal instructions on jobs not prepared to receive those instructions.
>>>
>>> So these two new states don't help accomplish (1) or (2) strictly, but
>>> they do facilitate the code additions that _do_ a lot less ugly.
>> 
>
> I really bungled that sentence.

No problem, I got it anyway :)

>> Thanks!
>> 
>> Looks like a fine starting point for in-tree state machine documentation
>> :)

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2018-04-18 17:42 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-10  8:27 [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 01/21] blockjobs: fix set-speed kick John Snow
2018-03-12 18:13   ` Jeff Cody
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 02/21] blockjobs: model single jobs as transactions John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 03/21] Blockjobs: documentation touchup John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 04/21] blockjobs: add status enum John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 05/21] blockjobs: add state transition table John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 06/21] iotests: add pause_wait John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 07/21] blockjobs: add block_job_verb permission table John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 08/21] blockjobs: add ABORTING state John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 09/21] blockjobs: add CONCLUDED state John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state John Snow
2018-03-12 15:28   ` Kevin Wolf
2018-03-12 15:41     ` John Snow
2018-03-12 16:07       ` Kevin Wolf
2018-03-12 16:23         ` John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 11/21] blockjobs: add block_job_dismiss John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 12/21] blockjobs: ensure abort is called for cancelled jobs John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 13/21] blockjobs: add commit, abort, clean helpers John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 14/21] blockjobs: add block_job_txn_apply function John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 15/21] blockjobs: add prepare callback John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 16/21] blockjobs: add waiting status John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 17/21] blockjobs: add PENDING status and event John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 18/21] blockjobs: add block-job-finalize John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 19/21] blockjobs: Expose manual property John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 20/21] iotests: test manual job dismissal John Snow
2018-03-10  8:27 ` [Qemu-devel] [PATCH v5 21/21] tests/test-blockjob: test cancellations John Snow
2018-03-12 16:23 ` [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management Kevin Wolf
2018-03-12 17:51 ` no-reply
2018-04-17 13:44 ` Markus Armbruster
2018-04-17 13:47   ` Markus Armbruster
2018-04-17 16:56   ` John Snow
2018-04-18  7:25     ` Markus Armbruster
2018-04-18 17:29       ` John Snow
2018-04-18 17:42         ` Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.