All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] mirror: cancel nbd reconnect
@ 2021-02-05 16:37 Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 01/10] block: add new BlockDriver handler: bdrv_cancel_in_flight Vladimir Sementsov-Ogievskiy
                   ` (10 more replies)
  0 siblings, 11 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

Hi all!

The problem

Assume we have mirror job with nbd target node with enabled reconnect.
Connection failed. So, all current requests to nbd node are waiting for
nbd driver to reconnect. And they will wait for reconnect-delay time
specified in nbd blockdev options. This timeout may be long enough, for
example, we in Virtuozzo use 300 seconds by default.

So, if at this moment user tries to cancel the job, job will wait for
its in-flight requests to finish up to 300 seconds. From the user point
of view, cancelling the job takes a long time. Bad.

Solution

Let's just cancel "waiting for reconnect in in-flight request coroutines"
on mirror (and backup) cancel. Welcome the series below.

v2: wording, rebase on master, add Eric's r-bs, update test-output in
    last commit

Vladimir Sementsov-Ogievskiy (10):
  block: add new BlockDriver handler: bdrv_cancel_in_flight
  block/nbd: implement .bdrv_cancel_in_flight
  block/raw-format: implement .bdrv_cancel_in_flight handler
  job: add .cancel handler for the driver
  block/mirror: implement .cancel job handler
  iotests/264: move to python unittest
  iotests.py: qemu_nbd_popen: remove pid file after use
  iotests/264: add mirror-cancel test-case
  block/backup: implement .cancel job handler
  iotests/264: add backup-cancel test-case

 include/block/block.h         |   3 +
 include/block/block_int.h     |   9 +++
 include/qemu/job.h            |   5 ++
 block/backup.c                |  10 +++
 block/io.c                    |  11 +++
 block/mirror.c                |   9 +++
 block/nbd.c                   |  15 ++++
 block/raw-format.c            |   6 ++
 job.c                         |   3 +
 tests/qemu-iotests/264        | 140 ++++++++++++++++++++++------------
 tests/qemu-iotests/264.out    |  20 ++---
 tests/qemu-iotests/iotests.py |   6 +-
 12 files changed, 172 insertions(+), 65 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 01/10] block: add new BlockDriver handler: bdrv_cancel_in_flight
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 02/10] block/nbd: implement .bdrv_cancel_in_flight Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

It will be used to stop retrying NBD requests on mirror cancel.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 include/block/block.h     |  3 +++
 include/block/block_int.h |  9 +++++++++
 block/io.c                | 11 +++++++++++
 3 files changed, 23 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 0a9f2c187c..2f2698074e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -849,4 +849,7 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, int64_t src_offset,
                                     BdrvChild *dst, int64_t dst_offset,
                                     int64_t bytes, BdrvRequestFlags read_flags,
                                     BdrvRequestFlags write_flags);
+
+void bdrv_cancel_in_flight(BlockDriverState *bs);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 22a2789d35..88e4111939 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -352,6 +352,15 @@ struct BlockDriver {
         bool want_zero, int64_t offset, int64_t bytes, int64_t *pnum,
         int64_t *map, BlockDriverState **file);
 
+    /*
+     * This informs the driver that we are no longer interested in the result
+     * of in-flight requests, so don't waste the time if possible.
+     *
+     * One example usage is to avoid waiting for an nbd target node reconnect
+     * timeout during job-cancel.
+     */
+    void (*bdrv_cancel_in_flight)(BlockDriverState *bs);
+
     /*
      * Invalidate any cached meta-data.
      */
diff --git a/block/io.c b/block/io.c
index b0435ed670..ca2dca3007 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3460,3 +3460,14 @@ out:
 
     return ret;
 }
+
+void bdrv_cancel_in_flight(BlockDriverState *bs)
+{
+    if (!bs || !bs->drv) {
+        return;
+    }
+
+    if (bs->drv->bdrv_cancel_in_flight) {
+        bs->drv->bdrv_cancel_in_flight(bs);
+    }
+}
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 02/10] block/nbd: implement .bdrv_cancel_in_flight
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 01/10] block: add new BlockDriver handler: bdrv_cancel_in_flight Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-12 17:14   ` Eric Blake
  2021-02-05 16:37 ` [PATCH v2 03/10] block/raw-format: implement .bdrv_cancel_in_flight handler Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

Just stop waiting for connection in existing requests.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/nbd.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index b3cbbeb4b0..c26dc5a54f 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -2458,6 +2458,18 @@ static const char *const nbd_strong_runtime_opts[] = {
     NULL
 };
 
+static void nbd_cancel_in_flight(BlockDriverState *bs)
+{
+    BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
+
+    reconnect_delay_timer_del(s);
+
+    if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
+        s->state = NBD_CLIENT_CONNECTING_NOWAIT;
+        qemu_co_queue_restart_all(&s->free_sema);
+    }
+}
+
 static BlockDriver bdrv_nbd = {
     .format_name                = "nbd",
     .protocol_name              = "nbd",
@@ -2484,6 +2496,7 @@ static BlockDriver bdrv_nbd = {
     .bdrv_co_block_status       = nbd_client_co_block_status,
     .bdrv_dirname               = nbd_dirname,
     .strong_runtime_opts        = nbd_strong_runtime_opts,
+    .bdrv_cancel_in_flight      = nbd_cancel_in_flight,
 };
 
 static BlockDriver bdrv_nbd_tcp = {
@@ -2512,6 +2525,7 @@ static BlockDriver bdrv_nbd_tcp = {
     .bdrv_co_block_status       = nbd_client_co_block_status,
     .bdrv_dirname               = nbd_dirname,
     .strong_runtime_opts        = nbd_strong_runtime_opts,
+    .bdrv_cancel_in_flight      = nbd_cancel_in_flight,
 };
 
 static BlockDriver bdrv_nbd_unix = {
@@ -2540,6 +2554,7 @@ static BlockDriver bdrv_nbd_unix = {
     .bdrv_co_block_status       = nbd_client_co_block_status,
     .bdrv_dirname               = nbd_dirname,
     .strong_runtime_opts        = nbd_strong_runtime_opts,
+    .bdrv_cancel_in_flight      = nbd_cancel_in_flight,
 };
 
 static void bdrv_nbd_init(void)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 03/10] block/raw-format: implement .bdrv_cancel_in_flight handler
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 01/10] block: add new BlockDriver handler: bdrv_cancel_in_flight Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 02/10] block/nbd: implement .bdrv_cancel_in_flight Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 04/10] job: add .cancel handler for the driver Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

We are going to cancel in-flight requests on mirror nbd target on job
cancel. Still nbd is often used not directly but as raw-format child.
So, add pass-through handler here.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/raw-format.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/block/raw-format.c b/block/raw-format.c
index 42ec50802b..7717578ed6 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -575,6 +575,11 @@ static const char *const raw_strong_runtime_opts[] = {
     NULL
 };
 
+static void raw_cancel_in_flight(BlockDriverState *bs)
+{
+    bdrv_cancel_in_flight(bs->file->bs);
+}
+
 BlockDriver bdrv_raw = {
     .format_name          = "raw",
     .instance_size        = sizeof(BDRVRawState),
@@ -608,6 +613,7 @@ BlockDriver bdrv_raw = {
     .bdrv_has_zero_init   = &raw_has_zero_init,
     .strong_runtime_opts  = raw_strong_runtime_opts,
     .mutable_opts         = mutable_opts,
+    .bdrv_cancel_in_flight = raw_cancel_in_flight,
 };
 
 static void bdrv_raw_init(void)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 04/10] job: add .cancel handler for the driver
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2021-02-05 16:37 ` [PATCH v2 03/10] block/raw-format: implement .bdrv_cancel_in_flight handler Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-12 17:21   ` Eric Blake
  2021-02-05 16:37 ` [PATCH v2 05/10] block/mirror: implement .cancel job handler Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

To be used in mirror in the following commit to cancel in-flight io on
target to not waste the time.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/qemu/job.h | 5 +++++
 job.c              | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 32aabb1c60..efc6fa7544 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -251,6 +251,11 @@ struct JobDriver {
      */
     void (*clean)(Job *job);
 
+    /**
+     * If the callback is not NULL, it will be invoked in job_cancel_async
+     */
+    void (*cancel)(Job *job);
+
 
     /** Called when the job is freed */
     void (*free)(Job *job);
diff --git a/job.c b/job.c
index 3aaaebafe2..289edee143 100644
--- a/job.c
+++ b/job.c
@@ -715,6 +715,9 @@ static int job_finalize_single(Job *job)
 
 static void job_cancel_async(Job *job, bool force)
 {
+    if (job->driver->cancel) {
+        job->driver->cancel(job);
+    }
     if (job->user_paused) {
         /* Do not call job_enter here, the caller will handle it.  */
         if (job->driver->user_resume) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 05/10] block/mirror: implement .cancel job handler
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2021-02-05 16:37 ` [PATCH v2 04/10] job: add .cancel handler for the driver Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 06/10] iotests/264: move to python unittest Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

Cancel in-flight io on target to not waste the time.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/mirror.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/block/mirror.c b/block/mirror.c
index 8e1ad6eceb..9faffe4707 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1179,6 +1179,14 @@ static bool mirror_drained_poll(BlockJob *job)
     return !!s->in_flight;
 }
 
+static void mirror_cancel(Job *job)
+{
+    MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job);
+    BlockDriverState *target = blk_bs(s->target);
+
+    bdrv_cancel_in_flight(target);
+}
+
 static const BlockJobDriver mirror_job_driver = {
     .job_driver = {
         .instance_size          = sizeof(MirrorBlockJob),
@@ -1190,6 +1198,7 @@ static const BlockJobDriver mirror_job_driver = {
         .abort                  = mirror_abort,
         .pause                  = mirror_pause,
         .complete               = mirror_complete,
+        .cancel                 = mirror_cancel,
     },
     .drained_poll           = mirror_drained_poll,
 };
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 06/10] iotests/264: move to python unittest
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2021-02-05 16:37 ` [PATCH v2 05/10] block/mirror: implement .cancel job handler Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 07/10] iotests.py: qemu_nbd_popen: remove pid file after use Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

We are going to add more test cases, so use the library supporting test
cases.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/264     | 93 ++++++++++++++++++++++----------------
 tests/qemu-iotests/264.out | 20 ++------
 2 files changed, 58 insertions(+), 55 deletions(-)

diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264
index e725cefd47..6feeaa4056 100755
--- a/tests/qemu-iotests/264
+++ b/tests/qemu-iotests/264
@@ -20,13 +20,10 @@
 #
 
 import time
+import os
 
 import iotests
-from iotests import qemu_img_create, file_path, qemu_nbd_popen, log
-
-iotests.script_initialize(
-    supported_fmts=['qcow2'],
-)
+from iotests import qemu_img_create, file_path, qemu_nbd_popen
 
 disk_a, disk_b, nbd_sock = file_path('disk_a', 'disk_b', 'nbd-sock')
 nbd_uri = 'nbd+unix:///?socket=' + nbd_sock
@@ -34,46 +31,62 @@ size = 5 * 1024 * 1024
 wait_limit = 3.0
 wait_step = 0.2
 
-qemu_img_create('-f', iotests.imgfmt, disk_a, str(size))
-qemu_img_create('-f', iotests.imgfmt, disk_b, str(size))
 
-with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
-    vm = iotests.VM().add_drive(disk_a)
-    vm.launch()
-    vm.hmp_qemu_io('drive0', 'write 0 {}'.format(size))
+class TestNbdReconnect(iotests.QMPTestCase):
+    def setUp(self):
+        qemu_img_create('-f', iotests.imgfmt, disk_a, str(size))
+        qemu_img_create('-f', iotests.imgfmt, disk_b, str(size))
+        self.vm = iotests.VM().add_drive(disk_a)
+        self.vm.launch()
+        self.vm.hmp_qemu_io('drive0', 'write 0 {}'.format(size))
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(disk_a)
+        os.remove(disk_b)
+
+    def test(self):
+        with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
+            result = self.vm.qmp('blockdev-add',
+                                 **{'node_name': 'backup0',
+                                    'driver': 'raw',
+                                    'file': {'driver': 'nbd',
+                                             'server': {'type': 'unix',
+                                                        'path': nbd_sock},
+                                             'reconnect-delay': 10}})
+            self.assert_qmp(result, 'return', {})
+            result = self.vm.qmp('blockdev-backup', device='drive0',
+                                 sync='full', target='backup0',
+                                 speed=(1 * 1024 * 1024))
+            self.assert_qmp(result, 'return', {})
+
+            # Wait for some progress
+            t = 0.0
+            while t < wait_limit:
+                jobs = self.vm.qmp('query-block-jobs')['return']
+                if jobs and jobs[0]['offset'] > 0:
+                    break
+                time.sleep(wait_step)
+                t += wait_step
 
-    vm.qmp_log('blockdev-add', filters=[iotests.filter_qmp_testfiles],
-               **{'node_name': 'backup0',
-                  'driver': 'raw',
-                  'file': {'driver': 'nbd',
-                           'server': {'type': 'unix', 'path': nbd_sock},
-                           'reconnect-delay': 10}})
-    vm.qmp_log('blockdev-backup', device='drive0', sync='full',
-               target='backup0', speed=(1 * 1024 * 1024))
+            self.assertTrue(jobs and jobs[0]['offset'] > 0)  # job started
 
-    # Wait for some progress
-    t = 0.0
-    while t < wait_limit:
-        jobs = vm.qmp('query-block-jobs')['return']
-        if jobs and jobs[0]['offset'] > 0:
-            break
-        time.sleep(wait_step)
-        t += wait_step
+        jobs = self.vm.qmp('query-block-jobs')['return']
+        # Check that job is still in progress
+        self.assertTrue(jobs and jobs[0]['offset'] < jobs[0]['len'])
 
-    if jobs and jobs[0]['offset'] > 0:
-        log('Backup job is started')
+        result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0)
+        self.assert_qmp(result, 'return', {})
 
-jobs = vm.qmp('query-block-jobs')['return']
-if jobs and jobs[0]['offset'] < jobs[0]['len']:
-    log('Backup job is still in progress')
+        # Emulate server down time for 1 second
+        time.sleep(1)
 
-vm.qmp_log('block-job-set-speed', device='drive0', speed=0)
+        with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
+            e = self.vm.event_wait('BLOCK_JOB_COMPLETED')
+            self.assertEqual(e['data']['offset'], size)
+            result = self.vm.qmp('blockdev-del', node_name='backup0')
+            self.assert_qmp(result, 'return', {})
 
-# Emulate server down time for 1 second
-time.sleep(1)
 
-with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
-    e = vm.event_wait('BLOCK_JOB_COMPLETED')
-    log('Backup completed: {}'.format(e['data']['offset']))
-    vm.qmp_log('blockdev-del', node_name='backup0')
-    vm.shutdown()
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/264.out b/tests/qemu-iotests/264.out
index c45b1e81ef..ae1213e6f8 100644
--- a/tests/qemu-iotests/264.out
+++ b/tests/qemu-iotests/264.out
@@ -1,15 +1,5 @@
-Start NBD server
-{"execute": "blockdev-add", "arguments": {"driver": "raw", "file": {"driver": "nbd", "reconnect-delay": 10, "server": {"path": "TEST_DIR/PID-nbd-sock", "type": "unix"}}, "node-name": "backup0"}}
-{"return": {}}
-{"execute": "blockdev-backup", "arguments": {"device": "drive0", "speed": 1048576, "sync": "full", "target": "backup0"}}
-{"return": {}}
-Backup job is started
-Kill NBD server
-Backup job is still in progress
-{"execute": "block-job-set-speed", "arguments": {"device": "drive0", "speed": 0}}
-{"return": {}}
-Start NBD server
-Backup completed: 5242880
-{"execute": "blockdev-del", "arguments": {"node-name": "backup0"}}
-{"return": {}}
-Kill NBD server
+.
+----------------------------------------------------------------------
+Ran 1 tests
+
+OK
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 07/10] iotests.py: qemu_nbd_popen: remove pid file after use
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2021-02-05 16:37 ` [PATCH v2 06/10] iotests/264: move to python unittest Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 08/10] iotests/264: add mirror-cancel test-case Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

To not interfere with other qemu_nbd_popen() calls in same test.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/iotests.py | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 00be68eca3..4e758308f2 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -296,7 +296,9 @@ def qemu_nbd_list_log(*args: str) -> str:
 @contextmanager
 def qemu_nbd_popen(*args):
     '''Context manager running qemu-nbd within the context'''
-    pid_file = file_path("pid")
+    pid_file = file_path("qemu_nbd_popen-nbd-pid-file")
+
+    assert not os.path.exists(pid_file)
 
     cmd = list(qemu_nbd_args)
     cmd.extend(('--persistent', '--pid-file', pid_file))
@@ -314,6 +316,8 @@ def qemu_nbd_popen(*args):
             time.sleep(0.01)
         yield
     finally:
+        if os.path.exists(pid_file):
+            os.remove(pid_file)
         log('Kill NBD server')
         p.kill()
         p.wait()
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 08/10] iotests/264: add mirror-cancel test-case
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2021-02-05 16:37 ` [PATCH v2 07/10] iotests.py: qemu_nbd_popen: remove pid file after use Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2022-07-21  8:50   ` Thomas Huth
  2021-02-05 16:37 ` [PATCH v2 09/10] block/backup: implement .cancel job handler Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

Check that cancel doesn't wait for 10s of nbd reconnect timeout.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/264     | 38 ++++++++++++++++++++++++++++++--------
 tests/qemu-iotests/264.out |  4 ++--
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264
index 6feeaa4056..347e53add5 100755
--- a/tests/qemu-iotests/264
+++ b/tests/qemu-iotests/264
@@ -27,25 +27,26 @@ from iotests import qemu_img_create, file_path, qemu_nbd_popen
 
 disk_a, disk_b, nbd_sock = file_path('disk_a', 'disk_b', 'nbd-sock')
 nbd_uri = 'nbd+unix:///?socket=' + nbd_sock
-size = 5 * 1024 * 1024
 wait_limit = 3.0
 wait_step = 0.2
 
 
 class TestNbdReconnect(iotests.QMPTestCase):
-    def setUp(self):
-        qemu_img_create('-f', iotests.imgfmt, disk_a, str(size))
-        qemu_img_create('-f', iotests.imgfmt, disk_b, str(size))
+    def init_vm(self, disk_size):
+        qemu_img_create('-f', iotests.imgfmt, disk_a, str(disk_size))
+        qemu_img_create('-f', iotests.imgfmt, disk_b, str(disk_size))
         self.vm = iotests.VM().add_drive(disk_a)
         self.vm.launch()
-        self.vm.hmp_qemu_io('drive0', 'write 0 {}'.format(size))
+        self.vm.hmp_qemu_io('drive0', 'write 0 {}'.format(disk_size))
 
     def tearDown(self):
         self.vm.shutdown()
         os.remove(disk_a)
         os.remove(disk_b)
 
-    def test(self):
+    def start_job(self, job):
+        """Stat job with nbd target and kill the server"""
+        assert job in ('blockdev-backup', 'blockdev-mirror')
         with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
             result = self.vm.qmp('blockdev-add',
                                  **{'node_name': 'backup0',
@@ -55,7 +56,7 @@ class TestNbdReconnect(iotests.QMPTestCase):
                                                         'path': nbd_sock},
                                              'reconnect-delay': 10}})
             self.assert_qmp(result, 'return', {})
-            result = self.vm.qmp('blockdev-backup', device='drive0',
+            result = self.vm.qmp(job, device='drive0',
                                  sync='full', target='backup0',
                                  speed=(1 * 1024 * 1024))
             self.assert_qmp(result, 'return', {})
@@ -73,7 +74,8 @@ class TestNbdReconnect(iotests.QMPTestCase):
 
         jobs = self.vm.qmp('query-block-jobs')['return']
         # Check that job is still in progress
-        self.assertTrue(jobs and jobs[0]['offset'] < jobs[0]['len'])
+        self.assertTrue(jobs)
+        self.assertTrue(jobs[0]['offset'] < jobs[0]['len'])
 
         result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0)
         self.assert_qmp(result, 'return', {})
@@ -81,12 +83,32 @@ class TestNbdReconnect(iotests.QMPTestCase):
         # Emulate server down time for 1 second
         time.sleep(1)
 
+    def test_backup(self):
+        size = 5 * 1024 * 1024
+        self.init_vm(size)
+        self.start_job('blockdev-backup')
+
         with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
             e = self.vm.event_wait('BLOCK_JOB_COMPLETED')
             self.assertEqual(e['data']['offset'], size)
             result = self.vm.qmp('blockdev-del', node_name='backup0')
             self.assert_qmp(result, 'return', {})
 
+    def test_mirror_cancel(self):
+        # Mirror speed limit doesn't work well enough, it seems that mirror
+        # will run many parallel requests anyway. MAX_IN_FLIGHT is 16 and
+        # MAX_IO_BYTES is 1M in mirror.c, so let's use 20M disk.
+        self.init_vm(20 * 1024 * 1024)
+        self.start_job('blockdev-mirror')
+
+        result = self.vm.qmp('block-job-cancel', device='drive0')
+        self.assert_qmp(result, 'return', {})
+
+        start_t = time.time()
+        self.vm.event_wait('BLOCK_JOB_CANCELLED')
+        delta_t = time.time() - start_t
+        self.assertTrue(delta_t < 2.0)
+
 
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/264.out b/tests/qemu-iotests/264.out
index ae1213e6f8..fbc63e62f8 100644
--- a/tests/qemu-iotests/264.out
+++ b/tests/qemu-iotests/264.out
@@ -1,5 +1,5 @@
-.
+..
 ----------------------------------------------------------------------
-Ran 1 tests
+Ran 2 tests
 
 OK
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 09/10] block/backup: implement .cancel job handler
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2021-02-05 16:37 ` [PATCH v2 08/10] iotests/264: add mirror-cancel test-case Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-05 16:37 ` [PATCH v2 10/10] iotests/264: add backup-cancel test-case Vladimir Sementsov-Ogievskiy
  2021-02-12 18:21 ` [PATCH v2 00/10] mirror: cancel nbd reconnect Eric Blake
  10 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

Cancel in-flight io on target to not waste the time.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/backup.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/block/backup.c b/block/backup.c
index cc525d5544..94e6dcd72e 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -35,6 +35,7 @@ typedef struct BackupBlockJob {
     BlockJob common;
     BlockDriverState *backup_top;
     BlockDriverState *source_bs;
+    BlockDriverState *target_bs;
 
     BdrvDirtyBitmap *sync_bitmap;
 
@@ -329,6 +330,13 @@ static void coroutine_fn backup_set_speed(BlockJob *job, int64_t speed)
     }
 }
 
+static void backup_cancel(Job *job)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
+
+    bdrv_cancel_in_flight(s->target_bs);
+}
+
 static const BlockJobDriver backup_job_driver = {
     .job_driver = {
         .instance_size          = sizeof(BackupBlockJob),
@@ -340,6 +348,7 @@ static const BlockJobDriver backup_job_driver = {
         .abort                  = backup_abort,
         .clean                  = backup_clean,
         .pause                  = backup_pause,
+        .cancel                 = backup_cancel,
     },
     .set_speed = backup_set_speed,
 };
@@ -528,6 +537,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
 
     job->backup_top = backup_top;
     job->source_bs = bs;
+    job->target_bs = target;
     job->on_source_error = on_source_error;
     job->on_target_error = on_target_error;
     job->sync_mode = sync_mode;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 10/10] iotests/264: add backup-cancel test-case
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2021-02-05 16:37 ` [PATCH v2 09/10] block/backup: implement .cancel job handler Vladimir Sementsov-Ogievskiy
@ 2021-02-05 16:37 ` Vladimir Sementsov-Ogievskiy
  2021-02-12 18:21 ` [PATCH v2 00/10] mirror: cancel nbd reconnect Eric Blake
  10 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 16:37 UTC (permalink / raw)
  To: qemu-block
  Cc: qemu-devel, eblake, fam, stefanha, mreitz, kwolf, vsementsov, jsnow, den

Check that cancel doesn't wait for 10s of nbd reconnect timeout.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/264     | 21 ++++++++++++++-------
 tests/qemu-iotests/264.out |  4 ++--
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264
index 347e53add5..4f96825a22 100755
--- a/tests/qemu-iotests/264
+++ b/tests/qemu-iotests/264
@@ -94,13 +94,7 @@ class TestNbdReconnect(iotests.QMPTestCase):
             result = self.vm.qmp('blockdev-del', node_name='backup0')
             self.assert_qmp(result, 'return', {})
 
-    def test_mirror_cancel(self):
-        # Mirror speed limit doesn't work well enough, it seems that mirror
-        # will run many parallel requests anyway. MAX_IN_FLIGHT is 16 and
-        # MAX_IO_BYTES is 1M in mirror.c, so let's use 20M disk.
-        self.init_vm(20 * 1024 * 1024)
-        self.start_job('blockdev-mirror')
-
+    def cancel_job(self):
         result = self.vm.qmp('block-job-cancel', device='drive0')
         self.assert_qmp(result, 'return', {})
 
@@ -109,6 +103,19 @@ class TestNbdReconnect(iotests.QMPTestCase):
         delta_t = time.time() - start_t
         self.assertTrue(delta_t < 2.0)
 
+    def test_mirror_cancel(self):
+        # Mirror speed limit doesn't work well enough, it seems that mirror
+        # will run many parallel requests anyway. MAX_IN_FLIGHT is 16 and
+        # MAX_IO_BYTES is 1M in mirror.c, so let's use 20M disk.
+        self.init_vm(20 * 1024 * 1024)
+        self.start_job('blockdev-mirror')
+        self.cancel_job()
+
+    def test_backup_cancel(self):
+        self.init_vm(5 * 1024 * 1024)
+        self.start_job('blockdev-backup')
+        self.cancel_job()
+
 
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/264.out b/tests/qemu-iotests/264.out
index fbc63e62f8..8d7e996700 100644
--- a/tests/qemu-iotests/264.out
+++ b/tests/qemu-iotests/264.out
@@ -1,5 +1,5 @@
-..
+...
 ----------------------------------------------------------------------
-Ran 2 tests
+Ran 3 tests
 
 OK
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 02/10] block/nbd: implement .bdrv_cancel_in_flight
  2021-02-05 16:37 ` [PATCH v2 02/10] block/nbd: implement .bdrv_cancel_in_flight Vladimir Sementsov-Ogievskiy
@ 2021-02-12 17:14   ` Eric Blake
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Blake @ 2021-02-12 17:14 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, mreitz, stefanha, den, jsnow

On 2/5/21 10:37 AM, Vladimir Sementsov-Ogievskiy wrote:
> Just stop waiting for connection in existing requests.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/nbd.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index b3cbbeb4b0..c26dc5a54f 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -2458,6 +2458,18 @@ static const char *const nbd_strong_runtime_opts[] = {
>      NULL
>  };
>  
> +static void nbd_cancel_in_flight(BlockDriverState *bs)
> +{
> +    BDRVNBDState *s = (BDRVNBDState *)bs->opaque;

This cast is not necessary in C, but it doesn't hurt.

> +
> +    reconnect_delay_timer_del(s);
> +
> +    if (s->state == NBD_CLIENT_CONNECTING_WAIT) {
> +        s->state = NBD_CLIENT_CONNECTING_NOWAIT;
> +        qemu_co_queue_restart_all(&s->free_sema);
> +    }
> +}
> +

R-b still stands

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 04/10] job: add .cancel handler for the driver
  2021-02-05 16:37 ` [PATCH v2 04/10] job: add .cancel handler for the driver Vladimir Sementsov-Ogievskiy
@ 2021-02-12 17:21   ` Eric Blake
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Blake @ 2021-02-12 17:21 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, mreitz, stefanha, den, jsnow

On 2/5/21 10:37 AM, Vladimir Sementsov-Ogievskiy wrote:
> To be used in mirror in the following commit to cancel in-flight io on
> target to not waste the time.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/qemu/job.h | 5 +++++
>  job.c              | 3 +++
>  2 files changed, 8 insertions(+)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 00/10] mirror: cancel nbd reconnect
  2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2021-02-05 16:37 ` [PATCH v2 10/10] iotests/264: add backup-cancel test-case Vladimir Sementsov-Ogievskiy
@ 2021-02-12 18:21 ` Eric Blake
  10 siblings, 0 replies; 16+ messages in thread
From: Eric Blake @ 2021-02-12 18:21 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: fam, kwolf, qemu-devel, mreitz, stefanha, den, jsnow

On 2/5/21 10:37 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> The problem
> 
> Assume we have mirror job with nbd target node with enabled reconnect.
> Connection failed. So, all current requests to nbd node are waiting for
> nbd driver to reconnect. And they will wait for reconnect-delay time
> specified in nbd blockdev options. This timeout may be long enough, for
> example, we in Virtuozzo use 300 seconds by default.
> 
> So, if at this moment user tries to cancel the job, job will wait for
> its in-flight requests to finish up to 300 seconds. From the user point
> of view, cancelling the job takes a long time. Bad.
> 
> Solution
> 
> Let's just cancel "waiting for reconnect in in-flight request coroutines"
> on mirror (and backup) cancel. Welcome the series below.
> 
> v2: wording, rebase on master, add Eric's r-bs, update test-output in
>     last commit

Thanks; I'm queuing this through my NBD tree.

> 
> Vladimir Sementsov-Ogievskiy (10):
>   block: add new BlockDriver handler: bdrv_cancel_in_flight
>   block/nbd: implement .bdrv_cancel_in_flight
>   block/raw-format: implement .bdrv_cancel_in_flight handler
>   job: add .cancel handler for the driver
>   block/mirror: implement .cancel job handler
>   iotests/264: move to python unittest
>   iotests.py: qemu_nbd_popen: remove pid file after use
>   iotests/264: add mirror-cancel test-case
>   block/backup: implement .cancel job handler
>   iotests/264: add backup-cancel test-case
> 
>  include/block/block.h         |   3 +
>  include/block/block_int.h     |   9 +++
>  include/qemu/job.h            |   5 ++
>  block/backup.c                |  10 +++
>  block/io.c                    |  11 +++
>  block/mirror.c                |   9 +++
>  block/nbd.c                   |  15 ++++
>  block/raw-format.c            |   6 ++
>  job.c                         |   3 +
>  tests/qemu-iotests/264        | 140 ++++++++++++++++++++++------------
>  tests/qemu-iotests/264.out    |  20 ++---
>  tests/qemu-iotests/iotests.py |   6 +-
>  12 files changed, 172 insertions(+), 65 deletions(-)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 08/10] iotests/264: add mirror-cancel test-case
  2021-02-05 16:37 ` [PATCH v2 08/10] iotests/264: add mirror-cancel test-case Vladimir Sementsov-Ogievskiy
@ 2022-07-21  8:50   ` Thomas Huth
  2022-07-21 19:21     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Huth @ 2022-07-21  8:50 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, eblake, stefanha, kwolf, jsnow, den, Hanna Reitz

On 05/02/2021 17.37, Vladimir Sementsov-Ogievskiy wrote:
> Check that cancel doesn't wait for 10s of nbd reconnect timeout.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>   tests/qemu-iotests/264     | 38 ++++++++++++++++++++++++++++++--------
>   tests/qemu-iotests/264.out |  4 ++--
>   2 files changed, 32 insertions(+), 10 deletions(-)
> 
> diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264
> index 6feeaa4056..347e53add5 100755
> --- a/tests/qemu-iotests/264
> +++ b/tests/qemu-iotests/264
> @@ -27,25 +27,26 @@ from iotests import qemu_img_create, file_path, qemu_nbd_popen
>   
>   disk_a, disk_b, nbd_sock = file_path('disk_a', 'disk_b', 'nbd-sock')
>   nbd_uri = 'nbd+unix:///?socket=' + nbd_sock
> -size = 5 * 1024 * 1024
>   wait_limit = 3.0
>   wait_step = 0.2
>   
>   
>   class TestNbdReconnect(iotests.QMPTestCase):
> -    def setUp(self):
> -        qemu_img_create('-f', iotests.imgfmt, disk_a, str(size))
> -        qemu_img_create('-f', iotests.imgfmt, disk_b, str(size))
> +    def init_vm(self, disk_size):
> +        qemu_img_create('-f', iotests.imgfmt, disk_a, str(disk_size))
> +        qemu_img_create('-f', iotests.imgfmt, disk_b, str(disk_size))
>           self.vm = iotests.VM().add_drive(disk_a)
>           self.vm.launch()
> -        self.vm.hmp_qemu_io('drive0', 'write 0 {}'.format(size))
> +        self.vm.hmp_qemu_io('drive0', 'write 0 {}'.format(disk_size))
>   
>       def tearDown(self):
>           self.vm.shutdown()
>           os.remove(disk_a)
>           os.remove(disk_b)
>   
> -    def test(self):
> +    def start_job(self, job):
> +        """Stat job with nbd target and kill the server"""
> +        assert job in ('blockdev-backup', 'blockdev-mirror')
>           with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
>               result = self.vm.qmp('blockdev-add',
>                                    **{'node_name': 'backup0',
> @@ -55,7 +56,7 @@ class TestNbdReconnect(iotests.QMPTestCase):
>                                                           'path': nbd_sock},
>                                                'reconnect-delay': 10}})
>               self.assert_qmp(result, 'return', {})
> -            result = self.vm.qmp('blockdev-backup', device='drive0',
> +            result = self.vm.qmp(job, device='drive0',
>                                    sync='full', target='backup0',
>                                    speed=(1 * 1024 * 1024))
>               self.assert_qmp(result, 'return', {})
> @@ -73,7 +74,8 @@ class TestNbdReconnect(iotests.QMPTestCase):
>   
>           jobs = self.vm.qmp('query-block-jobs')['return']
>           # Check that job is still in progress
> -        self.assertTrue(jobs and jobs[0]['offset'] < jobs[0]['len'])
> +        self.assertTrue(jobs)
> +        self.assertTrue(jobs[0]['offset'] < jobs[0]['len'])
>   
>           result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0)
>           self.assert_qmp(result, 'return', {})
> @@ -81,12 +83,32 @@ class TestNbdReconnect(iotests.QMPTestCase):
>           # Emulate server down time for 1 second
>           time.sleep(1)
>   
> +    def test_backup(self):
> +        size = 5 * 1024 * 1024
> +        self.init_vm(size)
> +        self.start_job('blockdev-backup')
> +
>           with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
>               e = self.vm.event_wait('BLOCK_JOB_COMPLETED')
>               self.assertEqual(e['data']['offset'], size)
>               result = self.vm.qmp('blockdev-del', node_name='backup0')
>               self.assert_qmp(result, 'return', {})
>   
> +    def test_mirror_cancel(self):
> +        # Mirror speed limit doesn't work well enough, it seems that mirror
> +        # will run many parallel requests anyway. MAX_IN_FLIGHT is 16 and
> +        # MAX_IO_BYTES is 1M in mirror.c, so let's use 20M disk.
> +        self.init_vm(20 * 1024 * 1024)
> +        self.start_job('blockdev-mirror')
> +
> +        result = self.vm.qmp('block-job-cancel', device='drive0')
> +        self.assert_qmp(result, 'return', {})
> +
> +        start_t = time.time()
> +        self.vm.event_wait('BLOCK_JOB_CANCELLED')
> +        delta_t = time.time() - start_t
> +        self.assertTrue(delta_t < 2.0)

  Hi!

For what it's worth, I've just run into this assert while running "make 
check -j8 SPEED=slow":

--- /home/thuth/devel/qemu/tests/qemu-iotests/264.out
+++ /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch/264/264.out.bad
@@ -1,5 +1,15 @@
-...
+..F
+======================================================================
+FAIL: test_mirror_cancel (__main__.TestNbdReconnect)
+----------------------------------------------------------------------
+Traceback (most recent call last):
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/264", line 112, in 
test_mirror_cancel
+    self.cancel_job()
+  File "/home/thuth/devel/qemu/tests/qemu-iotests/264", line 104, in cancel_job
+    self.assertTrue(delta_t < 2.0)
+AssertionError: False is not true
+
  ----------------------------------------------------------------------
  Ran 3 tests

-OK
+FAILED (failures=1)

(test program exited with status code 1)

Maybe 2.0 is not enough on loaded systems? Should we increase the value there?

  Thomas



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 08/10] iotests/264: add mirror-cancel test-case
  2022-07-21  8:50   ` Thomas Huth
@ 2022-07-21 19:21     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-07-21 19:21 UTC (permalink / raw)
  To: Thomas Huth, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: qemu-devel, eblake, stefanha, kwolf, jsnow, den, Hanna Reitz

On 7/21/22 11:50, Thomas Huth wrote:
> On 05/02/2021 17.37, Vladimir Sementsov-Ogievskiy wrote:
>> Check that cancel doesn't wait for 10s of nbd reconnect timeout.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> ---
>>   tests/qemu-iotests/264     | 38 ++++++++++++++++++++++++++++++--------
>>   tests/qemu-iotests/264.out |  4 ++--
>>   2 files changed, 32 insertions(+), 10 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264
>> index 6feeaa4056..347e53add5 100755
>> --- a/tests/qemu-iotests/264
>> +++ b/tests/qemu-iotests/264
>> @@ -27,25 +27,26 @@ from iotests import qemu_img_create, file_path, qemu_nbd_popen
>>   disk_a, disk_b, nbd_sock = file_path('disk_a', 'disk_b', 'nbd-sock')
>>   nbd_uri = 'nbd+unix:///?socket=' + nbd_sock
>> -size = 5 * 1024 * 1024
>>   wait_limit = 3.0
>>   wait_step = 0.2
>>   class TestNbdReconnect(iotests.QMPTestCase):
>> -    def setUp(self):
>> -        qemu_img_create('-f', iotests.imgfmt, disk_a, str(size))
>> -        qemu_img_create('-f', iotests.imgfmt, disk_b, str(size))
>> +    def init_vm(self, disk_size):
>> +        qemu_img_create('-f', iotests.imgfmt, disk_a, str(disk_size))
>> +        qemu_img_create('-f', iotests.imgfmt, disk_b, str(disk_size))
>>           self.vm = iotests.VM().add_drive(disk_a)
>>           self.vm.launch()
>> -        self.vm.hmp_qemu_io('drive0', 'write 0 {}'.format(size))
>> +        self.vm.hmp_qemu_io('drive0', 'write 0 {}'.format(disk_size))
>>       def tearDown(self):
>>           self.vm.shutdown()
>>           os.remove(disk_a)
>>           os.remove(disk_b)
>> -    def test(self):
>> +    def start_job(self, job):
>> +        """Stat job with nbd target and kill the server"""
>> +        assert job in ('blockdev-backup', 'blockdev-mirror')
>>           with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
>>               result = self.vm.qmp('blockdev-add',
>>                                    **{'node_name': 'backup0',
>> @@ -55,7 +56,7 @@ class TestNbdReconnect(iotests.QMPTestCase):
>>                                                           'path': nbd_sock},
>>                                                'reconnect-delay': 10}})
>>               self.assert_qmp(result, 'return', {})
>> -            result = self.vm.qmp('blockdev-backup', device='drive0',
>> +            result = self.vm.qmp(job, device='drive0',
>>                                    sync='full', target='backup0',
>>                                    speed=(1 * 1024 * 1024))
>>               self.assert_qmp(result, 'return', {})
>> @@ -73,7 +74,8 @@ class TestNbdReconnect(iotests.QMPTestCase):
>>           jobs = self.vm.qmp('query-block-jobs')['return']
>>           # Check that job is still in progress
>> -        self.assertTrue(jobs and jobs[0]['offset'] < jobs[0]['len'])
>> +        self.assertTrue(jobs)
>> +        self.assertTrue(jobs[0]['offset'] < jobs[0]['len'])
>>           result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0)
>>           self.assert_qmp(result, 'return', {})
>> @@ -81,12 +83,32 @@ class TestNbdReconnect(iotests.QMPTestCase):
>>           # Emulate server down time for 1 second
>>           time.sleep(1)
>> +    def test_backup(self):
>> +        size = 5 * 1024 * 1024
>> +        self.init_vm(size)
>> +        self.start_job('blockdev-backup')
>> +
>>           with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
>>               e = self.vm.event_wait('BLOCK_JOB_COMPLETED')
>>               self.assertEqual(e['data']['offset'], size)
>>               result = self.vm.qmp('blockdev-del', node_name='backup0')
>>               self.assert_qmp(result, 'return', {})
>> +    def test_mirror_cancel(self):
>> +        # Mirror speed limit doesn't work well enough, it seems that mirror
>> +        # will run many parallel requests anyway. MAX_IN_FLIGHT is 16 and
>> +        # MAX_IO_BYTES is 1M in mirror.c, so let's use 20M disk.
>> +        self.init_vm(20 * 1024 * 1024)
>> +        self.start_job('blockdev-mirror')
>> +
>> +        result = self.vm.qmp('block-job-cancel', device='drive0')
>> +        self.assert_qmp(result, 'return', {})
>> +
>> +        start_t = time.time()
>> +        self.vm.event_wait('BLOCK_JOB_CANCELLED')
>> +        delta_t = time.time() - start_t
>> +        self.assertTrue(delta_t < 2.0)
> 
>   Hi!
> 
> For what it's worth, I've just run into this assert while running "make check -j8 SPEED=slow":
> 
> --- /home/thuth/devel/qemu/tests/qemu-iotests/264.out
> +++ /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch/264/264.out.bad
> @@ -1,5 +1,15 @@
> -...
> +..F
> +======================================================================
> +FAIL: test_mirror_cancel (__main__.TestNbdReconnect)
> +----------------------------------------------------------------------
> +Traceback (most recent call last):
> +  File "/home/thuth/devel/qemu/tests/qemu-iotests/264", line 112, in test_mirror_cancel
> +    self.cancel_job()
> +  File "/home/thuth/devel/qemu/tests/qemu-iotests/264", line 104, in cancel_job
> +    self.assertTrue(delta_t < 2.0)
> +AssertionError: False is not true
> +
>   ----------------------------------------------------------------------
>   Ran 3 tests
> 
> -OK
> +FAILED (failures=1)
> 
> (test program exited with status code 1)
> 
> Maybe 2.0 is not enough on loaded systems? Should we increase the value there?
> 

Yes I think we can increase it. We have speed set to 1MB/s in the test. Mirror (due to implementation) will copy 16MB at once.. and then it should wait 16 seconds to copy next 4M. So, if fast-cancelling not work, we'll cancel in 19+ seconds. I think, we can safely increase our bound from 2 to 5-7 seconds.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-07-21 19:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-05 16:37 [PATCH v2 00/10] mirror: cancel nbd reconnect Vladimir Sementsov-Ogievskiy
2021-02-05 16:37 ` [PATCH v2 01/10] block: add new BlockDriver handler: bdrv_cancel_in_flight Vladimir Sementsov-Ogievskiy
2021-02-05 16:37 ` [PATCH v2 02/10] block/nbd: implement .bdrv_cancel_in_flight Vladimir Sementsov-Ogievskiy
2021-02-12 17:14   ` Eric Blake
2021-02-05 16:37 ` [PATCH v2 03/10] block/raw-format: implement .bdrv_cancel_in_flight handler Vladimir Sementsov-Ogievskiy
2021-02-05 16:37 ` [PATCH v2 04/10] job: add .cancel handler for the driver Vladimir Sementsov-Ogievskiy
2021-02-12 17:21   ` Eric Blake
2021-02-05 16:37 ` [PATCH v2 05/10] block/mirror: implement .cancel job handler Vladimir Sementsov-Ogievskiy
2021-02-05 16:37 ` [PATCH v2 06/10] iotests/264: move to python unittest Vladimir Sementsov-Ogievskiy
2021-02-05 16:37 ` [PATCH v2 07/10] iotests.py: qemu_nbd_popen: remove pid file after use Vladimir Sementsov-Ogievskiy
2021-02-05 16:37 ` [PATCH v2 08/10] iotests/264: add mirror-cancel test-case Vladimir Sementsov-Ogievskiy
2022-07-21  8:50   ` Thomas Huth
2022-07-21 19:21     ` Vladimir Sementsov-Ogievskiy
2021-02-05 16:37 ` [PATCH v2 09/10] block/backup: implement .cancel job handler Vladimir Sementsov-Ogievskiy
2021-02-05 16:37 ` [PATCH v2 10/10] iotests/264: add backup-cancel test-case Vladimir Sementsov-Ogievskiy
2021-02-12 18:21 ` [PATCH v2 00/10] mirror: cancel nbd reconnect Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.