All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy
@ 2020-07-27 19:42 Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 01/21] qemu-iotests/199: fix style Vladimir Sementsov-Ogievskiy
                   ` (21 more replies)
  0 siblings, 22 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	den, mreitz, jsnow

v4:

01: typo in commit msg
07: reword commit msg, add Eric's r-b
10: add Dr. David's r-b
15: add check for buf_size
    use g_autofree (and fix introduced in v3)
    use QEMU_LOCK_GUARD
17: fix commit msg, add Eric's r-b
20-21: add Eric's t-b

Original idea of bitmaps postcopy migration is that bitmaps are non
critical data, and their loss is not serious problem. So, using postcopy
method on any failure we should just drop unfinished bitmaps and
continue guest execution.

However, it doesn't work so. It crashes, fails, it goes to
postcopy-recovery feature. It does anything except for behavior we want.
These series fixes at least some problems with error handling during
bitmaps migration postcopy.

Vladimir Sementsov-Ogievskiy (21):
  qemu-iotests/199: fix style
  qemu-iotests/199: drop extra constraints
  qemu-iotests/199: better catch postcopy time
  qemu-iotests/199: improve performance: set bitmap by discard
  qemu-iotests/199: change discard patterns
  qemu-iotests/199: increase postcopy period
  migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
  migration/block-dirty-bitmap: rename state structure types
  migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup
  migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init
  migration/block-dirty-bitmap: refactor state global variables
  migration/block-dirty-bitmap: rename finish_lock to just lock
  migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
  migration/block-dirty-bitmap: keep bitmap state for all bitmaps
  migration/block-dirty-bitmap: relax error handling in incoming part
  migration/block-dirty-bitmap: cancel migration on shutdown
  migration/savevm: don't worry if bitmap migration postcopy failed
  qemu-iotests/199: prepare for new test-cases addition
  qemu-iotests/199: check persistent bitmaps
  qemu-iotests/199: add early shutdown case to bitmaps postcopy
  qemu-iotests/199: add source-killed case to bitmaps postcopy

 migration/migration.h          |   3 +-
 migration/block-dirty-bitmap.c | 470 +++++++++++++++++++++------------
 migration/migration.c          |  15 +-
 migration/savevm.c             |  37 ++-
 tests/qemu-iotests/199         | 250 ++++++++++++++----
 tests/qemu-iotests/199.out     |   4 +-
 6 files changed, 545 insertions(+), 234 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v4 01/21] qemu-iotests/199: fix style
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 02/21] qemu-iotests/199: drop extra constraints Vladimir Sementsov-Ogievskiy
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Mostly, satisfy pep8 complaints.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/199 | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 40774eed74..de9ba8d94c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -28,8 +28,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
-class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
+class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
     def tearDown(self):
         self.vm_a.shutdown()
         self.vm_b.shutdown()
@@ -54,7 +54,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
         result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
                                name='bitmap', granularity=granularity)
-        self.assert_qmp(result, 'return', {});
+        self.assert_qmp(result, 'return', {})
 
         s = 0
         while s < write_size:
@@ -71,7 +71,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
         result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
                                name='bitmap')
-        self.assert_qmp(result, 'return', {});
+        self.assert_qmp(result, 'return', {})
         s = 0
         while s < write_size:
             self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
@@ -104,15 +104,16 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
             self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
             s += 0x10000
 
-        result = self.vm_b.qmp('query-block');
+        result = self.vm_b.qmp('query-block')
         while len(result['return'][0]['dirty-bitmaps']) > 1:
             time.sleep(2)
-            result = self.vm_b.qmp('query-block');
+            result = self.vm_b.qmp('query-block')
 
         result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
                                node='drive0', name='bitmap')
 
-        self.assert_qmp(result, 'return/sha256', sha256);
+        self.assert_qmp(result, 'return/sha256', sha256)
+
 
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 02/21] qemu-iotests/199: drop extra constraints
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 01/21] qemu-iotests/199: fix style Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 03/21] qemu-iotests/199: better catch postcopy time Vladimir Sementsov-Ogievskiy
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

We don't need any specific format constraints here. Still keep qcow2
for two reasons:
1. No extra calls of format-unrelated test
2. Add some check around persistent bitmap in future (require qcow2)

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/199 | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index de9ba8d94c..dda918450a 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -116,5 +116,4 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 
 if __name__ == '__main__':
-    iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
-                 supported_protocols=['file'])
+    iotests.main(supported_fmts=['qcow2'])
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 03/21] qemu-iotests/199: better catch postcopy time
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 01/21] qemu-iotests/199: fix style Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 02/21] qemu-iotests/199: drop extra constraints Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 04/21] qemu-iotests/199: improve performance: set bitmap by discard Vladimir Sementsov-Ogievskiy
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.

Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.

Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.

Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/199 | 72 +++++++++++++++++++++++++++++++++---------
 1 file changed, 57 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dda918450a..dd6044768c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -20,17 +20,43 @@
 
 import os
 import iotests
-import time
 from iotests import qemu_img
 
+debug = False
+
 disk_a = os.path.join(iotests.test_dir, 'disk_a')
 disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 
+def event_seconds(event):
+    return event['timestamp']['seconds'] + \
+        event['timestamp']['microseconds'] / 1000000.0
+
+
+def event_dist(e1, e2):
+    return event_seconds(e2) - event_seconds(e1)
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
     def tearDown(self):
+        if debug:
+            self.vm_a_events += self.vm_a.get_qmp_events()
+            self.vm_b_events += self.vm_b.get_qmp_events()
+            for e in self.vm_a_events:
+                e['vm'] = 'SRC'
+            for e in self.vm_b_events:
+                e['vm'] = 'DST'
+            events = (self.vm_a_events + self.vm_b_events)
+            events = [(e['timestamp']['seconds'],
+                       e['timestamp']['microseconds'],
+                       e['vm'],
+                       e['event'],
+                       e.get('data', '')) for e in events]
+            for e in sorted(events):
+                print('{}.{:06} {} {} {}'.format(*e))
+
         self.vm_a.shutdown()
         self.vm_b.shutdown()
         os.remove(disk_a)
@@ -47,6 +73,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         self.vm_a.launch()
         self.vm_b.launch()
 
+        # collect received events for debug
+        self.vm_a_events = []
+        self.vm_b_events = []
+
     def test_postcopy(self):
         write_size = 0x40000000
         granularity = 512
@@ -77,15 +107,13 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
             self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
             s += 0x10000
 
-        bitmaps_cap = {'capability': 'dirty-bitmaps', 'state': True}
-        events_cap = {'capability': 'events', 'state': True}
+        caps = [{'capability': 'dirty-bitmaps', 'state': True},
+                {'capability': 'events', 'state': True}]
 
-        result = self.vm_a.qmp('migrate-set-capabilities',
-                               capabilities=[bitmaps_cap, events_cap])
+        result = self.vm_a.qmp('migrate-set-capabilities', capabilities=caps)
         self.assert_qmp(result, 'return', {})
 
-        result = self.vm_b.qmp('migrate-set-capabilities',
-                               capabilities=[bitmaps_cap])
+        result = self.vm_b.qmp('migrate-set-capabilities', capabilities=caps)
         self.assert_qmp(result, 'return', {})
 
         result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
@@ -94,24 +122,38 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         result = self.vm_a.qmp('migrate-start-postcopy')
         self.assert_qmp(result, 'return', {})
 
-        while True:
-            event = self.vm_a.event_wait('MIGRATION')
-            if event['data']['status'] == 'completed':
-                break
+        event_resume = self.vm_b.event_wait('RESUME')
+        self.vm_b_events.append(event_resume)
 
         s = 0x8000
         while s < write_size:
             self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
             s += 0x10000
 
+        match = {'data': {'status': 'completed'}}
+        event_complete = self.vm_b.event_wait('MIGRATION', match=match)
+        self.vm_b_events.append(event_complete)
+
+        # take queued event, should already been happened
+        event_stop = self.vm_a.event_wait('STOP')
+        self.vm_a_events.append(event_stop)
+
+        downtime = event_dist(event_stop, event_resume)
+        postcopy_time = event_dist(event_resume, event_complete)
+
+        # TODO: assert downtime * 10 < postcopy_time
+        if debug:
+            print('downtime:', downtime)
+            print('postcopy_time:', postcopy_time)
+
+        # Assert that bitmap migration is finished (check that successor bitmap
+        # is removed)
         result = self.vm_b.qmp('query-block')
-        while len(result['return'][0]['dirty-bitmaps']) > 1:
-            time.sleep(2)
-            result = self.vm_b.qmp('query-block')
+        assert len(result['return'][0]['dirty-bitmaps']) == 1
 
+        # Check content of migrated (and updated by new writes) bitmap
         result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
                                node='drive0', name='bitmap')
-
         self.assert_qmp(result, 'return/sha256', sha256)
 
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 04/21] qemu-iotests/199: improve performance: set bitmap by discard
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 03/21] qemu-iotests/199: better catch postcopy time Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 05/21] qemu-iotests/199: change discard patterns Vladimir Sementsov-Ogievskiy
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Discard dirties dirty-bitmap as well as write, but works faster. Let's
use it instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/199 | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dd6044768c..190e820b84 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -67,8 +67,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         os.mkfifo(fifo)
         qemu_img('create', '-f', iotests.imgfmt, disk_a, size)
         qemu_img('create', '-f', iotests.imgfmt, disk_b, size)
-        self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
-        self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+        self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a,
+                                                          'discard=unmap')
+        self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b,
+                                                          'discard=unmap')
         self.vm_b.add_incoming("exec: cat '" + fifo + "'")
         self.vm_a.launch()
         self.vm_b.launch()
@@ -78,7 +80,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         self.vm_b_events = []
 
     def test_postcopy(self):
-        write_size = 0x40000000
+        discard_size = 0x40000000
         granularity = 512
         chunk = 4096
 
@@ -86,25 +88,32 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
                                name='bitmap', granularity=granularity)
         self.assert_qmp(result, 'return', {})
 
+        result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+                               node='drive0', name='bitmap')
+        empty_sha256 = result['return']['sha256']
+
         s = 0
-        while s < write_size:
-            self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+        while s < discard_size:
+            self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
             s += 0x10000
         s = 0x8000
-        while s < write_size:
-            self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+        while s < discard_size:
+            self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
             s += 0x10000
 
         result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
                                node='drive0', name='bitmap')
         sha256 = result['return']['sha256']
 
+        # Check, that updating the bitmap by discards works
+        assert sha256 != empty_sha256
+
         result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
                                name='bitmap')
         self.assert_qmp(result, 'return', {})
         s = 0
-        while s < write_size:
-            self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+        while s < discard_size:
+            self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
             s += 0x10000
 
         caps = [{'capability': 'dirty-bitmaps', 'state': True},
@@ -126,8 +135,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         self.vm_b_events.append(event_resume)
 
         s = 0x8000
-        while s < write_size:
-            self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+        while s < discard_size:
+            self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
             s += 0x10000
 
         match = {'data': {'status': 'completed'}}
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 05/21] qemu-iotests/199: change discard patterns
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 04/21] qemu-iotests/199: improve performance: set bitmap by discard Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 06/21] qemu-iotests/199: increase postcopy period Vladimir Sementsov-Ogievskiy
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

iotest 199 works too long because of many discard operations. At the
same time, postcopy period is very short, in spite of all these
efforts.

So, let's use less discards (and with more interesting patterns) to
reduce test timing. In the next commit we'll increase postcopy period.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/199 | 44 +++++++++++++++++++++++++-----------------
 1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 190e820b84..da4dae01fb 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -30,6 +30,28 @@ size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 
+GiB = 1024 * 1024 * 1024
+
+discards1 = (
+    (0, GiB),
+    (2 * GiB + 512 * 5, 512),
+    (3 * GiB + 512 * 5, 512),
+    (100 * GiB, GiB)
+)
+
+discards2 = (
+    (3 * GiB + 512 * 8, 512),
+    (4 * GiB + 512 * 8, 512),
+    (50 * GiB, GiB),
+    (100 * GiB + GiB // 2, GiB)
+)
+
+
+def apply_discards(vm, discards):
+    for d in discards:
+        vm.hmp_qemu_io('drive0', 'discard {} {}'.format(*d))
+
+
 def event_seconds(event):
     return event['timestamp']['seconds'] + \
         event['timestamp']['microseconds'] / 1000000.0
@@ -80,9 +102,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         self.vm_b_events = []
 
     def test_postcopy(self):
-        discard_size = 0x40000000
         granularity = 512
-        chunk = 4096
 
         result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
                                name='bitmap', granularity=granularity)
@@ -92,14 +112,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
                                node='drive0', name='bitmap')
         empty_sha256 = result['return']['sha256']
 
-        s = 0
-        while s < discard_size:
-            self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-            s += 0x10000
-        s = 0x8000
-        while s < discard_size:
-            self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-            s += 0x10000
+        apply_discards(self.vm_a, discards1 + discards2)
 
         result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
                                node='drive0', name='bitmap')
@@ -111,10 +124,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
                                name='bitmap')
         self.assert_qmp(result, 'return', {})
-        s = 0
-        while s < discard_size:
-            self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-            s += 0x10000
+
+        apply_discards(self.vm_a, discards1)
 
         caps = [{'capability': 'dirty-bitmaps', 'state': True},
                 {'capability': 'events', 'state': True}]
@@ -134,10 +145,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         event_resume = self.vm_b.event_wait('RESUME')
         self.vm_b_events.append(event_resume)
 
-        s = 0x8000
-        while s < discard_size:
-            self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-            s += 0x10000
+        apply_discards(self.vm_b, discards2)
 
         match = {'data': {'status': 'completed'}}
         event_complete = self.vm_b.event_wait('MIGRATION', match=match)
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 06/21] qemu-iotests/199: increase postcopy period
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 05/21] qemu-iotests/199: change discard patterns Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 07/21] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start Vladimir Sementsov-Ogievskiy
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

The test wants to force a bitmap postcopy. Still, the resulting
postcopy period is very small. Let's increase it by adding more
bitmaps to migrate. Also, test disabled bitmaps migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/199 | 58 ++++++++++++++++++++++++++++--------------
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index da4dae01fb..d8532e49da 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -103,29 +103,45 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
     def test_postcopy(self):
         granularity = 512
+        nb_bitmaps = 15
 
-        result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
-                               name='bitmap', granularity=granularity)
-        self.assert_qmp(result, 'return', {})
+        for i in range(nb_bitmaps):
+            result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
+                                   name='bitmap{}'.format(i),
+                                   granularity=granularity)
+            self.assert_qmp(result, 'return', {})
 
         result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-                               node='drive0', name='bitmap')
+                               node='drive0', name='bitmap0')
         empty_sha256 = result['return']['sha256']
 
-        apply_discards(self.vm_a, discards1 + discards2)
+        apply_discards(self.vm_a, discards1)
 
         result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-                               node='drive0', name='bitmap')
-        sha256 = result['return']['sha256']
+                               node='drive0', name='bitmap0')
+        discards1_sha256 = result['return']['sha256']
 
         # Check, that updating the bitmap by discards works
-        assert sha256 != empty_sha256
+        assert discards1_sha256 != empty_sha256
 
-        result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
-                               name='bitmap')
-        self.assert_qmp(result, 'return', {})
+        # We want to calculate resulting sha256. Do it in bitmap0, so, disable
+        # other bitmaps
+        for i in range(1, nb_bitmaps):
+            result = self.vm_a.qmp('block-dirty-bitmap-disable', node='drive0',
+                                   name='bitmap{}'.format(i))
+            self.assert_qmp(result, 'return', {})
 
-        apply_discards(self.vm_a, discards1)
+        apply_discards(self.vm_a, discards2)
+
+        result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+                               node='drive0', name='bitmap0')
+        all_discards_sha256 = result['return']['sha256']
+
+        # Now, enable some bitmaps, to be updated during migration
+        for i in range(2, nb_bitmaps, 2):
+            result = self.vm_a.qmp('block-dirty-bitmap-enable', node='drive0',
+                                   name='bitmap{}'.format(i))
+            self.assert_qmp(result, 'return', {})
 
         caps = [{'capability': 'dirty-bitmaps', 'state': True},
                 {'capability': 'events', 'state': True}]
@@ -145,6 +161,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         event_resume = self.vm_b.event_wait('RESUME')
         self.vm_b_events.append(event_resume)
 
+        # enabled bitmaps should be updated
         apply_discards(self.vm_b, discards2)
 
         match = {'data': {'status': 'completed'}}
@@ -158,7 +175,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         downtime = event_dist(event_stop, event_resume)
         postcopy_time = event_dist(event_resume, event_complete)
 
-        # TODO: assert downtime * 10 < postcopy_time
+        assert downtime * 10 < postcopy_time
         if debug:
             print('downtime:', downtime)
             print('postcopy_time:', postcopy_time)
@@ -166,12 +183,15 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         # Assert that bitmap migration is finished (check that successor bitmap
         # is removed)
         result = self.vm_b.qmp('query-block')
-        assert len(result['return'][0]['dirty-bitmaps']) == 1
-
-        # Check content of migrated (and updated by new writes) bitmap
-        result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
-                               node='drive0', name='bitmap')
-        self.assert_qmp(result, 'return/sha256', sha256)
+        assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+
+        # Check content of migrated bitmaps. Still, don't waste time checking
+        # every bitmap
+        for i in range(0, nb_bitmaps, 5):
+            result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
+                                   node='drive0', name='bitmap{}'.format(i))
+            sha256 = discards1_sha256 if i % 2 else all_discards_sha256
+            self.assert_qmp(result, 'return/sha256', sha256)
 
 
 if __name__ == '__main__':
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 07/21] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 06/21] qemu-iotests/199: increase postcopy period Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 08/21] migration/block-dirty-bitmap: rename state structure types Vladimir Sementsov-Ogievskiy
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-stable, qemu-devel,
	dgilbert, stefanha, Andrey Shinkevich, den, mreitz, jsnow

Using the _locked version of bdrv_enable_dirty_bitmap to bypass locking
is wrong as we do not already own the mutex.  Moreover, the adjacent
call to bdrv_dirty_bitmap_enable_successor grabs the mutex.

Fixes: 58f72b965e9e1q
Cc: qemu-stable@nongnu.org # v3.0
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 migration/block-dirty-bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index b0dbf9eeed..0739f1259e 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -566,7 +566,7 @@ void dirty_bitmap_mig_before_vm_start(void)
         DirtyBitmapLoadBitmapState *b = item->data;
 
         if (b->migrated) {
-            bdrv_enable_dirty_bitmap_locked(b->bitmap);
+            bdrv_enable_dirty_bitmap(b->bitmap);
         } else {
             bdrv_dirty_bitmap_enable_successor(b->bitmap);
         }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 08/21] migration/block-dirty-bitmap: rename state structure types
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 07/21] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 09/21] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup Vladimir Sementsov-Ogievskiy
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Rename types to be symmetrical for load/save part and shorter.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 migration/block-dirty-bitmap.c | 70 ++++++++++++++++++----------------
 1 file changed, 37 insertions(+), 33 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 0739f1259e..1d57bff4f6 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -100,23 +100,25 @@
 /* 0x04 was "AUTOLOAD" flags on elder versions, no it is ignored */
 #define DIRTY_BITMAP_MIG_START_FLAG_RESERVED_MASK    0xf8
 
-typedef struct DirtyBitmapMigBitmapState {
+/* State of one bitmap during save process */
+typedef struct SaveBitmapState {
     /* Written during setup phase. */
     BlockDriverState *bs;
     const char *node_name;
     BdrvDirtyBitmap *bitmap;
     uint64_t total_sectors;
     uint64_t sectors_per_chunk;
-    QSIMPLEQ_ENTRY(DirtyBitmapMigBitmapState) entry;
+    QSIMPLEQ_ENTRY(SaveBitmapState) entry;
     uint8_t flags;
 
     /* For bulk phase. */
     bool bulk_completed;
     uint64_t cur_sector;
-} DirtyBitmapMigBitmapState;
+} SaveBitmapState;
 
-typedef struct DirtyBitmapMigState {
-    QSIMPLEQ_HEAD(, DirtyBitmapMigBitmapState) dbms_list;
+/* State of the dirty bitmap migration (DBM) during save process */
+typedef struct DBMSaveState {
+    QSIMPLEQ_HEAD(, SaveBitmapState) dbms_list;
 
     bool bulk_completed;
     bool no_bitmaps;
@@ -124,23 +126,25 @@ typedef struct DirtyBitmapMigState {
     /* for send_bitmap_bits() */
     BlockDriverState *prev_bs;
     BdrvDirtyBitmap *prev_bitmap;
-} DirtyBitmapMigState;
+} DBMSaveState;
 
-typedef struct DirtyBitmapLoadState {
+/* State of the dirty bitmap migration (DBM) during load process */
+typedef struct DBMLoadState {
     uint32_t flags;
     char node_name[256];
     char bitmap_name[256];
     BlockDriverState *bs;
     BdrvDirtyBitmap *bitmap;
-} DirtyBitmapLoadState;
+} DBMLoadState;
 
-static DirtyBitmapMigState dirty_bitmap_mig_state;
+static DBMSaveState dirty_bitmap_mig_state;
 
-typedef struct DirtyBitmapLoadBitmapState {
+/* State of one bitmap during load process */
+typedef struct LoadBitmapState {
     BlockDriverState *bs;
     BdrvDirtyBitmap *bitmap;
     bool migrated;
-} DirtyBitmapLoadBitmapState;
+} LoadBitmapState;
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;
 
@@ -170,7 +174,7 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t flags)
     qemu_put_byte(f, flags);
 }
 
-static void send_bitmap_header(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
                                uint32_t additional_flags)
 {
     BlockDriverState *bs = dbms->bs;
@@ -199,19 +203,19 @@ static void send_bitmap_header(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
     }
 }
 
-static void send_bitmap_start(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
 {
     send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
     qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
     qemu_put_byte(f, dbms->flags);
 }
 
-static void send_bitmap_complete(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
 {
     send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }
 
-static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
                              uint64_t start_sector, uint32_t nr_sectors)
 {
     /* align for buffer_is_zero() */
@@ -257,7 +261,7 @@ static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
 /* Called with iothread lock taken.  */
 static void dirty_bitmap_mig_cleanup(void)
 {
-    DirtyBitmapMigBitmapState *dbms;
+    SaveBitmapState *dbms;
 
     while ((dbms = QSIMPLEQ_FIRST(&dirty_bitmap_mig_state.dbms_list)) != NULL) {
         QSIMPLEQ_REMOVE_HEAD(&dirty_bitmap_mig_state.dbms_list, entry);
@@ -271,7 +275,7 @@ static void dirty_bitmap_mig_cleanup(void)
 static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
 {
     BdrvDirtyBitmap *bitmap;
-    DirtyBitmapMigBitmapState *dbms;
+    SaveBitmapState *dbms;
     Error *local_err = NULL;
 
     FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
@@ -309,7 +313,7 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
         bdrv_ref(bs);
         bdrv_dirty_bitmap_set_busy(bitmap, true);
 
-        dbms = g_new0(DirtyBitmapMigBitmapState, 1);
+        dbms = g_new0(SaveBitmapState, 1);
         dbms->bs = bs;
         dbms->node_name = bs_name;
         dbms->bitmap = bitmap;
@@ -334,7 +338,7 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
 static int init_dirty_bitmap_migration(void)
 {
     BlockDriverState *bs;
-    DirtyBitmapMigBitmapState *dbms;
+    SaveBitmapState *dbms;
     GHashTable *handled_by_blk = g_hash_table_new(NULL, NULL);
     BlockBackend *blk;
 
@@ -408,7 +412,7 @@ fail:
 }
 
 /* Called with no lock taken.  */
-static void bulk_phase_send_chunk(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void bulk_phase_send_chunk(QEMUFile *f, SaveBitmapState *dbms)
 {
     uint32_t nr_sectors = MIN(dbms->total_sectors - dbms->cur_sector,
                              dbms->sectors_per_chunk);
@@ -424,7 +428,7 @@ static void bulk_phase_send_chunk(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
 /* Called with no lock taken.  */
 static void bulk_phase(QEMUFile *f, bool limit)
 {
-    DirtyBitmapMigBitmapState *dbms;
+    SaveBitmapState *dbms;
 
     QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
         while (!dbms->bulk_completed) {
@@ -461,7 +465,7 @@ static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
 
 static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque)
 {
-    DirtyBitmapMigBitmapState *dbms;
+    SaveBitmapState *dbms;
     trace_dirty_bitmap_save_complete_enter();
 
     if (!dirty_bitmap_mig_state.bulk_completed) {
@@ -486,7 +490,7 @@ static void dirty_bitmap_save_pending(QEMUFile *f, void *opaque,
                                       uint64_t *res_compatible,
                                       uint64_t *res_postcopy_only)
 {
-    DirtyBitmapMigBitmapState *dbms;
+    SaveBitmapState *dbms;
     uint64_t pending = 0;
 
     qemu_mutex_lock_iothread();
@@ -507,7 +511,7 @@ static void dirty_bitmap_save_pending(QEMUFile *f, void *opaque,
 }
 
 /* First occurrence of this bitmap. It should be created if doesn't exist */
-static int dirty_bitmap_load_start(QEMUFile *f, DirtyBitmapLoadState *s)
+static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
 {
     Error *local_err = NULL;
     uint32_t granularity = qemu_get_be32(f);
@@ -538,7 +542,7 @@ static int dirty_bitmap_load_start(QEMUFile *f, DirtyBitmapLoadState *s)
 
     bdrv_disable_dirty_bitmap(s->bitmap);
     if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-        DirtyBitmapLoadBitmapState *b;
+        LoadBitmapState *b;
 
         bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err);
         if (local_err) {
@@ -546,7 +550,7 @@ static int dirty_bitmap_load_start(QEMUFile *f, DirtyBitmapLoadState *s)
             return -EINVAL;
         }
 
-        b = g_new(DirtyBitmapLoadBitmapState, 1);
+        b = g_new(LoadBitmapState, 1);
         b->bs = s->bs;
         b->bitmap = s->bitmap;
         b->migrated = false;
@@ -563,7 +567,7 @@ void dirty_bitmap_mig_before_vm_start(void)
     qemu_mutex_lock(&finish_lock);
 
     for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
-        DirtyBitmapLoadBitmapState *b = item->data;
+        LoadBitmapState *b = item->data;
 
         if (b->migrated) {
             bdrv_enable_dirty_bitmap(b->bitmap);
@@ -580,7 +584,7 @@ void dirty_bitmap_mig_before_vm_start(void)
     qemu_mutex_unlock(&finish_lock);
 }
 
-static void dirty_bitmap_load_complete(QEMUFile *f, DirtyBitmapLoadState *s)
+static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
     GSList *item;
     trace_dirty_bitmap_load_complete();
@@ -589,7 +593,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DirtyBitmapLoadState *s)
     qemu_mutex_lock(&finish_lock);
 
     for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
-        DirtyBitmapLoadBitmapState *b = item->data;
+        LoadBitmapState *b = item->data;
 
         if (b->bitmap == s->bitmap) {
             b->migrated = true;
@@ -621,7 +625,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DirtyBitmapLoadState *s)
     qemu_mutex_unlock(&finish_lock);
 }
 
-static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
+static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
 {
     uint64_t first_byte = qemu_get_be64(f) << BDRV_SECTOR_BITS;
     uint64_t nr_bytes = (uint64_t)qemu_get_be32(f) << BDRV_SECTOR_BITS;
@@ -666,7 +670,7 @@ static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
     return 0;
 }
 
-static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
+static int dirty_bitmap_load_header(QEMUFile *f, DBMLoadState *s)
 {
     Error *local_err = NULL;
     bool nothing;
@@ -715,7 +719,7 @@ static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
 
 static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
 {
-    static DirtyBitmapLoadState s;
+    static DBMLoadState s;
     int ret = 0;
 
     trace_dirty_bitmap_load_enter();
@@ -753,7 +757,7 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
 
 static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
 {
-    DirtyBitmapMigBitmapState *dbms = NULL;
+    SaveBitmapState *dbms = NULL;
     if (init_dirty_bitmap_migration() < 0) {
         return -1;
     }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 09/21] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 08/21] migration/block-dirty-bitmap: rename state structure types Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 10/21] migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init Vladimir Sementsov-Ogievskiy
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to
stress that it is on save part.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 migration/block-dirty-bitmap.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1d57bff4f6..01a536d7d3 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -259,7 +259,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
 }
 
 /* Called with iothread lock taken.  */
-static void dirty_bitmap_mig_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(void)
 {
     SaveBitmapState *dbms;
 
@@ -406,7 +406,7 @@ static int init_dirty_bitmap_migration(void)
 
 fail:
     g_hash_table_destroy(handled_by_blk);
-    dirty_bitmap_mig_cleanup();
+    dirty_bitmap_do_save_cleanup();
 
     return -1;
 }
@@ -445,7 +445,7 @@ static void bulk_phase(QEMUFile *f, bool limit)
 /* for SaveVMHandlers */
 static void dirty_bitmap_save_cleanup(void *opaque)
 {
-    dirty_bitmap_mig_cleanup();
+    dirty_bitmap_do_save_cleanup();
 }
 
 static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
@@ -480,7 +480,7 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque)
 
     trace_dirty_bitmap_save_complete_finish();
 
-    dirty_bitmap_mig_cleanup();
+    dirty_bitmap_do_save_cleanup();
     return 0;
 }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 10/21] migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 09/21] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 11/21] migration/block-dirty-bitmap: refactor state global variables Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

No reasons to keep two public init functions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/migration.h          | 1 -
 migration/block-dirty-bitmap.c | 6 +-----
 migration/migration.c          | 2 --
 3 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index f617960522..ab20c756f5 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -335,7 +335,6 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState *mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
-void init_dirty_bitmap_incoming_migration(void);
 void migrate_add_address(SocketAddress *address);
 
 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 01a536d7d3..4b67e4f4fb 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -148,11 +148,6 @@ typedef struct LoadBitmapState {
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;
 
-void init_dirty_bitmap_incoming_migration(void)
-{
-    qemu_mutex_init(&finish_lock);
-}
-
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
     uint8_t flags = qemu_get_byte(f);
@@ -801,6 +796,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
     QSIMPLEQ_INIT(&dirty_bitmap_mig_state.dbms_list);
+    qemu_mutex_init(&finish_lock);
 
     register_savevm_live("dirty-bitmap", 0, 1,
                          &savevm_dirty_bitmap_handlers,
diff --git a/migration/migration.c b/migration/migration.c
index 2ed9923227..1c61428988 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -165,8 +165,6 @@ void migration_object_init(void)
     qemu_sem_init(&current_incoming->postcopy_pause_sem_dst, 0);
     qemu_sem_init(&current_incoming->postcopy_pause_sem_fault, 0);
 
-    init_dirty_bitmap_incoming_migration();
-
     if (!migration_object_check(current_migration, &err)) {
         error_report_err(err);
         exit(1);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 11/21] migration/block-dirty-bitmap: refactor state global variables
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 10/21] migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 12/21] migration/block-dirty-bitmap: rename finish_lock to just lock Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Move all state variables into one global struct. Reduce global
variable usage, utilizing opaque pointer where possible.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 migration/block-dirty-bitmap.c | 179 ++++++++++++++++++---------------
 1 file changed, 99 insertions(+), 80 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 4b67e4f4fb..9b39e7aa2b 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -128,6 +128,12 @@ typedef struct DBMSaveState {
     BdrvDirtyBitmap *prev_bitmap;
 } DBMSaveState;
 
+typedef struct LoadBitmapState {
+    BlockDriverState *bs;
+    BdrvDirtyBitmap *bitmap;
+    bool migrated;
+} LoadBitmapState;
+
 /* State of the dirty bitmap migration (DBM) during load process */
 typedef struct DBMLoadState {
     uint32_t flags;
@@ -135,18 +141,17 @@ typedef struct DBMLoadState {
     char bitmap_name[256];
     BlockDriverState *bs;
     BdrvDirtyBitmap *bitmap;
+
+    GSList *enabled_bitmaps;
+    QemuMutex finish_lock;
 } DBMLoadState;
 
-static DBMSaveState dirty_bitmap_mig_state;
+typedef struct DBMState {
+    DBMSaveState save;
+    DBMLoadState load;
+} DBMState;
 
-/* State of one bitmap during load process */
-typedef struct LoadBitmapState {
-    BlockDriverState *bs;
-    BdrvDirtyBitmap *bitmap;
-    bool migrated;
-} LoadBitmapState;
-static GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+static DBMState dbm_state;
 
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
@@ -169,21 +174,21 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t flags)
     qemu_put_byte(f, flags);
 }
 
-static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
-                               uint32_t additional_flags)
+static void send_bitmap_header(QEMUFile *f, DBMSaveState *s,
+                               SaveBitmapState *dbms, uint32_t additional_flags)
 {
     BlockDriverState *bs = dbms->bs;
     BdrvDirtyBitmap *bitmap = dbms->bitmap;
     uint32_t flags = additional_flags;
     trace_send_bitmap_header_enter();
 
-    if (bs != dirty_bitmap_mig_state.prev_bs) {
-        dirty_bitmap_mig_state.prev_bs = bs;
+    if (bs != s->prev_bs) {
+        s->prev_bs = bs;
         flags |= DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME;
     }
 
-    if (bitmap != dirty_bitmap_mig_state.prev_bitmap) {
-        dirty_bitmap_mig_state.prev_bitmap = bitmap;
+    if (bitmap != s->prev_bitmap) {
+        s->prev_bitmap = bitmap;
         flags |= DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME;
     }
 
@@ -198,19 +203,22 @@ static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
     }
 }
 
-static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, DBMSaveState *s,
+                              SaveBitmapState *dbms)
 {
-    send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
+    send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_START);
     qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
     qemu_put_byte(f, dbms->flags);
 }
 
-static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, DBMSaveState *s,
+                                 SaveBitmapState *dbms)
 {
-    send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
+    send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }
 
-static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, DBMSaveState *s,
+                             SaveBitmapState *dbms,
                              uint64_t start_sector, uint32_t nr_sectors)
 {
     /* align for buffer_is_zero() */
@@ -235,7 +243,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
 
     trace_send_bitmap_bits(flags, start_sector, nr_sectors, buf_size);
 
-    send_bitmap_header(f, dbms, flags);
+    send_bitmap_header(f, s, dbms, flags);
 
     qemu_put_be64(f, start_sector);
     qemu_put_be32(f, nr_sectors);
@@ -254,12 +262,12 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
 }
 
 /* Called with iothread lock taken.  */
-static void dirty_bitmap_do_save_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(DBMSaveState *s)
 {
     SaveBitmapState *dbms;
 
-    while ((dbms = QSIMPLEQ_FIRST(&dirty_bitmap_mig_state.dbms_list)) != NULL) {
-        QSIMPLEQ_REMOVE_HEAD(&dirty_bitmap_mig_state.dbms_list, entry);
+    while ((dbms = QSIMPLEQ_FIRST(&s->dbms_list)) != NULL) {
+        QSIMPLEQ_REMOVE_HEAD(&s->dbms_list, entry);
         bdrv_dirty_bitmap_set_busy(dbms->bitmap, false);
         bdrv_unref(dbms->bs);
         g_free(dbms);
@@ -267,7 +275,8 @@ static void dirty_bitmap_do_save_cleanup(void)
 }
 
 /* Called with iothread lock taken. */
-static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
+static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
+                               const char *bs_name)
 {
     BdrvDirtyBitmap *bitmap;
     SaveBitmapState *dbms;
@@ -322,25 +331,24 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
             dbms->flags |= DIRTY_BITMAP_MIG_START_FLAG_PERSISTENT;
         }
 
-        QSIMPLEQ_INSERT_TAIL(&dirty_bitmap_mig_state.dbms_list,
-                             dbms, entry);
+        QSIMPLEQ_INSERT_TAIL(&s->dbms_list, dbms, entry);
     }
 
     return 0;
 }
 
 /* Called with iothread lock taken. */
-static int init_dirty_bitmap_migration(void)
+static int init_dirty_bitmap_migration(DBMSaveState *s)
 {
     BlockDriverState *bs;
     SaveBitmapState *dbms;
     GHashTable *handled_by_blk = g_hash_table_new(NULL, NULL);
     BlockBackend *blk;
 
-    dirty_bitmap_mig_state.bulk_completed = false;
-    dirty_bitmap_mig_state.prev_bs = NULL;
-    dirty_bitmap_mig_state.prev_bitmap = NULL;
-    dirty_bitmap_mig_state.no_bitmaps = false;
+    s->bulk_completed = false;
+    s->prev_bs = NULL;
+    s->prev_bitmap = NULL;
+    s->no_bitmaps = false;
 
     /*
      * Use blockdevice name for direct (or filtered) children of named block
@@ -369,7 +377,7 @@ static int init_dirty_bitmap_migration(void)
         }
 
         if (bs && bs->drv && !bs->drv->is_filter) {
-            if (add_bitmaps_to_list(bs, name)) {
+            if (add_bitmaps_to_list(s, bs, name)) {
                 goto fail;
             }
             g_hash_table_add(handled_by_blk, bs);
@@ -381,18 +389,18 @@ static int init_dirty_bitmap_migration(void)
             continue;
         }
 
-        if (add_bitmaps_to_list(bs, bdrv_get_node_name(bs))) {
+        if (add_bitmaps_to_list(s, bs, bdrv_get_node_name(bs))) {
             goto fail;
         }
     }
 
     /* unset migration flags here, to not roll back it */
-    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
+    QSIMPLEQ_FOREACH(dbms, &s->dbms_list, entry) {
         bdrv_dirty_bitmap_skip_store(dbms->bitmap, true);
     }
 
-    if (QSIMPLEQ_EMPTY(&dirty_bitmap_mig_state.dbms_list)) {
-        dirty_bitmap_mig_state.no_bitmaps = true;
+    if (QSIMPLEQ_EMPTY(&s->dbms_list)) {
+        s->no_bitmaps = true;
     }
 
     g_hash_table_destroy(handled_by_blk);
@@ -401,18 +409,19 @@ static int init_dirty_bitmap_migration(void)
 
 fail:
     g_hash_table_destroy(handled_by_blk);
-    dirty_bitmap_do_save_cleanup();
+    dirty_bitmap_do_save_cleanup(s);
 
     return -1;
 }
 
 /* Called with no lock taken.  */
-static void bulk_phase_send_chunk(QEMUFile *f, SaveBitmapState *dbms)
+static void bulk_phase_send_chunk(QEMUFile *f, DBMSaveState *s,
+                                  SaveBitmapState *dbms)
 {
     uint32_t nr_sectors = MIN(dbms->total_sectors - dbms->cur_sector,
                              dbms->sectors_per_chunk);
 
-    send_bitmap_bits(f, dbms, dbms->cur_sector, nr_sectors);
+    send_bitmap_bits(f, s, dbms, dbms->cur_sector, nr_sectors);
 
     dbms->cur_sector += nr_sectors;
     if (dbms->cur_sector >= dbms->total_sectors) {
@@ -421,61 +430,66 @@ static void bulk_phase_send_chunk(QEMUFile *f, SaveBitmapState *dbms)
 }
 
 /* Called with no lock taken.  */
-static void bulk_phase(QEMUFile *f, bool limit)
+static void bulk_phase(QEMUFile *f, DBMSaveState *s, bool limit)
 {
     SaveBitmapState *dbms;
 
-    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
+    QSIMPLEQ_FOREACH(dbms, &s->dbms_list, entry) {
         while (!dbms->bulk_completed) {
-            bulk_phase_send_chunk(f, dbms);
+            bulk_phase_send_chunk(f, s, dbms);
             if (limit && qemu_file_rate_limit(f)) {
                 return;
             }
         }
     }
 
-    dirty_bitmap_mig_state.bulk_completed = true;
+    s->bulk_completed = true;
 }
 
 /* for SaveVMHandlers */
 static void dirty_bitmap_save_cleanup(void *opaque)
 {
-    dirty_bitmap_do_save_cleanup();
+    DBMSaveState *s = &((DBMState *)opaque)->save;
+
+    dirty_bitmap_do_save_cleanup(s);
 }
 
 static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
 {
+    DBMSaveState *s = &((DBMState *)opaque)->save;
+
     trace_dirty_bitmap_save_iterate(migration_in_postcopy());
 
-    if (migration_in_postcopy() && !dirty_bitmap_mig_state.bulk_completed) {
-        bulk_phase(f, true);
+    if (migration_in_postcopy() && !s->bulk_completed) {
+        bulk_phase(f, s, true);
     }
 
     qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
 
-    return dirty_bitmap_mig_state.bulk_completed;
+    return s->bulk_completed;
 }
 
 /* Called with iothread lock taken.  */
 
 static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque)
 {
+    DBMSaveState *s = &((DBMState *)opaque)->save;
     SaveBitmapState *dbms;
     trace_dirty_bitmap_save_complete_enter();
 
-    if (!dirty_bitmap_mig_state.bulk_completed) {
-        bulk_phase(f, false);
+    if (!s->bulk_completed) {
+        bulk_phase(f, s, false);
     }
 
-    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
-        send_bitmap_complete(f, dbms);
+    QSIMPLEQ_FOREACH(dbms, &s->dbms_list, entry) {
+        send_bitmap_complete(f, s, dbms);
     }
 
     qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
 
     trace_dirty_bitmap_save_complete_finish();
 
-    dirty_bitmap_do_save_cleanup();
+    dirty_bitmap_save_cleanup(opaque);
     return 0;
 }
 
@@ -485,12 +499,13 @@ static void dirty_bitmap_save_pending(QEMUFile *f, void *opaque,
                                       uint64_t *res_compatible,
                                       uint64_t *res_postcopy_only)
 {
+    DBMSaveState *s = &((DBMState *)opaque)->save;
     SaveBitmapState *dbms;
     uint64_t pending = 0;
 
     qemu_mutex_lock_iothread();
 
-    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
+    QSIMPLEQ_FOREACH(dbms, &s->dbms_list, entry) {
         uint64_t gran = bdrv_dirty_bitmap_granularity(dbms->bitmap);
         uint64_t sectors = dbms->bulk_completed ? 0 :
                            dbms->total_sectors - dbms->cur_sector;
@@ -549,7 +564,7 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
         b->bs = s->bs;
         b->bitmap = s->bitmap;
         b->migrated = false;
-        enabled_bitmaps = g_slist_prepend(enabled_bitmaps, b);
+        s->enabled_bitmaps = g_slist_prepend(s->enabled_bitmaps, b);
     }
 
     return 0;
@@ -557,11 +572,12 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
 
 void dirty_bitmap_mig_before_vm_start(void)
 {
+    DBMLoadState *s = &dbm_state.load;
     GSList *item;
 
-    qemu_mutex_lock(&finish_lock);
+    qemu_mutex_lock(&s->finish_lock);
 
-    for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
+    for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
         LoadBitmapState *b = item->data;
 
         if (b->migrated) {
@@ -573,10 +589,10 @@ void dirty_bitmap_mig_before_vm_start(void)
         g_free(b);
     }
 
-    g_slist_free(enabled_bitmaps);
-    enabled_bitmaps = NULL;
+    g_slist_free(s->enabled_bitmaps);
+    s->enabled_bitmaps = NULL;
 
-    qemu_mutex_unlock(&finish_lock);
+    qemu_mutex_unlock(&s->finish_lock);
 }
 
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
@@ -585,9 +601,9 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
     trace_dirty_bitmap_load_complete();
     bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-    qemu_mutex_lock(&finish_lock);
+    qemu_mutex_lock(&s->finish_lock);
 
-    for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
+    for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
         LoadBitmapState *b = item->data;
 
         if (b->bitmap == s->bitmap) {
@@ -598,7 +614,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 
     if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
         bdrv_dirty_bitmap_lock(s->bitmap);
-        if (enabled_bitmaps == NULL) {
+        if (s->enabled_bitmaps == NULL) {
             /* in postcopy */
             bdrv_reclaim_dirty_bitmap_locked(s->bitmap, &error_abort);
             bdrv_enable_dirty_bitmap_locked(s->bitmap);
@@ -617,7 +633,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
         bdrv_dirty_bitmap_unlock(s->bitmap);
     }
 
-    qemu_mutex_unlock(&finish_lock);
+    qemu_mutex_unlock(&s->finish_lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -714,7 +730,7 @@ static int dirty_bitmap_load_header(QEMUFile *f, DBMLoadState *s)
 
 static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
 {
-    static DBMLoadState s;
+    DBMLoadState *s = &((DBMState *)opaque)->load;
     int ret = 0;
 
     trace_dirty_bitmap_load_enter();
@@ -724,17 +740,17 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
     }
 
     do {
-        ret = dirty_bitmap_load_header(f, &s);
+        ret = dirty_bitmap_load_header(f, s);
         if (ret < 0) {
             return ret;
         }
 
-        if (s.flags & DIRTY_BITMAP_MIG_FLAG_START) {
-            ret = dirty_bitmap_load_start(f, &s);
-        } else if (s.flags & DIRTY_BITMAP_MIG_FLAG_COMPLETE) {
-            dirty_bitmap_load_complete(f, &s);
-        } else if (s.flags & DIRTY_BITMAP_MIG_FLAG_BITS) {
-            ret = dirty_bitmap_load_bits(f, &s);
+        if (s->flags & DIRTY_BITMAP_MIG_FLAG_START) {
+            ret = dirty_bitmap_load_start(f, s);
+        } else if (s->flags & DIRTY_BITMAP_MIG_FLAG_COMPLETE) {
+            dirty_bitmap_load_complete(f, s);
+        } else if (s->flags & DIRTY_BITMAP_MIG_FLAG_BITS) {
+            ret = dirty_bitmap_load_bits(f, s);
         }
 
         if (!ret) {
@@ -744,7 +760,7 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
         if (ret) {
             return ret;
         }
-    } while (!(s.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
+    } while (!(s->flags & DIRTY_BITMAP_MIG_FLAG_EOS));
 
     trace_dirty_bitmap_load_success();
     return 0;
@@ -752,13 +768,14 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
 
 static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
 {
+    DBMSaveState *s = &((DBMState *)opaque)->save;
     SaveBitmapState *dbms = NULL;
-    if (init_dirty_bitmap_migration() < 0) {
+    if (init_dirty_bitmap_migration(s) < 0) {
         return -1;
     }
 
-    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
-        send_bitmap_start(f, dbms);
+    QSIMPLEQ_FOREACH(dbms, &s->dbms_list, entry) {
+        send_bitmap_start(f, s, dbms);
     }
     qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
 
@@ -767,7 +784,9 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
 
 static bool dirty_bitmap_is_active(void *opaque)
 {
-    return migrate_dirty_bitmaps() && !dirty_bitmap_mig_state.no_bitmaps;
+    DBMSaveState *s = &((DBMState *)opaque)->save;
+
+    return migrate_dirty_bitmaps() && !s->no_bitmaps;
 }
 
 static bool dirty_bitmap_is_active_iterate(void *opaque)
@@ -795,10 +814,10 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 
 void dirty_bitmap_mig_init(void)
 {
-    QSIMPLEQ_INIT(&dirty_bitmap_mig_state.dbms_list);
-    qemu_mutex_init(&finish_lock);
+    QSIMPLEQ_INIT(&dbm_state.save.dbms_list);
+    qemu_mutex_init(&dbm_state.load.finish_lock);
 
     register_savevm_live("dirty-bitmap", 0, 1,
                          &savevm_dirty_bitmap_handlers,
-                         &dirty_bitmap_mig_state);
+                         &dbm_state);
 }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 12/21] migration/block-dirty-bitmap: rename finish_lock to just lock
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 11/21] migration/block-dirty-bitmap: refactor state global variables Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 13/21] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

finish_lock is bad name, as lock used not only on process end.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 migration/block-dirty-bitmap.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9b39e7aa2b..9194807b54 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -143,7 +143,7 @@ typedef struct DBMLoadState {
     BdrvDirtyBitmap *bitmap;
 
     GSList *enabled_bitmaps;
-    QemuMutex finish_lock;
+    QemuMutex lock; /* protect enabled_bitmaps */
 } DBMLoadState;
 
 typedef struct DBMState {
@@ -575,7 +575,7 @@ void dirty_bitmap_mig_before_vm_start(void)
     DBMLoadState *s = &dbm_state.load;
     GSList *item;
 
-    qemu_mutex_lock(&s->finish_lock);
+    qemu_mutex_lock(&s->lock);
 
     for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
         LoadBitmapState *b = item->data;
@@ -592,7 +592,7 @@ void dirty_bitmap_mig_before_vm_start(void)
     g_slist_free(s->enabled_bitmaps);
     s->enabled_bitmaps = NULL;
 
-    qemu_mutex_unlock(&s->finish_lock);
+    qemu_mutex_unlock(&s->lock);
 }
 
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
@@ -601,7 +601,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
     trace_dirty_bitmap_load_complete();
     bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-    qemu_mutex_lock(&s->finish_lock);
+    qemu_mutex_lock(&s->lock);
 
     for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
         LoadBitmapState *b = item->data;
@@ -633,7 +633,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
         bdrv_dirty_bitmap_unlock(s->bitmap);
     }
 
-    qemu_mutex_unlock(&s->finish_lock);
+    qemu_mutex_unlock(&s->lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -815,7 +815,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
     QSIMPLEQ_INIT(&dbm_state.save.dbms_list);
-    qemu_mutex_init(&dbm_state.load.finish_lock);
+    qemu_mutex_init(&dbm_state.load.lock);
 
     register_savevm_live("dirty-bitmap", 0, 1,
                          &savevm_dirty_bitmap_handlers,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 13/21] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 12/21] migration/block-dirty-bitmap: rename finish_lock to just lock Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 14/21] migration/block-dirty-bitmap: keep bitmap state for all bitmaps Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.

So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 migration/block-dirty-bitmap.c | 25 ++++---------------------
 1 file changed, 4 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9194807b54..405a259296 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -603,6 +603,10 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 
     qemu_mutex_lock(&s->lock);
 
+    if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
+        bdrv_reclaim_dirty_bitmap(s->bitmap, &error_abort);
+    }
+
     for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
         LoadBitmapState *b = item->data;
 
@@ -612,27 +616,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
         }
     }
 
-    if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
-        bdrv_dirty_bitmap_lock(s->bitmap);
-        if (s->enabled_bitmaps == NULL) {
-            /* in postcopy */
-            bdrv_reclaim_dirty_bitmap_locked(s->bitmap, &error_abort);
-            bdrv_enable_dirty_bitmap_locked(s->bitmap);
-        } else {
-            /* target not started, successor must be empty */
-            int64_t count = bdrv_get_dirty_count(s->bitmap);
-            BdrvDirtyBitmap *ret = bdrv_reclaim_dirty_bitmap_locked(s->bitmap,
-                                                                    NULL);
-            /* bdrv_reclaim_dirty_bitmap can fail only on no successor (it
-             * must be) or on merge fail, but merge can't fail when second
-             * bitmap is empty
-             */
-            assert(ret == s->bitmap &&
-                   count == bdrv_get_dirty_count(s->bitmap));
-        }
-        bdrv_dirty_bitmap_unlock(s->bitmap);
-    }
-
     qemu_mutex_unlock(&s->lock);
 }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 14/21] migration/block-dirty-bitmap: keep bitmap state for all bitmaps
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (12 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 13/21] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 15/21] migration/block-dirty-bitmap: relax error handling in incoming part Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.

To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 migration/block-dirty-bitmap.c | 64 +++++++++++++++++++++++-----------
 1 file changed, 43 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 405a259296..eb4ffeac4d 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -132,6 +132,7 @@ typedef struct LoadBitmapState {
     BlockDriverState *bs;
     BdrvDirtyBitmap *bitmap;
     bool migrated;
+    bool enabled;
 } LoadBitmapState;
 
 /* State of the dirty bitmap migration (DBM) during load process */
@@ -142,8 +143,10 @@ typedef struct DBMLoadState {
     BlockDriverState *bs;
     BdrvDirtyBitmap *bitmap;
 
-    GSList *enabled_bitmaps;
-    QemuMutex lock; /* protect enabled_bitmaps */
+    bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
+
+    GSList *bitmaps;
+    QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
 
 typedef struct DBMState {
@@ -526,6 +529,7 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
     Error *local_err = NULL;
     uint32_t granularity = qemu_get_be32(f);
     uint8_t flags = qemu_get_byte(f);
+    LoadBitmapState *b;
 
     if (s->bitmap) {
         error_report("Bitmap with the same name ('%s') already exists on "
@@ -552,45 +556,59 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
 
     bdrv_disable_dirty_bitmap(s->bitmap);
     if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-        LoadBitmapState *b;
-
         bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err);
         if (local_err) {
             error_report_err(local_err);
             return -EINVAL;
         }
-
-        b = g_new(LoadBitmapState, 1);
-        b->bs = s->bs;
-        b->bitmap = s->bitmap;
-        b->migrated = false;
-        s->enabled_bitmaps = g_slist_prepend(s->enabled_bitmaps, b);
     }
 
+    b = g_new(LoadBitmapState, 1);
+    b->bs = s->bs;
+    b->bitmap = s->bitmap;
+    b->migrated = false;
+    b->enabled = flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED;
+
+    s->bitmaps = g_slist_prepend(s->bitmaps, b);
+
     return 0;
 }
 
-void dirty_bitmap_mig_before_vm_start(void)
+/*
+ * before_vm_start_handle_item
+ *
+ * g_slist_foreach helper
+ *
+ * item is LoadBitmapState*
+ * opaque is DBMLoadState*
+ */
+static void before_vm_start_handle_item(void *item, void *opaque)
 {
-    DBMLoadState *s = &dbm_state.load;
-    GSList *item;
-
-    qemu_mutex_lock(&s->lock);
-
-    for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
-        LoadBitmapState *b = item->data;
+    DBMLoadState *s = opaque;
+    LoadBitmapState *b = item;
 
+    if (b->enabled) {
         if (b->migrated) {
             bdrv_enable_dirty_bitmap(b->bitmap);
         } else {
             bdrv_dirty_bitmap_enable_successor(b->bitmap);
         }
+    }
 
+    if (b->migrated) {
+        s->bitmaps = g_slist_remove(s->bitmaps, b);
         g_free(b);
     }
+}
 
-    g_slist_free(s->enabled_bitmaps);
-    s->enabled_bitmaps = NULL;
+void dirty_bitmap_mig_before_vm_start(void)
+{
+    DBMLoadState *s = &dbm_state.load;
+    qemu_mutex_lock(&s->lock);
+
+    assert(!s->before_vm_start_handled);
+    g_slist_foreach(s->bitmaps, before_vm_start_handle_item, s);
+    s->before_vm_start_handled = true;
 
     qemu_mutex_unlock(&s->lock);
 }
@@ -607,11 +625,15 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
         bdrv_reclaim_dirty_bitmap(s->bitmap, &error_abort);
     }
 
-    for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
+    for (item = s->bitmaps; item; item = g_slist_next(item)) {
         LoadBitmapState *b = item->data;
 
         if (b->bitmap == s->bitmap) {
             b->migrated = true;
+            if (s->before_vm_start_handled) {
+                s->bitmaps = g_slist_remove(s->bitmaps, b);
+                g_free(b);
+            }
             break;
         }
     }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 15/21] migration/block-dirty-bitmap: relax error handling in incoming part
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (13 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 14/21] migration/block-dirty-bitmap: keep bitmap state for all bitmaps Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 20:14   ` Eric Blake
  2020-07-27 19:42 ` [PATCH v4 16/21] migration/block-dirty-bitmap: cancel migration on shutdown Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  21 siblings, 1 reply; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	den, mreitz, jsnow

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 migration/block-dirty-bitmap.c | 164 +++++++++++++++++++++++++--------
 1 file changed, 127 insertions(+), 37 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index eb4ffeac4d..4e45e79251 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -145,6 +145,15 @@ typedef struct DBMLoadState {
 
     bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
 
+    /*
+     * cancelled
+     * Incoming migration is cancelled for some reason. That means that we
+     * still should read our chunks from migration stream, to not affect other
+     * migration objects (like RAM), but just ignore them and do not touch any
+     * bitmaps or nodes.
+     */
+    bool cancelled;
+
     GSList *bitmaps;
     QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
@@ -531,6 +540,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
     uint8_t flags = qemu_get_byte(f);
     LoadBitmapState *b;
 
+    if (s->cancelled) {
+        return 0;
+    }
+
     if (s->bitmap) {
         error_report("Bitmap with the same name ('%s') already exists on "
                      "destination", bdrv_dirty_bitmap_name(s->bitmap));
@@ -613,13 +626,47 @@ void dirty_bitmap_mig_before_vm_start(void)
     qemu_mutex_unlock(&s->lock);
 }
 
+static void cancel_incoming_locked(DBMLoadState *s)
+{
+    GSList *item;
+
+    if (s->cancelled) {
+        return;
+    }
+
+    s->cancelled = true;
+    s->bs = NULL;
+    s->bitmap = NULL;
+
+    /* Drop all unfinished bitmaps */
+    for (item = s->bitmaps; item; item = g_slist_next(item)) {
+        LoadBitmapState *b = item->data;
+
+        /*
+         * Bitmap must be unfinished, as finished bitmaps should already be
+         * removed from the list.
+         */
+        assert(!s->before_vm_start_handled || !b->migrated);
+        if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
+            bdrv_reclaim_dirty_bitmap(b->bitmap, &error_abort);
+        }
+        bdrv_release_dirty_bitmap(b->bitmap);
+    }
+
+    g_slist_free_full(s->bitmaps, g_free);
+    s->bitmaps = NULL;
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
     GSList *item;
     trace_dirty_bitmap_load_complete();
-    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-    qemu_mutex_lock(&s->lock);
+    if (s->cancelled) {
+        return;
+    }
+
+    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
     if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
         bdrv_reclaim_dirty_bitmap(s->bitmap, &error_abort);
@@ -637,8 +684,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
             break;
         }
     }
-
-    qemu_mutex_unlock(&s->lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -650,15 +695,46 @@ static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
 
     if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
         trace_dirty_bitmap_load_bits_zeroes();
-        bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
-                                             false);
+        if (!s->cancelled) {
+            bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+                                                 nr_bytes, false);
+        }
     } else {
         size_t ret;
-        uint8_t *buf;
+        g_autofree uint8_t *buf = NULL;
         uint64_t buf_size = qemu_get_be64(f);
-        uint64_t needed_size =
-            bdrv_dirty_bitmap_serialization_size(s->bitmap,
-                                                 first_byte, nr_bytes);
+        uint64_t needed_size;
+
+        /*
+         * Actual check for buf_size is done a bit later. We can't do it in
+         * cancelled mode as we don't have the bitmap to check the constraints
+         * (so, we do allocate buffer and read prior to the check). On the other
+         * hand, we shouldn't blindly g_malloc the number from the stream.
+         * Actually one chunk should not be larger thatn CHUNK_SIZE. Let's allow
+         * a bit larger (which means that bitmap migration will fail anyway and
+         * the whole migration will most probably fail soon due to broken
+         * stream).
+         */
+        if (buf_size > 10 * CHUNK_SIZE) {
+            error_report("Bitmap migration stream requests too large buffer "
+                         "size to allocate");
+            return -EIO;
+        }
+
+        buf = g_malloc(buf_size);
+        ret = qemu_get_buffer(f, buf, buf_size);
+        if (ret != buf_size) {
+            error_report("Failed to read bitmap bits");
+            return -EIO;
+        }
+
+        if (s->cancelled) {
+            return 0;
+        }
+
+        needed_size = bdrv_dirty_bitmap_serialization_size(s->bitmap,
+                                                           first_byte,
+                                                           nr_bytes);
 
         if (needed_size > buf_size ||
             buf_size > QEMU_ALIGN_UP(needed_size, 4 * sizeof(long))
@@ -667,20 +743,12 @@ static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
             error_report("Migrated bitmap granularity doesn't "
                          "match the destination bitmap '%s' granularity",
                          bdrv_dirty_bitmap_name(s->bitmap));
-            return -EINVAL;
-        }
-
-        buf = g_malloc(buf_size);
-        ret = qemu_get_buffer(f, buf, buf_size);
-        if (ret != buf_size) {
-            error_report("Failed to read bitmap bits");
-            g_free(buf);
-            return -EIO;
+            cancel_incoming_locked(s);
+            return 0;
         }
 
         bdrv_dirty_bitmap_deserialize_part(s->bitmap, buf, first_byte, nr_bytes,
                                            false);
-        g_free(buf);
     }
 
     return 0;
@@ -700,14 +768,16 @@ static int dirty_bitmap_load_header(QEMUFile *f, DBMLoadState *s)
             error_report("Unable to read node name string");
             return -EINVAL;
         }
-        s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
-        if (!s->bs) {
-            error_report_err(local_err);
-            return -EINVAL;
+        if (!s->cancelled) {
+            s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
+            if (!s->bs) {
+                error_report_err(local_err);
+                cancel_incoming_locked(s);
+            }
         }
-    } else if (!s->bs && !nothing) {
+    } else if (!s->bs && !nothing && !s->cancelled) {
         error_report("Error: block device name is not set");
-        return -EINVAL;
+        cancel_incoming_locked(s);
     }
 
     if (s->flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) {
@@ -715,24 +785,38 @@ static int dirty_bitmap_load_header(QEMUFile *f, DBMLoadState *s)
             error_report("Unable to read bitmap name string");
             return -EINVAL;
         }
-        s->bitmap = bdrv_find_dirty_bitmap(s->bs, s->bitmap_name);
-
-        /* bitmap may be NULL here, it wouldn't be an error if it is the
-         * first occurrence of the bitmap */
-        if (!s->bitmap && !(s->flags & DIRTY_BITMAP_MIG_FLAG_START)) {
-            error_report("Error: unknown dirty bitmap "
-                         "'%s' for block device '%s'",
-                         s->bitmap_name, s->node_name);
-            return -EINVAL;
+        if (!s->cancelled) {
+            s->bitmap = bdrv_find_dirty_bitmap(s->bs, s->bitmap_name);
+
+            /*
+             * bitmap may be NULL here, it wouldn't be an error if it is the
+             * first occurrence of the bitmap
+             */
+            if (!s->bitmap && !(s->flags & DIRTY_BITMAP_MIG_FLAG_START)) {
+                error_report("Error: unknown dirty bitmap "
+                             "'%s' for block device '%s'",
+                             s->bitmap_name, s->node_name);
+                cancel_incoming_locked(s);
+            }
         }
-    } else if (!s->bitmap && !nothing) {
+    } else if (!s->bitmap && !nothing && !s->cancelled) {
         error_report("Error: block device name is not set");
-        return -EINVAL;
+        cancel_incoming_locked(s);
     }
 
     return 0;
 }
 
+/*
+ * dirty_bitmap_load
+ *
+ * Load sequence of dirty bitmap chunks. Return error only on fatal io stream
+ * violations. On other errors just cancel bitmaps incoming migration and return
+ * 0.
+ *
+ * Note, than when incoming bitmap migration is canceled, we still must read all
+ * our chunks (and just ignore them), to not affect other migration objects.
+ */
 static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
 {
     DBMLoadState *s = &((DBMState *)opaque)->load;
@@ -741,12 +825,17 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
     trace_dirty_bitmap_load_enter();
 
     if (version_id != 1) {
+        QEMU_LOCK_GUARD(&s->lock);
+        cancel_incoming_locked(s);
         return -EINVAL;
     }
 
     do {
+        QEMU_LOCK_GUARD(&s->lock);
+
         ret = dirty_bitmap_load_header(f, s);
         if (ret < 0) {
+            cancel_incoming_locked(s);
             return ret;
         }
 
@@ -763,6 +852,7 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
         }
 
         if (ret) {
+            cancel_incoming_locked(s);
             return ret;
         }
     } while (!(s->flags & DIRTY_BITMAP_MIG_FLAG_EOS));
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 16/21] migration/block-dirty-bitmap: cancel migration on shutdown
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (14 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 15/21] migration/block-dirty-bitmap: relax error handling in incoming part Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 17/21] migration/savevm: don't worry if bitmap migration postcopy failed Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

If target is turned off prior to postcopy finished, target crashes
because busy bitmaps are found at shutdown.
Canceling incoming migration helps, as it removes all unfinished (and
therefore busy) bitmaps.

Similarly on source we crash in bdrv_close_all which asserts that all
bdrv states are removed, because bdrv states involved into dirty bitmap
migration are referenced by it. So, we need to cancel outgoing
migration as well.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 migration/migration.h          |  2 ++
 migration/block-dirty-bitmap.c | 16 ++++++++++++++++
 migration/migration.c          | 13 +++++++++++++
 3 files changed, 31 insertions(+)

diff --git a/migration/migration.h b/migration/migration.h
index ab20c756f5..6c6a931d0d 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -335,6 +335,8 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState *mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
+void dirty_bitmap_mig_cancel_outgoing(void);
+void dirty_bitmap_mig_cancel_incoming(void);
 void migrate_add_address(SocketAddress *address);
 
 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 4e45e79251..36ca8be392 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -657,6 +657,22 @@ static void cancel_incoming_locked(DBMLoadState *s)
     s->bitmaps = NULL;
 }
 
+void dirty_bitmap_mig_cancel_outgoing(void)
+{
+    dirty_bitmap_do_save_cleanup(&dbm_state.save);
+}
+
+void dirty_bitmap_mig_cancel_incoming(void)
+{
+    DBMLoadState *s = &dbm_state.load;
+
+    qemu_mutex_lock(&s->lock);
+
+    cancel_incoming_locked(s);
+
+    qemu_mutex_unlock(&s->lock);
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
     GSList *item;
diff --git a/migration/migration.c b/migration/migration.c
index 1c61428988..8fe36339db 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -188,6 +188,19 @@ void migration_shutdown(void)
      */
     migrate_fd_cancel(current_migration);
     object_unref(OBJECT(current_migration));
+
+    /*
+     * Cancel outgoing migration of dirty bitmaps. It should
+     * at least unref used block nodes.
+     */
+    dirty_bitmap_mig_cancel_outgoing();
+
+    /*
+     * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
+     * are non-critical data, and their loss never considered as
+     * something serious.
+     */
+    dirty_bitmap_mig_cancel_incoming();
 }
 
 /* For outgoing */
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 17/21] migration/savevm: don't worry if bitmap migration postcopy failed
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (15 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 16/21] migration/block-dirty-bitmap: cancel migration on shutdown Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 18/21] qemu-iotests/199: prepare for new test-cases addition Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

First, if only bitmaps postcopy is enabled (and not ram postcopy)
postcopy_pause_incoming crashes on an assertion
assert(mis->to_src_file).

And anyway, bitmaps postcopy is not prepared to be somehow recovered.
The original idea instead is that if bitmaps postcopy failed, we just
lose some bitmaps, which is not critical. So, on failure we just need
to remove unfinished bitmaps and guest should continue execution on
destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 migration/savevm.c | 37 ++++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 45c9dd9d8a..a843d202b5 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1813,6 +1813,9 @@ static void *postcopy_ram_listen_thread(void *opaque)
     MigrationIncomingState *mis = migration_incoming_get_current();
     QEMUFile *f = mis->from_src_file;
     int load_res;
+    MigrationState *migr = migrate_get_current();
+
+    object_ref(OBJECT(migr));
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                                    MIGRATION_STATUS_POSTCOPY_ACTIVE);
@@ -1839,11 +1842,24 @@ static void *postcopy_ram_listen_thread(void *opaque)
 
     trace_postcopy_ram_listen_thread_exit();
     if (load_res < 0) {
-        error_report("%s: loadvm failed: %d", __func__, load_res);
         qemu_file_set_error(f, load_res);
-        migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
-                                       MIGRATION_STATUS_FAILED);
-    } else {
+        dirty_bitmap_mig_cancel_incoming();
+        if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
+            !migrate_postcopy_ram() && migrate_dirty_bitmaps())
+        {
+            error_report("%s: loadvm failed during postcopy: %d. All states "
+                         "are migrated except dirty bitmaps. Some dirty "
+                         "bitmaps may be lost, and present migrated dirty "
+                         "bitmaps are correctly migrated and valid.",
+                         __func__, load_res);
+            load_res = 0; /* prevent further exit() */
+        } else {
+            error_report("%s: loadvm failed: %d", __func__, load_res);
+            migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
+                                           MIGRATION_STATUS_FAILED);
+        }
+    }
+    if (load_res >= 0) {
         /*
          * This looks good, but it's possible that the device loading in the
          * main thread hasn't finished yet, and so we might not be in 'RUN'
@@ -1879,6 +1895,8 @@ static void *postcopy_ram_listen_thread(void *opaque)
     mis->have_listen_thread = false;
     postcopy_state_set(POSTCOPY_INCOMING_END);
 
+    object_unref(OBJECT(migr));
+
     return NULL;
 }
 
@@ -2437,6 +2455,8 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
 {
     trace_postcopy_pause_incoming();
 
+    assert(migrate_postcopy_ram());
+
     /* Clear the triggered bit to allow one recovery */
     mis->postcopy_recover_triggered = false;
 
@@ -2521,15 +2541,22 @@ out:
     if (ret < 0) {
         qemu_file_set_error(f, ret);
 
+        /* Cancel bitmaps incoming regardless of recovery */
+        dirty_bitmap_mig_cancel_incoming();
+
         /*
          * If we are during an active postcopy, then we pause instead
          * of bail out to at least keep the VM's dirty data.  Note
          * that POSTCOPY_INCOMING_LISTENING stage is still not enough,
          * during which we're still receiving device states and we
          * still haven't yet started the VM on destination.
+         *
+         * Only RAM postcopy supports recovery. Still, if RAM postcopy is
+         * enabled, canceled bitmaps postcopy will not affect RAM postcopy
+         * recovering.
          */
         if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
-            postcopy_pause_incoming(mis)) {
+            migrate_postcopy_ram() && postcopy_pause_incoming(mis)) {
             /* Reset f to point to the newly created channel */
             f = mis->from_src_file;
             goto retry;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 18/21] qemu-iotests/199: prepare for new test-cases addition
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (16 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 17/21] migration/savevm: don't worry if bitmap migration postcopy failed Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 19/21] qemu-iotests/199: check persistent bitmaps Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Move future common part to start_postcopy() method. Move checking
number of bitmaps to check_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 tests/qemu-iotests/199 | 36 +++++++++++++++++++++++-------------
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index d8532e49da..355c0b2885 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -29,6 +29,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
+granularity = 512
+nb_bitmaps = 15
 
 GiB = 1024 * 1024 * 1024
 
@@ -61,6 +63,15 @@ def event_dist(e1, e2):
     return event_seconds(e2) - event_seconds(e1)
 
 
+def check_bitmaps(vm, count):
+    result = vm.qmp('query-block')
+
+    if count == 0:
+        assert 'dirty-bitmaps' not in result['return'][0]
+    else:
+        assert len(result['return'][0]['dirty-bitmaps']) == count
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
     def tearDown(self):
         if debug:
@@ -101,10 +112,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         self.vm_a_events = []
         self.vm_b_events = []
 
-    def test_postcopy(self):
-        granularity = 512
-        nb_bitmaps = 15
-
+    def start_postcopy(self):
+        """ Run migration until RESUME event on target. Return this event. """
         for i in range(nb_bitmaps):
             result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
                                    name='bitmap{}'.format(i),
@@ -119,10 +128,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
         result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
                                node='drive0', name='bitmap0')
-        discards1_sha256 = result['return']['sha256']
+        self.discards1_sha256 = result['return']['sha256']
 
         # Check, that updating the bitmap by discards works
-        assert discards1_sha256 != empty_sha256
+        assert self.discards1_sha256 != empty_sha256
 
         # We want to calculate resulting sha256. Do it in bitmap0, so, disable
         # other bitmaps
@@ -135,7 +144,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
         result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
                                node='drive0', name='bitmap0')
-        all_discards_sha256 = result['return']['sha256']
+        self.all_discards_sha256 = result['return']['sha256']
 
         # Now, enable some bitmaps, to be updated during migration
         for i in range(2, nb_bitmaps, 2):
@@ -160,6 +169,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
         event_resume = self.vm_b.event_wait('RESUME')
         self.vm_b_events.append(event_resume)
+        return event_resume
+
+    def test_postcopy_success(self):
+        event_resume = self.start_postcopy()
 
         # enabled bitmaps should be updated
         apply_discards(self.vm_b, discards2)
@@ -180,18 +193,15 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
             print('downtime:', downtime)
             print('postcopy_time:', postcopy_time)
 
-        # Assert that bitmap migration is finished (check that successor bitmap
-        # is removed)
-        result = self.vm_b.qmp('query-block')
-        assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+        check_bitmaps(self.vm_b, nb_bitmaps)
 
         # Check content of migrated bitmaps. Still, don't waste time checking
         # every bitmap
         for i in range(0, nb_bitmaps, 5):
             result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
                                    node='drive0', name='bitmap{}'.format(i))
-            sha256 = discards1_sha256 if i % 2 else all_discards_sha256
-            self.assert_qmp(result, 'return/sha256', sha256)
+            sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
+            self.assert_qmp(result, 'return/sha256', sha)
 
 
 if __name__ == '__main__':
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 19/21] qemu-iotests/199: check persistent bitmaps
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (17 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 18/21] qemu-iotests/199: prepare for new test-cases addition Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 20/21] qemu-iotests/199: add early shutdown case to bitmaps postcopy Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Check that persistent bitmaps are not stored on source and that bitmaps
are persistent on destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 tests/qemu-iotests/199 | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 355c0b2885..5fd34f0fcd 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -117,7 +117,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         for i in range(nb_bitmaps):
             result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
                                    name='bitmap{}'.format(i),
-                                   granularity=granularity)
+                                   granularity=granularity,
+                                   persistent=True)
             self.assert_qmp(result, 'return', {})
 
         result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
@@ -193,6 +194,19 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
             print('downtime:', downtime)
             print('postcopy_time:', postcopy_time)
 
+        # check that there are no bitmaps stored on source
+        self.vm_a_events += self.vm_a.get_qmp_events()
+        self.vm_a.shutdown()
+        self.vm_a.launch()
+        check_bitmaps(self.vm_a, 0)
+
+        # check that bitmaps are migrated and persistence works
+        check_bitmaps(self.vm_b, nb_bitmaps)
+        self.vm_b.shutdown()
+        # recreate vm_b, so there is no incoming option, which prevents
+        # loading bitmaps from disk
+        self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+        self.vm_b.launch()
         check_bitmaps(self.vm_b, nb_bitmaps)
 
         # Check content of migrated bitmaps. Still, don't waste time checking
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 20/21] qemu-iotests/199: add early shutdown case to bitmaps postcopy
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (18 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 19/21] qemu-iotests/199: check persistent bitmaps Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:42 ` [PATCH v4 21/21] qemu-iotests/199: add source-killed " Vladimir Sementsov-Ogievskiy
  2020-07-27 19:53 ` [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Eric Blake
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Previous patches fixed two crashes which may occur on shutdown prior to
bitmaps postcopy finished. Check that it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/199     | 24 ++++++++++++++++++++++++
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 5fd34f0fcd..140930b2b1 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -217,6 +217,30 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
             sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
             self.assert_qmp(result, 'return/sha256', sha)
 
+    def test_early_shutdown_destination(self):
+        self.start_postcopy()
+
+        self.vm_b_events += self.vm_b.get_qmp_events()
+        self.vm_b.shutdown()
+        # recreate vm_b, so there is no incoming option, which prevents
+        # loading bitmaps from disk
+        self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+        self.vm_b.launch()
+        check_bitmaps(self.vm_b, 0)
+
+        # Bitmaps will be lost if we just shutdown the vm, as they are marked
+        # to skip storing to disk when prepared for migration. And that's
+        # correct, as actual data may be modified in target vm, so we play
+        # safe.
+        # Still, this mark would be taken away if we do 'cont', and bitmaps
+        # become persistent again. (see iotest 169 for such behavior case)
+        result = self.vm_a.qmp('query-status')
+        assert not result['return']['running']
+        self.vm_a_events += self.vm_a.get_qmp_events()
+        self.vm_a.shutdown()
+        self.vm_a.launch()
+        check_bitmaps(self.vm_a, 0)
+
 
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index ae1213e6f8..fbc63e62f8 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-.
+..
 ----------------------------------------------------------------------
-Ran 1 tests
+Ran 2 tests
 
 OK
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 21/21] qemu-iotests/199: add source-killed case to bitmaps postcopy
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (19 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 20/21] qemu-iotests/199: add early shutdown case to bitmaps postcopy Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:42 ` Vladimir Sementsov-Ogievskiy
  2020-07-27 19:53 ` [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Eric Blake
  21 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:42 UTC (permalink / raw)
  To: qemu-block
  Cc: kwolf, fam, vsementsov, quintela, qemu-devel, dgilbert, stefanha,
	Andrey Shinkevich, den, mreitz, jsnow

Previous patches fixes behavior of bitmaps migration, so that errors
are handled by just removing unfinished bitmaps, and not fail or try to
recover postcopy migration. Add corresponding test.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/199     | 15 +++++++++++++++
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 140930b2b1..58fad872a1 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -241,6 +241,21 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
         self.vm_a.launch()
         check_bitmaps(self.vm_a, 0)
 
+    def test_early_kill_source(self):
+        self.start_postcopy()
+
+        self.vm_a_events = self.vm_a.get_qmp_events()
+        self.vm_a.kill()
+
+        self.vm_a.launch()
+
+        match = {'data': {'status': 'completed'}}
+        e_complete = self.vm_b.event_wait('MIGRATION', match=match)
+        self.vm_b_events.append(e_complete)
+
+        check_bitmaps(self.vm_a, 0)
+        check_bitmaps(self.vm_b, 0)
+
 
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index fbc63e62f8..8d7e996700 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-..
+...
 ----------------------------------------------------------------------
-Ran 2 tests
+Ran 3 tests
 
 OK
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy
  2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
                   ` (20 preceding siblings ...)
  2020-07-27 19:42 ` [PATCH v4 21/21] qemu-iotests/199: add source-killed " Vladimir Sementsov-Ogievskiy
@ 2020-07-27 19:53 ` Eric Blake
  2020-07-27 19:59   ` Vladimir Sementsov-Ogievskiy
  21 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2020-07-27 19:53 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, fam, quintela, qemu-devel, dgilbert, stefanha, den, mreitz, jsnow

On 7/27/20 2:42 PM, Vladimir Sementsov-Ogievskiy wrote:
> v4:
> 
> 01: typo in commit msg
> 07: reword commit msg, add Eric's r-b
> 10: add Dr. David's r-b
> 15: add check for buf_size
>      use g_autofree (and fix introduced in v3)
>      use QEMU_LOCK_GUARD
> 17: fix commit msg, add Eric's r-b
> 20-21: add Eric's t-b

What timing!  I was literally in the middle of composing my pull request 
when this landed in my inbox; I'll refresh my local contents to pick 
this up (and see if you tweaked anything differently than I did).

Therefore, my pull request is now shifted by an hour or two, but will 
still come today ;)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy
  2020-07-27 19:53 ` [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Eric Blake
@ 2020-07-27 19:59   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 25+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-27 19:59 UTC (permalink / raw)
  To: Eric Blake, qemu-block
  Cc: kwolf, fam, quintela, qemu-devel, dgilbert, stefanha, den, mreitz, jsnow

27.07.2020 22:53, Eric Blake wrote:
> On 7/27/20 2:42 PM, Vladimir Sementsov-Ogievskiy wrote:
>> v4:
>>
>> 01: typo in commit msg
>> 07: reword commit msg, add Eric's r-b
>> 10: add Dr. David's r-b
>> 15: add check for buf_size
>>      use g_autofree (and fix introduced in v3)
>>      use QEMU_LOCK_GUARD
>> 17: fix commit msg, add Eric's r-b
>> 20-21: add Eric's t-b
> 
> What timing!  I was literally in the middle of composing my pull request when this landed in my inbox; I'll refresh my local contents to pick this up (and see if you tweaked anything differently than I did).
> 
> Therefore, my pull request is now shifted by an hour or two, but will still come today ;)
> 

Sorry :). Thanks for your work!

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 15/21] migration/block-dirty-bitmap: relax error handling in incoming part
  2020-07-27 19:42 ` [PATCH v4 15/21] migration/block-dirty-bitmap: relax error handling in incoming part Vladimir Sementsov-Ogievskiy
@ 2020-07-27 20:14   ` Eric Blake
  0 siblings, 0 replies; 25+ messages in thread
From: Eric Blake @ 2020-07-27 20:14 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: kwolf, fam, quintela, qemu-devel, dgilbert, stefanha, den, mreitz, jsnow

On 7/27/20 2:42 PM, Vladimir Sementsov-Ogievskiy wrote:
> Bitmaps data is not critical, and we should not fail the migration (or
> use postcopy recovering) because of dirty-bitmaps migration failure.
> Instead we should just lose unfinished bitmaps.
> 
> Still we have to report io stream violation errors, as they affect the
> whole migration stream.
> 

I'm amending this to also add:

While touching this, tighten code that was previously blindly calling 
malloc on a size read from the migration stream, as a corrupted stream 
(perhaps from a malicious user) should not be able to convince us to 
allocate an inordinate amount of memory.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   migration/block-dirty-bitmap.c | 164 +++++++++++++++++++++++++--------
>   1 file changed, 127 insertions(+), 37 deletions(-)
> 

> @@ -650,15 +695,46 @@ static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
>   
>       if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
>           trace_dirty_bitmap_load_bits_zeroes();
> -        bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
> -                                             false);
> +        if (!s->cancelled) {
> +            bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
> +                                                 nr_bytes, false);
> +        }
>       } else {
>           size_t ret;
> -        uint8_t *buf;
> +        g_autofree uint8_t *buf = NULL;
>           uint64_t buf_size = qemu_get_be64(f);
> -        uint64_t needed_size =
> -            bdrv_dirty_bitmap_serialization_size(s->bitmap,
> -                                                 first_byte, nr_bytes);
> +        uint64_t needed_size;
> +
> +        /*
> +         * Actual check for buf_size is done a bit later. We can't do it in

s/Actual/The actual/

> +         * cancelled mode as we don't have the bitmap to check the constraints
> +         * (so, we do allocate buffer and read prior to the check). On the other
> +         * hand, we shouldn't blindly g_malloc the number from the stream.
> +         * Actually one chunk should not be larger thatn CHUNK_SIZE. Let's allow

than

> +         * a bit larger (which means that bitmap migration will fail anyway and
> +         * the whole migration will most probably fail soon due to broken
> +         * stream).
> +         */
> +        if (buf_size > 10 * CHUNK_SIZE) {
> +            error_report("Bitmap migration stream requests too large buffer "
> +                         "size to allocate");

Bitmap migration stream buffer allocation request is too large

I'll make those touchups.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-07-27 20:16 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-27 19:42 [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 01/21] qemu-iotests/199: fix style Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 02/21] qemu-iotests/199: drop extra constraints Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 03/21] qemu-iotests/199: better catch postcopy time Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 04/21] qemu-iotests/199: improve performance: set bitmap by discard Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 05/21] qemu-iotests/199: change discard patterns Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 06/21] qemu-iotests/199: increase postcopy period Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 07/21] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 08/21] migration/block-dirty-bitmap: rename state structure types Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 09/21] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 10/21] migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 11/21] migration/block-dirty-bitmap: refactor state global variables Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 12/21] migration/block-dirty-bitmap: rename finish_lock to just lock Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 13/21] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 14/21] migration/block-dirty-bitmap: keep bitmap state for all bitmaps Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 15/21] migration/block-dirty-bitmap: relax error handling in incoming part Vladimir Sementsov-Ogievskiy
2020-07-27 20:14   ` Eric Blake
2020-07-27 19:42 ` [PATCH v4 16/21] migration/block-dirty-bitmap: cancel migration on shutdown Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 17/21] migration/savevm: don't worry if bitmap migration postcopy failed Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 18/21] qemu-iotests/199: prepare for new test-cases addition Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 19/21] qemu-iotests/199: check persistent bitmaps Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 20/21] qemu-iotests/199: add early shutdown case to bitmaps postcopy Vladimir Sementsov-Ogievskiy
2020-07-27 19:42 ` [PATCH v4 21/21] qemu-iotests/199: add source-killed " Vladimir Sementsov-Ogievskiy
2020-07-27 19:53 ` [PATCH v4 for-5.1 00/21] Fix error handling during bitmap postcopy Eric Blake
2020-07-27 19:59   ` Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.