All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, mreitz@redhat.com, kwolf@redhat.com,
	vsementsov@virtuozzo.com, jsnow@redhat.com, dem@openvz.org
Subject: [PATCH 3/3] block/mirror: fix active mirror dead-lock in mirror_wait_on_conflicts
Date: Sat,  3 Jul 2021 00:16:36 +0300	[thread overview]
Message-ID: <20210702211636.228981-4-vsementsov@virtuozzo.com> (raw)
In-Reply-To: <20210702211636.228981-1-vsementsov@virtuozzo.com>

It's possible that requests start to wait each other in
mirror_wait_on_conflicts(). To avoid it let's use same technique as in
block/io.c in bdrv_wait_serialising_requests_locked() /
bdrv_find_conflicting_request(): don't wait on intersecting request if
it is already waiting for some other request.

For details of the dead-lock look at testIntersectingActiveIO()
test-case which we actually fixing now.

Fixes: d06107ade0ce74dc39739bac80de84b51ec18546
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

Note, that I failed to run the test on d06107ade0ce74d, that introduces
active mirror. But it seems that the problem should exit in it too, and
it's better to leave "Fixes:" tag than don't do it.

 block/mirror.c         | 12 ++++++++++++
 tests/qemu-iotests/151 | 18 +++++-------------
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index ad6aac2f95..98fc66eabf 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -107,6 +107,7 @@ struct MirrorOp {
     bool is_in_flight;
     CoQueue waiting_requests;
     Coroutine *co;
+    MirrorOp *waiting_for_op;
 
     QTAILQ_ENTRY(MirrorOp) next;
 };
@@ -159,7 +160,18 @@ static void coroutine_fn mirror_wait_on_conflicts(MirrorOp *self,
             if (ranges_overlap(self_start_chunk, self_nb_chunks,
                                op_start_chunk, op_nb_chunks))
             {
+                /*
+                 * If the operation is already (indirectly) waiting for us, or
+                 * will wait for us as soon as it wakes up, then just go on
+                 * (instead of producing a deadlock in the former case).
+                 */
+                if (op->waiting_for_op) {
+                    continue;
+                }
+
+                self->waiting_for_op = op;
                 qemu_co_queue_wait(&op->waiting_requests, NULL);
+                self->waiting_for_op = NULL;
                 break;
             }
         }
diff --git a/tests/qemu-iotests/151 b/tests/qemu-iotests/151
index ab46c5e8ba..93d14193d0 100755
--- a/tests/qemu-iotests/151
+++ b/tests/qemu-iotests/151
@@ -143,10 +143,6 @@ class TestActiveMirror(iotests.QMPTestCase):
         self.potential_writes_in_flight = False
 
     def testIntersectingActiveIO(self):
-        # FIXME: test-case is dead-locking. To reproduce dead-lock just drop
-        # this return statement
-        return
-
         # Fill the source image
         result = self.vm.hmp_qemu_io('source', 'write -P 1 0 2M')
 
@@ -180,18 +176,14 @@ class TestActiveMirror(iotests.QMPTestCase):
 
         # Now we resumed 1, so 2 and 3 goes to the next iteration of while loop
         # in mirror_wait_on_conflicts(). They don't exit, as bitmap is dirty
-        # due to request 4. And they start to wait: 2 wait for 3, 3 wait for 2
-        # - DEAD LOCK.
-        # Note that it's important that we add request 4 at last: requests are
-        # appended to the list, so we are sure that 4 is last in the list, so 2
-        # and 3 now waits for each other, not for 4.
+        # due to request 4.
+        # In the past at that point 2 and 3 would wait for each other producing
+        # a dead-lock. Now this is fixed and they will wait for request 4.
 
         self.vm.hmp_qemu_io('source', 'resume B')
 
-        # Resuming 4 doesn't help, 2 and 3 already dead-locked
-        # To check the dead-lock run:
-        #    gdb -p $(pidof qemu-system-x86_64) -ex 'set $job=(MirrorBlockJob *)jobs.lh_first' -ex 'p *$job->ops_in_flight.tqh_first' -ex 'p *$job->ops_in_flight.tqh_first->next.tqe_next'
-        # You'll see two MirrorOp objects waiting on each other
+        # After resuming 4, one of 2 and 3 goes first and set in_flight_bitmap,
+        # so the other will wait for it.
 
         result = self.vm.qmp('block-job-set-speed', device='mirror', speed=0)
         self.assert_qmp(result, 'return', {})
-- 
2.29.2



  parent reply	other threads:[~2021-07-02 21:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-02 21:16 [PATCH 0/3] Fix active mirror dead-lock Vladimir Sementsov-Ogievskiy
2021-07-02 21:16 ` [PATCH 1/3] block/mirror: set .co for active-write MirrorOp objects Vladimir Sementsov-Ogievskiy
2021-07-02 21:16 ` [PATCH 2/3] iotest 151: add test-case that shows active mirror dead-lock Vladimir Sementsov-Ogievskiy
2021-07-02 21:16 ` Vladimir Sementsov-Ogievskiy [this message]
2021-07-02 22:34 ` [PATCH 0/3] Fix " Vladimir Sementsov-Ogievskiy
2021-07-14 14:08 ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210702211636.228981-4-vsementsov@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=dem@openvz.org \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.