From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55519) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fYsiD-0006wa-5s for qemu-devel@nongnu.org; Fri, 29 Jun 2018 08:41:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fYsiC-0003eu-6R for qemu-devel@nongnu.org; Fri, 29 Jun 2018 08:41:17 -0400 From: Denis Plotnikov Date: Fri, 29 Jun 2018 15:40:50 +0300 Message-Id: <20180629124052.331406-1-dplotnikov@virtuozzo.com> Subject: [Qemu-devel] [PATCH v0 0/2] Postponed actions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: kwolf@redhat.com, reitz@redhat.com, stefanha@redhat.com, famz@redhat.com, qemu-stable@nongnu.org Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org There are cases when a request to a block driver state shouldn't have appeared producing dangerous race conditions. This misbehaviour is usually happens with storage devices emulated without eventfd for guest to host notifications like IDE. The issue arises when the context is in the "drained" section and doesn't expect the request to come, but request comes from the device not using iothread and which context is processed by the main loop. The main loop apart of the iothread event loop isn't blocked by the "drained" section. The request coming and processing while in "drained" section can spoil the block driver state consistency. This behavior can be observed in the following KVM-based case: 1. Setup a VM with an IDE disk. 2. Inside a VM start a disk writing load for the IDE device e.g: dd if= of= bs=X count=Y oflag=direct 3. On the host create a mirroring block job for the IDE device e.g: drive_mirror 4. On the host finish the block job e.g: block_job_complete Having done the 4th action, you could get an assert: assert(QLIST_EMPTY(&bs->tracked_requests)) from mirror_run. On my setup, the assert is 1/3 reproducible. The patch series introduces the mechanism to postpone the requests until the BDS leaves "drained" section for the devices not using iothreads. Also, it modifies the asynchronous block backend infrastructure to use that mechanism to release the assert bug for IDE devices. Denis Plotnikov (2): async: add infrastructure for postponed actions block: postpone the coroutine executing if the BDS's is drained block/block-backend.c | 58 ++++++++++++++++++++++++++++++--------- include/block/aio.h | 63 +++++++++++++++++++++++++++++++++++++++++++ util/async.c | 33 +++++++++++++++++++++++ 3 files changed, 142 insertions(+), 12 deletions(-) -- 2.17.0