From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46844) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bpIxF-0004rl-BU for qemu-devel@nongnu.org; Wed, 28 Sep 2016 13:47:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bpIxD-0007lK-0u for qemu-devel@nongnu.org; Wed, 28 Sep 2016 13:47:36 -0400 References: <1474958276-7715-1-git-send-email-famz@redhat.com> <1474958276-7715-6-git-send-email-famz@redhat.com> From: Max Reitz Message-ID: Date: Wed, 28 Sep 2016 19:47:25 +0200 MIME-Version: 1.0 In-Reply-To: <1474958276-7715-6-git-send-email-famz@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="0Xxpkbq3SSt7Ruv4J3ne2Vm46HhupU4M4" Subject: Re: [Qemu-devel] [PATCH v2 5/5] block: keep AioContext pointer in BlockBackend List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng , qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, Kevin Wolf , qemu-stable@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --0Xxpkbq3SSt7Ruv4J3ne2Vm46HhupU4M4 From: Max Reitz To: Fam Zheng , qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, Kevin Wolf , qemu-stable@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com Message-ID: Subject: Re: [PATCH v2 5/5] block: keep AioContext pointer in BlockBackend References: <1474958276-7715-1-git-send-email-famz@redhat.com> <1474958276-7715-6-git-send-email-famz@redhat.com> In-Reply-To: <1474958276-7715-6-git-send-email-famz@redhat.com> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: quoted-printable On 27.09.2016 08:37, Fam Zheng wrote: > From: Stefan Hajnoczi >=20 > blk_get/set_aio_context() delegate to BlockDriverState without storing > the AioContext pointer in BlockBackend. >=20 > There are two flaws: >=20 > 1. BlockBackend falls back to the QEMU main loop AioContext when there > is no root BlockDriverState. This means the drive loses its > AioContext during media change and would break dataplane. >=20 > 2. BlockBackend state used from multiple threads has no lock. Race > conditions will creep in as functionality is moved from > BlockDriverState to BlockBackend due to the absense of a lock. The > monitor cannot access BlockBackend state safely while an IOThread is= > also accessing the state. >=20 > Issue #1 can be triggered by "change" on virtio-scsi dataplane, causing= > a assertion failure (virtio-blk is fine because medium change is not > possible). #2 may be possible with block accounting statistics in > BlockBackend but I'm not aware of a crash that can be triggered. >=20 > This patch stores the AioContext pointer in BlockBackend and puts newly= > inserted BlockDriverStates into the AioContext. >=20 > Signed-off-by: Stefan Hajnoczi > Signed-off-by: Fam Zheng > --- > block/block-backend.c | 24 +++++++++++++++++------- > 1 file changed, 17 insertions(+), 7 deletions(-) >=20 > diff --git a/block/block-backend.c b/block/block-backend.c > index b71babe..cda67cc 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -31,6 +31,7 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIO= CB *acb); > struct BlockBackend { > char *name; > int refcnt; > + AioContext *aio_context; > BdrvChild *root; > DriveInfo *legacy_dinfo; /* null unless created by drive_new() = */ > QTAILQ_ENTRY(BlockBackend) link; /* for block_backends */ > @@ -121,6 +122,7 @@ static BlockBackend *blk_new_with_ctx(AioContext *c= tx) > =20 > blk =3D g_new0(BlockBackend, 1); > blk->refcnt =3D 1; > + blk->aio_context =3D ctx; > blk_set_enable_write_cache(blk, true); > =20 > qemu_co_queue_init(&blk->public.throttled_reqs[0]); > @@ -510,6 +512,8 @@ void blk_remove_bs(BlockBackend *blk) > void blk_insert_bs(BlockBackend *blk, BlockDriverState *bs) > { > bdrv_ref(bs); > + > + assert(blk->aio_context =3D=3D bdrv_get_aio_context(bs)); > blk->root =3D bdrv_root_attach_child(bs, "root", &child_root, blk)= ; > =20 > notifier_list_notify(&blk->insert_bs_notifiers, blk); > @@ -1413,13 +1417,7 @@ void blk_op_unblock_all(BlockBackend *blk, Error= *reason) > =20 > AioContext *blk_get_aio_context(BlockBackend *blk) > { > - BlockDriverState *bs =3D blk_bs(blk); > - > - if (bs) { > - return bdrv_get_aio_context(bs); > - } else { > - return qemu_get_aio_context(); > - } > + return blk->aio_context; > } > =20 > static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb) > @@ -1432,7 +1430,19 @@ void blk_set_aio_context(BlockBackend *blk, AioC= ontext *new_context) > { > BlockDriverState *bs =3D blk_bs(blk); > =20 > + blk->aio_context =3D new_context; > + > if (bs) { > + AioContext *ctx =3D bdrv_get_aio_context(bs); > + > + if (ctx =3D=3D new_context) { > + return; > + } > + /* Moving context around happens when a block device is > + * enabling/disabling data plane, in which case we own the roo= t BDS and > + * it cannot be associated with another AioContext. */ > + assert(ctx =3D=3D qemu_get_aio_context() || > + new_context =3D=3D qemu_get_aio_context()); I don't really see the point behind this assertion. I know it's not currently possible, but you are basically asserting that we do not move a BDS tree directly from some non-main-loop context to another non-main-loop context, which in theory sounds completely fine to me. Based on the "Write code for now and not for the future" rule, I'm fine with this assertion if you can tell me what good it does us now. The only thing I can personally imagine is that it's a safeguard that we don't try to place a BDS tree into some other AioContext while having ignored that there are still some other BBs attached to it which don't want to agree on that new AioContext. But I think that should rather be fixed before patch 2, i.e. as I said we need an infrastructure which can tell us beforehand (and without failing assertions) whether we can move a certain BDS tree to some other context. So whether we can move a certain BB from some context to another depends on what the frontend supports, I don't think there is a generic answer we can implement here in the generic BB code. NBD for instance allows any movement; but devices probably only allow movements they have initiated themselves (e.g. dataplane will allow exactly what you describe here with that assertion, and any other device will probably not allow anything but the main loop). Max > if (blk->public.throttle_state) { > throttle_timers_detach_aio_context(&blk->public.throttle_t= imers); > } >=20 --0Xxpkbq3SSt7Ruv4J3ne2Vm46HhupU4M4 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQEvBAEBCAAZBQJX7AItEhxtcmVpdHpAcmVkaGF0LmNvbQAKCRD0B9sAYdXPQKwm CACF0KCfTJd3msuTtMazhvduSyMwXTbQydPFDgwER2g+dgcbE1E7QTx5QRN8IIKg dn0hnoqOdQhWVo3nUUedZfm182fX3lB4Ngk0nM4xXQ/2dRss7HRGMju+lWF3d44D u7JmauIiA/KWKuDUljs3ZP86PG1NSn7afIh9ivTijX4lZnj8AsZ8YkiF9wZQe+N7 pzcVIt6uchO+c8rrCin2YYGv9UNrCBMr8bCWOYPrYQTuNll/Kd1VTtf9W5+u4sSH wqgxLGENrXutk0X9FOqAxTWPqTjgE+aP+48NyHl00v1CXE2Rd54aVBTAorX20qen cGlM2DUybogPON8ZAE3wi1ED =vbxX -----END PGP SIGNATURE----- --0Xxpkbq3SSt7Ruv4J3ne2Vm46HhupU4M4--