From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45223) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c1KeP-0007xu-8B for qemu-devel@nongnu.org; Mon, 31 Oct 2016 18:01:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c1KeO-0001Gl-0n for qemu-devel@nongnu.org; Mon, 31 Oct 2016 18:01:53 -0400 References: <1477928314-11184-1-git-send-email-famz@redhat.com> <1477928314-11184-14-git-send-email-famz@redhat.com> From: Eric Blake Message-ID: <43438fac-7c1c-443f-28fc-f49fd33fdfe0@redhat.com> Date: Mon, 31 Oct 2016 17:01:44 -0500 MIME-Version: 1.0 In-Reply-To: <1477928314-11184-14-git-send-email-famz@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="BNOvgfjBVj3i6JBBMClPptQ1OLBlQJkvI" Subject: Re: [Qemu-devel] [PATCH 13/14] raw-posix: Implement image locking List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng , qemu-devel@nongnu.org Cc: Kevin Wolf , rjones@redhat.com, qemu-block@nongnu.org, Max Reitz This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --BNOvgfjBVj3i6JBBMClPptQ1OLBlQJkvI From: Eric Blake To: Fam Zheng , qemu-devel@nongnu.org Cc: Kevin Wolf , rjones@redhat.com, qemu-block@nongnu.org, Max Reitz Message-ID: <43438fac-7c1c-443f-28fc-f49fd33fdfe0@redhat.com> Subject: Re: [Qemu-devel] [PATCH 13/14] raw-posix: Implement image locking References: <1477928314-11184-1-git-send-email-famz@redhat.com> <1477928314-11184-14-git-send-email-famz@redhat.com> In-Reply-To: <1477928314-11184-14-git-send-email-famz@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 10/31/2016 10:38 AM, Fam Zheng wrote: > This implements open flag sensible image locking for local file > and host device protocol. >=20 > virtlockd in libvirt locks the first byte, so we start looking at the > file bytes from 1. What happens if we try to use a raw file with less than 3 bytes? There's not much to be locked in that case (especially if we round down to sector sizes - the file is effectively empty) - but it's probably a corner case you have to be aware of. >=20 > Quoting what was proposed by Kevin Wolf , there are > four locking modes by combining two bits (BDRV_O_RDWR and > BDRV_O_SHARE_RW), and implemented by taking two locks: >=20 > Lock bytes: >=20 > * byte 1: I can't allow other processes to write to the image > * byte 2: I am writing to the image >=20 > Lock modes: >=20 > * shared writer (BDRV_O_RDWR | BDRV_O_SHARE_RW): Take shared lock on > byte 2. Test whether byte 1 is locked using an exclusive lock, and > fail if so. >=20 > * exclusive writer (BDRV_O_RDWR only): Take shared lock on byte 2. Test= > whether byte 1 is locked using an exclusive lock, and fail if so. The= n > take shared lock on byte 1. I suppose this is racy, but we can > probably tolerate that. >=20 > * reader that can tolerate writers (BDRV_O_SHARE_RW only): Don't do any= thing >=20 > * reader that can't tolerate writers (neither bit is set): Take shared > lock on byte 1. Test whether byte 2 is locked, and fail if so. >=20 > The complication is in the transactional reopen. To make the reopen > logic managable, and allow better reuse, the code is internally s/managable/manageable/ > organized with a table from old mode to the new one. >=20 > Signed-off-by: Fam Zheng > --- > block/raw-posix.c | 710 ++++++++++++++++++++++++++++++++++++++++++++++= ++++---- > 1 file changed, 660 insertions(+), 50 deletions(-) >=20 > +typedef enum { > + /* Read only and accept other writers. */ > + RAW_L_READ_SHARE_RW, > + /* Read only and try to forbid other writers. */ > + RAW_L_READ, > + /* Read write and accept other writers. */ > + RAW_L_WRITE_SHARE_RW, > + /* Read write and try to forbit other writers. */ s/forbit/forbid/ > =20 > +static int raw_lock_fd(int fd, BDRVRawLockMode mode, Error **errp) > +{ > + int ret; > + assert(fd >=3D 0); > + /* Locking byte 1 avoids interfereing with virtlockd. */ s/interfereing/interfering/ > +/** > + * Transactionally moving between possible locking states is tricky an= d must be > + * done carefully. That is mostly because downgrading an exclusive loc= k to > + * shared or unlocked is not guaranteed to be revertable. As a result,= in such s/revertable/revertible/ > + * cases we have to defer the downgraing to "commit", given that no re= vert will s/downgraing/downgrading/ > + * happen after that point, and that downgrading a lock should never f= ail. > + * > + * On the other hand, upgrading a lock (e.g. from unlocked or shared t= o > + * exclusive lock) must happen in "prepare" because it may fail. > + * > + * Manage the operation matrix with this state transition table to mak= e > + * fulfulling above conditions easier. s/fulfulling/fulfilling/ > @@ -560,61 +1177,24 @@ static int raw_reopen_prepare(BDRVReopenState *s= tate, > =20 > raw_parse_flags(state->flags, &rs->open_flags); > =20 > - rs->fd =3D -1; > - > - int fcntl_flags =3D O_APPEND | O_NONBLOCK; > -#ifdef O_NOATIME > - fcntl_flags |=3D O_NOATIME; > -#endif > - > -#ifdef O_ASYNC > - /* Not all operating systems have O_ASYNC, and those that don't > - * will not let us track the state into rs->open_flags (typically > - * you achieve the same effect with an ioctl, for example I_SETSIG= > - * on Solaris). But we do not use O_ASYNC, so that's fine. > - */ > - assert((s->open_flags & O_ASYNC) =3D=3D 0); > -#endif It looks like you are doing some code motion (refactoring into a helper function) mixed in with everything else; it might be worth splitting that into a separate commit for ease of review. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --BNOvgfjBVj3i6JBBMClPptQ1OLBlQJkvI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJYF79IAAoJEKeha0olJ0NqETcH/ix2MgGorUKwZ49PT6CLi4Y0 v9OPPJfeJ6AIPApZl82HldN2qRDs2sEElkaH/HkR3emn8i/huJeXEcsU+h+YmDjP ERSu3V9dNt34YK2dL5IzTJCtb0DPes2TuZu/A5WMjGe9501rB2/cVM2H6vOzQNSS OGxC5uYyEVUci77xdmSRAcAmrRg9Gyddptqa/LzUqRZPRAGQbIzFXl1fPRfzNg8+ HwBvb746a0r1XW/gyX6oLW5RrBLw40sf7uU/n+oXoetV6x7za4i+eqhxvd/LEPQh o+Pxc/UzCjafy9gniKr2m4OHDIKlkBFAUzaBRewNocnXeohbgAr6B9EMaBLA74c= =Owks -----END PGP SIGNATURE----- --BNOvgfjBVj3i6JBBMClPptQ1OLBlQJkvI--