All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org, Kevin Wolf <kwolf@redhat.com>,
	rjones@redhat.com, qemu-block@nongnu.org,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 13/14] raw-posix: Implement image locking
Date: Tue, 1 Nov 2016 10:06:46 +0800	[thread overview]
Message-ID: <20161101020646.GA19068@lemon> (raw)
In-Reply-To: <43438fac-7c1c-443f-28fc-f49fd33fdfe0@redhat.com>

On Mon, 10/31 17:01, Eric Blake wrote:
> On 10/31/2016 10:38 AM, Fam Zheng wrote:
> > This implements open flag sensible image locking for local file
> > and host device protocol.
> > 
> > virtlockd in libvirt locks the first byte, so we start looking at the
> > file bytes from 1.
> 
> What happens if we try to use a raw file with less than 3 bytes? There's
> not much to be locked in that case (especially if we round down to
> sector sizes - the file is effectively empty) - but it's probably a
> corner case you have to be aware of.

Yes.  Not sure if it is complete (and always true) but patch 14 covers a 0 byte
test case and it seems the lock just works the same a a large file.

So I look at the kernel implementation of fcntl locks, to see if it limits lock
range to file size, but I don't see any.

I also manually tested with "touch /var/tmp/zerofile && qemu-io
/var/tmp/zerofile" and "lslocks", it indeed report the locked bytes tough the
file is 0 byte.

As another example, /dev/null is also lockable by this series, that's why I have
to add a patch to change it to null-co://.

So, I think where a zero byte file cannot be locked, if any, is a corner case.

> 
> > 
> > Quoting what was proposed by Kevin Wolf <kwolf@redhat.com>, there are
> > four locking modes by combining two bits (BDRV_O_RDWR and
> > BDRV_O_SHARE_RW), and implemented by taking two locks:
> > 
> > Lock bytes:
> > 
> > * byte 1: I can't allow other processes to write to the image
> > * byte 2: I am writing to the image
> > 
> > Lock modes:
> > 
> > * shared writer (BDRV_O_RDWR | BDRV_O_SHARE_RW): Take shared lock on
> >   byte 2. Test whether byte 1 is locked using an exclusive lock, and
> >   fail if so.
> > 
> > * exclusive writer (BDRV_O_RDWR only): Take shared lock on byte 2. Test
> >   whether byte 1 is locked using an exclusive lock, and fail if so. Then
> >   take shared lock on byte 1. I suppose this is racy, but we can
> >   probably tolerate that.
> > 
> > * reader that can tolerate writers (BDRV_O_SHARE_RW only): Don't do anything
> > 
> > * reader that can't tolerate writers (neither bit is set): Take shared
> >   lock on byte 1. Test whether byte 2 is locked, and fail if so.
> > 
> > The complication is in the transactional reopen.  To make the reopen
> > logic managable, and allow better reuse, the code is internally
> 
> s/managable/manageable/
> 
> > organized with a table from old mode to the new one.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/raw-posix.c | 710 ++++++++++++++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 660 insertions(+), 50 deletions(-)
> > 
> 
> > +typedef enum {
> > +    /* Read only and accept other writers. */
> > +    RAW_L_READ_SHARE_RW,
> > +    /* Read only and try to forbid other writers. */
> > +    RAW_L_READ,
> > +    /* Read write and accept other writers. */
> > +    RAW_L_WRITE_SHARE_RW,
> > +    /* Read write and try to forbit other writers. */
> 
> s/forbit/forbid/
> 
> 
> >  
> > +static int raw_lock_fd(int fd, BDRVRawLockMode mode, Error **errp)
> > +{
> > +    int ret;
> > +    assert(fd >= 0);
> > +    /* Locking byte 1 avoids interfereing with virtlockd. */
> 
> s/interfereing/interfering/
> 
> 
> > +/**
> > + * Transactionally moving between possible locking states is tricky and must be
> > + * done carefully. That is mostly because downgrading an exclusive lock to
> > + * shared or unlocked is not guaranteed to be revertable. As a result, in such
> 
> s/revertable/revertible/
> 
> > + * cases we have to defer the downgraing to "commit", given that no revert will
> 
> s/downgraing/downgrading/
> 
> > + * happen after that point, and that downgrading a lock should never fail.
> > + *
> > + * On the other hand, upgrading a lock (e.g. from unlocked or shared to
> > + * exclusive lock) must happen in "prepare" because it may fail.
> > + *
> > + * Manage the operation matrix with this state transition table to make
> > + * fulfulling above conditions easier.
> 
> s/fulfulling/fulfilling/

Thanks, will fix these typos and misspells.

> 
> 
> > @@ -560,61 +1177,24 @@ static int raw_reopen_prepare(BDRVReopenState *state,
> >  
> >      raw_parse_flags(state->flags, &rs->open_flags);
> >  
> > -    rs->fd = -1;
> > -
> > -    int fcntl_flags = O_APPEND | O_NONBLOCK;
> > -#ifdef O_NOATIME
> > -    fcntl_flags |= O_NOATIME;
> > -#endif
> > -
> > -#ifdef O_ASYNC
> > -    /* Not all operating systems have O_ASYNC, and those that don't
> > -     * will not let us track the state into rs->open_flags (typically
> > -     * you achieve the same effect with an ioctl, for example I_SETSIG
> > -     * on Solaris). But we do not use O_ASYNC, so that's fine.
> > -     */
> > -    assert((s->open_flags & O_ASYNC) == 0);
> > -#endif
> 
> It looks like you are doing some code motion (refactoring into a helper
> function) mixed in with everything else; it might be worth splitting
> that into a separate commit for ease of review.

Yes, good idea.

Fam

  parent reply	other threads:[~2016-11-01  2:06 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-31 15:38 [Qemu-devel] [PATCH 00/14] block: Image locking series Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 01/14] osdep: Add qemu_lock_fd and qemu_unlock_fd Fam Zheng
2016-12-02  0:30   ` Max Reitz
2016-12-08  6:53     ` Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 02/14] block: Define BDRV_O_SHARE_RW Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 03/14] qemu-io: Set "share-rw" flag together with read-only Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 04/14] qemu-img: Set "share-rw" flag in read-only commands Fam Zheng
2016-12-02  0:52   ` Max Reitz
2016-12-08  7:19     ` Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 05/14] block: Set "share-rw" flag in drive-backup when sync=none Fam Zheng
2016-12-02  1:01   ` Max Reitz
2016-10-31 15:38 ` [Qemu-devel] [PATCH 06/14] block: Set "share-rw" flag for incoming migration Fam Zheng
2016-12-02  1:22   ` Max Reitz
2016-10-31 15:38 ` [Qemu-devel] [PATCH 07/14] iotests: 055: Don't attach the drive to vm for drive-backup Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 08/14] iotests: 030: Read-only open image for getting map Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 09/14] iotests: 087: Don't attch test image twice Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 10/14] iotests: 085: Avoid image locking conflict Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 11/14] iotests: 091: Quit QEMU before checking image Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 12/14] tests: Use null-co:// instead of /dev/null as the dummy image Fam Zheng
2016-10-31 15:38 ` [Qemu-devel] [PATCH 13/14] raw-posix: Implement image locking Fam Zheng
2016-10-31 22:01   ` Eric Blake
2016-10-31 22:39     ` Richard W.M. Jones
2016-11-01  2:06     ` Fam Zheng [this message]
2016-12-02  2:58   ` Max Reitz
2017-01-18 10:48     ` Fam Zheng
2017-01-18 13:02       ` Max Reitz
2017-01-18 13:19         ` Fam Zheng
2016-12-02 16:13   ` Max Reitz
2016-10-31 15:38 ` [Qemu-devel] [PATCH 14/14] tests: Add test-image-lock Fam Zheng
2016-12-02 16:30   ` Max Reitz
2016-12-09  7:39     ` Fam Zheng
2016-12-02  3:10 ` [Qemu-devel] [PATCH 00/14] block: Image locking series Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161101020646.GA19068@lemon \
    --to=famz@redhat.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.