All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, mreitz@redhat.com,
	eblake@redhat.com, vsementsov@virtuozzo.com, den@openvz.org,
	berrange@redhat.com, armbru@redhat.com
Subject: [PATCH v2] docs: document file-posix locking protocol
Date: Sat,  3 Jul 2021 16:50:33 +0300	[thread overview]
Message-ID: <20210703135033.835344-1-vsementsov@virtuozzo.com> (raw)

Let's document how we use file locks in file-posix driver, to allow
external programs to "communicate" in this way with Qemu.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

v2: improve some descriptions
    add examples
    add notice about old bad POSIX file locks

 docs/system/qemu-block-drivers.rst.inc | 186 +++++++++++++++++++++++++
 1 file changed, 186 insertions(+)

diff --git a/docs/system/qemu-block-drivers.rst.inc b/docs/system/qemu-block-drivers.rst.inc
index 16225710eb..74fb71600d 100644
--- a/docs/system/qemu-block-drivers.rst.inc
+++ b/docs/system/qemu-block-drivers.rst.inc
@@ -909,3 +909,189 @@ some additional tasks, hooking io requests.
   .. option:: prealloc-size
 
     How much to preallocate (in bytes), default 128M.
+
+Image locking protocol
+~~~~~~~~~~~~~~~~~~~~~~
+
+QEMU holds rd locks and never rw locks. Instead, GETLK fcntl is used with F_WRLCK
+to handle permissions as described below.
+QEMU process may rd-lock the following bytes of the image with corresponding
+meaning:
+
+Permission bytes. If permission byte is rd-locked, it means that some process
+uses corresponding permission on that file.
+
+Byte    Operation
+100     read
+          Lock holder can read
+101     write
+          Lock holder can write
+102     write-unchanged
+          Lock holder can write same data if it sure, that this write doesn't
+          break concurrent readers. This is mostly used internally in Qemu
+          and it wouldn't be good idea to exploit it somehow.
+103     resize
+          Lock holder can resize the file. "write" permission is also required
+          for resizing, so lock byte 103 only if you also lock byte 101.
+104     graph-mod
+          Undefined. QEMU may sometimes locks this byte, but external programs
+          should not. QEMU will stop locking this byte in future
+
+Unshare bytes. If permission byte is rd-locked, it means that some process
+does not allow the others use corresponding options on that file.
+
+Byte    Operation
+200     read
+          Lock holder don't allow read operation to other processes.
+201     write
+          Lock holder don't allow write operation to other processes. This
+          still allows others to do write-uncahnged operations. Better not
+          exploit outside of Qemu.
+202     write-unchanged
+          Lock holder don't allow write-unchanged operation to other processes.
+203     resize
+          Lock holder don't allow resizing the file by other processes.
+204     graph-mod
+          Undefined. QEMU may sometimes locks this byte, but external programs
+          should not. QEMU will stop locking this byte in future
+
+Handling the permissions works as follows: assume we want to open the file to do
+some operations and in the same time want to disallow some operation to other
+processes. So, we want to lock some of the bytes described above. We operate as
+follows:
+
+1. rd-lock all needed bytes, both "permission" bytes and "unshare" bytes.
+
+2. For each "unshare" byte we rd-locked, do GETLK that "tries" to wr-lock
+corresponding "permission" byte. So, we check is there any other process that
+uses the permission we want to unshare. If it exists we fail.
+
+3. For each "permission" byte we rd-locked, do GETLK that "tries" to wr-lock
+corresponding "unshare" byte. So, we check is there any other process that
+unshares the permission we want to have. If it exists we fail.
+
+Important notice: Qemu may fallback to POSIX file locks only if OFD locks
+unavailable. Other programs should behave similarly: use POSIX file locks
+only if OFD locks unavailable and if you are OK with drawbacks of POSIX
+file locks (for example, they are lost on close() of any file descriptor
+for that file).
+
+Image locking examples
+~~~~~~~~~~~~~~~~~~~~~~
+
+Read-only, allow others to write
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+So, we want to read and don't care what other users do with the image. We only
+need to lock byte 100. Operation is as follows:
+
+1. rd-lock byte 100
+
+.. highlight:: c
+
+    struct flock fl = {
+        .l_whence = SEEK_SET,
+        .l_start  = 100,
+        .l_len    = 1,
+        .l_type   = F_RDLCK,
+    };
+    ret = fcntl(fd, F_OFD_SETLK, &fl);
+    if (ret == -1) {
+        /* Error */
+    }
+
+2. try wr-lock byte 200, to check that no one is against our read access
+
+.. highlight:: c
+
+    struct flock fl = {
+        .l_whence = SEEK_SET,
+        .l_start  = 200,
+        .l_len    = 1,
+        .l_type   = F_WRLCK,
+    };
+    ret = fcntl(fd, F_OFD_GETLK, &fl);
+    if (ret != -1 && fl.l_type == F_UNLCK) {
+        /*
+         * We are lucky, nobody against. So, now we have RO access
+         * that we want.
+         */
+    } else {
+        /* Error, or RO access is blocked by someone. We don't have access */
+    }
+
+3. Now we can operate read the data.
+
+4. When finished, release the lock:
+
+.. highlight:: c
+
+    struct flock fl = {
+        .l_whence = SEEK_SET,
+        .l_start  = 100,
+        .l_len    = 1,
+        .l_type   = F_UNLCK,
+    };
+    ret = fcntl(fd, F_OFD_SETLK, &fl);
+
+RW, allow others to read only
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+We want to read and write, and don't want others to modify the image.
+So, let's lock bytes 100, 101, 201. Operation is as follows:
+
+1. rd-lock bytes 100 (read), 101 (write), 201 (don't allow others to write)
+
+.. highlight:: c
+
+    for byte in (100, 101, 201) {
+        struct flock fl = {
+            .l_whence = SEEK_SET,
+            .l_start  = byte,
+            .l_len    = 1,
+            .l_type   = F_RDLCK,
+        };
+        ret = fcntl(fd, F_OFD_SETLK, &fl);
+        if (ret == -1) {
+            /* Error */
+        }
+    }
+
+2. try wr-lock bytes 200 (to check that no one is against our read access),
+   201 (no one against our write access), 101 (there are no writers currently)
+
+.. highlight:: c
+
+    for byte in (200, 201, 101) {
+        struct flock fl = {
+            .l_whence = SEEK_SET,
+            .l_start  = byte,
+            .l_len    = 1,
+            .l_type   = F_WRLCK,
+        };
+        ret = fcntl(fd, F_OFD_GETLK, &fl);
+        if (ret != -1 && fl.l_type == F_UNLCK) {
+            /* We are lucky, nobody against. */
+        } else {
+            /*
+             * Error, or feature we want is blocked by someone.
+             * We don't have access.
+             */
+        }
+    }
+
+3. Now we can read and write.
+
+4. When finished, release locks:
+
+.. highlight:: c
+
+    for byte in (100, 101, 201) {
+        struct flock fl = {
+            .l_whence = SEEK_SET,
+            .l_start  = byte,
+            .l_len    = 1,
+            .l_type   = F_UNLCK,
+        };
+        fcntl(fd, F_OFD_SETLK, &fl);
+    }
-- 
2.29.2



             reply	other threads:[~2021-07-03 13:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-03 13:50 Vladimir Sementsov-Ogievskiy [this message]
2021-07-03 14:50 ` [PATCH v2] docs: document file-posix locking protocol Nir Soffer
2021-07-05  7:55   ` Vladimir Sementsov-Ogievskiy
2021-07-05  8:26     ` Denis V. Lunev
2021-07-15 17:13   ` Vladimir Sementsov-Ogievskiy
2021-07-15 17:19     ` Daniel P. Berrangé
2021-07-15 20:00 ` Vladimir Sementsov-Ogievskiy
2021-07-16 16:21   ` Vladimir Sementsov-Ogievskiy
2021-07-16 18:47     ` Vladimir Sementsov-Ogievskiy
2021-07-16 20:35       ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210703135033.835344-1-vsementsov@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.