From: Randy Dunlap <rdunlap@infradead.org>
To: Adrian Catangiu <acatan@amazon.com>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
qemu-devel@nongnu.org, kvm@vger.kernel.org,
linux-s390@vger.kernel.org
Cc: Jason@zx2c4.com, dgunigun@redhat.com, mst@redhat.com,
ghammer@redhat.com, vijaysun@ca.ibm.com, 0x7f454c46@gmail.com,
mhocko@kernel.org, oridgar@gmail.com, avagin@gmail.com,
pavel@ucw.cz, ptikhomirov@virtuozzo.com, corbet@lwn.net,
mpe@ellerman.id.au, rafael@kernel.org, ebiggers@kernel.org,
borntraeger@de.ibm.com, sblbir@amazon.com, bonzini@gnu.org,
arnd@arndb.de, jannh@google.com, raduweis@amazon.com,
asmehra@redhat.com, graf@amazon.com, rppt@kernel.org,
luto@kernel.org, gil@azul.com, colmmacc@amazon.com,
tytso@mit.edu, gregkh@linuxfoundation.org, areber@redhat.com,
ebiederm@xmission.com, ovzxemul@gmail.com, w@1wt.eu,
dwmw@amazon.co.uk
Subject: Re: [PATCH v5 1/2] drivers/misc: sysgenid: add system generation id driver
Date: Tue, 2 Feb 2021 14:58:02 -0800 [thread overview]
Message-ID: <5290f6f5-396f-aa47-3b74-8d50c2434a04@infradead.org> (raw)
In-Reply-To: <1612200294-17561-2-git-send-email-acatan@amazon.com>
Hi--
On 2/1/21 9:24 AM, Adrian Catangiu wrote:
> - Background and problem
>
> The System Generation ID feature is required in virtualized or
> containerized environments by applications that work with local copies
> or caches of world-unique data such as random values, uuids,
> monotonically increasing counters, etc.
... if those applications want to comply with <some MS spec>.
> Such applications can be negatively affected by VM or container
> snapshotting when the VM or container is either cloned or returned to
> an earlier point in time.
> Signed-off-by: Adrian Catangiu <acatan@amazon.com>
> ---
> Documentation/misc-devices/sysgenid.rst | 236 ++++++++++++++++
> Documentation/userspace-api/ioctl/ioctl-number.rst | 1 +
> MAINTAINERS | 8 +
> drivers/misc/Kconfig | 16 ++
> drivers/misc/Makefile | 1 +
> drivers/misc/sysgenid.c | 307 +++++++++++++++++++++
> include/uapi/linux/sysgenid.h | 17 ++
> 7 files changed, 586 insertions(+)
> create mode 100644 Documentation/misc-devices/sysgenid.rst
> create mode 100644 drivers/misc/sysgenid.c
> create mode 100644 include/uapi/linux/sysgenid.h
>
> diff --git a/Documentation/misc-devices/sysgenid.rst b/Documentation/misc-devices/sysgenid.rst
> new file mode 100644
> index 0000000..4337ca0
> --- /dev/null
> +++ b/Documentation/misc-devices/sysgenid.rst
> @@ -0,0 +1,236 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +========
> +SYSGENID
> +========
> +
> +The System Generation ID feature is required in virtualized or
> +containerized environments by applications that work with local copies
> +or caches of world-unique data such as random values, UUIDs,
> +monotonically increasing counters, etc.
> +Such applications can be negatively affected by VM or container
> +snapshotting when the VM or container is either cloned or returned to
> +an earlier point in time.
> +
> +The System Generation ID is a simple concept meant to alleviate the
> +issue by providing a monotonically increasing counter that changes
> +each time the VM or container is restored from a snapshot.
> +The driver for it lives at ``drivers/misc/sysgenid.c``.
> +
> +The ``sysgenid`` driver exposes a monotonic incremental System
> +Generation u32 counter via a char-dev FS interface accessible through
s/FS/filesystem/
> +``/dev/sysgenid`` that provides sync and async SysGen counter update
> +notifications. It also provides SysGen counter retrieval and
> +confirmation mechanisms.
> +
> +The counter starts from zero when the driver is initialized and
> +monotonically increments every time the system generation changes.
> +
> +The ``sysgenid`` driver exports the ``void sysgenid_bump_generation()``
> +symbol which can be used by backend drivers to drive system generation
> +changes based on hardware events.
> +System generation changes can also be driven by userspace software
> +through a dedicated driver ioctl.
> +
> +Userspace applications or libraries can (a)synchronously consume the
> +system generation counter through the provided FS interface, to make
s/FS/filesystem/
> +any necessary internal adjustments following a system generation update.
> +
> +Driver FS interface:
> +
> +``open()``:
> + When the device is opened, a copy of the current Sys-Gen-Id (counter)
> + is associated with the open file descriptor. The driver now tracks
> + this file as an independent *watcher*. The driver tracks how many
> + watchers are aware of the latest Sys-Gen-Id counter and how many of
> + them are *outdated*; outdated being those that have lived through
> + a Sys-Gen-Id change but not yet confirmed the new generation counter.
> +
> +``read()``:
> + Read is meant to provide the *new* system generation counter when a
> + generation change takes place. The read operation blocks until the
> + associated counter is no longer up to date, at which point the new
> + counter is provided/returned.
> + Nonblocking ``read()`` uses ``EAGAIN`` to signal that there is no
> + *new* counter value available. The generation counter is considered
> + *new* for each open file descriptor that hasn't confirmed the new
> + value following a generation change. Therefore, once a generation
> + change takes place, all ``read()`` calls will immediately return the
> + new generation counter and will continue to do so until the
> + new value is confirmed back to the driver through ``write()``.
> + Partial reads are not allowed - read buffer needs to be at least
> + 32 bits in size.
> +
> +``write()``:
> + Write is used to confirm the up-to-date Sys Gen counter back to the
> + driver.
> + Following a VM generation change, all existing watchers are marked
> + as *outdated*. Each file descriptor will maintain the *outdated*
> + status until a ``write()`` confirms the up-to-date counter back to
> + the driver.
> + Partial writes are not allowed - write buffer should be exactly
> + 32 bits in size.
> +
> +``poll()``:
> + Poll is implemented to allow polling for generation counter updates.
> + Such updates result in ``EPOLLIN`` polling status until the new
> + up-to-date counter is confirmed back to the driver through a
> + ``write()``.
> +
> +``ioctl()``:
> + The driver also adds support for waiting on open file descriptors
> + that haven't acknowledged a generation counter update, as well as a
> + mechanism for userspace to *force* a generation update:
> +
> + - SYSGENID_WAIT_WATCHERS: blocks until there are no more *outdated*
> + watchers, or if a ``timeout`` argument is provided, until the
> + timeout expires.
> + If the current caller is *outdated* or a generation change happens
> + while waiting (thus making current caller *outdated*), the ioctl
> + returns ``-EINTR`` to signal the user to handle event and retry.
> + - SYSGENID_FORCE_GEN_UPDATE: forces a generation counter increment.
> + It takes a ``minimum-generation`` argument which represents the
> + minimum value the generation counter will be incremented to. For
will be set to. For
It's not so much an increment as it is a "set to this value or higher".
> + example if current generation is ``5`` and ``SYSGENID_FORCE_GEN_UPDATE(8)``
> + is called, the generation counter will increment to ``8``.
> + This IOCTL can only be used by processes with CAP_CHECKPOINT_RESTORE
> + or CAP_SYS_ADMIN capabilities.
> +
> +``mmap()``:
> + The driver supports ``PROT_READ, MAP_SHARED`` mmaps of a single page
> + in size. The first 4 bytes of the mapped page will contain an
> + up-to-date u32 copy of the system generation counter.
> + The mapped memory can be used as a low-latency generation counter
> + probe mechanism in critical sections - see examples.
> +
> +``close()``:
> + Removes the file descriptor as a system generation counter *watcher*.
> +
> +Example application workflows
> +-----------------------------
> +
[snip]
--
~Randy
next prev parent reply other threads:[~2021-02-02 23:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-01 17:24 [PATCH v5 0/2] System Generation ID driver and VMGENID backend acatan--- via
2021-02-01 17:24 ` [PATCH v5 1/2] drivers/misc: sysgenid: add system generation id driver acatan--- via
2021-02-02 12:04 ` Greg KH
2021-02-02 12:05 ` Greg KH
2021-02-09 14:44 ` Catangiu, Adrian Costin
2021-02-02 12:08 ` Greg KH
2021-02-09 14:46 ` Catangiu, Adrian Costin
2021-02-02 22:58 ` Randy Dunlap [this message]
2021-02-09 14:52 ` Michael S. Tsirkin
2021-02-09 16:44 ` Catangiu, Adrian Costin
2021-02-01 17:24 ` [PATCH v5 2/2] drivers/virt: vmgenid: add vm " acatan--- via
2021-02-09 14:55 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5290f6f5-396f-aa47-3b74-8d50c2434a04@infradead.org \
--to=rdunlap@infradead.org \
--cc=0x7f454c46@gmail.com \
--cc=Jason@zx2c4.com \
--cc=acatan@amazon.com \
--cc=areber@redhat.com \
--cc=arnd@arndb.de \
--cc=asmehra@redhat.com \
--cc=avagin@gmail.com \
--cc=bonzini@gnu.org \
--cc=borntraeger@de.ibm.com \
--cc=colmmacc@amazon.com \
--cc=corbet@lwn.net \
--cc=dgunigun@redhat.com \
--cc=dwmw@amazon.co.uk \
--cc=ebiederm@xmission.com \
--cc=ebiggers@kernel.org \
--cc=ghammer@redhat.com \
--cc=gil@azul.com \
--cc=graf@amazon.com \
--cc=gregkh@linuxfoundation.org \
--cc=jannh@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mhocko@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=mst@redhat.com \
--cc=oridgar@gmail.com \
--cc=ovzxemul@gmail.com \
--cc=pavel@ucw.cz \
--cc=ptikhomirov@virtuozzo.com \
--cc=qemu-devel@nongnu.org \
--cc=raduweis@amazon.com \
--cc=rafael@kernel.org \
--cc=rppt@kernel.org \
--cc=sblbir@amazon.com \
--cc=tytso@mit.edu \
--cc=vijaysun@ca.ibm.com \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).