QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: "Catangiu, Adrian Costin" <acatan@amazon.com>
To: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"virtualization@lists.linux-foundation.org"
	<virtualization@lists.linux-foundation.org>
Cc: "Graf (AWS), Alexander" <graf@amazon.de>,
	"MacCarthaigh, Colm" <colmmacc@amazon.com>,
	"Woodhouse, David" <dwmw@amazon.co.uk>,
	"bonzini@gnu.org" <bonzini@gnu.org>,
	"Singh, Balbir" <sblbir@amazon.com>,
	"Weiss, Radu" <raduweis@amazon.com>,
	"oridgar@gmail.com" <oridgar@gmail.com>,
	 "ghammer@redhat.com" <ghammer@redhat.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	 "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"mst@redhat.com" <mst@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Jann Horn <jannh@google.com>, Michal Hocko <mhocko@kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Pavel Machek <pavel@ucw.cz>
Subject: Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver
Date: Fri, 16 Oct 2020 15:00:12 +0000
Message-ID: <A8D0AD0B-CA9F-4667-B50B-9C1805BBE2C2@amazon.com> (raw)
In-Reply-To: <788878CE-2578-4991-A5A6-669DCABAC2F2@amazon.com>

Sorry, I forgot to add a few people interested in this and the KVM ML to CC. Added them.

On 16/10/2020, 17:33, "Catangiu, Adrian Costin" <acatan@amazon.com> wrote:

    - Background
    
    The VM Generation ID is a feature defined by Microsoft (paper:
    http://go.microsoft.com/fwlink/?LinkId=260709) and supported by
    multiple hypervisor vendors.
    
    The feature is required in virtualized environments by apps that work
    with local copies/caches of world-unique data such as random values,
    uuids, monotonically increasing counters, etc.
    Such apps can be negatively affected by VM snapshotting when the VM
    is either cloned or returned to an earlier point in time.
    
    The VM Generation ID is a simple concept meant to alleviate the issue
    by providing a unique ID that changes each time the VM is restored
    from a snapshot. The hw provided UUID value can be used to
    differentiate between VMs or different generations of the same VM.
    
    - Problem
    
    The VM Generation ID is exposed through an ACPI device by multiple
    hypervisor vendors but neither the vendors or upstream Linux have no
    default driver for it leaving users to fend for themselves.
    
    Furthermore, simply finding out about a VM generation change is only
    the starting point of a process to renew internal states of possibly
    multiple applications across the system. This process could benefit
    from a driver that provides an interface through which orchestration
    can be easily done.
    
    - Solution
    
    This patch is a driver which exposes the Virtual Machine Generation ID
    via a char-dev FS interface that provides ID update sync and async
    notification, retrieval and confirmation mechanisms:
    
    When the device is 'open()'ed a copy of the current vm UUID is
    associated with the file handle. 'read()' operations block until the
    associated UUID is no longer up to date - until HW vm gen id changes -
    at which point the new UUID is provided/returned. Nonblocking 'read()'
    uses EWOULDBLOCK to signal that there is no _new_ UUID available.
    
    'poll()' is implemented to allow polling for UUID updates. Such
    updates result in 'EPOLLIN' events.
    
    Subsequent read()s following a UUID update no longer block, but return
    the updated UUID. The application needs to acknowledge the UUID update
    by confirming it through a 'write()'.
    Only on writing back to the driver the right/latest UUID, will the
    driver mark this "watcher" as up to date and remove EPOLLIN status.
    
    'mmap()' support allows mapping a single read-only shared page which
    will always contain the latest UUID value at offset 0.
    
    The driver also adds support for tracking count of open file handles
    that haven't acknowledged an UUID update. This is exposed through
    two IOCTLs:
     * VMGENID_GET_OUTDATED_WATCHERS: immediately returns the number of
       _outdated_ watchers - number of file handles that were open during
       a VM generation change, and which have not yet confirmed the new
       Vm-Gen-Id.
     * VMGENID_WAIT_WATCHERS: blocks until there are no more _outdated_
       watchers, or if a 'timeout' argument is provided, until the timeout
       expires.
    
    This patch builds on top of Or Idgar <oridgar@gmail.com>'s proposal
    https://lkml.org/lkml/2018/3/1/498
    
    - Future improvements
    
    Ideally we would want the driver to register itself based on devices'
    _CID and not _HID, but unfortunately I couldn't find a way to do that.
    The problem is that ACPI device matching is done by
    '__acpi_match_device()' which exclusively looks at
    'acpi_hardware_id *hwid'.
    
    There is a path for platform devices to match on _CID when _HID is
    'PRP0001' - which is not the case for the Qemu vmgenid device.
    
    Guidance and help here would be greatly appreciated.
    
    Signed-off-by: Adrian Catangiu <acatan@amazon.com>
    ---
     Documentation/virt/vmgenid.rst | 211 +++++++++++++++++++++
     drivers/virt/Kconfig           |  13 ++
     drivers/virt/Makefile          |   1 +
     drivers/virt/vmgenid.c         | 419 +++++++++++++++++++++++++++++++++++++++++
     include/uapi/linux/vmgenid.h   |  22 +++
     5 files changed, 666 insertions(+)
     create mode 100644 Documentation/virt/vmgenid.rst
     create mode 100644 drivers/virt/vmgenid.c
     create mode 100644 include/uapi/linux/vmgenid.h
    
    diff --git a/Documentation/virt/vmgenid.rst b/Documentation/virt/vmgenid.rst
    new file mode 100644
    index 0000000..5224415
    --- /dev/null
    +++ b/Documentation/virt/vmgenid.rst
    @@ -0,0 +1,211 @@
    +.. SPDX-License-Identifier: GPL-2.0
    +
    +============
    +VMGENID
    +============
    +
    +The VM Generation ID is a feature defined by Microsoft (paper:
    +http://go.microsoft.com/fwlink/?LinkId=260709) and supported by
    +multiple hypervisor vendors.
    +
    +The feature is required in virtualized environments by apps that work
    +with local copies/caches of world-unique data such as random values,
    +uuids, monotonically increasing counters, etc.
    +Such apps can be negatively affected by VM snapshotting when the VM
    +is either cloned or returned to an earlier point in time.
    +
    +The VM Generation ID is a simple concept meant to alleviate the issue
    +by providing a unique ID that changes each time the VM is restored
    +from a snapshot. The hw provided UUID value can be used to
    +differentiate between VMs or different generations of the same VM.
    +
    +The VM Generation ID is exposed through an ACPI device by multiple
    +hypervisor vendors. The driver for it lives at
    +``drivers/virt/vmgenid.c``
    +
    +The driver exposes the Virtual Machine Generation ID via a char-dev FS
    +interface that provides ID update sync/async notification, retrieval
    +and confirmation mechanisms:
    +
    +``open()``:
    +  When the device is opened, a copy of the current vm UUID is
    +  associated with the file handle. The driver now tracks this file
    +  handle as an independent *watcher*. The driver tracks how many
    +  watchers are aware of the latest Vm-Gen-Id uuid and how many of
    +  them are *outdated*, outdated being those that have lived through
    +  a Vm-Gen-Id change but not yet confirmed the generation change event.
    +
    +``read()``:
    +  Read is meant to provide the *new* Vm-Gen-Id when a generation change
    +  takes place. The read operation blocks until the associated UUID is
    +  no longer up to date - until HW vm gen id changes - at which point
    +  the new UUID is provided/returned. Nonblocking ``read()``
    +  uses ``EAGAIN`` to signal that there is no *new* UUID available.
    +  The hw UUID is considered *new* for each open file handle that hasn't
    +  confirmed the new value, following a generation change. Therefore,
    +  once a generation change takes place, all ``read()`` calls will
    +  immediately return the new uuid and will continue to do so until the
    +  new value is confirmed back to the driver through ``write()``.
    +  Partial reads are not allowed - read buffer needs to be at least
    +  ``sizeof(uuid_t)`` in size.
    +
    +``write()``:
    +  Write is used to confirm the up-to-date Vm-Gen-Id back to the driver.
    +  Following a VM generation change, all existing watchers are marked
    +  as *outdated*. Each file handle will maintain the *outdated* status
    +  until a ``write()`` confirms the up-to-date UUID back to the driver.
    +  Partial writes are not allowed - write buffer should be exactly
    +  ``sizeof(uuid_t)`` in size.
    +
    +``poll()``:
    +  Poll is implemented to allow polling for UUID updates. Such
    +  updates result in ``EPOLLIN`` polling status until the new up-to-date
    +  UUID is confirmed back to the driver through a ``write()``.
    +
    +``ioctl()``:
    +  The driver also adds support for tracking count of open file handles
    +  that haven't acknowledged an UUID update. This is exposed through
    +  two IOCTLs:
    +
    +  - VMGENID_GET_OUTDATED_WATCHERS: immediately returns the number of
    +    *outdated* watchers - number of file handles that were open during
    +    a VM generation change, and which have not yet confirmed the new
    +    Vm-Gen-Id.
    +  - VMGENID_WAIT_WATCHERS: blocks until there are no more *outdated*
    +    watchers, or if a ``timeout`` argument is provided, until the
    +    timeout expires.
    +
    +``mmap()``:
    +  The driver supports ``PROT_READ, MAP_SHARED`` mmaps of a single page
    +  in size. The first 16 bytes of the mapped page will contain an
    +  up-to-date copy of the VM generation UUID.
    +  The mapped memory can be used as a low-latency UUID probe mechanism
    +  in critical sections - see examples.
    +
    +``close()``:
    +  Removes the file handle as a Vm-Gen-Id watcher.
    +
    +Example application workflows
    +-----------------------------
    +
    +1) Watchdog thread simplified example::
    +
    +	void watchdog_thread_handler(int *thread_active)
    +	{
    +		uuid_t uuid;
    +		int fd = open("/dev/vmgenid", O_RDWR, S_IRUSR | S_IWUSR);
    +
    +		do {
    +			// read new UUID - blocks until VM generation changes
    +			read(fd, &uuid, sizeof(uuid));
    +
    +			// because of VM generation change, we need to rebuild world
    +			reseed_app_env();
    +
    +			// confirm we're done handling UUID update
    +			write(fd, &uuid, sizeof(uuid));
    +		} while (atomic_read(thread_active));
    +
    +		close(fd);
    +	}
    +
    +2) ASYNC simplified example::
    +
    +	void handle_io_on_vmgenfd(int vmgenfd)
    +	{
    +		uuid_t uuid;
    +
    +		// because of VM generation change, we need to rebuild world
    +		reseed_app_env();
    +
    +		// read new UUID - we need it to confirm we've handled update
    +		read(fd, &uuid, sizeof(uuid));
    +
    +		// confirm we're done handling UUID update
    +		write(fd, &uuid, sizeof(uuid));
    +	}
    +
    +	int main() {
    +		int epfd, vmgenfd;
    +		struct epoll_event ev;
    +
    +		epfd = epoll_create(EPOLL_QUEUE_LEN);
    +
    +		vmgenfd = open("/dev/vmgenid", O_RDWR, S_IRUSR | S_IWUSR);
    +
    +		// register vmgenid for polling
    +		ev.events = EPOLLIN;
    +		ev.data.fd = vmgenfd;
    +		epoll_ctl(epfd, EPOLL_CTL_ADD, vmgenfd, &ev);
    +
    +		// register other parts of your app for polling
    +		// ...
    +
    +		while (1) {
    +			// wait for something to do...
    +			int nfds = epoll_wait(epfd, events,
    +				MAX_EPOLL_EVENTS_PER_RUN,
    +				EPOLL_RUN_TIMEOUT);
    +			if (nfds < 0) die("Error in epoll_wait!");
    +
    +			// for each ready fd
    +			for(int i = 0; i < nfds; i++) {
    +				int fd = events[i].data.fd;
    +
    +				if (fd == vmgenfd)
    +					handle_io_on_vmgenfd(vmgenfd);
    +				else
    +					handle_some_other_part_of_the_app(fd);
    +			}
    +		}
    +
    +		return 0;
    +	}
    +
    +3) Mapped memory polling simplified example::
    +
    +	/*
    +	 * app/library function that provides cached secrets
    +	 */
    +	char * safe_cached_secret(app_data_t *app)
    +	{
    +		char *secret;
    +		volatile uuid_t *const uuid_ptr = get_vmgenid_mapping(app);
    +	again:
    +		secret = __cached_secret(app);
    +
    +		if (unlikely(*uuid_ptr != app->cached_uuid)) {
    +			app->cached_uuid = *uuid_ptr;
    +
    +			// rebuild world then confirm the uuid update (thru write)
    +			rebuild_caches(app);
    +			ack_vmgenid_update(app);
    +
    +			goto again;
    +		}
    +
    +		return secret;
    +	}
    +
    +4) Orchestrator simplified example::
    +
    +	/*
    +	 * orchestrator - manages multiple apps and libraries used by a service
    +	 * and tries to make sure all sensitive components gracefully handle
    +	 * VM generation changes.
    +	 * Following function is called on detection of a VM generation change.
    +	 */
    +	int handle_vmgen_update(int vmgenfd, uuid_t new_uuid)
    +	{
    +		// pause until all components have handled event
    +		pause_service();
    +
    +		// confirm *this* watcher as up-to-date
    +		write(fd, &new_uuid, sizeof(uuid_t));
    +
    +		// wait for all *others*
    +		ioctl(fd, VMGENID_WAIT_WATCHERS, NULL);
    +
    +		// all apps on the system have rebuilt worlds
    +		resume_service();
    +	}
    diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
    index 363af2e..c80f1ce 100644
    --- a/drivers/virt/Kconfig
    +++ b/drivers/virt/Kconfig
    @@ -13,6 +13,19 @@ menuconfig VIRT_DRIVERS
     
     if VIRT_DRIVERS
     
    +config VMGENID
    +	tristate "Virtual Machine Generation ID driver"
    +	depends on ACPI
    +	default M
    +	help
    +	  This is a Virtual Machine Generation ID driver which provides
    +	  a virtual machine unique identifier. The provided UUID can be
    +	  watched through the FS interface exposed by this driver, and
    +	  thus can provide notifications for VM snapshot or cloning events.
    +	  This enables applications and libraries that store or cache
    +	  sensitive information, to know that they need to regenerate it
    +	  after process memory has been exposed to potential copying.
    +
     config FSL_HV_MANAGER
     	tristate "Freescale hypervisor management driver"
     	depends on FSL_SOC
    diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
    index fd33124..a1f8dcc 100644
    --- a/drivers/virt/Makefile
    +++ b/drivers/virt/Makefile
    @@ -4,4 +4,5 @@
     #
     
     obj-$(CONFIG_FSL_HV_MANAGER)	+= fsl_hypervisor.o
    +obj-$(CONFIG_VMGENID)		+= vmgenid.o
     obj-y				+= vboxguest/
    diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
    new file mode 100644
    index 0000000..d314c72
    --- /dev/null
    +++ b/drivers/virt/vmgenid.c
    @@ -0,0 +1,419 @@
    +// SPDX-License-Identifier: GPL-2.0
    +/*
    + * Virtual Machine Generation ID driver
    + *
    + * Copyright (C) 2018 Red Hat Inc, Copyright (C) 2020 Amazon.com Inc
    + * All rights reserved.
    + *	Authors:
    + *	  Adrian Catangiu <acatan@amazon.com>
    + *	  Or Idgar <oridgar@gmail.com>
    + *	  Gal Hammer <ghammer@redhat.com>
    + *
    + */
    +#include <linux/acpi.h>
    +#include <linux/cdev.h>
    +#include <linux/kernel.h>
    +#include <linux/mm.h>
    +#include <linux/module.h>
    +#include <linux/poll.h>
    +#include <linux/uuid.h>
    +#include <linux/vmgenid.h>
    +
    +#define DEV_NAME "vmgenid"
    +ACPI_MODULE_NAME(DEV_NAME);
    +
    +struct dev_data {
    +	struct cdev       cdev;
    +	dev_t             dev_id;
    +	unsigned long     map_buf;
    +
    +	void              *uuid_iomap;
    +	uuid_t            uuid;
    +	wait_queue_head_t read_wait;
    +
    +	atomic_t          watchers;
    +	atomic_t          outdated_watchers;
    +	wait_queue_head_t outdated_wait;
    +};
    +
    +struct file_data {
    +	struct dev_data  *dev_data;
    +	uuid_t           acked_uuid;
    +};
    +
    +static bool vmgenid_uuid_matches(struct dev_data *priv, uuid_t *uuid)
    +{
    +	return !memcmp(uuid, &priv->uuid, sizeof(uuid_t));
    +}
    +
    +static void vmgenid_put_outdated_watchers(struct dev_data *priv)
    +{
    +	if (atomic_dec_and_test(&priv->outdated_watchers))
    +		wake_up_interruptible(&priv->outdated_wait);
    +}
    +
    +static int vmgenid_open(struct inode *inode, struct file *file)
    +{
    +	struct dev_data *priv =
    +		container_of(inode->i_cdev, struct dev_data, cdev);
    +	struct file_data *file_data =
    +		kzalloc(sizeof(struct file_data), GFP_KERNEL);
    +
    +	if (!file_data)
    +		return -ENOMEM;
    +
    +	file_data->acked_uuid = priv->uuid;
    +	file_data->dev_data = priv;
    +
    +	file->private_data = file_data;
    +	atomic_inc(&priv->watchers);
    +
    +	return 0;
    +}
    +
    +static int vmgenid_close(struct inode *inode, struct file *file)
    +{
    +	struct file_data *file_data = (struct file_data *) file->private_data;
    +	struct dev_data *priv = file_data->dev_data;
    +
    +	if (!vmgenid_uuid_matches(priv, &file_data->acked_uuid))
    +		vmgenid_put_outdated_watchers(priv);
    +	atomic_dec(&priv->watchers);
    +	kfree(file->private_data);
    +
    +	return 0;
    +}
    +
    +static ssize_t
    +vmgenid_read(struct file *file, char __user *ubuf, size_t nbytes, loff_t *ppos)
    +{
    +	struct file_data *file_data =
    +		(struct file_data *) file->private_data;
    +	struct dev_data *priv = file_data->dev_data;
    +	ssize_t ret;
    +
    +	if (nbytes == 0)
    +		return 0;
    +	/* disallow partial UUID reads */
    +	if (nbytes < sizeof(uuid_t))
    +		return -EINVAL;
    +	nbytes = sizeof(uuid_t);
    +
    +	if (vmgenid_uuid_matches(priv, &file_data->acked_uuid)) {
    +		if (file->f_flags & O_NONBLOCK)
    +			return -EAGAIN;
    +		ret = wait_event_interruptible(
    +			priv->read_wait,
    +			!vmgenid_uuid_matches(priv, &file_data->acked_uuid)
    +		);
    +		if (ret)
    +			return ret;
    +	}
    +
    +	ret = copy_to_user(ubuf, &priv->uuid, nbytes);
    +	if (ret)
    +		return -EFAULT;
    +
    +	return nbytes;
    +}
    +
    +static ssize_t vmgenid_write(struct file *file, const char __user *ubuf,
    +				size_t count, loff_t *ppos)
    +{
    +	struct file_data *file_data = (struct file_data *) file->private_data;
    +	struct dev_data *priv = file_data->dev_data;
    +	uuid_t ack_uuid;
    +
    +	/* disallow partial UUID writes */
    +	if (count != sizeof(uuid_t))
    +		return -EINVAL;
    +	if (copy_from_user(&ack_uuid, ubuf, count))
    +		return -EFAULT;
    +	/* wrong UUID acknowledged */
    +	if (!vmgenid_uuid_matches(priv, &ack_uuid))
    +		return -EINVAL;
    +
    +	if (!vmgenid_uuid_matches(priv, &file_data->acked_uuid)) {
    +		/* update local view of UUID */
    +		file_data->acked_uuid = ack_uuid;
    +		vmgenid_put_outdated_watchers(priv);
    +	}
    +
    +	return (ssize_t)count;
    +}
    +
    +static __poll_t
    +vmgenid_poll(struct file *file, poll_table *wait)
    +{
    +	__poll_t mask = 0;
    +	struct file_data *file_data =
    +		(struct file_data *) file->private_data;
    +	struct dev_data *priv = file_data->dev_data;
    +
    +	if (!vmgenid_uuid_matches(priv, &file_data->acked_uuid))
    +		return EPOLLIN | EPOLLRDNORM;
    +
    +	poll_wait(file, &priv->read_wait, wait);
    +
    +	if (!vmgenid_uuid_matches(priv, &file_data->acked_uuid))
    +		mask = EPOLLIN | EPOLLRDNORM;
    +
    +	return mask;
    +}
    +
    +static long vmgenid_ioctl(struct file *file,
    +		unsigned int cmd, unsigned long arg)
    +{
    +	struct file_data *file_data =
    +		(struct file_data *) file->private_data;
    +	struct dev_data *priv = file_data->dev_data;
    +	struct timespec __user *timeout = (void *) arg;
    +	struct timespec kts;
    +	ktime_t until;
    +	int ret;
    +
    +	switch (cmd) {
    +	case VMGENID_GET_OUTDATED_WATCHERS:
    +		ret = atomic_read(&priv->outdated_watchers);
    +		break;
    +	case VMGENID_WAIT_WATCHERS:
    +		if (timeout) {
    +			ret = copy_from_user(&kts, timeout, sizeof(kts));
    +			if (ret)
    +				return -EFAULT;
    +			until = timespec_to_ktime(kts);
    +		} else {
    +			until = KTIME_MAX;
    +		}
    +
    +		ret = wait_event_interruptible_hrtimeout(
    +			priv->outdated_wait,
    +			!atomic_read(&priv->outdated_watchers),
    +			until
    +		);
    +		break;
    +	default:
    +		ret = -EINVAL;
    +		break;
    +	}
    +	return ret;
    +}
    +
    +static vm_fault_t vmgenid_vm_fault(struct vm_fault *vmf)
    +{
    +	struct page *page;
    +	struct file_data *file_data =
    +			(struct file_data *) vmf->vma->vm_private_data;
    +	struct dev_data *priv = file_data->dev_data;
    +
    +	if (priv->map_buf) {
    +		page = virt_to_page(priv->map_buf);
    +		get_page(page);
    +		vmf->page = page;
    +	}
    +
    +	return 0;
    +}
    +
    +static const struct vm_operations_struct vmgenid_vm_ops = {
    +	.fault = vmgenid_vm_fault,
    +};
    +
    +static int vmgenid_mmap(struct file *file, struct vm_area_struct *vma)
    +{
    +	if (vma->vm_pgoff != 0 || vma_pages(vma) > 1)
    +		return -EINVAL;
    +
    +	if ((vma->vm_flags & VM_WRITE) != 0)
    +		return -EPERM;
    +
    +	vma->vm_ops = &vmgenid_vm_ops;
    +	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
    +	vma->vm_private_data = file->private_data;
    +
    +	return 0;
    +}
    +
    +static const struct file_operations fops = {
    +	.owner          = THIS_MODULE,
    +	.mmap           = vmgenid_mmap,
    +	.open           = vmgenid_open,
    +	.release        = vmgenid_close,
    +	.read           = vmgenid_read,
    +	.write          = vmgenid_write,
    +	.poll           = vmgenid_poll,
    +	.compat_ioctl   = vmgenid_ioctl,
    +	.unlocked_ioctl = vmgenid_ioctl,
    +};
    +
    +static int vmgenid_acpi_map(struct dev_data *priv, acpi_handle handle)
    +{
    +	int i;
    +	phys_addr_t phys_addr;
    +	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
    +	acpi_status status;
    +	union acpi_object *pss;
    +	union acpi_object *element;
    +
    +	status = acpi_evaluate_object(handle, "ADDR", NULL, &buffer);
    +	if (ACPI_FAILURE(status)) {
    +		ACPI_EXCEPTION((AE_INFO, status, "Evaluating ADDR"));
    +		return -ENODEV;
    +	}
    +	pss = buffer.pointer;
    +	if (!pss || pss->type != ACPI_TYPE_PACKAGE || pss->package.count != 2)
    +		return -EINVAL;
    +
    +	phys_addr = 0;
    +	for (i = 0; i < pss->package.count; i++) {
    +		element = &(pss->package.elements[i]);
    +		if (element->type != ACPI_TYPE_INTEGER)
    +			return -EINVAL;
    +		phys_addr |= element->integer.value << i * 32;
    +	}
    +
    +	priv->uuid_iomap = acpi_os_map_memory(phys_addr, sizeof(uuid_t));
    +	if (!priv->uuid_iomap) {
    +		pr_err("Could not map memory at 0x%llx, size %u",
    +			   phys_addr,
    +			   (u32)sizeof(uuid_t));
    +		return -ENOMEM;
    +	}
    +
    +	memcpy_fromio(&priv->uuid, priv->uuid_iomap, sizeof(uuid_t));
    +	memcpy((void *) priv->map_buf, &priv->uuid, sizeof(uuid_t));
    +
    +	return 0;
    +}
    +
    +static int vmgenid_acpi_add(struct acpi_device *device)
    +{
    +	int ret;
    +	struct dev_data *priv;
    +
    +	if (!device)
    +		return -EINVAL;
    +
    +	priv = kzalloc(sizeof(struct dev_data), GFP_KERNEL);
    +	if (!priv)
    +		return -ENOMEM;
    +
    +	priv->map_buf = get_zeroed_page(GFP_KERNEL);
    +	if (!priv->map_buf) {
    +		ret = -ENOMEM;
    +		goto free;
    +	}
    +
    +	device->driver_data = priv;
    +
    +	init_waitqueue_head(&priv->read_wait);
    +	atomic_set(&priv->watchers, 0);
    +	atomic_set(&priv->outdated_watchers, 0);
    +	init_waitqueue_head(&priv->outdated_wait);
    +
    +	ret = vmgenid_acpi_map(priv, device->handle);
    +	if (ret < 0)
    +		goto err;
    +
    +	ret = alloc_chrdev_region(&priv->dev_id, 0, 1, DEV_NAME);
    +	if (ret < 0) {
    +		pr_err("alloc_chrdev_region() failed for vmgenid\n");
    +		goto err;
    +	}
    +
    +	cdev_init(&priv->cdev, &fops);
    +	cdev_add(&priv->cdev, priv->dev_id, 1);
    +
    +	return 0;
    +
    +err:
    +	if (priv->uuid_iomap)
    +		acpi_os_unmap_memory(priv->uuid_iomap, sizeof(uuid_t));
    +
    +	free_pages(priv->map_buf, 0);
    +
    +free:
    +	kfree(priv);
    +
    +	return ret;
    +}
    +
    +static int vmgenid_acpi_remove(struct acpi_device *device)
    +{
    +	struct dev_data *priv;
    +
    +	if (!device || !acpi_driver_data(device))
    +		return -EINVAL;
    +	priv = acpi_driver_data(device);
    +
    +	cdev_del(&priv->cdev);
    +	unregister_chrdev_region(priv->dev_id, 1);
    +	device->driver_data = NULL;
    +
    +	if (priv->uuid_iomap)
    +		acpi_os_unmap_memory(priv->uuid_iomap, sizeof(uuid_t));
    +	free_pages(priv->map_buf, 0);
    +	kfree(priv);
    +
    +	return 0;
    +}
    +
    +static void vmgenid_acpi_notify(struct acpi_device *device, u32 event)
    +{
    +	uuid_t old_uuid;
    +	struct dev_data *priv;
    +
    +	pr_debug("VMGENID notified, event %u", event);
    +
    +	if (!device || !acpi_driver_data(device)) {
    +		pr_err("VMGENID notify with NULL private data");
    +		return;
    +	}
    +	priv = acpi_driver_data(device);
    +
    +	/* update VM Generation UUID */
    +	old_uuid = priv->uuid;
    +	memcpy_fromio(&priv->uuid, priv->uuid_iomap, sizeof(uuid_t));
    +
    +	if (!vmgenid_uuid_matches(priv, &old_uuid)) {
    +		/* HW uuid updated */
    +		memcpy((void *) priv->map_buf, &priv->uuid, sizeof(uuid_t));
    +		atomic_set(&priv->outdated_watchers,
    +			 atomic_read(&priv->watchers));
    +		wake_up_interruptible(&priv->read_wait);
    +	}
    +}
    +
    +static const struct acpi_device_id vmgenid_ids[] = {
    +	{"QEMUVGID", 0},
    +	{"", 0},
    +};
    +
    +static struct acpi_driver acpi_vmgenid_driver = {
    +	.name = "vm_generation_id",
    +	.ids = vmgenid_ids,
    +	.owner = THIS_MODULE,
    +	.ops = {
    +		.add = vmgenid_acpi_add,
    +		.remove = vmgenid_acpi_remove,
    +		.notify = vmgenid_acpi_notify,
    +	}
    +};
    +
    +static int __init vmgenid_init(void)
    +{
    +	return acpi_bus_register_driver(&acpi_vmgenid_driver);
    +}
    +
    +static void __exit vmgenid_exit(void)
    +{
    +	acpi_bus_unregister_driver(&acpi_vmgenid_driver);
    +}
    +
    +module_init(vmgenid_init);
    +module_exit(vmgenid_exit);
    +
    +MODULE_AUTHOR("Adrian Catangiu");
    +MODULE_DESCRIPTION("Virtual Machine Generation ID");
    +MODULE_LICENSE("GPL");
    +MODULE_VERSION("0.1");
    diff --git a/include/uapi/linux/vmgenid.h b/include/uapi/linux/vmgenid.h
    new file mode 100644
    index 0000000..f7fca7b
    --- /dev/null
    +++ b/include/uapi/linux/vmgenid.h
    @@ -0,0 +1,22 @@
    +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
    +/*
    + * Copyright (c) 2020, Amazon.com Inc.
    + *
    + * This program is free software; you can redistribute it and/or
    + * modify it under the terms of the GNU General Public License
    + * as published by the Free Software Foundation; either version
    + * 2 of the License, or (at your option) any later version.
    + */
    +
    +#ifndef _UAPI_LINUX_VMGENID_H
    +#define _UAPI_LINUX_VMGENID_H
    +
    +#include <linux/ioctl.h>
    +#include <linux/time.h>
    +
    +#define VMGENID_IOCTL 0x2d
    +#define VMGENID_GET_OUTDATED_WATCHERS _IO(VMGENID_IOCTL, 1)
    +#define VMGENID_WAIT_WATCHERS         _IOW(VMGENID_IOCTL, 2, struct timespec)
    +
    +#endif /* _UAPI_LINUX_VMGENID_H */
    +
    -- 
    2.7.4
    
    
    




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

  reply index

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AQHWo8lIfZnFKGe8nkGmhTCXwq5R3w==>
2020-10-16 14:33 ` Catangiu, Adrian Costin
2020-10-16 15:00   ` Catangiu, Adrian Costin [this message]
2020-10-16 15:14   ` gregkh
2020-10-17  1:40   ` Jann Horn
2020-10-17  3:36     ` Willy Tarreau
2020-10-17  4:02       ` Jann Horn
2020-10-17  4:34         ` Colm MacCarthaigh
2020-10-17  5:01           ` Jann Horn
2020-10-17  5:29             ` Colm MacCarthaigh
2020-10-17  5:37             ` Willy Tarreau
2020-10-17  5:52               ` Jann Horn
2020-10-17  6:44                 ` Willy Tarreau
2020-10-17  6:55                   ` Jann Horn
2020-10-17  7:17                     ` Willy Tarreau
2020-10-17 13:24                     ` Jason A. Donenfeld
2020-10-17 18:06                       ` Catangiu, Adrian Costin
2020-10-17 18:09                       ` Alexander Graf
2020-10-18  2:08                         ` Jann Horn
2020-10-20  9:35                         ` Christian Borntraeger
2020-10-20  9:54                           ` Alexander Graf
2020-10-20 16:54                         ` Catangiu, Adrian Costin
2020-10-18  3:14                       ` Colm MacCarthaigh
2020-10-18 15:52                       ` Michael S. Tsirkin
2020-10-18 15:54                         ` Andy Lutomirski
2020-10-18 15:59                           ` Michael S. Tsirkin
2020-10-18 16:14                             ` Andy Lutomirski
2020-10-19 15:00                               ` Michael S. Tsirkin
2020-10-17 18:10     ` Andy Lutomirski
2020-10-19 17:15       ` Mathieu Desnoyers
2020-10-20 10:00         ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A8D0AD0B-CA9F-4667-B50B-9C1805BBE2C2@amazon.com \
    --to=acatan@amazon.com \
    --cc=bonzini@gnu.org \
    --cc=colmmacc@amazon.com \
    --cc=corbet@lwn.net \
    --cc=dwmw@amazon.co.uk \
    --cc=ghammer@redhat.com \
    --cc=graf@amazon.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=oridgar@gmail.com \
    --cc=pavel@ucw.cz \
    --cc=qemu-devel@nongnu.org \
    --cc=raduweis@amazon.com \
    --cc=rafael@kernel.org \
    --cc=sblbir@amazon.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git
	git clone --mirror https://lore.kernel.org/qemu-devel/2 qemu-devel/git/2.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git