All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
@ 2018-12-10 17:31 Dr. David Alan Gilbert (git)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 1/7] virtio: Add shared memory capability Dr. David Alan Gilbert (git)
                   ` (11 more replies)
  0 siblings, 12 replies; 26+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-12-10 17:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: vgoyal, miklos, stefanha, sweil, swhiteho

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Hi,
  This is the first RFC for the QEMU side of 'virtio-fs';
a new mechanism for mounting host directories into the guest
in a fast, consistent and secure manner.  Our primary use
case is kata containers, but it should be usable in other scenarios
as well.

There are corresponding patches being posted to Linux kernel,
libfuse and kata lists.

For a fuller design description, and benchmark numbers, please see
Vivek's posting of the kernel set here:

https://marc.info/?l=linux-kernel&m=154446243024251&w=2

We've got a small website with instructions on how to use it, here:

https://virtio-fs.gitlab.io/

and all the code is available on gitlab at:

https://gitlab.com/virtio-fs

QEMU's changes
--------------

The QEMU changes are pretty small; 

There's a new vhost-user device, which is used to carry a stream of
FUSE messages to an external daemon that actually performs
all the file IO.  The FUSE daemon is an external process in order to
achieve better isolation for security and resource control (e.g. number
of file descriptors) and also because it's cleaner than trying to
integrate libfuse into QEMU.

This device has an extra BAR that contains (up to) 3 regions:

 a) a DAX mapping range ('the cache') - into which QEMU mmap's
    files on behalf of the external daemon; those files are
    then directly mapped by the guest in a way similar to a DAX
    backed file system;  one advantage of this is that multiple
    guests all accessing the same files should all be sharing
    those pages of host cache.

 b) An experimental set of mappings for use by a metadata versioning
    daemon;  this mapping is shared between multiple guests and
    the daemon, but only contains a set of version counters that
    allow a guest to quickly tell if its metadata is stale.

TODO
----

This is the first RFC, we know we have a bunch of things to clear up:

  a) The virtio device specificiation is still in flux and is expected
     to change

  b) We'd like to find ways of reducing the map/unmap latency for DAX

  c) The metadata versioning scheme needs to settle out.

  d) mmap'ing host files has some interesting side effects; for example
     if the file gets truncated by the host and then the guest accesses
     the mapping, KVM can fail the guest hard.

Dr. David Alan Gilbert (6):
  virtio: Add shared memory capability
  virtio-fs: Add cache BAR
  virtio-fs: Add vhost-user slave commands for mapping
  virtio-fs: Fill in slave commands for mapping
  virtio-fs: Allow mapping of meta data version table
  virtio-fs: Allow mapping of journal

Stefan Hajnoczi (1):
  virtio: add vhost-user-fs-pci device

 configure                                   |  10 +
 contrib/libvhost-user/libvhost-user.h       |   3 +
 docs/interop/vhost-user.txt                 |  35 ++
 hw/virtio/Makefile.objs                     |   1 +
 hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
 hw/virtio/vhost-user.c                      |  16 +
 hw/virtio/virtio-pci.c                      | 115 +++++
 hw/virtio/virtio-pci.h                      |  19 +
 include/hw/pci/pci.h                        |   1 +
 include/hw/virtio/vhost-user-fs.h           |  79 +++
 include/standard-headers/linux/virtio_fs.h  |  48 ++
 include/standard-headers/linux/virtio_ids.h |   1 +
 include/standard-headers/linux/virtio_pci.h |   9 +
 13 files changed, 854 insertions(+)
 create mode 100644 hw/virtio/vhost-user-fs.c
 create mode 100644 include/hw/virtio/vhost-user-fs.h
 create mode 100644 include/standard-headers/linux/virtio_fs.h

-- 
2.19.2

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC PATCH 1/7] virtio: Add shared memory capability
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
@ 2018-12-10 17:31 ` Dr. David Alan Gilbert (git)
  2018-12-10 21:03   ` Eric Blake
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 2/7] virtio: add vhost-user-fs-pci device Dr. David Alan Gilbert (git)
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-12-10 17:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: vgoyal, miklos, stefanha, sweil, swhiteho

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Define a new capability type 'VIRTIO_PCI_CAP_SHARED_MEMORY_CFG'
and the data structure 'virtio_pci_shm_cap' to go with it.
They allow defining shared memory regions with sizes and offsets
of 2^32 and more.
Multiple instances of the capability are allowed and distinguished
by a device-specific 'id'.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/virtio-pci.c                      | 20 ++++++++++++++++++++
 include/standard-headers/linux/virtio_pci.h |  9 +++++++++
 2 files changed, 29 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index a954799267..1e737531b5 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1163,6 +1163,26 @@ static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
     return offset;
 }
 
+static int virtio_pci_add_shm_cap(VirtIOPCIProxy *proxy,
+                                   uint8_t bar,
+                                   uint64_t offset, uint64_t length,
+                                   uint8_t id)
+{
+    struct virtio_pci_shm_cap cap = {
+        .cap.cap_len = sizeof cap,
+        .cap.cfg_type = VIRTIO_PCI_CAP_SHARED_MEMORY_CFG,
+    };
+    uint32_t mask32 = ~0;
+
+    cap.cap.bar = bar;
+    cap.cap.length = cpu_to_le32(length & mask32);
+    cap.length_hi = cpu_to_le32((length >> 32) & mask32);
+    cap.cap.offset = cpu_to_le32(offset & mask32);
+    cap.offset_hi = cpu_to_le32((offset >> 32) & mask32);
+    cap.id = id;
+    return virtio_pci_add_mem_cap(proxy, &cap.cap);
+}
+
 static uint64_t virtio_pci_common_read(void *opaque, hwaddr addr,
                                        unsigned size)
 {
diff --git a/include/standard-headers/linux/virtio_pci.h b/include/standard-headers/linux/virtio_pci.h
index 9262acd130..745d7a1942 100644
--- a/include/standard-headers/linux/virtio_pci.h
+++ b/include/standard-headers/linux/virtio_pci.h
@@ -113,6 +113,8 @@
 #define VIRTIO_PCI_CAP_DEVICE_CFG	4
 /* PCI configuration access */
 #define VIRTIO_PCI_CAP_PCI_CFG		5
+/* Additional shared memory capability */
+#define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
 
 /* This is the PCI capability header: */
 struct virtio_pci_cap {
@@ -163,6 +165,13 @@ struct virtio_pci_cfg_cap {
 	uint8_t pci_cfg_data[4]; /* Data for BAR access. */
 };
 
+struct virtio_pci_shm_cap {
+	struct virtio_pci_cap cap;
+	uint32_t offset_hi;             /* Most sig 32 bits of offset */
+	uint32_t length_hi;             /* Most sig 32 bits of length */
+        uint8_t  id;                    /* To distinguish shm chunks */
+};
+
 /* Macro versions of offsets for the Old Timers! */
 #define VIRTIO_PCI_CAP_VNDR		0
 #define VIRTIO_PCI_CAP_NEXT		1
-- 
2.19.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC PATCH 2/7] virtio: add vhost-user-fs-pci device
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 1/7] virtio: Add shared memory capability Dr. David Alan Gilbert (git)
@ 2018-12-10 17:31 ` Dr. David Alan Gilbert (git)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 3/7] virtio-fs: Add cache BAR Dr. David Alan Gilbert (git)
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-12-10 17:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: vgoyal, miklos, stefanha, sweil, swhiteho

From: Stefan Hajnoczi <stefanha@redhat.com>

The virtio-fs virtio device provides shared file system access.  The
actual file server is implemented in an external vhost-user-fs device
backend process.

Launch QEMU like this:

  qemu -chardev socket,path=/tmp/vhost-fs.sock,id=chr0
       -device vhost-user-fs-pci,tag=myfs,chardev=chr0

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 configure                                   |  10 +
 hw/virtio/Makefile.objs                     |   1 +
 hw/virtio/vhost-user-fs.c                   | 297 ++++++++++++++++++++
 hw/virtio/virtio-pci.c                      |  52 ++++
 hw/virtio/virtio-pci.h                      |  18 ++
 include/hw/pci/pci.h                        |   1 +
 include/hw/virtio/vhost-user-fs.h           |  45 +++
 include/standard-headers/linux/virtio_fs.h  |  41 +++
 include/standard-headers/linux/virtio_ids.h |   1 +
 9 files changed, 466 insertions(+)
 create mode 100644 hw/virtio/vhost-user-fs.c
 create mode 100644 include/hw/virtio/vhost-user-fs.h
 create mode 100644 include/standard-headers/linux/virtio_fs.h

diff --git a/configure b/configure
index 0a3c6a72c3..140b89d8f0 100755
--- a/configure
+++ b/configure
@@ -371,6 +371,7 @@ vhost_crypto="no"
 vhost_scsi="no"
 vhost_vsock="no"
 vhost_user=""
+vhost_user_fs="no"
 kvm="no"
 hax="no"
 hvf="no"
@@ -878,6 +879,7 @@ Linux)
   vhost_crypto="yes"
   vhost_scsi="yes"
   vhost_vsock="yes"
+  vhost_user_fs="yes"
   QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$PWD/linux-headers $QEMU_INCLUDES"
   supported_os="yes"
   libudev="yes"
@@ -1272,6 +1274,10 @@ for opt do
   ;;
   --enable-vhost-vsock) vhost_vsock="yes"
   ;;
+  --disable-vhost-user-fs) vhost_user_fs="no"
+  ;;
+  --enable-vhost-user-fs) vhost_user_fs="yes"
+  ;;
   --disable-opengl) opengl="no"
   ;;
   --enable-opengl) opengl="yes"
@@ -6054,6 +6060,7 @@ echo "vhost-crypto support $vhost_crypto"
 echo "vhost-scsi support $vhost_scsi"
 echo "vhost-vsock support $vhost_vsock"
 echo "vhost-user support $vhost_user"
+echo "vhost-user-fs support $vhost_user_fs"
 echo "Trace backends    $trace_backends"
 if have_backend "simple"; then
 echo "Trace output file $trace_file-<pid>"
@@ -6524,6 +6531,9 @@ fi
 if test "$vhost_user" = "yes" ; then
   echo "CONFIG_VHOST_USER=y" >> $config_host_mak
 fi
+if test "$vhost_user_fs" = "yes" ; then
+  echo "CONFIG_VHOST_USER_FS=y" >> $config_host_mak
+fi
 if test "$blobs" = "yes" ; then
   echo "INSTALL_BLOBS=yes" >> $config_host_mak
 fi
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1b2799cfd8..6783932231 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -11,6 +11,7 @@ obj-$(call land,$(CONFIG_VIRTIO_CRYPTO),$(CONFIG_VIRTIO_PCI)) += virtio-crypto-p
 
 obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
 obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock.o
+obj-$(CONFIG_VHOST_USER_FS) += vhost-user-fs.o
 endif
 
 common-obj-$(call lnot,$(call land,$(CONFIG_VIRTIO),$(CONFIG_LINUX))) += vhost-stub.o
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
new file mode 100644
index 0000000000..bc21beeac3
--- /dev/null
+++ b/hw/virtio/vhost-user-fs.c
@@ -0,0 +1,297 @@
+/*
+ * Vhost-user filesystem virtio device
+ *
+ * Copyright 2018 Red Hat, Inc.
+ *
+ * Authors:
+ *  Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include <sys/ioctl.h>
+#include "standard-headers/linux/virtio_fs.h"
+#include "qapi/error.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include "qemu/error-report.h"
+#include "hw/virtio/vhost-user-fs.h"
+#include "monitor/monitor.h"
+
+static void vuf_get_config(VirtIODevice *vdev, uint8_t *config)
+{
+    VHostUserFS *fs = VHOST_USER_FS(vdev);
+    struct virtio_fs_config fscfg = {};
+
+    /* strncpy(3) is okay, the field is not NUL-terminated at max size */
+    strncpy((char *)fscfg.tag, fs->conf.tag, sizeof(fscfg.tag));
+
+    virtio_stl_p(vdev, &fscfg.num_queues, fs->conf.num_queues);
+
+    memcpy(config, &fscfg, sizeof(fscfg));
+}
+
+static void vuf_start(VirtIODevice *vdev)
+{
+    VHostUserFS *fs = VHOST_USER_FS(vdev);
+    BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+    VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+    int ret;
+    int i;
+
+    if (!k->set_guest_notifiers) {
+        error_report("binding does not support guest notifiers");
+        return;
+    }
+
+    ret = vhost_dev_enable_notifiers(&fs->vhost_dev, vdev);
+    if (ret < 0) {
+        error_report("Error enabling host notifiers: %d", -ret);
+        return;
+    }
+
+    ret = k->set_guest_notifiers(qbus->parent, fs->vhost_dev.nvqs, true);
+    if (ret < 0) {
+        error_report("Error binding guest notifier: %d", -ret);
+        goto err_host_notifiers;
+    }
+
+    fs->vhost_dev.acked_features = vdev->guest_features;
+    ret = vhost_dev_start(&fs->vhost_dev, vdev);
+    if (ret < 0) {
+        error_report("Error starting vhost: %d", -ret);
+        goto err_guest_notifiers;
+    }
+
+    /* guest_notifier_mask/pending not used yet, so just unmask
+     * everything here.  virtio-pci will do the right thing by
+     * enabling/disabling irqfd.
+     */
+    for (i = 0; i < fs->vhost_dev.nvqs; i++) {
+        vhost_virtqueue_mask(&fs->vhost_dev, vdev, i, false);
+    }
+
+    return;
+
+err_guest_notifiers:
+    k->set_guest_notifiers(qbus->parent, fs->vhost_dev.nvqs, false);
+err_host_notifiers:
+    vhost_dev_disable_notifiers(&fs->vhost_dev, vdev);
+}
+
+static void vuf_stop(VirtIODevice *vdev)
+{
+    VHostUserFS *fs = VHOST_USER_FS(vdev);
+    BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+    VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+    int ret;
+
+    if (!k->set_guest_notifiers) {
+        return;
+    }
+
+    vhost_dev_stop(&fs->vhost_dev, vdev);
+
+    ret = k->set_guest_notifiers(qbus->parent, fs->vhost_dev.nvqs, false);
+    if (ret < 0) {
+        error_report("vhost guest notifier cleanup failed: %d", ret);
+        return;
+    }
+
+    vhost_dev_disable_notifiers(&fs->vhost_dev, vdev);
+}
+
+static void vuf_set_status(VirtIODevice *vdev, uint8_t status)
+{
+    VHostUserFS *fs = VHOST_USER_FS(vdev);
+    bool should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
+
+    if (!vdev->vm_running) {
+        should_start = false;
+    }
+
+    if (fs->vhost_dev.started == should_start) {
+        return;
+    }
+
+    if (should_start) {
+        vuf_start(vdev);
+    } else {
+        vuf_stop(vdev);
+    }
+}
+
+static uint64_t vuf_get_features(VirtIODevice *vdev,
+                                      uint64_t requested_features,
+                                      Error **errp)
+{
+    /* No feature bits used yet */
+    return requested_features;
+}
+
+static void vuf_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+    /* Do nothing */
+}
+
+static void vuf_guest_notifier_mask(VirtIODevice *vdev, int idx,
+                                            bool mask)
+{
+    VHostUserFS *fs = VHOST_USER_FS(vdev);
+
+    vhost_virtqueue_mask(&fs->vhost_dev, vdev, idx, mask);
+}
+
+static bool vuf_guest_notifier_pending(VirtIODevice *vdev, int idx)
+{
+    VHostUserFS *fs = VHOST_USER_FS(vdev);
+
+    return vhost_virtqueue_pending(&fs->vhost_dev, idx);
+}
+
+static void vuf_device_realize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VHostUserFS *fs = VHOST_USER_FS(dev);
+    unsigned int i;
+    size_t len;
+    int ret;
+
+    if (!fs->conf.chardev.chr) {
+        error_setg(errp, "missing chardev");
+        return;
+    }
+
+    if (!fs->conf.tag) {
+        error_setg(errp, "missing tag property");
+        return;
+    }
+    len = strlen(fs->conf.tag);
+    if (len == 0) {
+        error_setg(errp, "tag property cannot be empty");
+        return;
+    }
+    if (len > sizeof_field(struct virtio_fs_config, tag)) {
+        error_setg(errp, "tag property must be %zu bytes or less",
+                   sizeof_field(struct virtio_fs_config, tag));
+        return;
+    }
+
+    if (fs->conf.num_queues == 0) {
+        error_setg(errp, "num-queues property must be larger than 0");
+        return;
+    }
+
+    if (!is_power_of_2(fs->conf.queue_size)) {
+        error_setg(errp, "queue-size property must be a power of 2");
+        return;
+    }
+
+    if (fs->conf.queue_size > VIRTQUEUE_MAX_SIZE) {
+        error_setg(errp, "queue-size property must be %u or smaller",
+                   VIRTQUEUE_MAX_SIZE);
+        return;
+    }
+
+    fs->vhost_user = vhost_user_init();
+    if (!fs->vhost_user) {
+        error_setg(errp, "failed to initialize vhost-user");
+        return;
+    }
+    fs->vhost_user->chr = &fs->conf.chardev;
+
+    virtio_init(vdev, "vhost-user-fs", VIRTIO_ID_FS,
+                sizeof(struct virtio_fs_config));
+
+    /* Notifications queue */
+    virtio_add_queue(vdev, fs->conf.queue_size, vuf_handle_output);
+
+    /* Hiprio queue */
+    virtio_add_queue(vdev, fs->conf.queue_size, vuf_handle_output);
+
+    /* Request queues */
+    for (i = 0; i < fs->conf.num_queues; i++) {
+        virtio_add_queue(vdev, fs->conf.queue_size, vuf_handle_output);
+    }
+
+    fs->vhost_dev.nvqs = 2 + fs->conf.num_queues;
+    fs->vhost_dev.vqs = g_new0(struct vhost_virtqueue, fs->vhost_dev.nvqs);
+    ret = vhost_dev_init(&fs->vhost_dev, fs->vhost_user,
+                         VHOST_BACKEND_TYPE_USER, 0);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "vhost_dev_init failed");
+        goto err_virtio;
+    }
+
+    return;
+
+err_virtio:
+    vhost_user_cleanup(fs->vhost_user);
+    g_free(fs->vhost_user);
+    virtio_cleanup(vdev);
+    g_free(fs->vhost_dev.vqs);
+    return;
+}
+
+static void vuf_device_unrealize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VHostUserFS *fs = VHOST_USER_FS(dev);
+
+    /* This will stop vhost backend if appropriate. */
+    vuf_set_status(vdev, 0);
+
+    vhost_dev_cleanup(&fs->vhost_dev);
+
+    if (fs->vhost_user) {
+        vhost_user_cleanup(fs->vhost_user);
+        g_free(fs->vhost_user);
+        fs->vhost_user = NULL;
+    }
+
+    virtio_cleanup(vdev);
+    g_free(fs->vhost_dev.vqs);
+    fs->vhost_dev.vqs = NULL;
+}
+
+static Property vuf_properties[] = {
+    DEFINE_PROP_CHR("chardev", VHostUserFS, conf.chardev),
+    DEFINE_PROP_STRING("tag", VHostUserFS, conf.tag),
+    DEFINE_PROP_UINT16("num-queues", VHostUserFS, conf.num_queues, 1),
+    DEFINE_PROP_UINT16("queue-size", VHostUserFS, conf.queue_size, 128),
+    DEFINE_PROP_STRING("vhostfd", VHostUserFS, conf.vhostfd),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vuf_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+
+    dc->props = vuf_properties;
+    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+    vdc->realize = vuf_device_realize;
+    vdc->unrealize = vuf_device_unrealize;
+    vdc->get_features = vuf_get_features;
+    vdc->get_config = vuf_get_config;
+    vdc->set_status = vuf_set_status;
+    vdc->guest_notifier_mask = vuf_guest_notifier_mask;
+    vdc->guest_notifier_pending = vuf_guest_notifier_pending;
+}
+
+static const TypeInfo vuf_info = {
+    .name = TYPE_VHOST_USER_FS,
+    .parent = TYPE_VIRTIO_DEVICE,
+    .instance_size = sizeof(VHostUserFS),
+    .class_init = vuf_class_init,
+};
+
+static void vuf_register_types(void)
+{
+    type_register_static(&vuf_info);
+}
+
+type_init(vuf_register_types)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 1e737531b5..d744f93655 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2665,6 +2665,55 @@ static const TypeInfo virtio_host_pci_info = {
 };
 #endif
 
+/* vhost-user-fs-pci */
+
+#ifdef CONFIG_VHOST_USER_FS
+static Property vhost_user_fs_pci_properties[] = {
+    /* TODO multiqueue */
+    DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 4),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vhost_user_fs_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+    VHostUserFSPCI *dev = VHOST_USER_FS_PCI(vpci_dev);
+    DeviceState *vdev = DEVICE(&dev->vdev);
+
+    qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
+    object_property_set_bool(OBJECT(vdev), true, "realized", errp);
+}
+
+static void vhost_user_fs_pci_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+    PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
+    k->realize = vhost_user_fs_pci_realize;
+    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+    dc->props = vhost_user_fs_pci_properties;
+    pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
+    pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_FS;
+    pcidev_k->revision = 0x00;
+    pcidev_k->class_id = PCI_CLASS_STORAGE_OTHER;
+}
+
+static void vhost_user_fs_pci_instance_init(Object *obj)
+{
+    VHostUserFSPCI *dev = VHOST_USER_FS_PCI(obj);
+
+    virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
+                                TYPE_VHOST_USER_FS);
+}
+
+static const TypeInfo vhost_user_fs_pci_info = {
+    .name          = TYPE_VHOST_USER_FS_PCI,
+    .parent        = TYPE_VIRTIO_PCI,
+    .instance_size = sizeof(VHostUserFSPCI),
+    .instance_init = vhost_user_fs_pci_instance_init,
+    .class_init    = vhost_user_fs_pci_class_init,
+};
+#endif
+
 /* virtio-pci-bus */
 
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
@@ -2743,6 +2792,9 @@ static void virtio_pci_register_types(void)
 #ifdef CONFIG_VHOST_VSOCK
     type_register_static(&vhost_vsock_pci_info);
 #endif
+#ifdef CONFIG_VHOST_USER_FS
+    type_register_static(&vhost_user_fs_pci_info);
+#endif
 }
 
 type_init(virtio_pci_register_types)
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 813082b0d7..a635dc564c 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -40,6 +40,9 @@
 #ifdef CONFIG_VHOST_VSOCK
 #include "hw/virtio/vhost-vsock.h"
 #endif
+#ifdef CONFIG_VHOST_USER_FS
+#include "hw/virtio/vhost-user-fs.h"
+#endif
 
 typedef struct VirtIOPCIProxy VirtIOPCIProxy;
 typedef struct VirtIOBlkPCI VirtIOBlkPCI;
@@ -57,6 +60,7 @@ typedef struct VirtIOInputHostPCI VirtIOInputHostPCI;
 typedef struct VirtIOGPUPCI VirtIOGPUPCI;
 typedef struct VHostVSockPCI VHostVSockPCI;
 typedef struct VirtIOCryptoPCI VirtIOCryptoPCI;
+typedef struct VHostUserFSPCI VHostUserFSPCI;
 
 /* virtio-pci-bus */
 
@@ -414,6 +418,20 @@ struct VirtIOCryptoPCI {
     VirtIOCrypto vdev;
 };
 
+#ifdef CONFIG_VHOST_USER_FS
+/*
+ * vhost-user-fs-pci: This extends VirtioPCIProxy.
+ */
+#define TYPE_VHOST_USER_FS_PCI "vhost-user-fs-pci"
+#define VHOST_USER_FS_PCI(obj) \
+        OBJECT_CHECK(VHostUserFSPCI, (obj), TYPE_VHOST_USER_FS_PCI)
+
+struct VHostUserFSPCI {
+    VirtIOPCIProxy parent_obj;
+    VHostUserFS vdev;
+};
+#endif
+
 /* Virtio ABI version, if we increment this, we break the guest driver. */
 #define VIRTIO_PCI_ABI_VERSION          0
 
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index e6514bba23..2adc3bf45f 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -85,6 +85,7 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_VIRTIO_RNG         0x1005
 #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
 #define PCI_DEVICE_ID_VIRTIO_VSOCK       0x1012
+#define PCI_DEVICE_ID_VIRTIO_FS          0x1019
 
 #define PCI_VENDOR_ID_REDHAT             0x1b36
 #define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
diff --git a/include/hw/virtio/vhost-user-fs.h b/include/hw/virtio/vhost-user-fs.h
new file mode 100644
index 0000000000..29629acc54
--- /dev/null
+++ b/include/hw/virtio/vhost-user-fs.h
@@ -0,0 +1,45 @@
+/*
+ * Vhost-user filesystem virtio device
+ *
+ * Copyright 2018 Red Hat, Inc.
+ *
+ * Authors:
+ *  Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+
+#ifndef _QEMU_VHOST_USER_FS_H
+#define _QEMU_VHOST_USER_FS_H
+
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-user.h"
+#include "chardev/char-fe.h"
+
+#define TYPE_VHOST_USER_FS "vhost-user-fs-device"
+#define VHOST_USER_FS(obj) \
+        OBJECT_CHECK(VHostUserFS, (obj), TYPE_VHOST_USER_FS)
+
+typedef struct {
+    CharBackend chardev;
+    char *tag;
+    uint16_t num_queues;
+    uint16_t queue_size;
+    char *vhostfd;
+} VHostUserFSConf;
+
+typedef struct {
+    /*< private >*/
+    VirtIODevice parent;
+    VHostUserFSConf conf;
+    struct vhost_virtqueue *vhost_vqs;
+    struct vhost_dev vhost_dev;
+    VhostUserState *vhost_user;
+
+    /*< public >*/
+} VHostUserFS;
+
+#endif /* _QEMU_VHOST_USER_FS_H */
diff --git a/include/standard-headers/linux/virtio_fs.h b/include/standard-headers/linux/virtio_fs.h
new file mode 100644
index 0000000000..4f811a0b70
--- /dev/null
+++ b/include/standard-headers/linux/virtio_fs.h
@@ -0,0 +1,41 @@
+#ifndef _LINUX_VIRTIO_FS_H
+#define _LINUX_VIRTIO_FS_H
+/* This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE. */
+#include "standard-headers/linux/types.h"
+#include "standard-headers/linux/virtio_ids.h"
+#include "standard-headers/linux/virtio_config.h"
+#include "standard-headers/linux/virtio_types.h"
+
+struct virtio_fs_config {
+	/* Filesystem name (UTF-8, not NUL-terminated, padded with NULs) */
+	uint8_t tag[36];
+
+	/* Number of request queues */
+	uint32_t num_queues;
+} QEMU_PACKED;
+
+#endif /* _LINUX_VIRTIO_FS_H */
diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
index 6d5c3b2d4f..884b0e2734 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -43,5 +43,6 @@
 #define VIRTIO_ID_INPUT        18 /* virtio input */
 #define VIRTIO_ID_VSOCK        19 /* virtio vsock transport */
 #define VIRTIO_ID_CRYPTO       20 /* virtio crypto */
+#define VIRTIO_ID_FS           26 /* virtio filesystem */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
-- 
2.19.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC PATCH 3/7] virtio-fs: Add cache BAR
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 1/7] virtio: Add shared memory capability Dr. David Alan Gilbert (git)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 2/7] virtio: add vhost-user-fs-pci device Dr. David Alan Gilbert (git)
@ 2018-12-10 17:31 ` Dr. David Alan Gilbert (git)
  2018-12-10 21:10   ` Eric Blake
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 4/7] virtio-fs: Add vhost-user slave commands for mapping Dr. David Alan Gilbert (git)
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-12-10 17:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: vgoyal, miklos, stefanha, sweil, swhiteho

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Add a cache BAR into which files will be directly mapped.
The size cacn be set with the cache-size= property, e.g.
   -device vhost-user-fs-pci,chardev=char0,tag=myfs,cache-size=16G

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/vhost-user-fs.c                  | 16 ++++++++++++++++
 hw/virtio/virtio-pci.c                     | 19 +++++++++++++++++++
 hw/virtio/virtio-pci.h                     |  1 +
 include/hw/virtio/vhost-user-fs.h          |  2 ++
 include/standard-headers/linux/virtio_fs.h |  5 +++++
 5 files changed, 43 insertions(+)

diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index bc21beeac3..14ee922661 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -195,6 +195,21 @@ static void vuf_device_realize(DeviceState *dev, Error **errp)
                    VIRTQUEUE_MAX_SIZE);
         return;
     }
+    if (!is_power_of_2(fs->conf.cache_size) ||
+        fs->conf.cache_size < sysconf(_SC_PAGESIZE)) {
+        error_setg(errp, "cache-size property must be a power of 2 "
+                         "no smaller than the page size");
+        return;
+    }
+    /* We need a region with some host memory, 'ram' is the easiest */
+    memory_region_init_ram_nomigrate(&fs->cache, OBJECT(vdev),
+                       "virtio-fs-cache",
+                       fs->conf.cache_size, NULL);
+    /* But we don't actually want anyone reading/writing the raw
+     * area with no cache data.
+     */
+    mprotect(memory_region_get_ram_ptr(&fs->cache), fs->conf.cache_size,
+             PROT_NONE);
 
     fs->vhost_user = vhost_user_init();
     if (!fs->vhost_user) {
@@ -263,6 +278,7 @@ static Property vuf_properties[] = {
     DEFINE_PROP_UINT16("num-queues", VHostUserFS, conf.num_queues, 1),
     DEFINE_PROP_UINT16("queue-size", VHostUserFS, conf.queue_size, 128),
     DEFINE_PROP_STRING("vhostfd", VHostUserFS, conf.vhostfd),
+    DEFINE_PROP_SIZE("cache-size", VHostUserFS, conf.cache_size, 1ull << 30),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index d744f93655..e819a29fb1 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -18,6 +18,7 @@
 #include "qemu/osdep.h"
 
 #include "standard-headers/linux/virtio_pci.h"
+#include "standard-headers/linux/virtio_fs.h"
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-blk.h"
 #include "hw/virtio/virtio-net.h"
@@ -2678,9 +2679,27 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
 {
     VHostUserFSPCI *dev = VHOST_USER_FS_PCI(vpci_dev);
     DeviceState *vdev = DEVICE(&dev->vdev);
+    uint64_t cachesize;
 
     qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
     object_property_set_bool(OBJECT(vdev), true, "realized", errp);
+    cachesize = memory_region_size(&dev->vdev.cache);
+
+    /* The bar starts with the data/DAX cache
+     * Others will be added later.
+     */
+    memory_region_init(&dev->cachebar, OBJECT(vpci_dev),
+                       "vhost-fs-pci-cachebar", cachesize);
+    memory_region_add_subregion(&dev->cachebar, 0, &dev->vdev.cache);
+    virtio_pci_add_shm_cap(vpci_dev, VIRTIO_FS_PCI_CACHE_BAR, 0, cachesize,
+                           VIRTIO_FS_PCI_SHMCAP_ID_CACHE);
+
+    /* After 'realized' so the memory region exists */
+    pci_register_bar(&vpci_dev->pci_dev, VIRTIO_FS_PCI_CACHE_BAR,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                     PCI_BASE_ADDRESS_MEM_PREFETCH |
+                     PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     &dev->cachebar);
 }
 
 static void vhost_user_fs_pci_class_init(ObjectClass *klass, void *data)
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index a635dc564c..53b87f245c 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -429,6 +429,7 @@ struct VirtIOCryptoPCI {
 struct VHostUserFSPCI {
     VirtIOPCIProxy parent_obj;
     VHostUserFS vdev;
+    MemoryRegion cachebar;
 };
 #endif
 
diff --git a/include/hw/virtio/vhost-user-fs.h b/include/hw/virtio/vhost-user-fs.h
index 29629acc54..be153e1c7a 100644
--- a/include/hw/virtio/vhost-user-fs.h
+++ b/include/hw/virtio/vhost-user-fs.h
@@ -29,6 +29,7 @@ typedef struct {
     uint16_t num_queues;
     uint16_t queue_size;
     char *vhostfd;
+    size_t cache_size;
 } VHostUserFSConf;
 
 typedef struct {
@@ -40,6 +41,7 @@ typedef struct {
     VhostUserState *vhost_user;
 
     /*< public >*/
+    MemoryRegion cache;
 } VHostUserFS;
 
 #endif /* _QEMU_VHOST_USER_FS_H */
diff --git a/include/standard-headers/linux/virtio_fs.h b/include/standard-headers/linux/virtio_fs.h
index 4f811a0b70..b5f137ca79 100644
--- a/include/standard-headers/linux/virtio_fs.h
+++ b/include/standard-headers/linux/virtio_fs.h
@@ -38,4 +38,9 @@ struct virtio_fs_config {
 	uint32_t num_queues;
 } QEMU_PACKED;
 
+#define VIRTIO_FS_PCI_CACHE_BAR 2
+
+/* For the id field in virtio_pci_shm_cap */
+#define VIRTIO_FS_PCI_SHMCAP_ID_CACHE 0
+
 #endif /* _LINUX_VIRTIO_FS_H */
-- 
2.19.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC PATCH 4/7] virtio-fs: Add vhost-user slave commands for mapping
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (2 preceding siblings ...)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 3/7] virtio-fs: Add cache BAR Dr. David Alan Gilbert (git)
@ 2018-12-10 17:31 ` Dr. David Alan Gilbert (git)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 5/7] virtio-fs: Fill in " Dr. David Alan Gilbert (git)
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-12-10 17:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: vgoyal, miklos, stefanha, sweil, swhiteho

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The daemon may request that fd's be mapped into the virtio-fs cache
visible to the guest.
These mappings are triggered by commands sent over the slave fd
from the daemon.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 contrib/libvhost-user/libvhost-user.h |  3 +++
 docs/interop/vhost-user.txt           | 35 +++++++++++++++++++++++++++
 hw/virtio/vhost-user-fs.c             | 20 +++++++++++++++
 hw/virtio/vhost-user.c                | 16 ++++++++++++
 include/hw/virtio/vhost-user-fs.h     | 24 ++++++++++++++++++
 5 files changed, 98 insertions(+)

diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h
index 4aa55b4d2d..6cff0ff189 100644
--- a/contrib/libvhost-user/libvhost-user.h
+++ b/contrib/libvhost-user/libvhost-user.h
@@ -99,6 +99,9 @@ typedef enum VhostUserSlaveRequest {
     VHOST_USER_SLAVE_IOTLB_MSG = 1,
     VHOST_USER_SLAVE_CONFIG_CHANGE_MSG = 2,
     VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG = 3,
+    VHOST_USER_SLAVE_FS_MAP = 4,
+    VHOST_USER_SLAVE_FS_UNMAP = 5,
+    VHOST_USER_SLAVE_FS_SYNC = 6,
     VHOST_USER_SLAVE_MAX
 }  VhostUserSlaveRequest;
 
diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index c2194711d9..29cdd74523 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -815,6 +815,41 @@ Slave message types
       This request should be sent only when VHOST_USER_PROTOCOL_F_HOST_NOTIFIER
       protocol feature has been successfully negotiated.
 
+ * VHOST_USER_SLAVE_FS_MAP
+
+      Id: 4
+      Equivalent ioctl: N/A
+      Slave payload: fd + n * (offset + address + len)
+      Master payload: N/A
+
+      Requests that the QEMU mmap the given fd into the virtio-fs cache;
+      multiple chunks can be mapped in one command.
+      A reply is generated indicating whether mapping succeeded.
+
+ * VHOST_USER_SLAVE_FS_UNMAP
+
+      Id: 5
+      Equivalent ioctl: N/A
+      Slave payload: n * (address + len)
+      Master payload: N/A
+
+      Requests that the QEMU un-mmap the given range in the virtio-fs cache;
+      multiple chunks can be unmapped in one command.
+      A reply is generated indicating whether unmapping succeeded.
+
+ * VHOST_USER_SLAVE_FS_SYNC
+
+      Id: 6
+      Equivalent ioctl: N/A
+      Slave payload: n * (address + len)
+      Master payload: N/A
+
+      Requests that the QEMU causes any changes to the virtio-fs cache to
+      be synchronised with the backing files.  Multiple chunks can be synced
+      in one command.
+      A reply is generated indicating whether syncing succeeded.
+      [Semantic details TBD]
+
 VHOST_USER_PROTOCOL_F_REPLY_ACK:
 -------------------------------
 The original vhost-user specification only demands replies for certain
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index 14ee922661..da70d9cd2c 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -21,6 +21,26 @@
 #include "hw/virtio/vhost-user-fs.h"
 #include "monitor/monitor.h"
 
+int vhost_user_fs_slave_map(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm,
+                            int fd)
+{
+    /* TODO */
+    return -1;
+}
+
+int vhost_user_fs_slave_unmap(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm)
+{
+    /* TODO */
+    return -1;
+}
+
+int vhost_user_fs_slave_sync(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm)
+{
+    /* TODO */
+    return -1;
+}
+
+
 static void vuf_get_config(VirtIODevice *vdev, uint8_t *config)
 {
     VHostUserFS *fs = VHOST_USER_FS(vdev);
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index e09bed0e4a..beb028c7e2 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -12,6 +12,7 @@
 #include "qapi/error.h"
 #include "hw/virtio/vhost.h"
 #include "hw/virtio/vhost-user.h"
+#include "hw/virtio/vhost-user-fs.h"
 #include "hw/virtio/vhost-backend.h"
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-net.h"
@@ -97,6 +98,9 @@ typedef enum VhostUserSlaveRequest {
     VHOST_USER_SLAVE_IOTLB_MSG = 1,
     VHOST_USER_SLAVE_CONFIG_CHANGE_MSG = 2,
     VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG = 3,
+    VHOST_USER_SLAVE_FS_MAP = 4,
+    VHOST_USER_SLAVE_FS_UNMAP = 5,
+    VHOST_USER_SLAVE_FS_SYNC = 6,
     VHOST_USER_SLAVE_MAX
 }  VhostUserSlaveRequest;
 
@@ -169,6 +173,7 @@ typedef union {
         VhostUserConfig config;
         VhostUserCryptoSession session;
         VhostUserVringArea area;
+        VhostUserFSSlaveMsg fs;
 } VhostUserPayload;
 
 typedef struct VhostUserMsg {
@@ -1010,6 +1015,17 @@ static void slave_read(void *opaque)
         ret = vhost_user_slave_handle_vring_host_notifier(dev, &payload.area,
                                                           fd[0]);
         break;
+#ifdef CONFIG_VHOST_USER_FS
+    case VHOST_USER_SLAVE_FS_MAP:
+        ret = vhost_user_fs_slave_map(dev, &payload.fs, fd[0]);
+        break;
+    case VHOST_USER_SLAVE_FS_UNMAP:
+        ret = vhost_user_fs_slave_unmap(dev, &payload.fs);
+        break;
+    case VHOST_USER_SLAVE_FS_SYNC:
+        ret = vhost_user_fs_slave_sync(dev, &payload.fs);
+        break;
+#endif
     default:
         error_report("Received unexpected msg type.");
         ret = -EINVAL;
diff --git a/include/hw/virtio/vhost-user-fs.h b/include/hw/virtio/vhost-user-fs.h
index be153e1c7a..9989bcd9e7 100644
--- a/include/hw/virtio/vhost-user-fs.h
+++ b/include/hw/virtio/vhost-user-fs.h
@@ -23,6 +23,24 @@
 #define VHOST_USER_FS(obj) \
         OBJECT_CHECK(VHostUserFS, (obj), TYPE_VHOST_USER_FS)
 
+/* Structures carried over the slave channel back to QEMU */
+#define VHOST_USER_FS_SLAVE_ENTRIES 8
+
+/* For the flags field of VhostUserFSSlaveMsg */
+#define VHOST_USER_FS_FLAG_MAP_R (1ull << 0)
+#define VHOST_USER_FS_FLAG_MAP_W (1ull << 1)
+
+typedef struct {
+    /* Offsets within the file being mapped */
+    uint64_t fd_offset[VHOST_USER_FS_SLAVE_ENTRIES];
+    /* Offsets within the cache */
+    uint64_t c_offset[VHOST_USER_FS_SLAVE_ENTRIES];
+    /* Lengths of sections */
+    uint64_t len[VHOST_USER_FS_SLAVE_ENTRIES];
+    /* Flags, from VHOST_USER_FS_FLAG_* */
+    uint64_t flags[VHOST_USER_FS_SLAVE_ENTRIES];
+} VhostUserFSSlaveMsg;
+
 typedef struct {
     CharBackend chardev;
     char *tag;
@@ -44,4 +62,10 @@ typedef struct {
     MemoryRegion cache;
 } VHostUserFS;
 
+/* Callbacks from the vhost-user code for slave commands */
+int vhost_user_fs_slave_map(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm,
+                            int fd);
+int vhost_user_fs_slave_unmap(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm);
+int vhost_user_fs_slave_sync(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm);
+
 #endif /* _QEMU_VHOST_USER_FS_H */
-- 
2.19.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC PATCH 5/7] virtio-fs: Fill in slave commands for mapping
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (3 preceding siblings ...)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 4/7] virtio-fs: Add vhost-user slave commands for mapping Dr. David Alan Gilbert (git)
@ 2018-12-10 17:31 ` Dr. David Alan Gilbert (git)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 6/7] virtio-fs: Allow mapping of meta data version table Dr. David Alan Gilbert (git)
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-12-10 17:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: vgoyal, miklos, stefanha, sweil, swhiteho

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Fill in definitions for map, unmap and sync commands.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/vhost-user-fs.c | 129 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 123 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index da70d9cd2c..bbb15477e5 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -24,20 +24,137 @@
 int vhost_user_fs_slave_map(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm,
                             int fd)
 {
-    /* TODO */
-    return -1;
+    VHostUserFS *fs = VHOST_USER_FS(dev->vdev);
+    size_t cache_size = fs->conf.cache_size;
+    void *cache_host = memory_region_get_ram_ptr(&fs->cache);
+
+    unsigned int i;
+    int res = 0;
+
+    if (fd < 0) {
+        fprintf(stderr, "%s: Bad fd for map\n", __func__);
+        return -1;
+    }
+
+    for (i = 0; i < VHOST_USER_FS_SLAVE_ENTRIES; i++) {
+        if (sm->len[i] == 0) {
+            continue;
+        }
+
+        if ((sm->c_offset[i] + sm->len[i]) < sm->len[i] ||
+            (sm->c_offset[i] + sm->len[i]) > cache_size) {
+            fprintf(stderr, "%s: Bad offset/len for map [%d] %"
+                            PRIx64 "+%" PRIx64 "\n", __func__,
+                            i, sm->c_offset[i], sm->len[i]);
+            res = -1;
+            break;
+        }
+
+        if (mmap(cache_host + sm->c_offset[i], sm->len[i],
+                 ((sm->flags[i] & VHOST_USER_FS_FLAG_MAP_R) ? PROT_READ : 0) |
+                 ((sm->flags[i] & VHOST_USER_FS_FLAG_MAP_W) ? PROT_WRITE : 0),
+                 MAP_SHARED | MAP_FIXED,
+                 fd, sm->fd_offset[i]) != (cache_host + sm->c_offset[i])) {
+            fprintf(stderr, "%s: map failed err %d [%d] %"
+                            PRIx64 "+%" PRIx64 " from %" PRIx64 "\n", __func__,
+                            errno, i, sm->c_offset[i], sm->len[i],
+                            sm->fd_offset[i]);
+            res = -1;
+            break;
+        }
+    }
+
+    if (res) {
+        /* Something went wrong, unmap them all */
+        vhost_user_fs_slave_unmap(dev, sm);
+    }
+    return res;
 }
 
 int vhost_user_fs_slave_unmap(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm)
 {
-    /* TODO */
-    return -1;
+    VHostUserFS *fs = VHOST_USER_FS(dev->vdev);
+    size_t cache_size = fs->conf.cache_size;
+    void *cache_host = memory_region_get_ram_ptr(&fs->cache);
+
+    unsigned int i;
+    int res = 0;
+
+    /* Note even if one unmap fails we try the rest, since the effect
+     * is to clean up as much as possible.
+     */
+    for (i = 0; i < VHOST_USER_FS_SLAVE_ENTRIES; i++) {
+        void *ptr;
+        if (sm->len[i] == 0) {
+            continue;
+        }
+
+        if (sm->len[i] == ~(uint64_t)0) {
+            /* Special case meaning the whole arena */
+            sm->len[i] = cache_size;
+        }
+
+        if ((sm->c_offset[i] + sm->len[i]) < sm->len[i] ||
+            (sm->c_offset[i] + sm->len[i]) > cache_size) {
+            fprintf(stderr, "%s: Bad offset/len for unmap [%d] %"
+                            PRIx64 "+%" PRIx64 "\n", __func__,
+                            i, sm->c_offset[i], sm->len[i]);
+            res = -1;
+            continue;
+        }
+
+        ptr = mmap(cache_host + sm->c_offset[i], sm->len[i],
+                PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
+        if (ptr != (cache_host + sm->c_offset[i])) {
+            fprintf(stderr, "%s: mmap failed (%s) [%d] %"
+                            PRIx64 "+%" PRIx64 " from %" PRIx64 " res: %p\n",
+                            __func__,
+                            strerror(errno),
+                            i, sm->c_offset[i], sm->len[i],
+                            sm->fd_offset[i], ptr);
+            res = -1;
+        }
+    }
+
+    return res;
 }
 
 int vhost_user_fs_slave_sync(struct vhost_dev *dev, VhostUserFSSlaveMsg *sm)
 {
-    /* TODO */
-    return -1;
+    VHostUserFS *fs = VHOST_USER_FS(dev->vdev);
+    size_t cache_size = fs->conf.cache_size;
+    void *cache_host = memory_region_get_ram_ptr(&fs->cache);
+
+    unsigned int i;
+    int res = 0;
+
+    /* Note even if one sync fails we try the rest */
+    for (i = 0; i < VHOST_USER_FS_SLAVE_ENTRIES; i++) {
+        if (sm->len[i] == 0) {
+            continue;
+        }
+
+        if ((sm->c_offset[i] + sm->len[i]) < sm->len[i] ||
+            (sm->c_offset[i] + sm->len[i]) > cache_size) {
+            fprintf(stderr, "%s: Bad offset/len for sync [%d] %"
+                            PRIx64 "+%" PRIx64 "\n", __func__,
+                            i, sm->c_offset[i], sm->len[i]);
+            res = -1;
+            continue;
+        }
+
+        if (msync(cache_host + sm->c_offset[i], sm->len[i],
+                  MS_SYNC /* ?? */)) {
+            fprintf(stderr, "%s: msync failed (%s) [%d] %"
+                            PRIx64 "+%" PRIx64 " from %" PRIx64 "\n", __func__,
+                            strerror(errno),
+                            i, sm->c_offset[i], sm->len[i],
+                            sm->fd_offset[i]);
+            res = -1;
+        }
+    }
+
+    return res;
 }
 
 
-- 
2.19.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC PATCH 6/7] virtio-fs: Allow mapping of meta data version table
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (4 preceding siblings ...)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 5/7] virtio-fs: Fill in " Dr. David Alan Gilbert (git)
@ 2018-12-10 17:31 ` Dr. David Alan Gilbert (git)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 7/7] virtio-fs: Allow mapping of journal Dr. David Alan Gilbert (git)
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-12-10 17:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: vgoyal, miklos, stefanha, sweil, swhiteho

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The 'meta data version table' is a block of shared memory mapped between
multiple QEMUs and fuse daemons, so that they can be informed
of metadata updates.  It's typically a shmfs file, and
it's specified as :

   -device vhost-user-fs-pci,chardev=char0,tag=myfs,cache-size=1G,versiontable=/dev/shm/mdvt1

It gets mapped into the PCI bar after the data cache; it's read only.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/vhost-user-fs.c                  | 36 ++++++++++++++++++++++
 hw/virtio/virtio-pci.c                     | 16 ++++++++--
 include/hw/virtio/vhost-user-fs.h          |  4 +++
 include/standard-headers/linux/virtio_fs.h |  1 +
 4 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index bbb15477e5..a39ecd3a16 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -296,6 +296,7 @@ static void vuf_device_realize(DeviceState *dev, Error **errp)
     unsigned int i;
     size_t len;
     int ret;
+    int mdvtfd = -1;
 
     if (!fs->conf.chardev.chr) {
         error_setg(errp, "missing chardev");
@@ -338,6 +339,31 @@ static void vuf_device_realize(DeviceState *dev, Error **errp)
                          "no smaller than the page size");
         return;
     }
+
+    if (fs->conf.mdvtpath) {
+        struct stat statbuf;
+
+        mdvtfd = open(fs->conf.mdvtpath, O_RDWR);
+        if (mdvtfd < 0) {
+            error_setg_errno(errp, errno,
+                             "Failed to open meta-data version table '%s'",
+                             fs->conf.mdvtpath);
+
+            return;
+        }
+        if (fstat(mdvtfd, &statbuf) == -1) {
+            error_setg_errno(errp, errno,
+                             "Failed to stat meta-data version table '%s'",
+                             fs->conf.mdvtpath);
+            close(mdvtfd);
+            return;
+        }
+
+        fs->mdvt_size = statbuf.st_size;
+    }
+    fprintf(stderr, "%s: cachesize=%zd mdvt_size=%zd\n", __func__,
+            fs->conf.cache_size, fs->mdvt_size);
+
     /* We need a region with some host memory, 'ram' is the easiest */
     memory_region_init_ram_nomigrate(&fs->cache, OBJECT(vdev),
                        "virtio-fs-cache",
@@ -348,6 +374,15 @@ static void vuf_device_realize(DeviceState *dev, Error **errp)
     mprotect(memory_region_get_ram_ptr(&fs->cache), fs->conf.cache_size,
              PROT_NONE);
 
+
+    if (mdvtfd) {
+        memory_region_init_ram_from_fd(&fs->mdvt, OBJECT(vdev),
+                       "virtio-fs-mdvt",
+                       fs->mdvt_size, true, mdvtfd, NULL);
+        /* The version table is read-only by the guest */
+        memory_region_set_readonly(&fs->mdvt, true);
+    }
+
     fs->vhost_user = vhost_user_init();
     if (!fs->vhost_user) {
         error_setg(errp, "failed to initialize vhost-user");
@@ -416,6 +451,7 @@ static Property vuf_properties[] = {
     DEFINE_PROP_UINT16("queue-size", VHostUserFS, conf.queue_size, 128),
     DEFINE_PROP_STRING("vhostfd", VHostUserFS, conf.vhostfd),
     DEFINE_PROP_SIZE("cache-size", VHostUserFS, conf.cache_size, 1ull << 30),
+    DEFINE_PROP_STRING("versiontable", VHostUserFS, conf.mdvtpath),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index e819a29fb1..d8785b78bf 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2680,20 +2680,32 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
     VHostUserFSPCI *dev = VHOST_USER_FS_PCI(vpci_dev);
     DeviceState *vdev = DEVICE(&dev->vdev);
     uint64_t cachesize;
+    uint64_t totalsize;
 
     qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
     object_property_set_bool(OBJECT(vdev), true, "realized", errp);
     cachesize = memory_region_size(&dev->vdev.cache);
 
+    /* PCIe bar needs to be a power of 2 */
+    totalsize = pow2ceil(cachesize + dev->vdev.mdvt_size);
+
     /* The bar starts with the data/DAX cache
-     * Others will be added later.
+     * followed by the metadata cache.
      */
     memory_region_init(&dev->cachebar, OBJECT(vpci_dev),
-                       "vhost-fs-pci-cachebar", cachesize);
+                       "vhost-fs-pci-cachebar", totalsize);
     memory_region_add_subregion(&dev->cachebar, 0, &dev->vdev.cache);
     virtio_pci_add_shm_cap(vpci_dev, VIRTIO_FS_PCI_CACHE_BAR, 0, cachesize,
                            VIRTIO_FS_PCI_SHMCAP_ID_CACHE);
 
+    if (dev->vdev.mdvt_size) {
+        memory_region_add_subregion(&dev->cachebar, cachesize,
+                                    &dev->vdev.mdvt);
+        virtio_pci_add_shm_cap(vpci_dev, VIRTIO_FS_PCI_CACHE_BAR,
+                               cachesize, dev->vdev.mdvt_size,
+                               VIRTIO_FS_PCI_SHMCAP_ID_VERTAB);
+    }
+
     /* After 'realized' so the memory region exists */
     pci_register_bar(&vpci_dev->pci_dev, VIRTIO_FS_PCI_CACHE_BAR,
                      PCI_BASE_ADDRESS_SPACE_MEMORY |
diff --git a/include/hw/virtio/vhost-user-fs.h b/include/hw/virtio/vhost-user-fs.h
index 9989bcd9e7..281ae0a52d 100644
--- a/include/hw/virtio/vhost-user-fs.h
+++ b/include/hw/virtio/vhost-user-fs.h
@@ -48,6 +48,7 @@ typedef struct {
     uint16_t queue_size;
     char *vhostfd;
     size_t cache_size;
+    char *mdvtpath;
 } VHostUserFSConf;
 
 typedef struct {
@@ -60,6 +61,9 @@ typedef struct {
 
     /*< public >*/
     MemoryRegion cache;
+    /* Metadata version table */
+    size_t mdvt_size;
+    MemoryRegion mdvt;
 } VHostUserFS;
 
 /* Callbacks from the vhost-user code for slave commands */
diff --git a/include/standard-headers/linux/virtio_fs.h b/include/standard-headers/linux/virtio_fs.h
index b5f137ca79..77fa651073 100644
--- a/include/standard-headers/linux/virtio_fs.h
+++ b/include/standard-headers/linux/virtio_fs.h
@@ -42,5 +42,6 @@ struct virtio_fs_config {
 
 /* For the id field in virtio_pci_shm_cap */
 #define VIRTIO_FS_PCI_SHMCAP_ID_CACHE 0
+#define VIRTIO_FS_PCI_SHMCAP_ID_VERTAB 1
 
 #endif /* _LINUX_VIRTIO_FS_H */
-- 
2.19.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC PATCH 7/7] virtio-fs: Allow mapping of journal
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (5 preceding siblings ...)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 6/7] virtio-fs: Allow mapping of meta data version table Dr. David Alan Gilbert (git)
@ 2018-12-10 17:31 ` Dr. David Alan Gilbert (git)
  2018-12-10 21:12   ` Eric Blake
  2018-12-10 20:26 ` [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 no-reply
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-12-10 17:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: vgoyal, miklos, stefanha, sweil, swhiteho

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The 'journal' is a shared block of RAM between QEMU and it's
fuse daemon.  It's typically a shmfs file and it's specified as:

-device
vhost-user-fs-pci,chardev=char0,tag=myfs,cache-size=1G,versiontable=/dev/shm/mdvt1,journal=/dev/shm/journal1

It gets mapped into the PCI bar after the version table.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hw/virtio/vhost-user-fs.c                  | 35 ++++++++++++++++++++--
 hw/virtio/virtio-pci.c                     | 14 ++++++++-
 include/hw/virtio/vhost-user-fs.h          |  4 +++
 include/standard-headers/linux/virtio_fs.h |  1 +
 4 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index a39ecd3a16..b263f43c60 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -297,6 +297,7 @@ static void vuf_device_realize(DeviceState *dev, Error **errp)
     size_t len;
     int ret;
     int mdvtfd = -1;
+    int journalfd = -1;
 
     if (!fs->conf.chardev.chr) {
         error_setg(errp, "missing chardev");
@@ -361,8 +362,31 @@ static void vuf_device_realize(DeviceState *dev, Error **errp)
 
         fs->mdvt_size = statbuf.st_size;
     }
-    fprintf(stderr, "%s: cachesize=%zd mdvt_size=%zd\n", __func__,
-            fs->conf.cache_size, fs->mdvt_size);
+    if (fs->conf.journalpath) {
+        struct stat statbuf;
+
+        journalfd = open(fs->conf.journalpath, O_RDWR);
+        if (journalfd < 0) {
+            error_setg_errno(errp, errno,
+                             "Failed to open journal '%s'",
+                             fs->conf.journalpath);
+
+            close(mdvtfd);
+            return;
+        }
+        if (fstat(journalfd, &statbuf) == -1) {
+            error_setg_errno(errp, errno,
+                             "Failed to stat journal '%s'",
+                             fs->conf.journalpath);
+            close(mdvtfd);
+            close(journalfd);
+            return;
+        }
+
+        fs->journal_size = statbuf.st_size;
+    }
+    fprintf(stderr, "%s: cachesize=%zd mdvt_size=%zd journal_size=%zd\n",
+            __func__, fs->conf.cache_size, fs->mdvt_size, fs->journal_size);
 
     /* We need a region with some host memory, 'ram' is the easiest */
     memory_region_init_ram_nomigrate(&fs->cache, OBJECT(vdev),
@@ -383,6 +407,12 @@ static void vuf_device_realize(DeviceState *dev, Error **errp)
         memory_region_set_readonly(&fs->mdvt, true);
     }
 
+    if (journalfd) {
+        memory_region_init_ram_from_fd(&fs->journal, OBJECT(vdev),
+                       "virtio-fs-journal",
+                       fs->journal_size, true, journalfd, NULL);
+    }
+
     fs->vhost_user = vhost_user_init();
     if (!fs->vhost_user) {
         error_setg(errp, "failed to initialize vhost-user");
@@ -452,6 +482,7 @@ static Property vuf_properties[] = {
     DEFINE_PROP_STRING("vhostfd", VHostUserFS, conf.vhostfd),
     DEFINE_PROP_SIZE("cache-size", VHostUserFS, conf.cache_size, 1ull << 30),
     DEFINE_PROP_STRING("versiontable", VHostUserFS, conf.mdvtpath),
+    DEFINE_PROP_STRING("journal", VHostUserFS, conf.journalpath),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index d8785b78bf..a46dd5a784 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2687,10 +2687,12 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
     cachesize = memory_region_size(&dev->vdev.cache);
 
     /* PCIe bar needs to be a power of 2 */
-    totalsize = pow2ceil(cachesize + dev->vdev.mdvt_size);
+    totalsize = pow2ceil(cachesize + dev->vdev.mdvt_size +
+                         dev->vdev.journal_size);
 
     /* The bar starts with the data/DAX cache
      * followed by the metadata cache.
+     * followed by the journal
      */
     memory_region_init(&dev->cachebar, OBJECT(vpci_dev),
                        "vhost-fs-pci-cachebar", totalsize);
@@ -2706,6 +2708,16 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
                                VIRTIO_FS_PCI_SHMCAP_ID_VERTAB);
     }
 
+    if (dev->vdev.journal_size) {
+        memory_region_add_subregion(&dev->cachebar,
+                                    cachesize + dev->vdev.mdvt_size,
+                                    &dev->vdev.journal);
+        virtio_pci_add_shm_cap(vpci_dev, VIRTIO_FS_PCI_CACHE_BAR,
+                               cachesize + dev->vdev.mdvt_size,
+                               dev->vdev.journal_size,
+                               VIRTIO_FS_PCI_SHMCAP_ID_JOURNAL);
+    }
+
     /* After 'realized' so the memory region exists */
     pci_register_bar(&vpci_dev->pci_dev, VIRTIO_FS_PCI_CACHE_BAR,
                      PCI_BASE_ADDRESS_SPACE_MEMORY |
diff --git a/include/hw/virtio/vhost-user-fs.h b/include/hw/virtio/vhost-user-fs.h
index 281ae0a52d..6d9f74a543 100644
--- a/include/hw/virtio/vhost-user-fs.h
+++ b/include/hw/virtio/vhost-user-fs.h
@@ -49,6 +49,7 @@ typedef struct {
     char *vhostfd;
     size_t cache_size;
     char *mdvtpath;
+    char *journalpath;
 } VHostUserFSConf;
 
 typedef struct {
@@ -64,6 +65,9 @@ typedef struct {
     /* Metadata version table */
     size_t mdvt_size;
     MemoryRegion mdvt;
+    /* Journal */
+    size_t journal_size;
+    MemoryRegion journal;
 } VHostUserFS;
 
 /* Callbacks from the vhost-user code for slave commands */
diff --git a/include/standard-headers/linux/virtio_fs.h b/include/standard-headers/linux/virtio_fs.h
index 77fa651073..0242f2a06e 100644
--- a/include/standard-headers/linux/virtio_fs.h
+++ b/include/standard-headers/linux/virtio_fs.h
@@ -43,5 +43,6 @@ struct virtio_fs_config {
 /* For the id field in virtio_pci_shm_cap */
 #define VIRTIO_FS_PCI_SHMCAP_ID_CACHE 0
 #define VIRTIO_FS_PCI_SHMCAP_ID_VERTAB 1
+#define VIRTIO_FS_PCI_SHMCAP_ID_JOURNAL 2
 
 #endif /* _LINUX_VIRTIO_FS_H */
-- 
2.19.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (6 preceding siblings ...)
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 7/7] virtio-fs: Allow mapping of journal Dr. David Alan Gilbert (git)
@ 2018-12-10 20:26 ` no-reply
  2018-12-11 12:53 ` Stefan Hajnoczi
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: no-reply @ 2018-12-10 20:26 UTC (permalink / raw)
  To: dgilbert; +Cc: famz, qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

Patchew URL: https://patchew.org/QEMU/20181210173151.16629-1-dgilbert@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-mingw@fedora SHOW_ENV=1 J=8
=== TEST SCRIPT END ===

  CC      ui/gtk.o
  CC      chardev/char.o
  CC      chardev/char-console.o
/tmp/qemu-test/src/hw/virtio/virtio-pci.c:1167:12: error: 'virtio_pci_add_shm_cap' defined but not used [-Werror=unused-function]
 static int virtio_pci_add_shm_cap(VirtIOPCIProxy *proxy,
            ^~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors


The full log is available at
http://patchew.org/logs/20181210173151.16629-1-dgilbert@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/7] virtio: Add shared memory capability
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 1/7] virtio: Add shared memory capability Dr. David Alan Gilbert (git)
@ 2018-12-10 21:03   ` Eric Blake
  2018-12-11 10:24     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Blake @ 2018-12-10 21:03 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git), qemu-devel
  Cc: sweil, swhiteho, stefanha, vgoyal, miklos

On 12/10/18 11:31 AM, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Define a new capability type 'VIRTIO_PCI_CAP_SHARED_MEMORY_CFG'
> and the data structure 'virtio_pci_shm_cap' to go with it.
> They allow defining shared memory regions with sizes and offsets
> of 2^32 and more.
> Multiple instances of the capability are allowed and distinguished
> by a device-specific 'id'.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>   hw/virtio/virtio-pci.c                      | 20 ++++++++++++++++++++
>   include/standard-headers/linux/virtio_pci.h |  9 +++++++++
>   2 files changed, 29 insertions(+)
> 

> +++ b/include/standard-headers/linux/virtio_pci.h
> @@ -113,6 +113,8 @@
>   #define VIRTIO_PCI_CAP_DEVICE_CFG	4
>   /* PCI configuration access */
>   #define VIRTIO_PCI_CAP_PCI_CFG		5
> +/* Additional shared memory capability */
> +#define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>   
>   /* This is the PCI capability header: */
>   struct virtio_pci_cap {
> @@ -163,6 +165,13 @@ struct virtio_pci_cfg_cap {
>   	uint8_t pci_cfg_data[4]; /* Data for BAR access. */
>   };
>   
> +struct virtio_pci_shm_cap {
> +	struct virtio_pci_cap cap;
> +	uint32_t offset_hi;             /* Most sig 32 bits of offset */
> +	uint32_t length_hi;             /* Most sig 32 bits of length */
> +        uint8_t  id;                    /* To distinguish shm chunks */

TAB damage.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 3/7] virtio-fs: Add cache BAR
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 3/7] virtio-fs: Add cache BAR Dr. David Alan Gilbert (git)
@ 2018-12-10 21:10   ` Eric Blake
  2018-12-11 10:25     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Blake @ 2018-12-10 21:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git), qemu-devel
  Cc: sweil, swhiteho, stefanha, vgoyal, miklos

On 12/10/18 11:31 AM, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Add a cache BAR into which files will be directly mapped.
> The size cacn be set with the cache-size= property, e.g.

s/cacn/can/

>     -device vhost-user-fs-pci,chardev=char0,tag=myfs,cache-size=16G
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 7/7] virtio-fs: Allow mapping of journal
  2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 7/7] virtio-fs: Allow mapping of journal Dr. David Alan Gilbert (git)
@ 2018-12-10 21:12   ` Eric Blake
  2018-12-11 10:34     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Blake @ 2018-12-10 21:12 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git), qemu-devel
  Cc: sweil, swhiteho, stefanha, vgoyal, miklos

On 12/10/18 11:31 AM, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> The 'journal' is a shared block of RAM between QEMU and it's

s/it's/its/ (here, you want possessive)

> fuse daemon.  It's typically a shmfs file and it's specified as:

whereas here, both uses of "it's" are correct as contractions for "it 
is" (although I might use just "is" instead of "it's" for that last 
instance).

> 
> -device
> vhost-user-fs-pci,chardev=char0,tag=myfs,cache-size=1G,versiontable=/dev/shm/mdvt1,journal=/dev/shm/journal1
> 
> It gets mapped into the PCI bar after the version table.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/7] virtio: Add shared memory capability
  2018-12-10 21:03   ` Eric Blake
@ 2018-12-11 10:24     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2018-12-11 10:24 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

* Eric Blake (eblake@redhat.com) wrote:
> On 12/10/18 11:31 AM, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Define a new capability type 'VIRTIO_PCI_CAP_SHARED_MEMORY_CFG'
> > and the data structure 'virtio_pci_shm_cap' to go with it.
> > They allow defining shared memory regions with sizes and offsets
> > of 2^32 and more.
> > Multiple instances of the capability are allowed and distinguished
> > by a device-specific 'id'.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >   hw/virtio/virtio-pci.c                      | 20 ++++++++++++++++++++
> >   include/standard-headers/linux/virtio_pci.h |  9 +++++++++
> >   2 files changed, 29 insertions(+)
> > 
> 
> > +++ b/include/standard-headers/linux/virtio_pci.h
> > @@ -113,6 +113,8 @@
> >   #define VIRTIO_PCI_CAP_DEVICE_CFG	4
> >   /* PCI configuration access */
> >   #define VIRTIO_PCI_CAP_PCI_CFG		5
> > +/* Additional shared memory capability */
> > +#define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
> >   /* This is the PCI capability header: */
> >   struct virtio_pci_cap {
> > @@ -163,6 +165,13 @@ struct virtio_pci_cfg_cap {
> >   	uint8_t pci_cfg_data[4]; /* Data for BAR access. */
> >   };
> > +struct virtio_pci_shm_cap {
> > +	struct virtio_pci_cap cap;
> > +	uint32_t offset_hi;             /* Most sig 32 bits of offset */
> > +	uint32_t length_hi;             /* Most sig 32 bits of length */
> > +        uint8_t  id;                    /* To distinguish shm chunks */
> 
> TAB damage.

Thanks, fixed.

Dave

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 3/7] virtio-fs: Add cache BAR
  2018-12-10 21:10   ` Eric Blake
@ 2018-12-11 10:25     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2018-12-11 10:25 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

* Eric Blake (eblake@redhat.com) wrote:
> On 12/10/18 11:31 AM, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Add a cache BAR into which files will be directly mapped.
> > The size cacn be set with the cache-size= property, e.g.
> 
> s/cacn/can/

Thanks, fixed.

> >     -device vhost-user-fs-pci,chardev=char0,tag=myfs,cache-size=16G
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 7/7] virtio-fs: Allow mapping of journal
  2018-12-10 21:12   ` Eric Blake
@ 2018-12-11 10:34     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2018-12-11 10:34 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

* Eric Blake (eblake@redhat.com) wrote:
> On 12/10/18 11:31 AM, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > The 'journal' is a shared block of RAM between QEMU and it's
> 
> s/it's/its/ (here, you want possessive)

Fixed.

Dave

> > fuse daemon.  It's typically a shmfs file and it's specified as:
> 
> whereas here, both uses of "it's" are correct as contractions for "it is"
> (although I might use just "is" instead of "it's" for that last instance).
> 
> > 
> > -device
> > vhost-user-fs-pci,chardev=char0,tag=myfs,cache-size=1G,versiontable=/dev/shm/mdvt1,journal=/dev/shm/journal1
> > 
> > It gets mapped into the PCI bar after the version table.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> 
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (7 preceding siblings ...)
  2018-12-10 20:26 ` [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 no-reply
@ 2018-12-11 12:53 ` Stefan Hajnoczi
  2018-12-12 12:30 ` Daniel P. Berrangé
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Stefan Hajnoczi @ 2018-12-11 12:53 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, vgoyal, miklos, sweil, swhiteho

[-- Attachment #1: Type: text/plain, Size: 1125 bytes --]

On Mon, Dec 10, 2018 at 05:31:44PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Hi,
>   This is the first RFC for the QEMU side of 'virtio-fs';
> a new mechanism for mounting host directories into the guest
> in a fast, consistent and secure manner.  Our primary use
> case is kata containers, but it should be usable in other scenarios
> as well.
> 
> There are corresponding patches being posted to Linux kernel,
> libfuse and kata lists.
> 
> For a fuller design description, and benchmark numbers, please see
> Vivek's posting of the kernel set here:
> 
> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> 
> We've got a small website with instructions on how to use it, here:
> 
> https://virtio-fs.gitlab.io/
> 
> and all the code is available on gitlab at:
> 
> https://gitlab.com/virtio-fs

A draft specification for the virtio-fs device is available here:

https://stefanha.github.io/virtio/virtio-fs.html#x1-38800010 (HTML)

https://github.com/stefanha/virtio/commit/e1cac3777ef03bc9c5c8ee91bcc6ba478272e6b6

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (8 preceding siblings ...)
  2018-12-11 12:53 ` Stefan Hajnoczi
@ 2018-12-12 12:30 ` Daniel P. Berrangé
  2018-12-12 13:52   ` Dr. David Alan Gilbert
  2018-12-22  9:27 ` jiangyiwen
  2019-04-04 13:24 ` Greg Kurz
  11 siblings, 1 reply; 26+ messages in thread
From: Daniel P. Berrangé @ 2018-12-12 12:30 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

On Mon, Dec 10, 2018 at 05:31:44PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Hi,
>   This is the first RFC for the QEMU side of 'virtio-fs';
> a new mechanism for mounting host directories into the guest
> in a fast, consistent and secure manner.  Our primary use
> case is kata containers, but it should be usable in other scenarios
> as well.
> 
> There are corresponding patches being posted to Linux kernel,
> libfuse and kata lists.
> 
> For a fuller design description, and benchmark numbers, please see
> Vivek's posting of the kernel set here:
> 
> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> 
> We've got a small website with instructions on how to use it, here:
> 
> https://virtio-fs.gitlab.io/
> 
> and all the code is available on gitlab at:
> 
> https://gitlab.com/virtio-fs
> 
> QEMU's changes
> --------------
> 
> The QEMU changes are pretty small; 
> 
> There's a new vhost-user device, which is used to carry a stream of
> FUSE messages to an external daemon that actually performs
> all the file IO.  The FUSE daemon is an external process in order to
> achieve better isolation for security and resource control (e.g. number
> of file descriptors) and also because it's cleaner than trying to
> integrate libfuse into QEMU.

Overall I like the virtio-fs architecture more than the virtio-vsock+NFS
approach, as virtio-fs feels simpler and closer to virtio-9p with the
latter's proxy backends.

I never really liked the idea of having to mess around with the host
NFS server to exposed filesystems to guests, as that's systemwide
service.  The ability to have an isolated virtio-fs backend process
per filesystem share per guest is simpler from a mgmt pov.

One think I would like to see though is a general purpose, production
quality backend impl that is shipped by the QEMU project.  It is fine
if projects like Kata want to write a custom impl tailored to their
specific needs, but I think QEMU should have something as standard that
isn't just demoware. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-12 12:30 ` Daniel P. Berrangé
@ 2018-12-12 13:52   ` Dr. David Alan Gilbert
  2018-12-12 13:58     ` Daniel P. Berrangé
  0 siblings, 1 reply; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2018-12-12 13:52 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Dec 10, 2018 at 05:31:44PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Hi,
> >   This is the first RFC for the QEMU side of 'virtio-fs';
> > a new mechanism for mounting host directories into the guest
> > in a fast, consistent and secure manner.  Our primary use
> > case is kata containers, but it should be usable in other scenarios
> > as well.
> > 
> > There are corresponding patches being posted to Linux kernel,
> > libfuse and kata lists.
> > 
> > For a fuller design description, and benchmark numbers, please see
> > Vivek's posting of the kernel set here:
> > 
> > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > 
> > We've got a small website with instructions on how to use it, here:
> > 
> > https://virtio-fs.gitlab.io/
> > 
> > and all the code is available on gitlab at:
> > 
> > https://gitlab.com/virtio-fs
> > 
> > QEMU's changes
> > --------------
> > 
> > The QEMU changes are pretty small; 
> > 
> > There's a new vhost-user device, which is used to carry a stream of
> > FUSE messages to an external daemon that actually performs
> > all the file IO.  The FUSE daemon is an external process in order to
> > achieve better isolation for security and resource control (e.g. number
> > of file descriptors) and also because it's cleaner than trying to
> > integrate libfuse into QEMU.
> 
> Overall I like the virtio-fs architecture more than the virtio-vsock+NFS
> approach, as virtio-fs feels simpler and closer to virtio-9p with the
> latter's proxy backends.
> 
> I never really liked the idea of having to mess around with the host
> NFS server to exposed filesystems to guests, as that's systemwide
> service.  The ability to have an isolated virtio-fs backend process
> per filesystem share per guest is simpler from a mgmt pov.
> 
> One think I would like to see though is a general purpose, production
> quality backend impl that is shipped by the QEMU project.  It is fine
> if projects like Kata want to write a custom impl tailored to their
> specific needs, but I think QEMU should have something as standard that
> isn't just demoware. 

Our patches sent to libfuse may provide that - after we tidy them up a
bit more; but it is the result of adding the fuse example code to qemu's
contrib vhost-user example code.    Given that this is the intersection
of so many projects I'm not sure I care which project distributes a
working implementation.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-12 13:52   ` Dr. David Alan Gilbert
@ 2018-12-12 13:58     ` Daniel P. Berrangé
  2018-12-12 14:49       ` Stefan Hajnoczi
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel P. Berrangé @ 2018-12-12 13:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

On Wed, Dec 12, 2018 at 01:52:03PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Mon, Dec 10, 2018 at 05:31:44PM +0000, Dr. David Alan Gilbert (git) wrote:
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 
> > > Hi,
> > >   This is the first RFC for the QEMU side of 'virtio-fs';
> > > a new mechanism for mounting host directories into the guest
> > > in a fast, consistent and secure manner.  Our primary use
> > > case is kata containers, but it should be usable in other scenarios
> > > as well.
> > > 
> > > There are corresponding patches being posted to Linux kernel,
> > > libfuse and kata lists.
> > > 
> > > For a fuller design description, and benchmark numbers, please see
> > > Vivek's posting of the kernel set here:
> > > 
> > > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > > 
> > > We've got a small website with instructions on how to use it, here:
> > > 
> > > https://virtio-fs.gitlab.io/
> > > 
> > > and all the code is available on gitlab at:
> > > 
> > > https://gitlab.com/virtio-fs
> > > 
> > > QEMU's changes
> > > --------------
> > > 
> > > The QEMU changes are pretty small; 
> > > 
> > > There's a new vhost-user device, which is used to carry a stream of
> > > FUSE messages to an external daemon that actually performs
> > > all the file IO.  The FUSE daemon is an external process in order to
> > > achieve better isolation for security and resource control (e.g. number
> > > of file descriptors) and also because it's cleaner than trying to
> > > integrate libfuse into QEMU.
> > 
> > Overall I like the virtio-fs architecture more than the virtio-vsock+NFS
> > approach, as virtio-fs feels simpler and closer to virtio-9p with the
> > latter's proxy backends.
> > 
> > I never really liked the idea of having to mess around with the host
> > NFS server to exposed filesystems to guests, as that's systemwide
> > service.  The ability to have an isolated virtio-fs backend process
> > per filesystem share per guest is simpler from a mgmt pov.
> > 
> > One think I would like to see though is a general purpose, production
> > quality backend impl that is shipped by the QEMU project.  It is fine
> > if projects like Kata want to write a custom impl tailored to their
> > specific needs, but I think QEMU should have something as standard that
> > isn't just demoware. 
> 
> Our patches sent to libfuse may provide that - after we tidy them up a
> bit more; but it is the result of adding the fuse example code to qemu's
> contrib vhost-user example code.    Given that this is the intersection
> of so many projects I'm not sure I care which project distributes a
> working implementation.

Right, but that's my point - the stuff in QEMU's contrib/ directories is
just demoware - not something we actually support as QEMU maintainers,
nor expect users to run in production. Likewise for stuff in libfuse
example/ directory AFAIK.

IMHO we need something whose support status is on a par with what you'd
get if we had the impl in-process for the main QEMU system emulator.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-12 13:58     ` Daniel P. Berrangé
@ 2018-12-12 14:49       ` Stefan Hajnoczi
  0 siblings, 0 replies; 26+ messages in thread
From: Stefan Hajnoczi @ 2018-12-12 14:49 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Dr. David Alan Gilbert, qemu-devel, sweil, swhiteho, vgoyal, miklos

[-- Attachment #1: Type: text/plain, Size: 753 bytes --]

On Wed, Dec 12, 2018 at 01:58:25PM +0000, Daniel P. Berrangé wrote:
> On Wed, Dec 12, 2018 at 01:52:03PM +0000, Dr. David Alan Gilbert wrote:
> IMHO we need something whose support status is on a par with what you'd
> get if we had the impl in-process for the main QEMU system emulator.

I agree.  Now that virtio-fs has been released we're working on todo
items that will make the libfuse code production-quality, including
security auditing and jailing of the process.

Once we're confident that this is a production-quality file server it's
a matter of moving it out of example/ or contrib/.  We might find that
the scope of a production-quality file server exceeds libfuse's example/
anyway and need to move it to a new home.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (9 preceding siblings ...)
  2018-12-12 12:30 ` Daniel P. Berrangé
@ 2018-12-22  9:27 ` jiangyiwen
  2018-12-26 19:08   ` Vivek Goyal
  2019-04-04 13:24 ` Greg Kurz
  11 siblings, 1 reply; 26+ messages in thread
From: jiangyiwen @ 2018-12-22  9:27 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git), qemu-devel
  Cc: sweil, swhiteho, stefanha, vgoyal, miklos

On 2018/12/11 1:31, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Hi,
>   This is the first RFC for the QEMU side of 'virtio-fs';
> a new mechanism for mounting host directories into the guest
> in a fast, consistent and secure manner.  Our primary use
> case is kata containers, but it should be usable in other scenarios
> as well.
> 
> There are corresponding patches being posted to Linux kernel,
> libfuse and kata lists.
> 
> For a fuller design description, and benchmark numbers, please see
> Vivek's posting of the kernel set here:
> 
> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> 
> We've got a small website with instructions on how to use it, here:
> 
> https://virtio-fs.gitlab.io/
> 
> and all the code is available on gitlab at:
> 
> https://gitlab.com/virtio-fs
> 
> QEMU's changes
> --------------
> 
> The QEMU changes are pretty small; 
> 
> There's a new vhost-user device, which is used to carry a stream of
> FUSE messages to an external daemon that actually performs
> all the file IO.  The FUSE daemon is an external process in order to
> achieve better isolation for security and resource control (e.g. number
> of file descriptors) and also because it's cleaner than trying to
> integrate libfuse into QEMU.
> 
> This device has an extra BAR that contains (up to) 3 regions:
> 
>  a) a DAX mapping range ('the cache') - into which QEMU mmap's
>     files on behalf of the external daemon; those files are
>     then directly mapped by the guest in a way similar to a DAX
>     backed file system;  one advantage of this is that multiple
>     guests all accessing the same files should all be sharing
>     those pages of host cache.
> 
>  b) An experimental set of mappings for use by a metadata versioning
>     daemon;  this mapping is shared between multiple guests and
>     the daemon, but only contains a set of version counters that
>     allow a guest to quickly tell if its metadata is stale.
> 
> TODO
> ----
> 
> This is the first RFC, we know we have a bunch of things to clear up:
> 
>   a) The virtio device specificiation is still in flux and is expected
>      to change
> 
>   b) We'd like to find ways of reducing the map/unmap latency for DAX
> 
>   c) The metadata versioning scheme needs to settle out.
> 
>   d) mmap'ing host files has some interesting side effects; for example
>      if the file gets truncated by the host and then the guest accesses
>      the mapping, KVM can fail the guest hard.
> 
> Dr. David Alan Gilbert (6):
>   virtio: Add shared memory capability
>   virtio-fs: Add cache BAR
>   virtio-fs: Add vhost-user slave commands for mapping
>   virtio-fs: Fill in slave commands for mapping
>   virtio-fs: Allow mapping of meta data version table
>   virtio-fs: Allow mapping of journal
> 
> Stefan Hajnoczi (1):
>   virtio: add vhost-user-fs-pci device
> 
>  configure                                   |  10 +
>  contrib/libvhost-user/libvhost-user.h       |   3 +
>  docs/interop/vhost-user.txt                 |  35 ++
>  hw/virtio/Makefile.objs                     |   1 +
>  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
>  hw/virtio/vhost-user.c                      |  16 +
>  hw/virtio/virtio-pci.c                      | 115 +++++
>  hw/virtio/virtio-pci.h                      |  19 +
>  include/hw/pci/pci.h                        |   1 +
>  include/hw/virtio/vhost-user-fs.h           |  79 +++
>  include/standard-headers/linux/virtio_fs.h  |  48 ++
>  include/standard-headers/linux/virtio_ids.h |   1 +
>  include/standard-headers/linux/virtio_pci.h |   9 +
>  13 files changed, 854 insertions(+)
>  create mode 100644 hw/virtio/vhost-user-fs.c
>  create mode 100644 include/hw/virtio/vhost-user-fs.h
>  create mode 100644 include/standard-headers/linux/virtio_fs.h
> 

Hi Dave,

I encounter a problem after running qemu with virtio-fs,

I find I only can mount virtio-fs using the following command:
mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0
or mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0,dax

Then, I want to know how to use "cache=always" or "cache=none", even "cache=auto", "cache=writeback"?

Thanks,
Yiwen.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-22  9:27 ` jiangyiwen
@ 2018-12-26 19:08   ` Vivek Goyal
  2019-01-08  6:08     ` jiangyiwen
  0 siblings, 1 reply; 26+ messages in thread
From: Vivek Goyal @ 2018-12-26 19:08 UTC (permalink / raw)
  To: jiangyiwen
  Cc: Dr. David Alan Gilbert (git),
	qemu-devel, sweil, swhiteho, stefanha, miklos

On Sat, Dec 22, 2018 at 05:27:28PM +0800, jiangyiwen wrote:
> On 2018/12/11 1:31, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Hi,
> >   This is the first RFC for the QEMU side of 'virtio-fs';
> > a new mechanism for mounting host directories into the guest
> > in a fast, consistent and secure manner.  Our primary use
> > case is kata containers, but it should be usable in other scenarios
> > as well.
> > 
> > There are corresponding patches being posted to Linux kernel,
> > libfuse and kata lists.
> > 
> > For a fuller design description, and benchmark numbers, please see
> > Vivek's posting of the kernel set here:
> > 
> > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > 
> > We've got a small website with instructions on how to use it, here:
> > 
> > https://virtio-fs.gitlab.io/
> > 
> > and all the code is available on gitlab at:
> > 
> > https://gitlab.com/virtio-fs
> > 
> > QEMU's changes
> > --------------
> > 
> > The QEMU changes are pretty small; 
> > 
> > There's a new vhost-user device, which is used to carry a stream of
> > FUSE messages to an external daemon that actually performs
> > all the file IO.  The FUSE daemon is an external process in order to
> > achieve better isolation for security and resource control (e.g. number
> > of file descriptors) and also because it's cleaner than trying to
> > integrate libfuse into QEMU.
> > 
> > This device has an extra BAR that contains (up to) 3 regions:
> > 
> >  a) a DAX mapping range ('the cache') - into which QEMU mmap's
> >     files on behalf of the external daemon; those files are
> >     then directly mapped by the guest in a way similar to a DAX
> >     backed file system;  one advantage of this is that multiple
> >     guests all accessing the same files should all be sharing
> >     those pages of host cache.
> > 
> >  b) An experimental set of mappings for use by a metadata versioning
> >     daemon;  this mapping is shared between multiple guests and
> >     the daemon, but only contains a set of version counters that
> >     allow a guest to quickly tell if its metadata is stale.
> > 
> > TODO
> > ----
> > 
> > This is the first RFC, we know we have a bunch of things to clear up:
> > 
> >   a) The virtio device specificiation is still in flux and is expected
> >      to change
> > 
> >   b) We'd like to find ways of reducing the map/unmap latency for DAX
> > 
> >   c) The metadata versioning scheme needs to settle out.
> > 
> >   d) mmap'ing host files has some interesting side effects; for example
> >      if the file gets truncated by the host and then the guest accesses
> >      the mapping, KVM can fail the guest hard.
> > 
> > Dr. David Alan Gilbert (6):
> >   virtio: Add shared memory capability
> >   virtio-fs: Add cache BAR
> >   virtio-fs: Add vhost-user slave commands for mapping
> >   virtio-fs: Fill in slave commands for mapping
> >   virtio-fs: Allow mapping of meta data version table
> >   virtio-fs: Allow mapping of journal
> > 
> > Stefan Hajnoczi (1):
> >   virtio: add vhost-user-fs-pci device
> > 
> >  configure                                   |  10 +
> >  contrib/libvhost-user/libvhost-user.h       |   3 +
> >  docs/interop/vhost-user.txt                 |  35 ++
> >  hw/virtio/Makefile.objs                     |   1 +
> >  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
> >  hw/virtio/vhost-user.c                      |  16 +
> >  hw/virtio/virtio-pci.c                      | 115 +++++
> >  hw/virtio/virtio-pci.h                      |  19 +
> >  include/hw/pci/pci.h                        |   1 +
> >  include/hw/virtio/vhost-user-fs.h           |  79 +++
> >  include/standard-headers/linux/virtio_fs.h  |  48 ++
> >  include/standard-headers/linux/virtio_ids.h |   1 +
> >  include/standard-headers/linux/virtio_pci.h |   9 +
> >  13 files changed, 854 insertions(+)
> >  create mode 100644 hw/virtio/vhost-user-fs.c
> >  create mode 100644 include/hw/virtio/vhost-user-fs.h
> >  create mode 100644 include/standard-headers/linux/virtio_fs.h
> > 
> 
> Hi Dave,
> 
> I encounter a problem after running qemu with virtio-fs,
> 
> I find I only can mount virtio-fs using the following command:
> mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0
> or mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0,dax
> 
> Then, I want to know how to use "cache=always" or "cache=none", even "cache=auto", "cache=writeback"?
> 
> Thanks,
> Yiwen.

Hi Yiwen,

As of now, cache options are libfuse daemon options. So while starting
daemon, specify "-o cache=none" or "-o cache=always" etc. One can not
specify caching option at virtio-fs mount time.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-26 19:08   ` Vivek Goyal
@ 2019-01-08  6:08     ` jiangyiwen
  0 siblings, 0 replies; 26+ messages in thread
From: jiangyiwen @ 2019-01-08  6:08 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Dr. David Alan Gilbert (git),
	qemu-devel, sweil, swhiteho, stefanha, miklos

On 2018/12/27 3:08, Vivek Goyal wrote:
> On Sat, Dec 22, 2018 at 05:27:28PM +0800, jiangyiwen wrote:
>> On 2018/12/11 1:31, Dr. David Alan Gilbert (git) wrote:
>>> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>>>
>>> Hi,
>>>   This is the first RFC for the QEMU side of 'virtio-fs';
>>> a new mechanism for mounting host directories into the guest
>>> in a fast, consistent and secure manner.  Our primary use
>>> case is kata containers, but it should be usable in other scenarios
>>> as well.
>>>
>>> There are corresponding patches being posted to Linux kernel,
>>> libfuse and kata lists.
>>>
>>> For a fuller design description, and benchmark numbers, please see
>>> Vivek's posting of the kernel set here:
>>>
>>> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
>>>
>>> We've got a small website with instructions on how to use it, here:
>>>
>>> https://virtio-fs.gitlab.io/
>>>
>>> and all the code is available on gitlab at:
>>>
>>> https://gitlab.com/virtio-fs
>>>
>>> QEMU's changes
>>> --------------
>>>
>>> The QEMU changes are pretty small; 
>>>
>>> There's a new vhost-user device, which is used to carry a stream of
>>> FUSE messages to an external daemon that actually performs
>>> all the file IO.  The FUSE daemon is an external process in order to
>>> achieve better isolation for security and resource control (e.g. number
>>> of file descriptors) and also because it's cleaner than trying to
>>> integrate libfuse into QEMU.
>>>
>>> This device has an extra BAR that contains (up to) 3 regions:
>>>
>>>  a) a DAX mapping range ('the cache') - into which QEMU mmap's
>>>     files on behalf of the external daemon; those files are
>>>     then directly mapped by the guest in a way similar to a DAX
>>>     backed file system;  one advantage of this is that multiple
>>>     guests all accessing the same files should all be sharing
>>>     those pages of host cache.
>>>
>>>  b) An experimental set of mappings for use by a metadata versioning
>>>     daemon;  this mapping is shared between multiple guests and
>>>     the daemon, but only contains a set of version counters that
>>>     allow a guest to quickly tell if its metadata is stale.
>>>
>>> TODO
>>> ----
>>>
>>> This is the first RFC, we know we have a bunch of things to clear up:
>>>
>>>   a) The virtio device specificiation is still in flux and is expected
>>>      to change
>>>
>>>   b) We'd like to find ways of reducing the map/unmap latency for DAX
>>>
>>>   c) The metadata versioning scheme needs to settle out.
>>>
>>>   d) mmap'ing host files has some interesting side effects; for example
>>>      if the file gets truncated by the host and then the guest accesses
>>>      the mapping, KVM can fail the guest hard.
>>>
>>> Dr. David Alan Gilbert (6):
>>>   virtio: Add shared memory capability
>>>   virtio-fs: Add cache BAR
>>>   virtio-fs: Add vhost-user slave commands for mapping
>>>   virtio-fs: Fill in slave commands for mapping
>>>   virtio-fs: Allow mapping of meta data version table
>>>   virtio-fs: Allow mapping of journal
>>>
>>> Stefan Hajnoczi (1):
>>>   virtio: add vhost-user-fs-pci device
>>>
>>>  configure                                   |  10 +
>>>  contrib/libvhost-user/libvhost-user.h       |   3 +
>>>  docs/interop/vhost-user.txt                 |  35 ++
>>>  hw/virtio/Makefile.objs                     |   1 +
>>>  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
>>>  hw/virtio/vhost-user.c                      |  16 +
>>>  hw/virtio/virtio-pci.c                      | 115 +++++
>>>  hw/virtio/virtio-pci.h                      |  19 +
>>>  include/hw/pci/pci.h                        |   1 +
>>>  include/hw/virtio/vhost-user-fs.h           |  79 +++
>>>  include/standard-headers/linux/virtio_fs.h  |  48 ++
>>>  include/standard-headers/linux/virtio_ids.h |   1 +
>>>  include/standard-headers/linux/virtio_pci.h |   9 +
>>>  13 files changed, 854 insertions(+)
>>>  create mode 100644 hw/virtio/vhost-user-fs.c
>>>  create mode 100644 include/hw/virtio/vhost-user-fs.h
>>>  create mode 100644 include/standard-headers/linux/virtio_fs.h
>>>
>>
>> Hi Dave,
>>
>> I encounter a problem after running qemu with virtio-fs,
>>
>> I find I only can mount virtio-fs using the following command:
>> mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0
>> or mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0,dax
>>
>> Then, I want to know how to use "cache=always" or "cache=none", even "cache=auto", "cache=writeback"?
>>
>> Thanks,
>> Yiwen.
> 
> Hi Yiwen,
> 
> As of now, cache options are libfuse daemon options. So while starting
> daemon, specify "-o cache=none" or "-o cache=always" etc. One can not
> specify caching option at virtio-fs mount time.
> 
> Thanks
> Vivek
> 
> .
> 

Ok, I get it, thanks.

Yiwen.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
  2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
                   ` (10 preceding siblings ...)
  2018-12-22  9:27 ` jiangyiwen
@ 2019-04-04 13:24 ` Greg Kurz
  2019-04-05  8:59     ` Dr. David Alan Gilbert
  11 siblings, 1 reply; 26+ messages in thread
From: Greg Kurz @ 2019-04-04 13:24 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

On Mon, 10 Dec 2018 17:31:44 +0000
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:

> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Hi,
>   This is the first RFC for the QEMU side of 'virtio-fs';
> a new mechanism for mounting host directories into the guest
> in a fast, consistent and secure manner.  Our primary use
> case is kata containers, but it should be usable in other scenarios
> as well.
> 
> There are corresponding patches being posted to Linux kernel,
> libfuse and kata lists.
> 
> For a fuller design description, and benchmark numbers, please see
> Vivek's posting of the kernel set here:
> 
> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> 
> We've got a small website with instructions on how to use it, here:
> 
> https://virtio-fs.gitlab.io/
> 
> and all the code is available on gitlab at:
> 
> https://gitlab.com/virtio-fs
> 

Hi !

This looks like a very promising replacement for virtio-9p, at
least with better chances of reaching a production quality level.

Not sure I'll have enough time to step in, but please Cc me on
future posts. As virtio-9p maintainer, I'll be happy to help if
I can. Also I'll be happy to get rid of the fsdev proxy backend
at some point (which I already wanted to replace with a vhost
user based solution :-) ).

Cheers,

--
Greg

> QEMU's changes
> --------------
> 
> The QEMU changes are pretty small; 
> 
> There's a new vhost-user device, which is used to carry a stream of
> FUSE messages to an external daemon that actually performs
> all the file IO.  The FUSE daemon is an external process in order to
> achieve better isolation for security and resource control (e.g. number
> of file descriptors) and also because it's cleaner than trying to
> integrate libfuse into QEMU.
> 
> This device has an extra BAR that contains (up to) 3 regions:
> 
>  a) a DAX mapping range ('the cache') - into which QEMU mmap's
>     files on behalf of the external daemon; those files are
>     then directly mapped by the guest in a way similar to a DAX
>     backed file system;  one advantage of this is that multiple
>     guests all accessing the same files should all be sharing
>     those pages of host cache.
> 
>  b) An experimental set of mappings for use by a metadata versioning
>     daemon;  this mapping is shared between multiple guests and
>     the daemon, but only contains a set of version counters that
>     allow a guest to quickly tell if its metadata is stale.
> 
> TODO
> ----
> 
> This is the first RFC, we know we have a bunch of things to clear up:
> 
>   a) The virtio device specificiation is still in flux and is expected
>      to change
> 
>   b) We'd like to find ways of reducing the map/unmap latency for DAX
> 
>   c) The metadata versioning scheme needs to settle out.
> 
>   d) mmap'ing host files has some interesting side effects; for example
>      if the file gets truncated by the host and then the guest accesses
>      the mapping, KVM can fail the guest hard.
> 
> Dr. David Alan Gilbert (6):
>   virtio: Add shared memory capability
>   virtio-fs: Add cache BAR
>   virtio-fs: Add vhost-user slave commands for mapping
>   virtio-fs: Fill in slave commands for mapping
>   virtio-fs: Allow mapping of meta data version table
>   virtio-fs: Allow mapping of journal
> 
> Stefan Hajnoczi (1):
>   virtio: add vhost-user-fs-pci device
> 
>  configure                                   |  10 +
>  contrib/libvhost-user/libvhost-user.h       |   3 +
>  docs/interop/vhost-user.txt                 |  35 ++
>  hw/virtio/Makefile.objs                     |   1 +
>  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
>  hw/virtio/vhost-user.c                      |  16 +
>  hw/virtio/virtio-pci.c                      | 115 +++++
>  hw/virtio/virtio-pci.h                      |  19 +
>  include/hw/pci/pci.h                        |   1 +
>  include/hw/virtio/vhost-user-fs.h           |  79 +++
>  include/standard-headers/linux/virtio_fs.h  |  48 ++
>  include/standard-headers/linux/virtio_ids.h |   1 +
>  include/standard-headers/linux/virtio_pci.h |   9 +
>  13 files changed, 854 insertions(+)
>  create mode 100644 hw/virtio/vhost-user-fs.c
>  create mode 100644 include/hw/virtio/vhost-user-fs.h
>  create mode 100644 include/standard-headers/linux/virtio_fs.h
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
@ 2019-04-05  8:59     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2019-04-05  8:59 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-devel, sweil, swhiteho, stefanha, vgoyal, miklos

* Greg Kurz (groug@kaod.org) wrote:
> On Mon, 10 Dec 2018 17:31:44 +0000
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Hi,
> >   This is the first RFC for the QEMU side of 'virtio-fs';
> > a new mechanism for mounting host directories into the guest
> > in a fast, consistent and secure manner.  Our primary use
> > case is kata containers, but it should be usable in other scenarios
> > as well.
> > 
> > There are corresponding patches being posted to Linux kernel,
> > libfuse and kata lists.
> > 
> > For a fuller design description, and benchmark numbers, please see
> > Vivek's posting of the kernel set here:
> > 
> > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > 
> > We've got a small website with instructions on how to use it, here:
> > 
> > https://virtio-fs.gitlab.io/
> > 
> > and all the code is available on gitlab at:
> > 
> > https://gitlab.com/virtio-fs
> > 
> 
> Hi !
> 
> This looks like a very promising replacement for virtio-9p, at
> least with better chances of reaching a production quality level.
> 
> Not sure I'll have enough time to step in, but please Cc me on
> future posts. As virtio-9p maintainer, I'll be happy to help if
> I can. Also I'll be happy to get rid of the fsdev proxy backend
> at some point (which I already wanted to replace with a vhost
> user based solution :-) ).

Thanks! We'll try and remember to keep you in the loop.
If there are any gotchas that you tripped over in 9p that we should
watch out for then please give us a prod.

Dave


Dave

> Cheers,
> 
> --
> Greg
> 
> > QEMU's changes
> > --------------
> > 
> > The QEMU changes are pretty small; 
> > 
> > There's a new vhost-user device, which is used to carry a stream of
> > FUSE messages to an external daemon that actually performs
> > all the file IO.  The FUSE daemon is an external process in order to
> > achieve better isolation for security and resource control (e.g. number
> > of file descriptors) and also because it's cleaner than trying to
> > integrate libfuse into QEMU.
> > 
> > This device has an extra BAR that contains (up to) 3 regions:
> > 
> >  a) a DAX mapping range ('the cache') - into which QEMU mmap's
> >     files on behalf of the external daemon; those files are
> >     then directly mapped by the guest in a way similar to a DAX
> >     backed file system;  one advantage of this is that multiple
> >     guests all accessing the same files should all be sharing
> >     those pages of host cache.
> > 
> >  b) An experimental set of mappings for use by a metadata versioning
> >     daemon;  this mapping is shared between multiple guests and
> >     the daemon, but only contains a set of version counters that
> >     allow a guest to quickly tell if its metadata is stale.
> > 
> > TODO
> > ----
> > 
> > This is the first RFC, we know we have a bunch of things to clear up:
> > 
> >   a) The virtio device specificiation is still in flux and is expected
> >      to change
> > 
> >   b) We'd like to find ways of reducing the map/unmap latency for DAX
> > 
> >   c) The metadata versioning scheme needs to settle out.
> > 
> >   d) mmap'ing host files has some interesting side effects; for example
> >      if the file gets truncated by the host and then the guest accesses
> >      the mapping, KVM can fail the guest hard.
> > 
> > Dr. David Alan Gilbert (6):
> >   virtio: Add shared memory capability
> >   virtio-fs: Add cache BAR
> >   virtio-fs: Add vhost-user slave commands for mapping
> >   virtio-fs: Fill in slave commands for mapping
> >   virtio-fs: Allow mapping of meta data version table
> >   virtio-fs: Allow mapping of journal
> > 
> > Stefan Hajnoczi (1):
> >   virtio: add vhost-user-fs-pci device
> > 
> >  configure                                   |  10 +
> >  contrib/libvhost-user/libvhost-user.h       |   3 +
> >  docs/interop/vhost-user.txt                 |  35 ++
> >  hw/virtio/Makefile.objs                     |   1 +
> >  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
> >  hw/virtio/vhost-user.c                      |  16 +
> >  hw/virtio/virtio-pci.c                      | 115 +++++
> >  hw/virtio/virtio-pci.h                      |  19 +
> >  include/hw/pci/pci.h                        |   1 +
> >  include/hw/virtio/vhost-user-fs.h           |  79 +++
> >  include/standard-headers/linux/virtio_fs.h  |  48 ++
> >  include/standard-headers/linux/virtio_ids.h |   1 +
> >  include/standard-headers/linux/virtio_pci.h |   9 +
> >  13 files changed, 854 insertions(+)
> >  create mode 100644 hw/virtio/vhost-user-fs.c
> >  create mode 100644 include/hw/virtio/vhost-user-fs.h
> >  create mode 100644 include/standard-headers/linux/virtio_fs.h
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
@ 2019-04-05  8:59     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 26+ messages in thread
From: Dr. David Alan Gilbert @ 2019-04-05  8:59 UTC (permalink / raw)
  To: Greg Kurz; +Cc: sweil, miklos, qemu-devel, stefanha, swhiteho, vgoyal

* Greg Kurz (groug@kaod.org) wrote:
> On Mon, 10 Dec 2018 17:31:44 +0000
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Hi,
> >   This is the first RFC for the QEMU side of 'virtio-fs';
> > a new mechanism for mounting host directories into the guest
> > in a fast, consistent and secure manner.  Our primary use
> > case is kata containers, but it should be usable in other scenarios
> > as well.
> > 
> > There are corresponding patches being posted to Linux kernel,
> > libfuse and kata lists.
> > 
> > For a fuller design description, and benchmark numbers, please see
> > Vivek's posting of the kernel set here:
> > 
> > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > 
> > We've got a small website with instructions on how to use it, here:
> > 
> > https://virtio-fs.gitlab.io/
> > 
> > and all the code is available on gitlab at:
> > 
> > https://gitlab.com/virtio-fs
> > 
> 
> Hi !
> 
> This looks like a very promising replacement for virtio-9p, at
> least with better chances of reaching a production quality level.
> 
> Not sure I'll have enough time to step in, but please Cc me on
> future posts. As virtio-9p maintainer, I'll be happy to help if
> I can. Also I'll be happy to get rid of the fsdev proxy backend
> at some point (which I already wanted to replace with a vhost
> user based solution :-) ).

Thanks! We'll try and remember to keep you in the loop.
If there are any gotchas that you tripped over in 9p that we should
watch out for then please give us a prod.

Dave


Dave

> Cheers,
> 
> --
> Greg
> 
> > QEMU's changes
> > --------------
> > 
> > The QEMU changes are pretty small; 
> > 
> > There's a new vhost-user device, which is used to carry a stream of
> > FUSE messages to an external daemon that actually performs
> > all the file IO.  The FUSE daemon is an external process in order to
> > achieve better isolation for security and resource control (e.g. number
> > of file descriptors) and also because it's cleaner than trying to
> > integrate libfuse into QEMU.
> > 
> > This device has an extra BAR that contains (up to) 3 regions:
> > 
> >  a) a DAX mapping range ('the cache') - into which QEMU mmap's
> >     files on behalf of the external daemon; those files are
> >     then directly mapped by the guest in a way similar to a DAX
> >     backed file system;  one advantage of this is that multiple
> >     guests all accessing the same files should all be sharing
> >     those pages of host cache.
> > 
> >  b) An experimental set of mappings for use by a metadata versioning
> >     daemon;  this mapping is shared between multiple guests and
> >     the daemon, but only contains a set of version counters that
> >     allow a guest to quickly tell if its metadata is stale.
> > 
> > TODO
> > ----
> > 
> > This is the first RFC, we know we have a bunch of things to clear up:
> > 
> >   a) The virtio device specificiation is still in flux and is expected
> >      to change
> > 
> >   b) We'd like to find ways of reducing the map/unmap latency for DAX
> > 
> >   c) The metadata versioning scheme needs to settle out.
> > 
> >   d) mmap'ing host files has some interesting side effects; for example
> >      if the file gets truncated by the host and then the guest accesses
> >      the mapping, KVM can fail the guest hard.
> > 
> > Dr. David Alan Gilbert (6):
> >   virtio: Add shared memory capability
> >   virtio-fs: Add cache BAR
> >   virtio-fs: Add vhost-user slave commands for mapping
> >   virtio-fs: Fill in slave commands for mapping
> >   virtio-fs: Allow mapping of meta data version table
> >   virtio-fs: Allow mapping of journal
> > 
> > Stefan Hajnoczi (1):
> >   virtio: add vhost-user-fs-pci device
> > 
> >  configure                                   |  10 +
> >  contrib/libvhost-user/libvhost-user.h       |   3 +
> >  docs/interop/vhost-user.txt                 |  35 ++
> >  hw/virtio/Makefile.objs                     |   1 +
> >  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
> >  hw/virtio/vhost-user.c                      |  16 +
> >  hw/virtio/virtio-pci.c                      | 115 +++++
> >  hw/virtio/virtio-pci.h                      |  19 +
> >  include/hw/pci/pci.h                        |   1 +
> >  include/hw/virtio/vhost-user-fs.h           |  79 +++
> >  include/standard-headers/linux/virtio_fs.h  |  48 ++
> >  include/standard-headers/linux/virtio_ids.h |   1 +
> >  include/standard-headers/linux/virtio_pci.h |   9 +
> >  13 files changed, 854 insertions(+)
> >  create mode 100644 hw/virtio/vhost-user-fs.c
> >  create mode 100644 include/hw/virtio/vhost-user-fs.h
> >  create mode 100644 include/standard-headers/linux/virtio_fs.h
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-04-05  9:06 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-10 17:31 [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 Dr. David Alan Gilbert (git)
2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 1/7] virtio: Add shared memory capability Dr. David Alan Gilbert (git)
2018-12-10 21:03   ` Eric Blake
2018-12-11 10:24     ` Dr. David Alan Gilbert
2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 2/7] virtio: add vhost-user-fs-pci device Dr. David Alan Gilbert (git)
2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 3/7] virtio-fs: Add cache BAR Dr. David Alan Gilbert (git)
2018-12-10 21:10   ` Eric Blake
2018-12-11 10:25     ` Dr. David Alan Gilbert
2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 4/7] virtio-fs: Add vhost-user slave commands for mapping Dr. David Alan Gilbert (git)
2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 5/7] virtio-fs: Fill in " Dr. David Alan Gilbert (git)
2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 6/7] virtio-fs: Allow mapping of meta data version table Dr. David Alan Gilbert (git)
2018-12-10 17:31 ` [Qemu-devel] [RFC PATCH 7/7] virtio-fs: Allow mapping of journal Dr. David Alan Gilbert (git)
2018-12-10 21:12   ` Eric Blake
2018-12-11 10:34     ` Dr. David Alan Gilbert
2018-12-10 20:26 ` [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3 no-reply
2018-12-11 12:53 ` Stefan Hajnoczi
2018-12-12 12:30 ` Daniel P. Berrangé
2018-12-12 13:52   ` Dr. David Alan Gilbert
2018-12-12 13:58     ` Daniel P. Berrangé
2018-12-12 14:49       ` Stefan Hajnoczi
2018-12-22  9:27 ` jiangyiwen
2018-12-26 19:08   ` Vivek Goyal
2019-01-08  6:08     ` jiangyiwen
2019-04-04 13:24 ` Greg Kurz
2019-04-05  8:59   ` Dr. David Alan Gilbert
2019-04-05  8:59     ` Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.