* [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device
@ 2013-10-30 17:02 Alex Williamson
2013-10-30 17:02 ` [PATCH v2 1/3] kvm: Add VFIO device Alex Williamson
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Alex Williamson @ 2013-10-30 17:02 UTC (permalink / raw)
To: pbonzini, kvm, gleb; +Cc: aik, linux-kernel
v2:
- Drop KVM device changes
- Add kfree(dev) in destroy function
- Add atomic_set(,0) in kvm_arch init
As described in v1 and discussion, this adds a VFIO pseudo device to
KVM through which userspace can register VFIO groups. The initial use
of this interface is to enable emulation of coherency instructions on
x86 when a VFIO group is attached. Future patches will allow KVM to
query the coherency state of the IOMMU domain used by the group to
avoid unnecessarily enabling this emulation. Thanks,
Alex
---
Alex Williamson (3):
kvm: Add VFIO device
kvm/x86: Convert iommu_flags to iommu_noncoherent
kvm: Create non-coherent DMA registeration
Documentation/virtual/kvm/devices/vfio.txt | 22 ++
arch/ia64/include/asm/kvm_host.h | 2
arch/x86/include/asm/kvm_host.h | 4
arch/x86/kvm/Kconfig | 1
arch/x86/kvm/Makefile | 2
arch/x86/kvm/vmx.c | 3
arch/x86/kvm/x86.c | 22 ++
include/linux/kvm_host.h | 23 ++
include/uapi/linux/kvm.h | 4
virt/kvm/Kconfig | 3
virt/kvm/iommu.c | 22 +-
virt/kvm/kvm_main.c | 5 +
virt/kvm/vfio.c | 264 ++++++++++++++++++++++++++++
13 files changed, 359 insertions(+), 18 deletions(-)
create mode 100644 Documentation/virtual/kvm/devices/vfio.txt
create mode 100644 virt/kvm/vfio.c
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/3] kvm: Add VFIO device
2013-10-30 17:02 [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Alex Williamson
@ 2013-10-30 17:02 ` Alex Williamson
2013-10-30 17:02 ` [PATCH v2 2/3] kvm/x86: Convert iommu_flags to iommu_noncoherent Alex Williamson
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2013-10-30 17:02 UTC (permalink / raw)
To: pbonzini, kvm, gleb; +Cc: aik, linux-kernel
So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made. This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction. The user creates
the device and can add and remove VFIO groups to it via file
descriptors. When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
Documentation/virtual/kvm/devices/vfio.txt | 22 +++
arch/x86/kvm/Kconfig | 1
arch/x86/kvm/Makefile | 2
include/linux/kvm_host.h | 1
include/uapi/linux/kvm.h | 4 +
virt/kvm/Kconfig | 3
virt/kvm/kvm_main.c | 5 +
virt/kvm/vfio.c | 220 ++++++++++++++++++++++++++++
8 files changed, 257 insertions(+), 1 deletion(-)
create mode 100644 Documentation/virtual/kvm/devices/vfio.txt
create mode 100644 virt/kvm/vfio.c
diff --git a/Documentation/virtual/kvm/devices/vfio.txt b/Documentation/virtual/kvm/devices/vfio.txt
new file mode 100644
index 0000000..ef51740
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/vfio.txt
@@ -0,0 +1,22 @@
+VFIO virtual device
+===================
+
+Device types supported:
+ KVM_DEV_TYPE_VFIO
+
+Only one VFIO instance may be created per VM. The created device
+tracks VFIO groups in use by the VM and features of those groups
+important to the correctness and acceleration of the VM. As groups
+are enabled and disabled for use by the VM, KVM should be updated
+about their presence. When registered with KVM, a reference to the
+VFIO-group is held by KVM.
+
+Groups:
+ KVM_DEV_VFIO_GROUP
+
+KVM_DEV_VFIO_GROUP attributes:
+ KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
+ KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking
+
+For each, kvm_device_attr.addr points to an int32_t file descriptor
+for the VFIO group.
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index a47a3e5..b89c5db 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -38,6 +38,7 @@ config KVM
select PERF_EVENTS
select HAVE_KVM_MSI
select HAVE_KVM_CPU_RELAX_INTERCEPT
+ select KVM_VFIO
---help---
Support hosting fully virtualized guest machines using hardware
virtualization extensions. You will need a fairly recent
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index bf4fb04..25d22b2 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -9,7 +9,7 @@ KVM := ../../../virt/kvm
kvm-y += $(KVM)/kvm_main.o $(KVM)/ioapic.o \
$(KVM)/coalesced_mmio.o $(KVM)/irq_comm.o \
- $(KVM)/eventfd.o $(KVM)/irqchip.o
+ $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o
kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT) += $(KVM)/assigned-dev.o $(KVM)/iommu.o
kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 0fbbc7a..279002b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1066,6 +1066,7 @@ struct kvm_device *kvm_device_from_filp(struct file *filp);
extern struct kvm_device_ops kvm_mpic_ops;
extern struct kvm_device_ops kvm_xics_ops;
+extern struct kvm_device_ops kvm_vfio_ops;
#ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 99c2533..7c1a349 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -843,6 +843,10 @@ struct kvm_device_attr {
#define KVM_DEV_TYPE_FSL_MPIC_20 1
#define KVM_DEV_TYPE_FSL_MPIC_42 2
#define KVM_DEV_TYPE_XICS 3
+#define KVM_DEV_TYPE_VFIO 4
+#define KVM_DEV_VFIO_GROUP 1
+#define KVM_DEV_VFIO_GROUP_ADD 1
+#define KVM_DEV_VFIO_GROUP_DEL 2
/*
* ioctls for VM fds
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 779262f..fbe1a48 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -27,3 +27,6 @@ config HAVE_KVM_MSI
config HAVE_KVM_CPU_RELAX_INTERCEPT
bool
+
+config KVM_VFIO
+ bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a9dd682..1053fe6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2271,6 +2271,11 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
ops = &kvm_xics_ops;
break;
#endif
+#ifdef CONFIG_KVM_VFIO
+ case KVM_DEV_TYPE_VFIO:
+ ops = &kvm_vfio_ops;
+ break;
+#endif
default:
return -ENODEV;
}
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
new file mode 100644
index 0000000..597c258
--- /dev/null
+++ b/virt/kvm/vfio.c
@@ -0,0 +1,220 @@
+/*
+ * VFIO-KVM bridge pseudo device
+ *
+ * Copyright (C) 2013 Red Hat, Inc. All rights reserved.
+ * Author: Alex Williamson <alex.williamson@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/errno.h>
+#include <linux/file.h>
+#include <linux/kvm_host.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+
+struct kvm_vfio_group {
+ struct list_head node;
+ struct vfio_group *vfio_group;
+};
+
+struct kvm_vfio {
+ struct list_head group_list;
+ struct mutex lock;
+};
+
+static struct vfio_group *kvm_vfio_group_get_external_user(struct file *filep)
+{
+ struct vfio_group *vfio_group;
+ struct vfio_group *(*fn)(struct file *);
+
+ fn = symbol_get(vfio_group_get_external_user);
+ if (!fn)
+ return ERR_PTR(-EINVAL);
+
+ vfio_group = fn(filep);
+
+ symbol_put(vfio_group_get_external_user);
+
+ return vfio_group;
+}
+
+static void kvm_vfio_group_put_external_user(struct vfio_group *vfio_group)
+{
+ void (*fn)(struct vfio_group *);
+
+ fn = symbol_get(vfio_group_put_external_user);
+ if (!fn)
+ return;
+
+ fn(vfio_group);
+
+ symbol_put(vfio_group_put_external_user);
+}
+
+static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
+{
+ struct kvm_vfio *kv = dev->private;
+ struct vfio_group *vfio_group;
+ struct kvm_vfio_group *kvg;
+ void __user *argp = (void __user *)arg;
+ struct fd f;
+ int32_t fd;
+ int ret;
+
+ switch (attr) {
+ case KVM_DEV_VFIO_GROUP_ADD:
+ if (get_user(fd, (int32_t __user *)argp))
+ return -EFAULT;
+
+ f = fdget(fd);
+ if (!f.file)
+ return -EBADF;
+
+ vfio_group = kvm_vfio_group_get_external_user(f.file);
+ fdput(f);
+
+ if (IS_ERR(vfio_group))
+ return PTR_ERR(vfio_group);
+
+ mutex_lock(&kv->lock);
+
+ list_for_each_entry(kvg, &kv->group_list, node) {
+ if (kvg->vfio_group == vfio_group) {
+ mutex_unlock(&kv->lock);
+ kvm_vfio_group_put_external_user(vfio_group);
+ return -EEXIST;
+ }
+ }
+
+ kvg = kzalloc(sizeof(*kvg), GFP_KERNEL);
+ if (!kvg) {
+ mutex_unlock(&kv->lock);
+ kvm_vfio_group_put_external_user(vfio_group);
+ return -ENOMEM;
+ }
+
+ list_add_tail(&kvg->node, &kv->group_list);
+ kvg->vfio_group = vfio_group;
+
+ mutex_unlock(&kv->lock);
+
+ return 0;
+
+ case KVM_DEV_VFIO_GROUP_DEL:
+ if (get_user(fd, (int32_t __user *)argp))
+ return -EFAULT;
+
+ f = fdget(fd);
+ if (!f.file)
+ return -EBADF;
+
+ vfio_group = kvm_vfio_group_get_external_user(f.file);
+ fdput(f);
+
+ if (IS_ERR(vfio_group))
+ return PTR_ERR(vfio_group);
+
+ ret = -ENOENT;
+
+ mutex_lock(&kv->lock);
+
+ list_for_each_entry(kvg, &kv->group_list, node) {
+ if (kvg->vfio_group != vfio_group)
+ continue;
+
+ list_del(&kvg->node);
+ kvm_vfio_group_put_external_user(kvg->vfio_group);
+ kfree(kvg);
+ ret = 0;
+ break;
+ }
+
+ mutex_unlock(&kv->lock);
+
+ kvm_vfio_group_put_external_user(vfio_group);
+
+ return ret;
+ }
+
+ return -ENXIO;
+}
+
+static int kvm_vfio_set_attr(struct kvm_device *dev,
+ struct kvm_device_attr *attr)
+{
+ switch (attr->group) {
+ case KVM_DEV_VFIO_GROUP:
+ return kvm_vfio_set_group(dev, attr->attr, attr->addr);
+ }
+
+ return -ENXIO;
+}
+
+static int kvm_vfio_has_attr(struct kvm_device *dev,
+ struct kvm_device_attr *attr)
+{
+ switch (attr->group) {
+ case KVM_DEV_VFIO_GROUP:
+ switch (attr->attr) {
+ case KVM_DEV_VFIO_GROUP_ADD:
+ case KVM_DEV_VFIO_GROUP_DEL:
+ return 0;
+ }
+
+ break;
+ }
+
+ return -ENXIO;
+}
+
+static void kvm_vfio_destroy(struct kvm_device *dev)
+{
+ struct kvm_vfio *kv = dev->private;
+ struct kvm_vfio_group *kvg, *tmp;
+
+ list_for_each_entry_safe(kvg, tmp, &kv->group_list, node) {
+ kvm_vfio_group_put_external_user(kvg->vfio_group);
+ list_del(&kvg->node);
+ kfree(kvg);
+ }
+
+ kfree(kv);
+ kfree(dev); /* alloc by kvm_ioctl_create_device, free by .destroy */
+}
+
+static int kvm_vfio_create(struct kvm_device *dev, u32 type)
+{
+ struct kvm_device *tmp;
+ struct kvm_vfio *kv;
+
+ /* Only one VFIO "device" per VM */
+ list_for_each_entry(tmp, &dev->kvm->devices, vm_node)
+ if (tmp->ops == &kvm_vfio_ops)
+ return -EBUSY;
+
+ kv = kzalloc(sizeof(*kv), GFP_KERNEL);
+ if (!kv)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&kv->group_list);
+ mutex_init(&kv->lock);
+
+ dev->private = kv;
+
+ return 0;
+}
+
+struct kvm_device_ops kvm_vfio_ops = {
+ .name = "kvm-vfio",
+ .create = kvm_vfio_create,
+ .destroy = kvm_vfio_destroy,
+ .set_attr = kvm_vfio_set_attr,
+ .has_attr = kvm_vfio_has_attr,
+};
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/3] kvm/x86: Convert iommu_flags to iommu_noncoherent
2013-10-30 17:02 [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Alex Williamson
2013-10-30 17:02 ` [PATCH v2 1/3] kvm: Add VFIO device Alex Williamson
@ 2013-10-30 17:02 ` Alex Williamson
2013-10-30 17:02 ` [PATCH v2 3/3] kvm: Create non-coherent DMA registeration Alex Williamson
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2013-10-30 17:02 UTC (permalink / raw)
To: pbonzini, kvm, gleb; +Cc: aik, linux-kernel
Default to operating in coherent mode. This simplifies the logic when
we switch to a model of registering and unregistering noncoherent I/O
with KVM.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
arch/ia64/include/asm/kvm_host.h | 2 +-
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/vmx.c | 2 +-
arch/x86/kvm/x86.c | 2 +-
include/linux/kvm_host.h | 3 ---
virt/kvm/iommu.c | 16 ++++++++--------
6 files changed, 12 insertions(+), 15 deletions(-)
diff --git a/arch/ia64/include/asm/kvm_host.h b/arch/ia64/include/asm/kvm_host.h
index 989dd3f..1933b7a 100644
--- a/arch/ia64/include/asm/kvm_host.h
+++ b/arch/ia64/include/asm/kvm_host.h
@@ -480,7 +480,7 @@ struct kvm_arch {
struct list_head assigned_dev_head;
struct iommu_domain *iommu_domain;
- int iommu_flags;
+ bool iommu_noncoherent;
unsigned long irq_sources_bitmap;
unsigned long irq_states[KVM_IOAPIC_NUM_PINS];
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c76ff74..1b6b5f9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -557,7 +557,7 @@ struct kvm_arch {
struct list_head assigned_dev_head;
struct iommu_domain *iommu_domain;
- int iommu_flags;
+ bool iommu_noncoherent;
struct kvm_pic *vpic;
struct kvm_ioapic *vioapic;
struct kvm_pit *vpit;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2b2fce1..5d0e3ab 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7411,7 +7411,7 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
if (is_mmio)
ret = MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT;
else if (vcpu->kvm->arch.iommu_domain &&
- !(vcpu->kvm->arch.iommu_flags & KVM_IOMMU_CACHE_COHERENCY))
+ vcpu->kvm->arch.iommu_noncoherent)
ret = kvm_get_guest_memory_type(vcpu, gfn) <<
VMX_EPT_MT_EPTE_SHIFT;
else
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e5ca72a..b1231b0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2716,7 +2716,7 @@ static void wbinvd_ipi(void *garbage)
static bool need_emulate_wbinvd(struct kvm_vcpu *vcpu)
{
return vcpu->kvm->arch.iommu_domain &&
- !(vcpu->kvm->arch.iommu_flags & KVM_IOMMU_CACHE_COHERENCY);
+ vcpu->kvm->arch.iommu_noncoherent;
}
void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 279002b..46ce3d1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -747,9 +747,6 @@ void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
int kvm_request_irq_source_id(struct kvm *kvm);
void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id);
-/* For vcpu->arch.iommu_flags */
-#define KVM_IOMMU_CACHE_COHERENCY 0x1
-
#ifdef CONFIG_KVM_DEVICE_ASSIGNMENT
int kvm_iommu_map_pages(struct kvm *kvm, struct kvm_memory_slot *slot);
void kvm_iommu_unmap_pages(struct kvm *kvm, struct kvm_memory_slot *slot);
diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index 72a130b..9cde444 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -79,7 +79,7 @@ int kvm_iommu_map_pages(struct kvm *kvm, struct kvm_memory_slot *slot)
flags = IOMMU_READ;
if (!(slot->flags & KVM_MEM_READONLY))
flags |= IOMMU_WRITE;
- if (kvm->arch.iommu_flags & KVM_IOMMU_CACHE_COHERENCY)
+ if (!kvm->arch.iommu_noncoherent)
flags |= IOMMU_CACHE;
@@ -158,7 +158,8 @@ int kvm_assign_device(struct kvm *kvm,
{
struct pci_dev *pdev = NULL;
struct iommu_domain *domain = kvm->arch.iommu_domain;
- int r, last_flags;
+ int r;
+ bool noncoherent;
/* check if iommu exists and in use */
if (!domain)
@@ -174,15 +175,13 @@ int kvm_assign_device(struct kvm *kvm,
return r;
}
- last_flags = kvm->arch.iommu_flags;
- if (iommu_domain_has_cap(kvm->arch.iommu_domain,
- IOMMU_CAP_CACHE_COHERENCY))
- kvm->arch.iommu_flags |= KVM_IOMMU_CACHE_COHERENCY;
+ noncoherent = !iommu_domain_has_cap(kvm->arch.iommu_domain,
+ IOMMU_CAP_CACHE_COHERENCY);
/* Check if need to update IOMMU page table for guest memory */
- if ((last_flags ^ kvm->arch.iommu_flags) ==
- KVM_IOMMU_CACHE_COHERENCY) {
+ if (noncoherent != kvm->arch.iommu_noncoherent) {
kvm_iommu_unmap_memslots(kvm);
+ kvm->arch.iommu_noncoherent = noncoherent;
r = kvm_iommu_map_memslots(kvm);
if (r)
goto out_unmap;
@@ -350,6 +349,7 @@ int kvm_iommu_unmap_guest(struct kvm *kvm)
mutex_lock(&kvm->slots_lock);
kvm_iommu_unmap_memslots(kvm);
kvm->arch.iommu_domain = NULL;
+ kvm->arch.iommu_noncoherent = false;
mutex_unlock(&kvm->slots_lock);
iommu_domain_free(domain);
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 3/3] kvm: Create non-coherent DMA registeration
2013-10-30 17:02 [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Alex Williamson
2013-10-30 17:02 ` [PATCH v2 1/3] kvm: Add VFIO device Alex Williamson
2013-10-30 17:02 ` [PATCH v2 2/3] kvm/x86: Convert iommu_flags to iommu_noncoherent Alex Williamson
@ 2013-10-30 17:02 ` Alex Williamson
2013-10-30 18:03 ` [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Paolo Bonzini
2013-11-05 8:05 ` [RFC PATCH v2] KVM: PPC: vfio kvm device: support spapr tce Alexey Kardashevskiy
4 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2013-10-30 17:02 UTC (permalink / raw)
To: pbonzini, kvm, gleb; +Cc: aik, linux-kernel
We currently use some ad-hoc arch variables tied to legacy KVM device
assignment to manage emulation of instructions that depend on whether
non-coherent DMA is present. Create an interface for this, adapting
legacy KVM device assignment and adding VFIO via the KVM-VFIO device.
For now we assume that non-coherent DMA is possible any time we have a
VFIO group. Eventually an interface can be developed as part of the
VFIO external user interface to query the coherency of a group.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/vmx.c | 3 +--
arch/x86/kvm/x86.c | 22 ++++++++++++++++++--
include/linux/kvm_host.h | 19 +++++++++++++++++
virt/kvm/iommu.c | 6 +++++
virt/kvm/vfio.c | 44 +++++++++++++++++++++++++++++++++++++++
6 files changed, 92 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1b6b5f9..50c1e9c1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -558,6 +558,8 @@ struct kvm_arch {
struct list_head assigned_dev_head;
struct iommu_domain *iommu_domain;
bool iommu_noncoherent;
+#define __KVM_HAVE_ARCH_NONCOHERENT_DMA
+ atomic_t noncoherent_dma_count;
struct kvm_pic *vpic;
struct kvm_ioapic *vioapic;
struct kvm_pit *vpit;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5d0e3ab..f7fe08a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7410,8 +7410,7 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
*/
if (is_mmio)
ret = MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT;
- else if (vcpu->kvm->arch.iommu_domain &&
- vcpu->kvm->arch.iommu_noncoherent)
+ else if (kvm_arch_has_noncoherent_dma(vcpu->kvm))
ret = kvm_get_guest_memory_type(vcpu, gfn) <<
VMX_EPT_MT_EPTE_SHIFT;
else
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b1231b0..92e1722 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2715,8 +2715,7 @@ static void wbinvd_ipi(void *garbage)
static bool need_emulate_wbinvd(struct kvm_vcpu *vcpu)
{
- return vcpu->kvm->arch.iommu_domain &&
- vcpu->kvm->arch.iommu_noncoherent;
+ return kvm_arch_has_noncoherent_dma(vcpu->kvm);
}
void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@@ -6981,6 +6980,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
+ atomic_set(&kvm->arch.noncoherent_dma_count, 0);
/* Reserve bit 0 of irq_sources_bitmap for userspace irq source */
set_bit(KVM_USERSPACE_IRQ_SOURCE_ID, &kvm->arch.irq_sources_bitmap);
@@ -7420,6 +7420,24 @@ bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
kvm_x86_ops->interrupt_allowed(vcpu);
}
+void kvm_arch_register_noncoherent_dma(struct kvm *kvm)
+{
+ atomic_inc(&kvm->arch.noncoherent_dma_count);
+}
+EXPORT_SYMBOL_GPL(kvm_arch_register_noncoherent_dma);
+
+void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm)
+{
+ atomic_dec(&kvm->arch.noncoherent_dma_count);
+}
+EXPORT_SYMBOL_GPL(kvm_arch_unregister_noncoherent_dma);
+
+bool kvm_arch_has_noncoherent_dma(struct kvm *kvm)
+{
+ return atomic_read(&kvm->arch.noncoherent_dma_count);
+}
+EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma);
+
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 46ce3d1..a225349 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -671,6 +671,25 @@ static inline void kvm_arch_free_vm(struct kvm *kvm)
}
#endif
+#ifdef __KVM_HAVE_ARCH_NONCOHERENT_DMA
+void kvm_arch_register_noncoherent_dma(struct kvm *kvm);
+void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm);
+bool kvm_arch_has_noncoherent_dma(struct kvm *kvm);
+#else
+static inline void kvm_arch_register_noncoherent_dma(struct kvm *kvm)
+{
+}
+
+static inline void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm)
+{
+}
+
+static inline bool kvm_arch_has_noncoherent_dma(struct kvm *kvm)
+{
+ return false;
+}
+#endif
+
static inline wait_queue_head_t *kvm_arch_vcpu_wq(struct kvm_vcpu *vcpu)
{
#ifdef __KVM_HAVE_ARCH_WQP
diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index 9cde444..0a54456 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -140,6 +140,9 @@ static int kvm_iommu_map_memslots(struct kvm *kvm)
struct kvm_memslots *slots;
struct kvm_memory_slot *memslot;
+ if (kvm->arch.iommu_noncoherent)
+ kvm_arch_register_noncoherent_dma(kvm);
+
idx = srcu_read_lock(&kvm->srcu);
slots = kvm_memslots(kvm);
@@ -335,6 +338,9 @@ static int kvm_iommu_unmap_memslots(struct kvm *kvm)
srcu_read_unlock(&kvm->srcu, idx);
+ if (kvm->arch.iommu_noncoherent)
+ kvm_arch_unregister_noncoherent_dma(kvm);
+
return 0;
}
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 597c258..ca4260e 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -27,6 +27,7 @@ struct kvm_vfio_group {
struct kvm_vfio {
struct list_head group_list;
struct mutex lock;
+ bool noncoherent;
};
static struct vfio_group *kvm_vfio_group_get_external_user(struct file *filep)
@@ -58,6 +59,43 @@ static void kvm_vfio_group_put_external_user(struct vfio_group *vfio_group)
symbol_put(vfio_group_put_external_user);
}
+/*
+ * Groups can use the same or different IOMMU domains. If the same then
+ * adding a new group may change the coherency of groups we've previously
+ * been told about. We don't want to care about any of that so we retest
+ * each group and bail as soon as we find one that's noncoherent. This
+ * means we only ever [un]register_noncoherent_dma once for the whole device.
+ */
+static void kvm_vfio_update_coherency(struct kvm_device *dev)
+{
+ struct kvm_vfio *kv = dev->private;
+ bool noncoherent = false;
+ struct kvm_vfio_group *kvg;
+
+ mutex_lock(&kv->lock);
+
+ list_for_each_entry(kvg, &kv->group_list, node) {
+ /*
+ * TODO: We need an interface to check the coherency of
+ * the IOMMU domain this group is using. For now, assume
+ * it's always noncoherent.
+ */
+ noncoherent = true;
+ break;
+ }
+
+ if (noncoherent != kv->noncoherent) {
+ kv->noncoherent = noncoherent;
+
+ if (kv->noncoherent)
+ kvm_arch_register_noncoherent_dma(dev->kvm);
+ else
+ kvm_arch_unregister_noncoherent_dma(dev->kvm);
+ }
+
+ mutex_unlock(&kv->lock);
+}
+
static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
{
struct kvm_vfio *kv = dev->private;
@@ -105,6 +143,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
mutex_unlock(&kv->lock);
+ kvm_vfio_update_coherency(dev);
+
return 0;
case KVM_DEV_VFIO_GROUP_DEL:
@@ -140,6 +180,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
kvm_vfio_group_put_external_user(vfio_group);
+ kvm_vfio_update_coherency(dev);
+
return ret;
}
@@ -185,6 +227,8 @@ static void kvm_vfio_destroy(struct kvm_device *dev)
kfree(kvg);
}
+ kvm_vfio_update_coherency(dev);
+
kfree(kv);
kfree(dev); /* alloc by kvm_ioctl_create_device, free by .destroy */
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device
2013-10-30 17:02 [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Alex Williamson
` (2 preceding siblings ...)
2013-10-30 17:02 ` [PATCH v2 3/3] kvm: Create non-coherent DMA registeration Alex Williamson
@ 2013-10-30 18:03 ` Paolo Bonzini
2013-11-05 8:05 ` [RFC PATCH v2] KVM: PPC: vfio kvm device: support spapr tce Alexey Kardashevskiy
4 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2013-10-30 18:03 UTC (permalink / raw)
To: Alex Williamson; +Cc: kvm, gleb, aik, linux-kernel
Il 30/10/2013 18:02, Alex Williamson ha scritto:
> v2:
> - Drop KVM device changes
> - Add kfree(dev) in destroy function
> - Add atomic_set(,0) in kvm_arch init
>
> As described in v1 and discussion, this adds a VFIO pseudo device to
> KVM through which userspace can register VFIO groups. The initial use
> of this interface is to enable emulation of coherency instructions on
> x86 when a VFIO group is attached. Future patches will allow KVM to
> query the coherency state of the IOMMU domain used by the group to
> avoid unnecessarily enabling this emulation. Thanks,
>
> Alex
>
> ---
>
> Alex Williamson (3):
> kvm: Add VFIO device
> kvm/x86: Convert iommu_flags to iommu_noncoherent
> kvm: Create non-coherent DMA registeration
>
>
> Documentation/virtual/kvm/devices/vfio.txt | 22 ++
> arch/ia64/include/asm/kvm_host.h | 2
> arch/x86/include/asm/kvm_host.h | 4
> arch/x86/kvm/Kconfig | 1
> arch/x86/kvm/Makefile | 2
> arch/x86/kvm/vmx.c | 3
> arch/x86/kvm/x86.c | 22 ++
> include/linux/kvm_host.h | 23 ++
> include/uapi/linux/kvm.h | 4
> virt/kvm/Kconfig | 3
> virt/kvm/iommu.c | 22 +-
> virt/kvm/kvm_main.c | 5 +
> virt/kvm/vfio.c | 264 ++++++++++++++++++++++++++++
> 13 files changed, 359 insertions(+), 18 deletions(-)
> create mode 100644 Documentation/virtual/kvm/devices/vfio.txt
> create mode 100644 virt/kvm/vfio.c
>
Applied to kvm/queue, thanks.
Paolo
^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC PATCH v2] KVM: PPC: vfio kvm device: support spapr tce
2013-10-30 17:02 [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Alex Williamson
` (3 preceding siblings ...)
2013-10-30 18:03 ` [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Paolo Bonzini
@ 2013-11-05 8:05 ` Alexey Kardashevskiy
2013-11-05 21:32 ` Alex Williamson
4 siblings, 1 reply; 7+ messages in thread
From: Alexey Kardashevskiy @ 2013-11-05 8:05 UTC (permalink / raw)
To: Alex Williamson
Cc: Alexey Kardashevskiy, linuxppc-dev, Benjamin Herrenschmidt,
Paul Mackerras, Alexander Graf, Gleb Natapov, Paolo Bonzini,
kvm-ppc, kvm, linux-kernel
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v2:
* it does not try to introduce a realmode search function.
Instead, liobn-to-iommu-group lookup is done by VFIO KVM device
in virtual mode and the result (iommu_group pointer) is cached
in kvm_arch so the realmode handlers do not use VFIO KVM device for that.
And the iommu groups get released on KVM termination.
I tried this, seems viable.
Did not I miss anything? Thanks.
---
arch/powerpc/include/asm/kvm_host.h | 3 ++
arch/powerpc/kvm/Kconfig | 1 +
arch/powerpc/kvm/Makefile | 3 ++
include/linux/vfio.h | 3 ++
include/uapi/linux/kvm.h | 1 +
virt/kvm/vfio.c | 74 +++++++++++++++++++++++++++++++++++++
6 files changed, 85 insertions(+)
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 48dbe8b..e1163d7 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -293,6 +293,9 @@ struct kvm_arch {
#ifdef CONFIG_KVM_XICS
struct kvmppc_xics *xics;
#endif
+#ifdef CONFIG_KVM_VFIO
+ struct kvm_vfio *vfio;
+#endif
};
/*
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 61b3535..d1b7f64 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -60,6 +60,7 @@ config KVM_BOOK3S_64
select KVM_BOOK3S_64_HANDLER
select KVM
select SPAPR_TCE_IOMMU
+ select KVM_VFIO
---help---
Support running unmodified book3s_64 and book3s_32 guest kernels
in virtual machines on book3s_64 host processors.
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 6646c95..2438d2e 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -87,6 +87,9 @@ kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
kvm-book3s_64-objs-$(CONFIG_KVM_XICS) += \
book3s_xics.o
+kvm-book3s_64-objs-$(CONFIG_KVM_VFIO) += \
+ $(KVM)/vfio.o \
+
kvm-book3s_64-module-objs := \
$(KVM)/kvm_main.o \
$(KVM)/eventfd.o \
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 24579a0..681e19b 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -97,4 +97,7 @@ extern struct vfio_group *vfio_group_get_external_user(struct file *filep);
extern void vfio_group_put_external_user(struct vfio_group *group);
extern int vfio_external_user_iommu_id(struct vfio_group *group);
+extern struct iommu_group *vfio_find_group_by_liobn(struct kvm *kvm,
+ unsigned long liobn);
+
#endif /* VFIO_H */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 7c1a349..a74ad16 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -847,6 +847,7 @@ struct kvm_device_attr {
#define KVM_DEV_VFIO_GROUP 1
#define KVM_DEV_VFIO_GROUP_ADD 1
#define KVM_DEV_VFIO_GROUP_DEL 2
+#define KVM_DEV_VFIO_SPAPR_TCE_LIOBN 2
/*
* ioctls for VM fds
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index ca4260e..f9271d5 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -22,6 +22,9 @@
struct kvm_vfio_group {
struct list_head node;
struct vfio_group *vfio_group;
+#ifdef CONFIG_SPAPR_TCE_IOMMU
+ uint64_t liobn;
+#endif
};
struct kvm_vfio {
@@ -188,12 +191,76 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
return -ENXIO;
}
+#ifdef CONFIG_SPAPR_TCE_IOMMU
+static int kvm_vfio_set_spapr_tce_liobn(struct kvm_device *dev,
+ long attr, u64 arg)
+{
+ struct kvm_vfio *kv = dev->private;
+ struct vfio_group *vfio_group;
+ struct kvm_vfio_group *kvg;
+ void __user *argp = (void __user *)arg;
+ struct fd f;
+ int32_t fd;
+ uint64_t liobn = attr;
+
+ if (get_user(fd, (int32_t __user *)argp))
+ return -EFAULT;
+
+ f = fdget(fd);
+ if (!f.file)
+ return -EBADF;
+
+ vfio_group = kvm_vfio_group_get_external_user(f.file);
+ fdput(f);
+
+ list_for_each_entry(kvg, &kv->group_list, node) {
+ if (kvg->vfio_group == vfio_group) {
+ WARN_ON(kvg->liobn);
+ kvg->liobn = liobn;
+ kvm_vfio_group_put_external_user(vfio_group);
+ return 0;
+ }
+ }
+
+ kvm_vfio_group_put_external_user(vfio_group);
+
+ return -ENXIO;
+}
+
+struct iommu_group *vfio_find_group_by_liobn(struct kvm *kvm,
+ unsigned long liobn)
+{
+ struct kvm_vfio_group *kvg;
+
+ if (!kvm->arch.vfio)
+ return NULL;
+
+ list_for_each_entry(kvg, &kvm->arch.vfio->group_list, node) {
+ if (kvg->liobn == liobn) {
+ int group_id = vfio_external_user_iommu_id(
+ kvg->vfio_group);
+ struct iommu_group *grp =
+ iommu_group_get_by_id(group_id);
+ return grp;
+ }
+ }
+
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(vfio_find_group_by_liobn);
+#endif
+
static int kvm_vfio_set_attr(struct kvm_device *dev,
struct kvm_device_attr *attr)
{
switch (attr->group) {
case KVM_DEV_VFIO_GROUP:
return kvm_vfio_set_group(dev, attr->attr, attr->addr);
+#ifdef CONFIG_SPAPR_TCE_IOMMU
+ case KVM_DEV_VFIO_SPAPR_TCE_LIOBN:
+ return kvm_vfio_set_spapr_tce_liobn(dev, attr->attr,
+ attr->addr);
+#endif
}
return -ENXIO;
@@ -211,6 +278,10 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
}
break;
+#ifdef CONFIG_SPAPR_TCE_IOMMU
+ case KVM_DEV_VFIO_SPAPR_TCE_LIOBN:
+ return 0;
+#endif
}
return -ENXIO;
@@ -251,6 +322,9 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type)
mutex_init(&kv->lock);
dev->private = kv;
+#ifdef CONFIG_SPAPR_TCE_IOMMU
+ dev->kvm->arch.vfio = kv;
+#endif
return 0;
}
--
1.8.4.rc4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH v2] KVM: PPC: vfio kvm device: support spapr tce
2013-11-05 8:05 ` [RFC PATCH v2] KVM: PPC: vfio kvm device: support spapr tce Alexey Kardashevskiy
@ 2013-11-05 21:32 ` Alex Williamson
0 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2013-11-05 21:32 UTC (permalink / raw)
To: Alexey Kardashevskiy
Cc: linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras,
Alexander Graf, Gleb Natapov, Paolo Bonzini, kvm-ppc, kvm,
linux-kernel
On Tue, 2013-11-05 at 19:05 +1100, Alexey Kardashevskiy wrote:
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>
> Changes:
> v2:
> * it does not try to introduce a realmode search function.
> Instead, liobn-to-iommu-group lookup is done by VFIO KVM device
> in virtual mode and the result (iommu_group pointer) is cached
> in kvm_arch so the realmode handlers do not use VFIO KVM device for that.
> And the iommu groups get released on KVM termination.
>
> I tried this, seems viable.
>
> Did not I miss anything? Thanks.
A commit message ;)
> ---
> arch/powerpc/include/asm/kvm_host.h | 3 ++
> arch/powerpc/kvm/Kconfig | 1 +
> arch/powerpc/kvm/Makefile | 3 ++
> include/linux/vfio.h | 3 ++
> include/uapi/linux/kvm.h | 1 +
> virt/kvm/vfio.c | 74 +++++++++++++++++++++++++++++++++++++
> 6 files changed, 85 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 48dbe8b..e1163d7 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -293,6 +293,9 @@ struct kvm_arch {
> #ifdef CONFIG_KVM_XICS
> struct kvmppc_xics *xics;
> #endif
> +#ifdef CONFIG_KVM_VFIO
> + struct kvm_vfio *vfio;
> +#endif
> };
>
> /*
> diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
> index 61b3535..d1b7f64 100644
> --- a/arch/powerpc/kvm/Kconfig
> +++ b/arch/powerpc/kvm/Kconfig
> @@ -60,6 +60,7 @@ config KVM_BOOK3S_64
> select KVM_BOOK3S_64_HANDLER
> select KVM
> select SPAPR_TCE_IOMMU
> + select KVM_VFIO
> ---help---
> Support running unmodified book3s_64 and book3s_32 guest kernels
> in virtual machines on book3s_64 host processors.
> diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
> index 6646c95..2438d2e 100644
> --- a/arch/powerpc/kvm/Makefile
> +++ b/arch/powerpc/kvm/Makefile
> @@ -87,6 +87,9 @@ kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
> kvm-book3s_64-objs-$(CONFIG_KVM_XICS) += \
> book3s_xics.o
>
> +kvm-book3s_64-objs-$(CONFIG_KVM_VFIO) += \
> + $(KVM)/vfio.o \
> +
> kvm-book3s_64-module-objs := \
> $(KVM)/kvm_main.o \
> $(KVM)/eventfd.o \
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 24579a0..681e19b 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -97,4 +97,7 @@ extern struct vfio_group *vfio_group_get_external_user(struct file *filep);
> extern void vfio_group_put_external_user(struct vfio_group *group);
> extern int vfio_external_user_iommu_id(struct vfio_group *group);
>
> +extern struct iommu_group *vfio_find_group_by_liobn(struct kvm *kvm,
> + unsigned long liobn);
> +
Nope, this doesn't go in vfio.h, it's a function provided by kvm. It
should be named as such too, kvm_vfio_... It also depends on both
CONFIG_KVM_VFIO and CONFIG_SPAPR_TCE_IOMMU and needs stub version
otherwise. Is just _liobn specific enough or does it need a spapr_tce
thrown in to avoid confusion with embedded ppc folks?
> #endif /* VFIO_H */
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 7c1a349..a74ad16 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -847,6 +847,7 @@ struct kvm_device_attr {
> #define KVM_DEV_VFIO_GROUP 1
> #define KVM_DEV_VFIO_GROUP_ADD 1
> #define KVM_DEV_VFIO_GROUP_DEL 2
> +#define KVM_DEV_VFIO_SPAPR_TCE_LIOBN 2
I wonder if it would be better architecturally if this was an attribute
rather than a new group, ex:
#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE_LIOBN 3
It's a mouthful, but we are setting an attribute of a VFIO group, so it
makes sense. kvm_device_attr.addr would then need to point to a struct
containing both the fd and liobn.
Whatever we come up with need a documentation addition in
Documentation/virtual/kvm/devices/vfio.txt.
>
> /*
> * ioctls for VM fds
> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
> index ca4260e..f9271d5 100644
> --- a/virt/kvm/vfio.c
> +++ b/virt/kvm/vfio.c
> @@ -22,6 +22,9 @@
> struct kvm_vfio_group {
> struct list_head node;
> struct vfio_group *vfio_group;
> +#ifdef CONFIG_SPAPR_TCE_IOMMU
> + uint64_t liobn;
Why is liobn an unsigned long in the exported function but a uint64_t
here?
> +#endif
> };
>
> struct kvm_vfio {
> @@ -188,12 +191,76 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, u64 arg)
> return -ENXIO;
> }
>
> +#ifdef CONFIG_SPAPR_TCE_IOMMU
> +static int kvm_vfio_set_spapr_tce_liobn(struct kvm_device *dev,
> + long attr, u64 arg)
> +{
> + struct kvm_vfio *kv = dev->private;
> + struct vfio_group *vfio_group;
> + struct kvm_vfio_group *kvg;
> + void __user *argp = (void __user *)arg;
> + struct fd f;
> + int32_t fd;
> + uint64_t liobn = attr;
> +
> + if (get_user(fd, (int32_t __user *)argp))
> + return -EFAULT;
> +
> + f = fdget(fd);
> + if (!f.file)
> + return -EBADF;
> +
> + vfio_group = kvm_vfio_group_get_external_user(f.file);
> + fdput(f);
Not sure why you dropped this from the example of kvm_vfio_set_group:
if (IS_ERR(vfio_group))
return PTR_ERR(vfio_group);
You're also ignoring kv->lock.
> +
> + list_for_each_entry(kvg, &kv->group_list, node) {
> + if (kvg->vfio_group == vfio_group) {
> + WARN_ON(kvg->liobn);
Userspace should not be able to trigger a WARN this easily. Return
EBUSY if it's an error, otherwise let it go.
> + kvg->liobn = liobn;
> + kvm_vfio_group_put_external_user(vfio_group);
> + return 0;
> + }
> + }
> +
> + kvm_vfio_group_put_external_user(vfio_group);
> +
> + return -ENXIO;
> +}
> +
> +struct iommu_group *vfio_find_group_by_liobn(struct kvm *kvm,
> + unsigned long liobn)
As mentioned, kvm_vfio_...
> +{
> + struct kvm_vfio_group *kvg;
> +
> + if (!kvm->arch.vfio)
> + return NULL;
If this is already a slow path you can avoid stashing this pointer and
search kvm->devices for the matching ops struct.
kv->lock...
> +
> + list_for_each_entry(kvg, &kvm->arch.vfio->group_list, node) {
> + if (kvg->liobn == liobn) {
> + int group_id = vfio_external_user_iommu_id(
> + kvg->vfio_group);
> + struct iommu_group *grp =
> + iommu_group_get_by_id(group_id);
nit, ugly line wrapping. Where's the iommu_group_put() done?
> + return grp;
So you've now got an liobn to iommu_group mapping cached away somewhere,
what happens when a group is removed? Would it be invalid for userspace
to re-use the liobn? Do we need a way to invalidate your cached entry
and perhaps do an iommu_group_put()?
This version is actually plausible, so big improvement from v1! Thanks,
Alex
> + }
> + }
> +
> + return NULL;
> +}
> +EXPORT_SYMBOL_GPL(vfio_find_group_by_liobn);
> +#endif
> +
> static int kvm_vfio_set_attr(struct kvm_device *dev,
> struct kvm_device_attr *attr)
> {
> switch (attr->group) {
> case KVM_DEV_VFIO_GROUP:
> return kvm_vfio_set_group(dev, attr->attr, attr->addr);
> +#ifdef CONFIG_SPAPR_TCE_IOMMU
> + case KVM_DEV_VFIO_SPAPR_TCE_LIOBN:
> + return kvm_vfio_set_spapr_tce_liobn(dev, attr->attr,
> + attr->addr);
> +#endif
> }
>
> return -ENXIO;
> @@ -211,6 +278,10 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
> }
>
> break;
> +#ifdef CONFIG_SPAPR_TCE_IOMMU
> + case KVM_DEV_VFIO_SPAPR_TCE_LIOBN:
> + return 0;
> +#endif
> }
>
> return -ENXIO;
> @@ -251,6 +322,9 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type)
> mutex_init(&kv->lock);
>
> dev->private = kv;
> +#ifdef CONFIG_SPAPR_TCE_IOMMU
> + dev->kvm->arch.vfio = kv;
> +#endif
>
> return 0;
> }
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-11-05 21:32 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-30 17:02 [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Alex Williamson
2013-10-30 17:02 ` [PATCH v2 1/3] kvm: Add VFIO device Alex Williamson
2013-10-30 17:02 ` [PATCH v2 2/3] kvm/x86: Convert iommu_flags to iommu_noncoherent Alex Williamson
2013-10-30 17:02 ` [PATCH v2 3/3] kvm: Create non-coherent DMA registeration Alex Williamson
2013-10-30 18:03 ` [PATCH v2 0/3] KVM noncoherent DMA registration and VFIO pseudo device Paolo Bonzini
2013-11-05 8:05 ` [RFC PATCH v2] KVM: PPC: vfio kvm device: support spapr tce Alexey Kardashevskiy
2013-11-05 21:32 ` Alex Williamson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).