All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-05  7:25 ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Benjamin Herrenschmidt, Paul Mackerras,
	Gleb Natapov, Paolo Bonzini, Alexander Graf, kvm, linux-kernel,
	kvm-ppc

This reserves 2 capability numbers.

This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.

Please advise how to proceed with these patches as I suspect that
first two should go via Paolo's tree while the last one via Alex Graf's tree
(correct?).

Thanks!

Alexey Kardashevskiy (3):
  PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number
  PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_64 capability number
  PPC: KVM: Add support for 64bit TCE windows

 Documentation/virtual/kvm/api.txt   | 46 +++++++++++++++++++++++++++++++++++++
 arch/powerpc/include/asm/kvm_host.h |  4 +++-
 arch/powerpc/include/asm/kvm_ppc.h  |  2 +-
 arch/powerpc/include/uapi/asm/kvm.h |  9 ++++++++
 arch/powerpc/kvm/book3s_64_vio.c    |  4 +++-
 arch/powerpc/kvm/powerpc.c          | 24 ++++++++++++++++++-
 include/uapi/linux/kvm.h            |  4 ++++
 7 files changed, 89 insertions(+), 4 deletions(-)

-- 
2.0.0


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-05  7:25 ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm, Alexey Kardashevskiy, Alexander Graf, kvm-ppc, linux-kernel,
	Gleb Natapov, Paul Mackerras, Paolo Bonzini

This reserves 2 capability numbers.

This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.

Please advise how to proceed with these patches as I suspect that
first two should go via Paolo's tree while the last one via Alex Graf's tree
(correct?).

Thanks!

Alexey Kardashevskiy (3):
  PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number
  PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_64 capability number
  PPC: KVM: Add support for 64bit TCE windows

 Documentation/virtual/kvm/api.txt   | 46 +++++++++++++++++++++++++++++++++++++
 arch/powerpc/include/asm/kvm_host.h |  4 +++-
 arch/powerpc/include/asm/kvm_ppc.h  |  2 +-
 arch/powerpc/include/uapi/asm/kvm.h |  9 ++++++++
 arch/powerpc/kvm/book3s_64_vio.c    |  4 +++-
 arch/powerpc/kvm/powerpc.c          | 24 ++++++++++++++++++-
 include/uapi/linux/kvm.h            |  4 ++++
 7 files changed, 89 insertions(+), 4 deletions(-)

-- 
2.0.0

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-05  7:25 ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Benjamin Herrenschmidt, Paul Mackerras,
	Gleb Natapov, Paolo Bonzini, Alexander Graf, kvm, linux-kernel,
	kvm-ppc

This reserves 2 capability numbers.

This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.

Please advise how to proceed with these patches as I suspect that
first two should go via Paolo's tree while the last one via Alex Graf's tree
(correct?).

Thanks!

Alexey Kardashevskiy (3):
  PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number
  PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_64 capability number
  PPC: KVM: Add support for 64bit TCE windows

 Documentation/virtual/kvm/api.txt   | 46 +++++++++++++++++++++++++++++++++++++
 arch/powerpc/include/asm/kvm_host.h |  4 +++-
 arch/powerpc/include/asm/kvm_ppc.h  |  2 +-
 arch/powerpc/include/uapi/asm/kvm.h |  9 ++++++++
 arch/powerpc/kvm/book3s_64_vio.c    |  4 +++-
 arch/powerpc/kvm/powerpc.c          | 24 ++++++++++++++++++-
 include/uapi/linux/kvm.h            |  4 ++++
 7 files changed, 89 insertions(+), 4 deletions(-)

-- 
2.0.0


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number
  2014-06-05  7:25 ` Alexey Kardashevskiy
  (?)
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Benjamin Herrenschmidt, Paul Mackerras,
	Gleb Natapov, Paolo Bonzini, Alexander Graf, kvm, linux-kernel,
	kvm-ppc

This adds a capability number for in-kernel support for VFIO on
SPAPR platform.

The capability will tell the user space whether in-kernel handlers of
H_PUT_TCE can handle VFIO-targeted requests or not. If not, the user space
must not attempt allocating a TCE table in the host kernel via
the KVM_CREATE_SPAPR_TCE KVM ioctl because in that case TCE requests
will not be passed to the user space which is desired action in
the situation like that.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index a8f4ee5..944cd21 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -743,6 +743,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_IOAPIC_POLARITY_IGNORED 97
 #define KVM_CAP_ENABLE_CAP_VM 98
 #define KVM_CAP_S390_IRQCHIP 99
+#define KVM_CAP_SPAPR_TCE_VFIO 100
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm, Alexey Kardashevskiy, Alexander Graf, kvm-ppc, linux-kernel,
	Gleb Natapov, Paul Mackerras, Paolo Bonzini

This adds a capability number for in-kernel support for VFIO on
SPAPR platform.

The capability will tell the user space whether in-kernel handlers of
H_PUT_TCE can handle VFIO-targeted requests or not. If not, the user space
must not attempt allocating a TCE table in the host kernel via
the KVM_CREATE_SPAPR_TCE KVM ioctl because in that case TCE requests
will not be passed to the user space which is desired action in
the situation like that.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index a8f4ee5..944cd21 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -743,6 +743,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_IOAPIC_POLARITY_IGNORED 97
 #define KVM_CAP_ENABLE_CAP_VM 98
 #define KVM_CAP_S390_IRQCHIP 99
+#define KVM_CAP_SPAPR_TCE_VFIO 100
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Benjamin Herrenschmidt, Paul Mackerras,
	Gleb Natapov, Paolo Bonzini, Alexander Graf, kvm, linux-kernel,
	kvm-ppc

This adds a capability number for in-kernel support for VFIO on
SPAPR platform.

The capability will tell the user space whether in-kernel handlers of
H_PUT_TCE can handle VFIO-targeted requests or not. If not, the user space
must not attempt allocating a TCE table in the host kernel via
the KVM_CREATE_SPAPR_TCE KVM ioctl because in that case TCE requests
will not be passed to the user space which is desired action in
the situation like that.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index a8f4ee5..944cd21 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -743,6 +743,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_IOAPIC_POLARITY_IGNORED 97
 #define KVM_CAP_ENABLE_CAP_VM 98
 #define KVM_CAP_S390_IRQCHIP 99
+#define KVM_CAP_SPAPR_TCE_VFIO 100
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_64 capability number
  2014-06-05  7:25 ` Alexey Kardashevskiy
  (?)
  (?)
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Benjamin Herrenschmidt, Paul Mackerras,
	Gleb Natapov, Paolo Bonzini, Alexander Graf, kvm, linux-kernel,
	kvm-ppc

This adds a capability number for 64-bit TCE tables support.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 944cd21..e6972bf 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -744,6 +744,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ENABLE_CAP_VM 98
 #define KVM_CAP_S390_IRQCHIP 99
 #define KVM_CAP_SPAPR_TCE_VFIO 100
+#define KVM_CAP_SPAPR_TCE_64 101
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_64 capability number
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm, Alexey Kardashevskiy, Alexander Graf, kvm-ppc, linux-kernel,
	Gleb Natapov, Paul Mackerras, Paolo Bonzini

This adds a capability number for 64-bit TCE tables support.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 944cd21..e6972bf 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -744,6 +744,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ENABLE_CAP_VM 98
 #define KVM_CAP_S390_IRQCHIP 99
 #define KVM_CAP_SPAPR_TCE_VFIO 100
+#define KVM_CAP_SPAPR_TCE_64 101
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.0.0

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_64 capability number
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm, Alexey Kardashevskiy, Alexander Graf, kvm-ppc, linux-kernel,
	Gleb Natapov, Paul Mackerras, Paolo Bonzini

This adds a capability number for 64-bit TCE tables support.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 944cd21..e6972bf 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -744,6 +744,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ENABLE_CAP_VM 98
 #define KVM_CAP_S390_IRQCHIP 99
 #define KVM_CAP_SPAPR_TCE_VFIO 100
+#define KVM_CAP_SPAPR_TCE_64 101
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_64 capability number
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Benjamin Herrenschmidt, Paul Mackerras,
	Gleb Natapov, Paolo Bonzini, Alexander Graf, kvm, linux-kernel,
	kvm-ppc

This adds a capability number for 64-bit TCE tables support.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/uapi/linux/kvm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 944cd21..e6972bf 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -744,6 +744,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ENABLE_CAP_VM 98
 #define KVM_CAP_S390_IRQCHIP 99
 #define KVM_CAP_SPAPR_TCE_VFIO 100
+#define KVM_CAP_SPAPR_TCE_64 101
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
  2014-06-05  7:25 ` Alexey Kardashevskiy
  (?)
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Benjamin Herrenschmidt, Paul Mackerras,
	Gleb Natapov, Paolo Bonzini, Alexander Graf, kvm, linux-kernel,
	kvm-ppc

The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not
enough for directly mapped windows as the guest can get more than 4GB.

This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it
via KVM_CAP_SPAPR_TCE_64 capability.

Since 64bit windows are to support Dynamic DMA windows (DDW), let's add
@bus_offset and @page_shift which are also required by DDW.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 Documentation/virtual/kvm/api.txt   | 46 +++++++++++++++++++++++++++++++++++++
 arch/powerpc/include/asm/kvm_host.h |  4 +++-
 arch/powerpc/include/asm/kvm_ppc.h  |  2 +-
 arch/powerpc/include/uapi/asm/kvm.h |  9 ++++++++
 arch/powerpc/kvm/book3s_64_vio.c    |  4 +++-
 arch/powerpc/kvm/powerpc.c          | 24 ++++++++++++++++++-
 include/uapi/linux/kvm.h            |  2 ++
 7 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b4f5365..8a2a2da 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2484,6 +2484,52 @@ calls by the guest for that service will be passed to userspace to be
 handled.
 
 
+4.87 KVM_CREATE_SPAPR_TCE_64
+
+Capability: KVM_CAP_SPAPR_TCE_64
+Architectures: powerpc
+Type: vm ioctl
+Parameters: struct kvm_create_spapr_tce_64 (in)
+Returns: file descriptor for manipulating the created TCE table
+
+This is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit
+windows.
+
+This creates a virtual TCE (translation control entry) table, which
+is an IOMMU for PAPR-style virtual I/O.  It is used to translate
+logical addresses used in virtual I/O into guest physical addresses,
+and provides a scatter/gather capability for PAPR virtual I/O.
+
+/* for KVM_CAP_SPAPR_TCE_64 */
+struct kvm_create_spapr_tce_64 {
+	__u64 liobn;
+	__u64 window_size;
+	__u64 bus_offset;
+	__u32 page_shift;
+	__u32 flags;
+};
+
+The liobn field gives the logical IO bus number for which to create a
+TCE table. The window_size field specifies the size of the DMA window
+which this TCE table will translate - the table will contain one 64
+bit TCE entry for every IOMMU page. The bus_offset field tells where
+this window is mapped on the IO bus. The page_size field tells a size
+of the pages in this window, can be 4K, 64K, 16MB, etc. The flags field
+is not used at the moment but provides the room for extensions.
+
+When the guest issues an H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE hcall
+on a liobn for which a TCE table has been created using this ioctl(),
+the kernel will handle it in real or virtual mode, updating the TCE table.
+If liobn has not been registered with this ioctl, H_PUT_TCE/etc calls
+will cause a vm exit and must be handled by userspace.
+
+The return value is a file descriptor which can be passed to mmap(2)
+to map the created TCE table into userspace.  This lets userspace read
+the entries written by kernel-handled H_PUT_TCE calls, and also lets
+userspace update the TCE table directly which is useful in some
+circumstances.
+
+
 5. The kvm_run structure
 ------------------------
 
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 1eaea2d..260a810 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -179,7 +179,9 @@ struct kvmppc_spapr_tce_table {
 	struct list_head list;
 	struct kvm *kvm;
 	u64 liobn;
-	u32 window_size;
+	u64 window_size;
+	u64 bus_offset;
+	u32 page_shift;
 	struct page *pages[0];
 };
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 4096f16..b472fd3 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -126,7 +126,7 @@ extern void kvmppc_map_vrma(struct kvm_vcpu *vcpu,
 extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);
 
 extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
-				struct kvm_create_spapr_tce *args);
+				struct kvm_create_spapr_tce_64 *args);
 extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 			     unsigned long ioba, unsigned long tce);
 extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index a6665be..0ada7b4 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -333,6 +333,15 @@ struct kvm_create_spapr_tce {
 	__u32 window_size;
 };
 
+/* for KVM_CAP_SPAPR_TCE_64 */
+struct kvm_create_spapr_tce_64 {
+	__u64 liobn;
+	__u64 window_size;
+	__u64 bus_offset;
+	__u32 page_shift;
+	__u32 flags;
+};
+
 /* for KVM_ALLOCATE_RMA */
 struct kvm_allocate_rma {
 	__u64 rma_size;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 54cf9bc..230fa5f 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -98,7 +98,7 @@ static const struct file_operations kvm_spapr_tce_fops = {
 };
 
 long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
-				   struct kvm_create_spapr_tce *args)
+				   struct kvm_create_spapr_tce_64 *args)
 {
 	struct kvmppc_spapr_tce_table *stt = NULL;
 	long npages;
@@ -120,6 +120,8 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
 
 	stt->liobn = args->liobn;
 	stt->window_size = args->window_size;
+	stt->bus_offset = args->bus_offset;
+	stt->page_shift = args->page_shift;
 	stt->kvm = kvm;
 
 	for (i = 0; i < npages; i++) {
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cf541a..3b78b8d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -33,6 +33,7 @@
 #include <asm/tlbflush.h>
 #include <asm/cputhreads.h>
 #include <asm/irqflags.h>
+#include <asm/iommu.h>
 #include "timing.h"
 #include "irq.h"
 #include "../mm/mmu_decl.h"
@@ -373,6 +374,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_CAP_SPAPR_TCE:
+	case KVM_CAP_SPAPR_TCE_64:
 	case KVM_CAP_PPC_ALLOC_HTAB:
 	case KVM_CAP_PPC_RTAS:
 #ifdef CONFIG_KVM_XICS
@@ -1077,13 +1079,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		break;
 	}
 #ifdef CONFIG_PPC_BOOK3S_64
+	case KVM_CREATE_SPAPR_TCE_64: {
+		struct kvm_create_spapr_tce_64 create_tce_64;
+
+		r = -EFAULT;
+		if (copy_from_user(&create_tce_64, argp, sizeof(create_tce_64)))
+			goto out;
+		if (create_tce_64.flags) {
+			r = -EINVAL;
+			goto out;
+		}
+		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce_64);
+		goto out;
+	}
 	case KVM_CREATE_SPAPR_TCE: {
 		struct kvm_create_spapr_tce create_tce;
+		struct kvm_create_spapr_tce_64 create_tce_64;
 
 		r = -EFAULT;
 		if (copy_from_user(&create_tce, argp, sizeof(create_tce)))
 			goto out;
-		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce);
+
+		create_tce_64.liobn = create_tce.liobn;
+		create_tce_64.window_size = create_tce.window_size;
+		create_tce_64.bus_offset = 0;
+		create_tce_64.page_shift = IOMMU_PAGE_SHIFT_4K;
+		create_tce_64.flags = 0;
+		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce_64);
 		goto out;
 	}
 	case KVM_PPC_GET_SMMU_INFO: {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index e6972bf..c435cbb 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1018,6 +1018,8 @@ struct kvm_s390_ucas_mapping {
 /* Available with KVM_CAP_PPC_ALLOC_HTAB */
 #define KVM_PPC_ALLOCATE_HTAB	  _IOWR(KVMIO, 0xa7, __u32)
 #define KVM_CREATE_SPAPR_TCE	  _IOW(KVMIO,  0xa8, struct kvm_create_spapr_tce)
+#define KVM_CREATE_SPAPR_TCE_64	  _IOW(KVMIO,  0xa8, \
+				       struct kvm_create_spapr_tce_64)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA	  _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
 /* Available with KVM_CAP_PPC_HTAB_FD */
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: kvm, Alexey Kardashevskiy, Alexander Graf, kvm-ppc, linux-kernel,
	Gleb Natapov, Paul Mackerras, Paolo Bonzini

The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not
enough for directly mapped windows as the guest can get more than 4GB.

This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it
via KVM_CAP_SPAPR_TCE_64 capability.

Since 64bit windows are to support Dynamic DMA windows (DDW), let's add
@bus_offset and @page_shift which are also required by DDW.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 Documentation/virtual/kvm/api.txt   | 46 +++++++++++++++++++++++++++++++++++++
 arch/powerpc/include/asm/kvm_host.h |  4 +++-
 arch/powerpc/include/asm/kvm_ppc.h  |  2 +-
 arch/powerpc/include/uapi/asm/kvm.h |  9 ++++++++
 arch/powerpc/kvm/book3s_64_vio.c    |  4 +++-
 arch/powerpc/kvm/powerpc.c          | 24 ++++++++++++++++++-
 include/uapi/linux/kvm.h            |  2 ++
 7 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b4f5365..8a2a2da 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2484,6 +2484,52 @@ calls by the guest for that service will be passed to userspace to be
 handled.
 
 
+4.87 KVM_CREATE_SPAPR_TCE_64
+
+Capability: KVM_CAP_SPAPR_TCE_64
+Architectures: powerpc
+Type: vm ioctl
+Parameters: struct kvm_create_spapr_tce_64 (in)
+Returns: file descriptor for manipulating the created TCE table
+
+This is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit
+windows.
+
+This creates a virtual TCE (translation control entry) table, which
+is an IOMMU for PAPR-style virtual I/O.  It is used to translate
+logical addresses used in virtual I/O into guest physical addresses,
+and provides a scatter/gather capability for PAPR virtual I/O.
+
+/* for KVM_CAP_SPAPR_TCE_64 */
+struct kvm_create_spapr_tce_64 {
+	__u64 liobn;
+	__u64 window_size;
+	__u64 bus_offset;
+	__u32 page_shift;
+	__u32 flags;
+};
+
+The liobn field gives the logical IO bus number for which to create a
+TCE table. The window_size field specifies the size of the DMA window
+which this TCE table will translate - the table will contain one 64
+bit TCE entry for every IOMMU page. The bus_offset field tells where
+this window is mapped on the IO bus. The page_size field tells a size
+of the pages in this window, can be 4K, 64K, 16MB, etc. The flags field
+is not used at the moment but provides the room for extensions.
+
+When the guest issues an H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE hcall
+on a liobn for which a TCE table has been created using this ioctl(),
+the kernel will handle it in real or virtual mode, updating the TCE table.
+If liobn has not been registered with this ioctl, H_PUT_TCE/etc calls
+will cause a vm exit and must be handled by userspace.
+
+The return value is a file descriptor which can be passed to mmap(2)
+to map the created TCE table into userspace.  This lets userspace read
+the entries written by kernel-handled H_PUT_TCE calls, and also lets
+userspace update the TCE table directly which is useful in some
+circumstances.
+
+
 5. The kvm_run structure
 ------------------------
 
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 1eaea2d..260a810 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -179,7 +179,9 @@ struct kvmppc_spapr_tce_table {
 	struct list_head list;
 	struct kvm *kvm;
 	u64 liobn;
-	u32 window_size;
+	u64 window_size;
+	u64 bus_offset;
+	u32 page_shift;
 	struct page *pages[0];
 };
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 4096f16..b472fd3 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -126,7 +126,7 @@ extern void kvmppc_map_vrma(struct kvm_vcpu *vcpu,
 extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);
 
 extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
-				struct kvm_create_spapr_tce *args);
+				struct kvm_create_spapr_tce_64 *args);
 extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 			     unsigned long ioba, unsigned long tce);
 extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index a6665be..0ada7b4 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -333,6 +333,15 @@ struct kvm_create_spapr_tce {
 	__u32 window_size;
 };
 
+/* for KVM_CAP_SPAPR_TCE_64 */
+struct kvm_create_spapr_tce_64 {
+	__u64 liobn;
+	__u64 window_size;
+	__u64 bus_offset;
+	__u32 page_shift;
+	__u32 flags;
+};
+
 /* for KVM_ALLOCATE_RMA */
 struct kvm_allocate_rma {
 	__u64 rma_size;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 54cf9bc..230fa5f 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -98,7 +98,7 @@ static const struct file_operations kvm_spapr_tce_fops = {
 };
 
 long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
-				   struct kvm_create_spapr_tce *args)
+				   struct kvm_create_spapr_tce_64 *args)
 {
 	struct kvmppc_spapr_tce_table *stt = NULL;
 	long npages;
@@ -120,6 +120,8 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
 
 	stt->liobn = args->liobn;
 	stt->window_size = args->window_size;
+	stt->bus_offset = args->bus_offset;
+	stt->page_shift = args->page_shift;
 	stt->kvm = kvm;
 
 	for (i = 0; i < npages; i++) {
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cf541a..3b78b8d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -33,6 +33,7 @@
 #include <asm/tlbflush.h>
 #include <asm/cputhreads.h>
 #include <asm/irqflags.h>
+#include <asm/iommu.h>
 #include "timing.h"
 #include "irq.h"
 #include "../mm/mmu_decl.h"
@@ -373,6 +374,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_CAP_SPAPR_TCE:
+	case KVM_CAP_SPAPR_TCE_64:
 	case KVM_CAP_PPC_ALLOC_HTAB:
 	case KVM_CAP_PPC_RTAS:
 #ifdef CONFIG_KVM_XICS
@@ -1077,13 +1079,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		break;
 	}
 #ifdef CONFIG_PPC_BOOK3S_64
+	case KVM_CREATE_SPAPR_TCE_64: {
+		struct kvm_create_spapr_tce_64 create_tce_64;
+
+		r = -EFAULT;
+		if (copy_from_user(&create_tce_64, argp, sizeof(create_tce_64)))
+			goto out;
+		if (create_tce_64.flags) {
+			r = -EINVAL;
+			goto out;
+		}
+		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce_64);
+		goto out;
+	}
 	case KVM_CREATE_SPAPR_TCE: {
 		struct kvm_create_spapr_tce create_tce;
+		struct kvm_create_spapr_tce_64 create_tce_64;
 
 		r = -EFAULT;
 		if (copy_from_user(&create_tce, argp, sizeof(create_tce)))
 			goto out;
-		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce);
+
+		create_tce_64.liobn = create_tce.liobn;
+		create_tce_64.window_size = create_tce.window_size;
+		create_tce_64.bus_offset = 0;
+		create_tce_64.page_shift = IOMMU_PAGE_SHIFT_4K;
+		create_tce_64.flags = 0;
+		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce_64);
 		goto out;
 	}
 	case KVM_PPC_GET_SMMU_INFO: {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index e6972bf..c435cbb 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1018,6 +1018,8 @@ struct kvm_s390_ucas_mapping {
 /* Available with KVM_CAP_PPC_ALLOC_HTAB */
 #define KVM_PPC_ALLOCATE_HTAB	  _IOWR(KVMIO, 0xa7, __u32)
 #define KVM_CREATE_SPAPR_TCE	  _IOW(KVMIO,  0xa8, struct kvm_create_spapr_tce)
+#define KVM_CREATE_SPAPR_TCE_64	  _IOW(KVMIO,  0xa8, \
+				       struct kvm_create_spapr_tce_64)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA	  _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
 /* Available with KVM_CAP_PPC_HTAB_FD */
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05  7:25   ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  7:25 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Alexey Kardashevskiy, Benjamin Herrenschmidt, Paul Mackerras,
	Gleb Natapov, Paolo Bonzini, Alexander Graf, kvm, linux-kernel,
	kvm-ppc

The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not
enough for directly mapped windows as the guest can get more than 4GB.

This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it
via KVM_CAP_SPAPR_TCE_64 capability.

Since 64bit windows are to support Dynamic DMA windows (DDW), let's add
@bus_offset and @page_shift which are also required by DDW.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 Documentation/virtual/kvm/api.txt   | 46 +++++++++++++++++++++++++++++++++++++
 arch/powerpc/include/asm/kvm_host.h |  4 +++-
 arch/powerpc/include/asm/kvm_ppc.h  |  2 +-
 arch/powerpc/include/uapi/asm/kvm.h |  9 ++++++++
 arch/powerpc/kvm/book3s_64_vio.c    |  4 +++-
 arch/powerpc/kvm/powerpc.c          | 24 ++++++++++++++++++-
 include/uapi/linux/kvm.h            |  2 ++
 7 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b4f5365..8a2a2da 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2484,6 +2484,52 @@ calls by the guest for that service will be passed to userspace to be
 handled.
 
 
+4.87 KVM_CREATE_SPAPR_TCE_64
+
+Capability: KVM_CAP_SPAPR_TCE_64
+Architectures: powerpc
+Type: vm ioctl
+Parameters: struct kvm_create_spapr_tce_64 (in)
+Returns: file descriptor for manipulating the created TCE table
+
+This is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit
+windows.
+
+This creates a virtual TCE (translation control entry) table, which
+is an IOMMU for PAPR-style virtual I/O.  It is used to translate
+logical addresses used in virtual I/O into guest physical addresses,
+and provides a scatter/gather capability for PAPR virtual I/O.
+
+/* for KVM_CAP_SPAPR_TCE_64 */
+struct kvm_create_spapr_tce_64 {
+	__u64 liobn;
+	__u64 window_size;
+	__u64 bus_offset;
+	__u32 page_shift;
+	__u32 flags;
+};
+
+The liobn field gives the logical IO bus number for which to create a
+TCE table. The window_size field specifies the size of the DMA window
+which this TCE table will translate - the table will contain one 64
+bit TCE entry for every IOMMU page. The bus_offset field tells where
+this window is mapped on the IO bus. The page_size field tells a size
+of the pages in this window, can be 4K, 64K, 16MB, etc. The flags field
+is not used at the moment but provides the room for extensions.
+
+When the guest issues an H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE hcall
+on a liobn for which a TCE table has been created using this ioctl(),
+the kernel will handle it in real or virtual mode, updating the TCE table.
+If liobn has not been registered with this ioctl, H_PUT_TCE/etc calls
+will cause a vm exit and must be handled by userspace.
+
+The return value is a file descriptor which can be passed to mmap(2)
+to map the created TCE table into userspace.  This lets userspace read
+the entries written by kernel-handled H_PUT_TCE calls, and also lets
+userspace update the TCE table directly which is useful in some
+circumstances.
+
+
 5. The kvm_run structure
 ------------------------
 
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 1eaea2d..260a810 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -179,7 +179,9 @@ struct kvmppc_spapr_tce_table {
 	struct list_head list;
 	struct kvm *kvm;
 	u64 liobn;
-	u32 window_size;
+	u64 window_size;
+	u64 bus_offset;
+	u32 page_shift;
 	struct page *pages[0];
 };
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 4096f16..b472fd3 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -126,7 +126,7 @@ extern void kvmppc_map_vrma(struct kvm_vcpu *vcpu,
 extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);
 
 extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
-				struct kvm_create_spapr_tce *args);
+				struct kvm_create_spapr_tce_64 *args);
 extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 			     unsigned long ioba, unsigned long tce);
 extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index a6665be..0ada7b4 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -333,6 +333,15 @@ struct kvm_create_spapr_tce {
 	__u32 window_size;
 };
 
+/* for KVM_CAP_SPAPR_TCE_64 */
+struct kvm_create_spapr_tce_64 {
+	__u64 liobn;
+	__u64 window_size;
+	__u64 bus_offset;
+	__u32 page_shift;
+	__u32 flags;
+};
+
 /* for KVM_ALLOCATE_RMA */
 struct kvm_allocate_rma {
 	__u64 rma_size;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 54cf9bc..230fa5f 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -98,7 +98,7 @@ static const struct file_operations kvm_spapr_tce_fops = {
 };
 
 long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
-				   struct kvm_create_spapr_tce *args)
+				   struct kvm_create_spapr_tce_64 *args)
 {
 	struct kvmppc_spapr_tce_table *stt = NULL;
 	long npages;
@@ -120,6 +120,8 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
 
 	stt->liobn = args->liobn;
 	stt->window_size = args->window_size;
+	stt->bus_offset = args->bus_offset;
+	stt->page_shift = args->page_shift;
 	stt->kvm = kvm;
 
 	for (i = 0; i < npages; i++) {
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cf541a..3b78b8d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -33,6 +33,7 @@
 #include <asm/tlbflush.h>
 #include <asm/cputhreads.h>
 #include <asm/irqflags.h>
+#include <asm/iommu.h>
 #include "timing.h"
 #include "irq.h"
 #include "../mm/mmu_decl.h"
@@ -373,6 +374,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_CAP_SPAPR_TCE:
+	case KVM_CAP_SPAPR_TCE_64:
 	case KVM_CAP_PPC_ALLOC_HTAB:
 	case KVM_CAP_PPC_RTAS:
 #ifdef CONFIG_KVM_XICS
@@ -1077,13 +1079,33 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		break;
 	}
 #ifdef CONFIG_PPC_BOOK3S_64
+	case KVM_CREATE_SPAPR_TCE_64: {
+		struct kvm_create_spapr_tce_64 create_tce_64;
+
+		r = -EFAULT;
+		if (copy_from_user(&create_tce_64, argp, sizeof(create_tce_64)))
+			goto out;
+		if (create_tce_64.flags) {
+			r = -EINVAL;
+			goto out;
+		}
+		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce_64);
+		goto out;
+	}
 	case KVM_CREATE_SPAPR_TCE: {
 		struct kvm_create_spapr_tce create_tce;
+		struct kvm_create_spapr_tce_64 create_tce_64;
 
 		r = -EFAULT;
 		if (copy_from_user(&create_tce, argp, sizeof(create_tce)))
 			goto out;
-		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce);
+
+		create_tce_64.liobn = create_tce.liobn;
+		create_tce_64.window_size = create_tce.window_size;
+		create_tce_64.bus_offset = 0;
+		create_tce_64.page_shift = IOMMU_PAGE_SHIFT_4K;
+		create_tce_64.flags = 0;
+		r = kvm_vm_ioctl_create_spapr_tce(kvm, &create_tce_64);
 		goto out;
 	}
 	case KVM_PPC_GET_SMMU_INFO: {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index e6972bf..c435cbb 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1018,6 +1018,8 @@ struct kvm_s390_ucas_mapping {
 /* Available with KVM_CAP_PPC_ALLOC_HTAB */
 #define KVM_PPC_ALLOCATE_HTAB	  _IOWR(KVMIO, 0xa7, __u32)
 #define KVM_CREATE_SPAPR_TCE	  _IOW(KVMIO,  0xa8, struct kvm_create_spapr_tce)
+#define KVM_CREATE_SPAPR_TCE_64	  _IOW(KVMIO,  0xa8, \
+				       struct kvm_create_spapr_tce_64)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA	  _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
 /* Available with KVM_CAP_PPC_HTAB_FD */
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
  2014-06-05  7:25   ` Alexey Kardashevskiy
  (?)
@ 2014-06-05  7:38     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05  7:38 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini,
	Alexander Graf, kvm, linux-kernel, kvm-ppc

On Thu, 2014-06-05 at 17:25 +1000, Alexey Kardashevskiy wrote:
> +This creates a virtual TCE (translation control entry) table, which
> +is an IOMMU for PAPR-style virtual I/O.  It is used to translate
> +logical addresses used in virtual I/O into guest physical addresses,
> +and provides a scatter/gather capability for PAPR virtual I/O.
> +
> +/* for KVM_CAP_SPAPR_TCE_64 */
> +struct kvm_create_spapr_tce_64 {
> +       __u64 liobn;
> +       __u64 window_size;
> +       __u64 bus_offset;
> +       __u32 page_shift;
> +       __u32 flags;
> +};
> +
> +The liobn field gives the logical IO bus number for which to create a
> +TCE table. The window_size field specifies the size of the DMA window
> +which this TCE table will translate - the table will contain one 64
> +bit TCE entry for every IOMMU page. The bus_offset field tells where
> +this window is mapped on the IO bus. 

Hrm, the bus_offset cannot be set arbitrarily, it has some pretty strong
HW limits depending on the type of bridge & architecture version...

Do you plan to have that knowledge in qemu ? Or do you have some other
mechanism to query it ? (I might be missing a piece of the puzzle here).

Also one thing I've been pondering ...

We'll end up wasting a ton of memory with those TCE tables. If you have
3 PEs mapped into a guest, it will try to create 3 DDW's mapping the
entire guest memory and so 3 TCE tables large enough for that ... and
which will contain exactly the same entries !

We really want to look into extending PAPR to allow the creation of
table "aliases" so that the guest can essentially create one table and
associate it with multiple PEs. We might still decide to do multiple
copies for NUMA reasons but no more than one per node for example... at
least we can have the policy in qemu/kvm.

Also, do you currently require allocating a single physically contiguous
table or do you support TCE trees in your implementation ?

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05  7:38     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05  7:38 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, Gleb Natapov, Alexander Graf, kvm-ppc, linux-kernel,
	Paul Mackerras, Paolo Bonzini, linuxppc-dev

On Thu, 2014-06-05 at 17:25 +1000, Alexey Kardashevskiy wrote:
> +This creates a virtual TCE (translation control entry) table, which
> +is an IOMMU for PAPR-style virtual I/O.  It is used to translate
> +logical addresses used in virtual I/O into guest physical addresses,
> +and provides a scatter/gather capability for PAPR virtual I/O.
> +
> +/* for KVM_CAP_SPAPR_TCE_64 */
> +struct kvm_create_spapr_tce_64 {
> +       __u64 liobn;
> +       __u64 window_size;
> +       __u64 bus_offset;
> +       __u32 page_shift;
> +       __u32 flags;
> +};
> +
> +The liobn field gives the logical IO bus number for which to create a
> +TCE table. The window_size field specifies the size of the DMA window
> +which this TCE table will translate - the table will contain one 64
> +bit TCE entry for every IOMMU page. The bus_offset field tells where
> +this window is mapped on the IO bus. 

Hrm, the bus_offset cannot be set arbitrarily, it has some pretty strong
HW limits depending on the type of bridge & architecture version...

Do you plan to have that knowledge in qemu ? Or do you have some other
mechanism to query it ? (I might be missing a piece of the puzzle here).

Also one thing I've been pondering ...

We'll end up wasting a ton of memory with those TCE tables. If you have
3 PEs mapped into a guest, it will try to create 3 DDW's mapping the
entire guest memory and so 3 TCE tables large enough for that ... and
which will contain exactly the same entries !

We really want to look into extending PAPR to allow the creation of
table "aliases" so that the guest can essentially create one table and
associate it with multiple PEs. We might still decide to do multiple
copies for NUMA reasons but no more than one per node for example... at
least we can have the policy in qemu/kvm.

Also, do you currently require allocating a single physically contiguous
table or do you support TCE trees in your implementation ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05  7:38     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05  7:38 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini,
	Alexander Graf, kvm, linux-kernel, kvm-ppc

On Thu, 2014-06-05 at 17:25 +1000, Alexey Kardashevskiy wrote:
> +This creates a virtual TCE (translation control entry) table, which
> +is an IOMMU for PAPR-style virtual I/O.  It is used to translate
> +logical addresses used in virtual I/O into guest physical addresses,
> +and provides a scatter/gather capability for PAPR virtual I/O.
> +
> +/* for KVM_CAP_SPAPR_TCE_64 */
> +struct kvm_create_spapr_tce_64 {
> +       __u64 liobn;
> +       __u64 window_size;
> +       __u64 bus_offset;
> +       __u32 page_shift;
> +       __u32 flags;
> +};
> +
> +The liobn field gives the logical IO bus number for which to create a
> +TCE table. The window_size field specifies the size of the DMA window
> +which this TCE table will translate - the table will contain one 64
> +bit TCE entry for every IOMMU page. The bus_offset field tells where
> +this window is mapped on the IO bus. 

Hrm, the bus_offset cannot be set arbitrarily, it has some pretty strong
HW limits depending on the type of bridge & architecture version...

Do you plan to have that knowledge in qemu ? Or do you have some other
mechanism to query it ? (I might be missing a piece of the puzzle here).

Also one thing I've been pondering ...

We'll end up wasting a ton of memory with those TCE tables. If you have
3 PEs mapped into a guest, it will try to create 3 DDW's mapping the
entire guest memory and so 3 TCE tables large enough for that ... and
which will contain exactly the same entries !

We really want to look into extending PAPR to allow the creation of
table "aliases" so that the guest can essentially create one table and
associate it with multiple PEs. We might still decide to do multiple
copies for NUMA reasons but no more than one per node for example... at
least we can have the policy in qemu/kvm.

Also, do you currently require allocating a single physically contiguous
table or do you support TCE trees in your implementation ?

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
  2014-06-05  7:38     ` Benjamin Herrenschmidt
  (?)
@ 2014-06-05  9:26       ` Alexey Kardashevskiy
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  9:26 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini,
	Alexander Graf, kvm, linux-kernel, kvm-ppc

On 06/05/2014 05:38 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 17:25 +1000, Alexey Kardashevskiy wrote:
>> +This creates a virtual TCE (translation control entry) table, which
>> +is an IOMMU for PAPR-style virtual I/O.  It is used to translate
>> +logical addresses used in virtual I/O into guest physical addresses,
>> +and provides a scatter/gather capability for PAPR virtual I/O.
>> +
>> +/* for KVM_CAP_SPAPR_TCE_64 */
>> +struct kvm_create_spapr_tce_64 {
>> +       __u64 liobn;
>> +       __u64 window_size;
>> +       __u64 bus_offset;
>> +       __u32 page_shift;
>> +       __u32 flags;
>> +};
>> +
>> +The liobn field gives the logical IO bus number for which to create a
>> +TCE table. The window_size field specifies the size of the DMA window
>> +which this TCE table will translate - the table will contain one 64
>> +bit TCE entry for every IOMMU page. The bus_offset field tells where
>> +this window is mapped on the IO bus. 
> 
> Hrm, the bus_offset cannot be set arbitrarily, it has some pretty strong
> HW limits depending on the type of bridge & architecture version...
> 
> Do you plan to have that knowledge in qemu ? Or do you have some other
> mechanism to query it ? (I might be missing a piece of the puzzle here).


Yes. QEMU will have this knowledge as it has to implement
ibm,create-pe-dma-window and return this address to the guest. There will
be a container API to receive it from powernv code via funky ppc_md callback.

There are 2 steps:
1. query + create window
2. enable in-kernel KVM acceleration for it.

Everything will work without step2 and, frankly speaking, we do not need it
too much for DDW but it does not cost much.

By having bus_offset in ioctl which is only used for step2, I reduce
dependance from powernv.


> Also one thing I've been pondering ...
> 
> We'll end up wasting a ton of memory with those TCE tables. If you have
> 3 PEs mapped into a guest, it will try to create 3 DDW's mapping the
> entire guest memory and so 3 TCE tables large enough for that ... and
> which will contain exactly the same entries !

This is in the plan too, do not rush :)


> We really want to look into extending PAPR to allow the creation of
> table "aliases" so that the guest can essentially create one table and
> associate it with multiple PEs. We might still decide to do multiple
> copies for NUMA reasons but no more than one per node for example... at
> least we can have the policy in qemu/kvm.
> 
> Also, do you currently require allocating a single physically contiguous
> table or do you support TCE trees in your implementation ?


No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
Do we really need trees?


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05  9:26       ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  9:26 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Gleb Natapov, Alexander Graf, kvm-ppc, linux-kernel,
	Paul Mackerras, Paolo Bonzini, linuxppc-dev

On 06/05/2014 05:38 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 17:25 +1000, Alexey Kardashevskiy wrote:
>> +This creates a virtual TCE (translation control entry) table, which
>> +is an IOMMU for PAPR-style virtual I/O.  It is used to translate
>> +logical addresses used in virtual I/O into guest physical addresses,
>> +and provides a scatter/gather capability for PAPR virtual I/O.
>> +
>> +/* for KVM_CAP_SPAPR_TCE_64 */
>> +struct kvm_create_spapr_tce_64 {
>> +       __u64 liobn;
>> +       __u64 window_size;
>> +       __u64 bus_offset;
>> +       __u32 page_shift;
>> +       __u32 flags;
>> +};
>> +
>> +The liobn field gives the logical IO bus number for which to create a
>> +TCE table. The window_size field specifies the size of the DMA window
>> +which this TCE table will translate - the table will contain one 64
>> +bit TCE entry for every IOMMU page. The bus_offset field tells where
>> +this window is mapped on the IO bus. 
> 
> Hrm, the bus_offset cannot be set arbitrarily, it has some pretty strong
> HW limits depending on the type of bridge & architecture version...
> 
> Do you plan to have that knowledge in qemu ? Or do you have some other
> mechanism to query it ? (I might be missing a piece of the puzzle here).


Yes. QEMU will have this knowledge as it has to implement
ibm,create-pe-dma-window and return this address to the guest. There will
be a container API to receive it from powernv code via funky ppc_md callback.

There are 2 steps:
1. query + create window
2. enable in-kernel KVM acceleration for it.

Everything will work without step2 and, frankly speaking, we do not need it
too much for DDW but it does not cost much.

By having bus_offset in ioctl which is only used for step2, I reduce
dependance from powernv.


> Also one thing I've been pondering ...
> 
> We'll end up wasting a ton of memory with those TCE tables. If you have
> 3 PEs mapped into a guest, it will try to create 3 DDW's mapping the
> entire guest memory and so 3 TCE tables large enough for that ... and
> which will contain exactly the same entries !

This is in the plan too, do not rush :)


> We really want to look into extending PAPR to allow the creation of
> table "aliases" so that the guest can essentially create one table and
> associate it with multiple PEs. We might still decide to do multiple
> copies for NUMA reasons but no more than one per node for example... at
> least we can have the policy in qemu/kvm.
> 
> Also, do you currently require allocating a single physically contiguous
> table or do you support TCE trees in your implementation ?


No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
Do we really need trees?


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05  9:26       ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05  9:26 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini,
	Alexander Graf, kvm, linux-kernel, kvm-ppc

On 06/05/2014 05:38 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 17:25 +1000, Alexey Kardashevskiy wrote:
>> +This creates a virtual TCE (translation control entry) table, which
>> +is an IOMMU for PAPR-style virtual I/O.  It is used to translate
>> +logical addresses used in virtual I/O into guest physical addresses,
>> +and provides a scatter/gather capability for PAPR virtual I/O.
>> +
>> +/* for KVM_CAP_SPAPR_TCE_64 */
>> +struct kvm_create_spapr_tce_64 {
>> +       __u64 liobn;
>> +       __u64 window_size;
>> +       __u64 bus_offset;
>> +       __u32 page_shift;
>> +       __u32 flags;
>> +};
>> +
>> +The liobn field gives the logical IO bus number for which to create a
>> +TCE table. The window_size field specifies the size of the DMA window
>> +which this TCE table will translate - the table will contain one 64
>> +bit TCE entry for every IOMMU page. The bus_offset field tells where
>> +this window is mapped on the IO bus. 
> 
> Hrm, the bus_offset cannot be set arbitrarily, it has some pretty strong
> HW limits depending on the type of bridge & architecture version...
> 
> Do you plan to have that knowledge in qemu ? Or do you have some other
> mechanism to query it ? (I might be missing a piece of the puzzle here).


Yes. QEMU will have this knowledge as it has to implement
ibm,create-pe-dma-window and return this address to the guest. There will
be a container API to receive it from powernv code via funky ppc_md callback.

There are 2 steps:
1. query + create window
2. enable in-kernel KVM acceleration for it.

Everything will work without step2 and, frankly speaking, we do not need it
too much for DDW but it does not cost much.

By having bus_offset in ioctl which is only used for step2, I reduce
dependance from powernv.


> Also one thing I've been pondering ...
> 
> We'll end up wasting a ton of memory with those TCE tables. If you have
> 3 PEs mapped into a guest, it will try to create 3 DDW's mapping the
> entire guest memory and so 3 TCE tables large enough for that ... and
> which will contain exactly the same entries !

This is in the plan too, do not rush :)


> We really want to look into extending PAPR to allow the creation of
> table "aliases" so that the guest can essentially create one table and
> associate it with multiple PEs. We might still decide to do multiple
> copies for NUMA reasons but no more than one per node for example... at
> least we can have the policy in qemu/kvm.
> 
> Also, do you currently require allocating a single physically contiguous
> table or do you support TCE trees in your implementation ?


No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
Do we really need trees?


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
  2014-06-05  9:26       ` Alexey Kardashevskiy
  (?)
  (?)
@ 2014-06-05 10:27         ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05 10:27 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini,
	Alexander Graf, kvm, linux-kernel, kvm-ppc

On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote:
> 
> No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
> Do we really need trees?

The above is assuming hugetlbfs backed guests. These are the least of my worry
indeed. But we need to deal with 4k and 64k guests.

Cheers,
Ben



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 10:27         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05 10:27 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, Gleb Natapov, Alexander Graf, kvm-ppc, linux-kernel,
	Paul Mackerras, Paolo Bonzini, linuxppc-dev

On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote:
> 
> No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
> Do we really need trees?

The above is assuming hugetlbfs backed guests. These are the least of my worry
indeed. But we need to deal with 4k and 64k guests.

Cheers,
Ben


_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 10:27         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05 10:27 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, Gleb Natapov, Alexander Graf, kvm-ppc, linux-kernel,
	Paul Mackerras, Paolo Bonzini, linuxppc-dev

On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote:
> 
> No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
> Do we really need trees?

The above is assuming hugetlbfs backed guests. These are the least of my worry
indeed. But we need to deal with 4k and 64k guests.

Cheers,
Ben

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 10:27         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05 10:27 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini,
	Alexander Graf, kvm, linux-kernel, kvm-ppc

On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote:
> 
> No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
> Do we really need trees?

The above is assuming hugetlbfs backed guests. These are the least of my worry
indeed. But we need to deal with 4k and 64k guests.

Cheers,
Ben



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
  2014-06-05 10:27         ` Benjamin Herrenschmidt
  (?)
@ 2014-06-05 11:56           ` Alexander Graf
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 11:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Alexey Kardashevskiy
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini, kvm,
	linux-kernel, kvm-ppc


On 05.06.14 12:27, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote:
>> No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
>> Do we really need trees?
> The above is assuming hugetlbfs backed guests. These are the least of my worry
> indeed. But we need to deal with 4k and 64k guests.

What if we ask user space to give us a pointer to user space allocated 
memory along with the TCE registration? We would still ask user space to 
only use the returned fd for TCE modifications, but would have some 
nicely swappable memory we can store the TCE entries in.

In fact, the code as is today can allocate an arbitrary amount of pinned 
kernel memory from within user space without any checks.


Alex


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 11:56           ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 11:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Alexey Kardashevskiy
  Cc: kvm, Gleb Natapov, linux-kernel, kvm-ppc, Paul Mackerras,
	Paolo Bonzini, linuxppc-dev


On 05.06.14 12:27, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote:
>> No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
>> Do we really need trees?
> The above is assuming hugetlbfs backed guests. These are the least of my worry
> indeed. But we need to deal with 4k and 64k guests.

What if we ask user space to give us a pointer to user space allocated 
memory along with the TCE registration? We would still ask user space to 
only use the returned fd for TCE modifications, but would have some 
nicely swappable memory we can store the TCE entries in.

In fact, the code as is today can allocate an arbitrary amount of pinned 
kernel memory from within user space without any checks.


Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 11:56           ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 11:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Alexey Kardashevskiy
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini, kvm,
	linux-kernel, kvm-ppc


On 05.06.14 12:27, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote:
>> No trees yet. For 64GB window we need (64<<30)/(16<<20)*8 = 32K TCE table.
>> Do we really need trees?
> The above is assuming hugetlbfs backed guests. These are the least of my worry
> indeed. But we need to deal with 4k and 64k guests.

What if we ask user space to give us a pointer to user space allocated 
memory along with the TCE registration? We would still ask user space to 
only use the returned fd for TCE modifications, but would have some 
nicely swappable memory we can store the TCE entries in.

In fact, the code as is today can allocate an arbitrary amount of pinned 
kernel memory from within user space without any checks.


Alex


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
  2014-06-05  7:25 ` Alexey Kardashevskiy
  (?)
@ 2014-06-05 11:57   ` Alexander Graf
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 11:57 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc


On 05.06.14 09:25, Alexey Kardashevskiy wrote:
> This reserves 2 capability numbers.
>
> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>
> Please advise how to proceed with these patches as I suspect that
> first two should go via Paolo's tree while the last one via Alex Graf's tree
> (correct?).

They would just go via my tree, but only be actually allocated (read: 
mergable to qemu) when they hit Paolo's tree.

In fact, I don't think it makes sense to split them off at all.


Alex


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-05 11:57   ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 11:57 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: kvm, Gleb Natapov, linux-kernel, kvm-ppc, Paul Mackerras, Paolo Bonzini


On 05.06.14 09:25, Alexey Kardashevskiy wrote:
> This reserves 2 capability numbers.
>
> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>
> Please advise how to proceed with these patches as I suspect that
> first two should go via Paolo's tree while the last one via Alex Graf's tree
> (correct?).

They would just go via my tree, but only be actually allocated (read: 
mergable to qemu) when they hit Paolo's tree.

In fact, I don't think it makes sense to split them off at all.


Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-05 11:57   ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 11:57 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc


On 05.06.14 09:25, Alexey Kardashevskiy wrote:
> This reserves 2 capability numbers.
>
> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>
> Please advise how to proceed with these patches as I suspect that
> first two should go via Paolo's tree while the last one via Alex Graf's tree
> (correct?).

They would just go via my tree, but only be actually allocated (read: 
mergable to qemu) when they hit Paolo's tree.

In fact, I don't think it makes sense to split them off at all.


Alex


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
  2014-06-05 11:56           ` Alexander Graf
  (?)
@ 2014-06-05 12:30             ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05 12:30 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Alexey Kardashevskiy, linuxppc-dev, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc

On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
> What if we ask user space to give us a pointer to user space allocated 
> memory along with the TCE registration? We would still ask user space to 
> only use the returned fd for TCE modifications, but would have some 
> nicely swappable memory we can store the TCE entries in.

That isn't going to work terribly well for VFIO :-) But yes, for
emulated devices, we could improve things a bit, including for
the 32-bit TCE tables.

For emulated, the real mode path could walk the page tables and fallback
to virtual mode & get_user if the page isn't present, thus operating
directly on qemu memory TCE tables instead of the current pinned stuff.

However that has a cost in performance, but since that's really only
used for emulated devices and PAPR VIOs, it might not be a huge issue.

But for VFIO we don't have much choice, we need to create something the
HW can access.

> In fact, the code as is today can allocate an arbitrary amount of pinned 
> kernel memory from within user space without any checks.

Right. We should at least account it in the locked limit.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 12:30             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05 12:30 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm, Alexey Kardashevskiy, linux-kernel, kvm-ppc, Gleb Natapov,
	Paul Mackerras, Paolo Bonzini, linuxppc-dev

On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
> What if we ask user space to give us a pointer to user space allocated 
> memory along with the TCE registration? We would still ask user space to 
> only use the returned fd for TCE modifications, but would have some 
> nicely swappable memory we can store the TCE entries in.

That isn't going to work terribly well for VFIO :-) But yes, for
emulated devices, we could improve things a bit, including for
the 32-bit TCE tables.

For emulated, the real mode path could walk the page tables and fallback
to virtual mode & get_user if the page isn't present, thus operating
directly on qemu memory TCE tables instead of the current pinned stuff.

However that has a cost in performance, but since that's really only
used for emulated devices and PAPR VIOs, it might not be a huge issue.

But for VFIO we don't have much choice, we need to create something the
HW can access.

> In fact, the code as is today can allocate an arbitrary amount of pinned 
> kernel memory from within user space without any checks.

Right. We should at least account it in the locked limit.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 12:30             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 50+ messages in thread
From: Benjamin Herrenschmidt @ 2014-06-05 12:30 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Alexey Kardashevskiy, linuxppc-dev, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc

On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
> What if we ask user space to give us a pointer to user space allocated 
> memory along with the TCE registration? We would still ask user space to 
> only use the returned fd for TCE modifications, but would have some 
> nicely swappable memory we can store the TCE entries in.

That isn't going to work terribly well for VFIO :-) But yes, for
emulated devices, we could improve things a bit, including for
the 32-bit TCE tables.

For emulated, the real mode path could walk the page tables and fallback
to virtual mode & get_user if the page isn't present, thus operating
directly on qemu memory TCE tables instead of the current pinned stuff.

However that has a cost in performance, but since that's really only
used for emulated devices and PAPR VIOs, it might not be a huge issue.

But for VFIO we don't have much choice, we need to create something the
HW can access.

> In fact, the code as is today can allocate an arbitrary amount of pinned 
> kernel memory from within user space without any checks.

Right. We should at least account it in the locked limit.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
  2014-06-05 12:30             ` Benjamin Herrenschmidt
  (?)
@ 2014-06-05 12:32               ` Alexander Graf
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 12:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, linuxppc-dev, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc


On 05.06.14 14:30, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
>> What if we ask user space to give us a pointer to user space allocated
>> memory along with the TCE registration? We would still ask user space to
>> only use the returned fd for TCE modifications, but would have some
>> nicely swappable memory we can store the TCE entries in.
> That isn't going to work terribly well for VFIO :-) But yes, for
> emulated devices, we could improve things a bit, including for
> the 32-bit TCE tables.
>
> For emulated, the real mode path could walk the page tables and fallback
> to virtual mode & get_user if the page isn't present, thus operating
> directly on qemu memory TCE tables instead of the current pinned stuff.
>
> However that has a cost in performance, but since that's really only
> used for emulated devices and PAPR VIOs, it might not be a huge issue.
>
> But for VFIO we don't have much choice, we need to create something the
> HW can access.

But we need to create separate tables for VFIO anyways, because these 
TCE tables contain virtual addresses, no?


Alex

>
>> In fact, the code as is today can allocate an arbitrary amount of pinned
>> kernel memory from within user space without any checks.
> Right. We should at least account it in the locked limit.
>
> Cheers,
> Ben.
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 12:32               ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 12:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexey Kardashevskiy, linux-kernel, kvm-ppc, Gleb Natapov,
	Paul Mackerras, Paolo Bonzini, linuxppc-dev


On 05.06.14 14:30, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
>> What if we ask user space to give us a pointer to user space allocated
>> memory along with the TCE registration? We would still ask user space to
>> only use the returned fd for TCE modifications, but would have some
>> nicely swappable memory we can store the TCE entries in.
> That isn't going to work terribly well for VFIO :-) But yes, for
> emulated devices, we could improve things a bit, including for
> the 32-bit TCE tables.
>
> For emulated, the real mode path could walk the page tables and fallback
> to virtual mode & get_user if the page isn't present, thus operating
> directly on qemu memory TCE tables instead of the current pinned stuff.
>
> However that has a cost in performance, but since that's really only
> used for emulated devices and PAPR VIOs, it might not be a huge issue.
>
> But for VFIO we don't have much choice, we need to create something the
> HW can access.

But we need to create separate tables for VFIO anyways, because these 
TCE tables contain virtual addresses, no?


Alex

>
>> In fact, the code as is today can allocate an arbitrary amount of pinned
>> kernel memory from within user space without any checks.
> Right. We should at least account it in the locked limit.
>
> Cheers,
> Ben.
>
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 12:32               ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-05 12:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, linuxppc-dev, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc


On 05.06.14 14:30, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
>> What if we ask user space to give us a pointer to user space allocated
>> memory along with the TCE registration? We would still ask user space to
>> only use the returned fd for TCE modifications, but would have some
>> nicely swappable memory we can store the TCE entries in.
> That isn't going to work terribly well for VFIO :-) But yes, for
> emulated devices, we could improve things a bit, including for
> the 32-bit TCE tables.
>
> For emulated, the real mode path could walk the page tables and fallback
> to virtual mode & get_user if the page isn't present, thus operating
> directly on qemu memory TCE tables instead of the current pinned stuff.
>
> However that has a cost in performance, but since that's really only
> used for emulated devices and PAPR VIOs, it might not be a huge issue.
>
> But for VFIO we don't have much choice, we need to create something the
> HW can access.

But we need to create separate tables for VFIO anyways, because these 
TCE tables contain virtual addresses, no?


Alex

>
>> In fact, the code as is today can allocate an arbitrary amount of pinned
>> kernel memory from within user space without any checks.
> Right. We should at least account it in the locked limit.
>
> Cheers,
> Ben.
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
  2014-06-05 12:30             ` Benjamin Herrenschmidt
  (?)
@ 2014-06-05 13:04               ` Alexey Kardashevskiy
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05 13:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Alexander Graf
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini, kvm,
	linux-kernel, kvm-ppc

On 06/05/2014 10:30 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
>> What if we ask user space to give us a pointer to user space allocated 
>> memory along with the TCE registration? We would still ask user space to 
>> only use the returned fd for TCE modifications, but would have some 
>> nicely swappable memory we can store the TCE entries in.
> 
> That isn't going to work terribly well for VFIO :-) But yes, for
> emulated devices, we could improve things a bit, including for
> the 32-bit TCE tables.
> 
> For emulated, the real mode path could walk the page tables and fallback
> to virtual mode & get_user if the page isn't present, thus operating
> directly on qemu memory TCE tables instead of the current pinned stuff.
> 
> However that has a cost in performance, but since that's really only
> used for emulated devices and PAPR VIOs, it might not be a huge issue.
> 
> But for VFIO we don't have much choice, we need to create something the
> HW can access.

You are confusing things here.

There are 2 tables:
1. guest-visible TCE table, this is what is allocated for VIO or emulated PCI;
2. real HW DMA window, one exists already for DMA32 and one I will
allocated for a huge window.

I have just #2 for VFIO now but we will need both in order to implement
H_GET_TCE correctly, and this is the table I will allocate by this new ioctl.


>> In fact, the code as is today can allocate an arbitrary amount of pinned 
>> kernel memory from within user space without any checks.
> 
> Right. We should at least account it in the locked limit.

Yup. And (probably) this thing will keep a counter of how many windows were
created per KVM instance to avoid having multiple copies of the same table.


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 13:04               ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05 13:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Alexander Graf
  Cc: kvm, Gleb Natapov, linux-kernel, kvm-ppc, Paul Mackerras,
	Paolo Bonzini, linuxppc-dev

On 06/05/2014 10:30 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
>> What if we ask user space to give us a pointer to user space allocated 
>> memory along with the TCE registration? We would still ask user space to 
>> only use the returned fd for TCE modifications, but would have some 
>> nicely swappable memory we can store the TCE entries in.
> 
> That isn't going to work terribly well for VFIO :-) But yes, for
> emulated devices, we could improve things a bit, including for
> the 32-bit TCE tables.
> 
> For emulated, the real mode path could walk the page tables and fallback
> to virtual mode & get_user if the page isn't present, thus operating
> directly on qemu memory TCE tables instead of the current pinned stuff.
> 
> However that has a cost in performance, but since that's really only
> used for emulated devices and PAPR VIOs, it might not be a huge issue.
> 
> But for VFIO we don't have much choice, we need to create something the
> HW can access.

You are confusing things here.

There are 2 tables:
1. guest-visible TCE table, this is what is allocated for VIO or emulated PCI;
2. real HW DMA window, one exists already for DMA32 and one I will
allocated for a huge window.

I have just #2 for VFIO now but we will need both in order to implement
H_GET_TCE correctly, and this is the table I will allocate by this new ioctl.


>> In fact, the code as is today can allocate an arbitrary amount of pinned 
>> kernel memory from within user space without any checks.
> 
> Right. We should at least account it in the locked limit.

Yup. And (probably) this thing will keep a counter of how many windows were
created per KVM instance to avoid having multiple copies of the same table.


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows
@ 2014-06-05 13:04               ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-05 13:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Alexander Graf
  Cc: linuxppc-dev, Paul Mackerras, Gleb Natapov, Paolo Bonzini, kvm,
	linux-kernel, kvm-ppc

On 06/05/2014 10:30 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
>> What if we ask user space to give us a pointer to user space allocated 
>> memory along with the TCE registration? We would still ask user space to 
>> only use the returned fd for TCE modifications, but would have some 
>> nicely swappable memory we can store the TCE entries in.
> 
> That isn't going to work terribly well for VFIO :-) But yes, for
> emulated devices, we could improve things a bit, including for
> the 32-bit TCE tables.
> 
> For emulated, the real mode path could walk the page tables and fallback
> to virtual mode & get_user if the page isn't present, thus operating
> directly on qemu memory TCE tables instead of the current pinned stuff.
> 
> However that has a cost in performance, but since that's really only
> used for emulated devices and PAPR VIOs, it might not be a huge issue.
> 
> But for VFIO we don't have much choice, we need to create something the
> HW can access.

You are confusing things here.

There are 2 tables:
1. guest-visible TCE table, this is what is allocated for VIO or emulated PCI;
2. real HW DMA window, one exists already for DMA32 and one I will
allocated for a huge window.

I have just #2 for VFIO now but we will need both in order to implement
H_GET_TCE correctly, and this is the table I will allocate by this new ioctl.


>> In fact, the code as is today can allocate an arbitrary amount of pinned 
>> kernel memory from within user space without any checks.
> 
> Right. We should at least account it in the locked limit.

Yup. And (probably) this thing will keep a counter of how many windows were
created per KVM instance to avoid having multiple copies of the same table.


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
  2014-06-05 11:57   ` Alexander Graf
  (?)
@ 2014-06-06  0:20     ` Alexey Kardashevskiy
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-06  0:20 UTC (permalink / raw)
  To: Alexander Graf, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc

On 06/05/2014 09:57 PM, Alexander Graf wrote:
> 
> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>> This reserves 2 capability numbers.
>>
>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>
>> Please advise how to proceed with these patches as I suspect that
>> first two should go via Paolo's tree while the last one via Alex Graf's tree
>> (correct?).
> 
> They would just go via my tree, but only be actually allocated (read:
> mergable to qemu) when they hit Paolo's tree.
> 
> In fact, I don't think it makes sense to split them off at all.


So? Are these patches going anywhere? Thanks.


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-06  0:20     ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-06  0:20 UTC (permalink / raw)
  To: Alexander Graf, linuxppc-dev
  Cc: kvm, Gleb Natapov, linux-kernel, kvm-ppc, Paul Mackerras, Paolo Bonzini

On 06/05/2014 09:57 PM, Alexander Graf wrote:
> 
> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>> This reserves 2 capability numbers.
>>
>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>
>> Please advise how to proceed with these patches as I suspect that
>> first two should go via Paolo's tree while the last one via Alex Graf's tree
>> (correct?).
> 
> They would just go via my tree, but only be actually allocated (read:
> mergable to qemu) when they hit Paolo's tree.
> 
> In fact, I don't think it makes sense to split them off at all.


So? Are these patches going anywhere? Thanks.


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-06  0:20     ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-06  0:20 UTC (permalink / raw)
  To: Alexander Graf, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc

On 06/05/2014 09:57 PM, Alexander Graf wrote:
> 
> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>> This reserves 2 capability numbers.
>>
>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>
>> Please advise how to proceed with these patches as I suspect that
>> first two should go via Paolo's tree while the last one via Alex Graf's tree
>> (correct?).
> 
> They would just go via my tree, but only be actually allocated (read:
> mergable to qemu) when they hit Paolo's tree.
> 
> In fact, I don't think it makes sense to split them off at all.


So? Are these patches going anywhere? Thanks.


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
  2014-06-06  0:20     ` Alexey Kardashevskiy
  (?)
@ 2014-06-25 21:12       ` Alexander Graf
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-25 21:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc


On 06.06.14 02:20, Alexey Kardashevskiy wrote:
> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>> This reserves 2 capability numbers.
>>>
>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>
>>> Please advise how to proceed with these patches as I suspect that
>>> first two should go via Paolo's tree while the last one via Alex Graf's tree
>>> (correct?).
>> They would just go via my tree, but only be actually allocated (read:
>> mergable to qemu) when they hit Paolo's tree.
>>
>> In fact, I don't think it makes sense to split them off at all.
>
> So? Are these patches going anywhere? Thanks.

So? Are you going to address the comments?


Alex


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-25 21:12       ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-25 21:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: kvm, Gleb Natapov, linux-kernel, kvm-ppc, Paul Mackerras, Paolo Bonzini


On 06.06.14 02:20, Alexey Kardashevskiy wrote:
> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>> This reserves 2 capability numbers.
>>>
>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>
>>> Please advise how to proceed with these patches as I suspect that
>>> first two should go via Paolo's tree while the last one via Alex Graf's tree
>>> (correct?).
>> They would just go via my tree, but only be actually allocated (read:
>> mergable to qemu) when they hit Paolo's tree.
>>
>> In fact, I don't think it makes sense to split them off at all.
>
> So? Are these patches going anywhere? Thanks.

So? Are you going to address the comments?


Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-25 21:12       ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-25 21:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc


On 06.06.14 02:20, Alexey Kardashevskiy wrote:
> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>> This reserves 2 capability numbers.
>>>
>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>
>>> Please advise how to proceed with these patches as I suspect that
>>> first two should go via Paolo's tree while the last one via Alex Graf's tree
>>> (correct?).
>> They would just go via my tree, but only be actually allocated (read:
>> mergable to qemu) when they hit Paolo's tree.
>>
>> In fact, I don't think it makes sense to split them off at all.
>
> So? Are these patches going anywhere? Thanks.

So? Are you going to address the comments?


Alex


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
  2014-06-25 21:12       ` Alexander Graf
  (?)
@ 2014-06-25 23:59         ` Alexey Kardashevskiy
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-25 23:59 UTC (permalink / raw)
  To: Alexander Graf, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc

On 06/26/2014 07:12 AM, Alexander Graf wrote:
> 
> On 06.06.14 02:20, Alexey Kardashevskiy wrote:
>> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>>> This reserves 2 capability numbers.
>>>>
>>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>>
>>>> Please advise how to proceed with these patches as I suspect that
>>>> first two should go via Paolo's tree while the last one via Alex Graf's
>>>> tree
>>>> (correct?).
>>> They would just go via my tree, but only be actually allocated (read:
>>> mergable to qemu) when they hit Paolo's tree.
>>>
>>> In fact, I don't think it makes sense to split them off at all.
>>
>> So? Are these patches going anywhere? Thanks.
> 
> So? Are you going to address the comments?

Sorry, I cannot find here anything to fix. Ben asked some questions, I
answered and there were no objections. What do I miss this time?...


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-25 23:59         ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-25 23:59 UTC (permalink / raw)
  To: Alexander Graf, linuxppc-dev
  Cc: kvm, Gleb Natapov, linux-kernel, kvm-ppc, Paul Mackerras, Paolo Bonzini

On 06/26/2014 07:12 AM, Alexander Graf wrote:
> 
> On 06.06.14 02:20, Alexey Kardashevskiy wrote:
>> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>>> This reserves 2 capability numbers.
>>>>
>>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>>
>>>> Please advise how to proceed with these patches as I suspect that
>>>> first two should go via Paolo's tree while the last one via Alex Graf's
>>>> tree
>>>> (correct?).
>>> They would just go via my tree, but only be actually allocated (read:
>>> mergable to qemu) when they hit Paolo's tree.
>>>
>>> In fact, I don't think it makes sense to split them off at all.
>>
>> So? Are these patches going anywhere? Thanks.
> 
> So? Are you going to address the comments?

Sorry, I cannot find here anything to fix. Ben asked some questions, I
answered and there were no objections. What do I miss this time?...


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-25 23:59         ` Alexey Kardashevskiy
  0 siblings, 0 replies; 50+ messages in thread
From: Alexey Kardashevskiy @ 2014-06-25 23:59 UTC (permalink / raw)
  To: Alexander Graf, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc

On 06/26/2014 07:12 AM, Alexander Graf wrote:
> 
> On 06.06.14 02:20, Alexey Kardashevskiy wrote:
>> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>>> This reserves 2 capability numbers.
>>>>
>>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>>
>>>> Please advise how to proceed with these patches as I suspect that
>>>> first two should go via Paolo's tree while the last one via Alex Graf's
>>>> tree
>>>> (correct?).
>>> They would just go via my tree, but only be actually allocated (read:
>>> mergable to qemu) when they hit Paolo's tree.
>>>
>>> In fact, I don't think it makes sense to split them off at all.
>>
>> So? Are these patches going anywhere? Thanks.
> 
> So? Are you going to address the comments?

Sorry, I cannot find here anything to fix. Ben asked some questions, I
answered and there were no objections. What do I miss this time?...


-- 
Alexey

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
  2014-06-25 23:59         ` Alexey Kardashevskiy
  (?)
@ 2014-06-26 10:37           ` Alexander Graf
  -1 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-26 10:37 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc


On 26.06.14 01:59, Alexey Kardashevskiy wrote:
> On 06/26/2014 07:12 AM, Alexander Graf wrote:
>> On 06.06.14 02:20, Alexey Kardashevskiy wrote:
>>> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>>>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>>>> This reserves 2 capability numbers.
>>>>>
>>>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>>>
>>>>> Please advise how to proceed with these patches as I suspect that
>>>>> first two should go via Paolo's tree while the last one via Alex Graf's
>>>>> tree
>>>>> (correct?).
>>>> They would just go via my tree, but only be actually allocated (read:
>>>> mergable to qemu) when they hit Paolo's tree.
>>>>
>>>> In fact, I don't think it makes sense to split them off at all.
>>> So? Are these patches going anywhere? Thanks.
>> So? Are you going to address the comments?
> Sorry, I cannot find here anything to fix. Ben asked some questions, I
> answered and there were no objections. What do I miss this time?...

> >> In fact, the code as is today can allocate an arbitrary amount of pinned
> >> kernel memory from within user space without any checks.
> >
> > Right. We should at least account it in the locked limit.
>
> Yup. And (probably) this thing will keep a counter of how many windows were
> created per KVM instance to avoid having multiple copies of the same table.


Alex


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-26 10:37           ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-26 10:37 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: kvm, Gleb Natapov, linux-kernel, kvm-ppc, Paul Mackerras, Paolo Bonzini


On 26.06.14 01:59, Alexey Kardashevskiy wrote:
> On 06/26/2014 07:12 AM, Alexander Graf wrote:
>> On 06.06.14 02:20, Alexey Kardashevskiy wrote:
>>> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>>>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>>>> This reserves 2 capability numbers.
>>>>>
>>>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>>>
>>>>> Please advise how to proceed with these patches as I suspect that
>>>>> first two should go via Paolo's tree while the last one via Alex Graf's
>>>>> tree
>>>>> (correct?).
>>>> They would just go via my tree, but only be actually allocated (read:
>>>> mergable to qemu) when they hit Paolo's tree.
>>>>
>>>> In fact, I don't think it makes sense to split them off at all.
>>> So? Are these patches going anywhere? Thanks.
>> So? Are you going to address the comments?
> Sorry, I cannot find here anything to fix. Ben asked some questions, I
> answered and there were no objections. What do I miss this time?...

> >> In fact, the code as is today can allocate an arbitrary amount of pinned
> >> kernel memory from within user space without any checks.
> >
> > Right. We should at least account it in the locked limit.
>
> Yup. And (probably) this thing will keep a counter of how many windows were
> created per KVM instance to avoid having multiple copies of the same table.


Alex

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration
@ 2014-06-26 10:37           ` Alexander Graf
  0 siblings, 0 replies; 50+ messages in thread
From: Alexander Graf @ 2014-06-26 10:37 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Gleb Natapov,
	Paolo Bonzini, kvm, linux-kernel, kvm-ppc


On 26.06.14 01:59, Alexey Kardashevskiy wrote:
> On 06/26/2014 07:12 AM, Alexander Graf wrote:
>> On 06.06.14 02:20, Alexey Kardashevskiy wrote:
>>> On 06/05/2014 09:57 PM, Alexander Graf wrote:
>>>> On 05.06.14 09:25, Alexey Kardashevskiy wrote:
>>>>> This reserves 2 capability numbers.
>>>>>
>>>>> This implements an extended version of KVM_CREATE_SPAPR_TCE_64 ioctl.
>>>>>
>>>>> Please advise how to proceed with these patches as I suspect that
>>>>> first two should go via Paolo's tree while the last one via Alex Graf's
>>>>> tree
>>>>> (correct?).
>>>> They would just go via my tree, but only be actually allocated (read:
>>>> mergable to qemu) when they hit Paolo's tree.
>>>>
>>>> In fact, I don't think it makes sense to split them off at all.
>>> So? Are these patches going anywhere? Thanks.
>> So? Are you going to address the comments?
> Sorry, I cannot find here anything to fix. Ben asked some questions, I
> answered and there were no objections. What do I miss this time?...

> >> In fact, the code as is today can allocate an arbitrary amount of pinned
> >> kernel memory from within user space without any checks.
> >
> > Right. We should at least account it in the locked limit.
>
> Yup. And (probably) this thing will keep a counter of how many windows were
> created per KVM instance to avoid having multiple copies of the same table.


Alex


^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2014-06-26 10:37 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-05  7:25 [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration Alexey Kardashevskiy
2014-06-05  7:25 ` Alexey Kardashevskiy
2014-06-05  7:25 ` Alexey Kardashevskiy
2014-06-05  7:25 ` [PATCH 1/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number Alexey Kardashevskiy
2014-06-05  7:25   ` Alexey Kardashevskiy
2014-06-05  7:25   ` Alexey Kardashevskiy
2014-06-05  7:25 ` [PATCH 2/3] PPC: KVM: Reserve KVM_CAP_SPAPR_TCE_64 " Alexey Kardashevskiy
2014-06-05  7:25   ` Alexey Kardashevskiy
2014-06-05  7:25   ` Alexey Kardashevskiy
2014-06-05  7:25   ` Alexey Kardashevskiy
2014-06-05  7:25 ` [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows Alexey Kardashevskiy
2014-06-05  7:25   ` Alexey Kardashevskiy
2014-06-05  7:25   ` Alexey Kardashevskiy
2014-06-05  7:38   ` Benjamin Herrenschmidt
2014-06-05  7:38     ` Benjamin Herrenschmidt
2014-06-05  7:38     ` Benjamin Herrenschmidt
2014-06-05  9:26     ` Alexey Kardashevskiy
2014-06-05  9:26       ` Alexey Kardashevskiy
2014-06-05  9:26       ` Alexey Kardashevskiy
2014-06-05 10:27       ` Benjamin Herrenschmidt
2014-06-05 10:27         ` Benjamin Herrenschmidt
2014-06-05 10:27         ` Benjamin Herrenschmidt
2014-06-05 10:27         ` Benjamin Herrenschmidt
2014-06-05 11:56         ` Alexander Graf
2014-06-05 11:56           ` Alexander Graf
2014-06-05 11:56           ` Alexander Graf
2014-06-05 12:30           ` Benjamin Herrenschmidt
2014-06-05 12:30             ` Benjamin Herrenschmidt
2014-06-05 12:30             ` Benjamin Herrenschmidt
2014-06-05 12:32             ` Alexander Graf
2014-06-05 12:32               ` Alexander Graf
2014-06-05 12:32               ` Alexander Graf
2014-06-05 13:04             ` Alexey Kardashevskiy
2014-06-05 13:04               ` Alexey Kardashevskiy
2014-06-05 13:04               ` Alexey Kardashevskiy
2014-06-05 11:57 ` [PATCH 0/3] Prepare for in-kernel VFIO DMA operations acceleration Alexander Graf
2014-06-05 11:57   ` Alexander Graf
2014-06-05 11:57   ` Alexander Graf
2014-06-06  0:20   ` Alexey Kardashevskiy
2014-06-06  0:20     ` Alexey Kardashevskiy
2014-06-06  0:20     ` Alexey Kardashevskiy
2014-06-25 21:12     ` Alexander Graf
2014-06-25 21:12       ` Alexander Graf
2014-06-25 21:12       ` Alexander Graf
2014-06-25 23:59       ` Alexey Kardashevskiy
2014-06-25 23:59         ` Alexey Kardashevskiy
2014-06-25 23:59         ` Alexey Kardashevskiy
2014-06-26 10:37         ` Alexander Graf
2014-06-26 10:37           ` Alexander Graf
2014-06-26 10:37           ` Alexander Graf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.