All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] powerpc: implement reset/shutdown hcalls
@ 2013-07-15 11:23 ` Bharat Bhushan
  0 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:11 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder; +Cc: Bharat Bhushan

From: Bharat Bhushan <bharat.bhushan@freescale.com>

This patchset implements the hcall exit interface to userspace.
Also we added reset and shutdown hcall

Bharat Bhushan (5):
  powerpc: define ePAPR hcall exit interface
  booke: exit to guest userspace for unimplemented hcalls in kvm
  booke: define reset and shutdown hcalls
  powerpc: Resolve KVM_HC_FEATURES compilation dependeny
  powerpc: using reset hcall when kvm,has-reset

 Documentation/virtual/kvm/api.txt        |   20 ++++++++++++
 Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++
 arch/powerpc/include/asm/kvm_para.h      |    1 +
 arch/powerpc/kernel/epapr_paravirt.c     |   12 +++++++
 arch/powerpc/kvm/booke.c                 |   47 +++++++++++++++++++++++++----
 arch/powerpc/kvm/powerpc.c               |    1 +
 include/uapi/linux/kvm.h                 |    8 +++++
 include/uapi/linux/kvm_para.h            |    3 +-
 8 files changed, 100 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 1/5] powerpc: define ePAPR hcall exit interface
  2013-07-15 11:23 ` Bharat Bhushan
@ 2013-07-15 11:23   ` Bharat Bhushan
  -1 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:11 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

This patch defines the ePAPR hcall exit interface to guest user space.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 Documentation/virtual/kvm/api.txt |   20 ++++++++++++++++++++
 include/uapi/linux/kvm.h          |    7 +++++++
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 66dd2aa..054f2f4 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2597,6 +2597,26 @@ The possible hypercalls are defined in the Power Architecture Platform
 Requirements (PAPR) document available from www.power.org (free
 developer registration required to access it).
 
+		/* KVM_EXIT_EPAPR_HCALL */
+		struct {
+			__u64 nr;
+			__u64 ret;
+			__u64 args[8];
+		} epapr_hcall;
+
+This is used on PowerPC platforms that support ePAPR hcalls.
+It occurs when a guest does a hypercall (as defined
+in the ePAPR 1.1) and the hcall is not handled by the kernel.
+
+The 'nr' field contains the hypercall number (from the guest R11),
+and 'args' contains the arguments (from the guest R3 - R10).
+Userspace should put the return code in 'ret' and any extra returned
+values in args[].  If the VM is not in 64-bit mode KVM zeros
+the upper half of each field in the struct.
+
+As per the ePAPR hcall ABI, the return value is returned to
+the guest in R3 and output return values in R4 - R10.
+
 		/* KVM_EXIT_S390_TSCH */
 		struct {
 			__u16 subchannel_id;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index acccd08..01ee50e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -171,6 +171,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_WATCHDOG         21
 #define KVM_EXIT_S390_TSCH        22
 #define KVM_EXIT_EPR              23
+#define KVM_EXIT_EPAPR_HCALL      24
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -288,6 +289,12 @@ struct kvm_run {
 			__u64 ret;
 			__u64 args[9];
 		} papr_hcall;
+		/* KVM_EXIT_EPAPR_HCALL */
+		struct {
+			__u64 nr;
+			__u64 ret;
+			__u64 args[8];
+		} epapr_hcall;
 		/* KVM_EXIT_S390_TSCH */
 		struct {
 			__u16 subchannel_id;
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 11:23 ` Bharat Bhushan
@ 2013-07-15 11:23   ` Bharat Bhushan
  -1 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:11 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

Exit to guest user space if kvm does not implement the hcall.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++------
 arch/powerpc/kvm/powerpc.c |    1 +
 include/uapi/linux/kvm.h   |    1 +
 3 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 17722d8..c8b41b4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		break;
 
 #ifdef CONFIG_KVM_BOOKE_HV
-	case BOOKE_INTERRUPT_HV_SYSCALL:
+	case BOOKE_INTERRUPT_HV_SYSCALL: {
+		int i;
 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
-			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
+			r = kvmppc_kvm_pv(vcpu);
+			if (r != EV_UNIMPLEMENTED) {
+				/* except unimplemented return to guest */
+				kvmppc_set_gpr(vcpu, 3, r);
+				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
+				r = RESUME_GUEST;
+				break;
+			}
+			/* Exit to userspace for unimplemented hcalls in kvm */
+			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
+			run->epapr_hcall.ret = 0;
+			for (i = 0; i < 8; i++)
+				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i);
+			vcpu->arch.hcall_needed = 1;
+			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
+			r = RESUME_HOST;
 		} else {
 			/*
 			 * hcall from guest userspace -- send privileged
@@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			kvmppc_core_queue_program(vcpu, ESR_PPR);
 		}
 
-		r = RESUME_GUEST;
+		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
 		break;
+	}
 #else
-	case BOOKE_INTERRUPT_SYSCALL:
+	case BOOKE_INTERRUPT_SYSCALL: {
+		int i;
+		r = RESUME_GUEST;
 		if (!(vcpu->arch.shared->msr & MSR_PR) &&
 		    (((u32)kvmppc_get_gpr(vcpu, 0)) == KVM_SC_MAGIC_R0)) {
 			/* KVM PV hypercalls */
-			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
-			r = RESUME_GUEST;
+			r = kvmppc_kvm_pv(vcpu);
+			if (r != EV_UNIMPLEMENTED) {
+				/* except unimplemented return to guest */
+				kvmppc_set_gpr(vcpu, 3, r);
+				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
+				r = RESUME_GUEST;
+				break;
+			}
+			/* Exit to userspace for unimplemented hcalls in kvm */
+			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
+			run->epapr_hcall.ret = 0;
+			for (i = 0; i < 8; i++)
+				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i);
+			vcpu->arch.hcall_needed = 1;
+			run->exit_reason = KVM_EXIT_EPAPR_HCALL;
+			r = RESUME_HOST;
 		} else {
 			/* Guest syscalls */
 			kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL);
 		}
 		kvmppc_account_exit(vcpu, SYSCALL_EXITS);
-		r = RESUME_GUEST;
 		break;
+	}
 #endif
 
 	case BOOKE_INTERRUPT_DTLB_MISS: {
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4e05f8c..6c6199d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -307,6 +307,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_PPC_BOOKE_SREGS:
 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
 	case KVM_CAP_PPC_EPR:
+	case KVM_CAP_EXIT_EPAPR_HCALL:
 #else
 	case KVM_CAP_PPC_SEGSTATE:
 	case KVM_CAP_PPC_HIOR:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 01ee50e..b5266e5 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -674,6 +674,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_RTAS 91
 #define KVM_CAP_IRQ_XICS 92
 #define KVM_CAP_ARM_EL1_32BIT 93
+#define KVM_CAP_EXIT_EPAPR_HCALL 94
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 11:23 ` Bharat Bhushan
@ 2013-07-15 11:23   ` Bharat Bhushan
  -1 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:11 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

KVM_HC_VM_RESET: Requests that the virtual machine be reset.
KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.

These hcalls are handled by guest userspace.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
 include/uapi/linux/kvm_para.h            |    3 ++-
 2 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
index ea113b5..58acdc1 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
 shared page that contains parts of supervisor visible register state.
 The guest can map this shared page to access its supervisor register through
 memory using this hypercall.
+
+5. KVM_HC_VM_RESET
+------------------------
+Architecture: PPC
+Status: active
+Purpose:  Requests that the virtual machine be reset.  The hcall takes no
+arguments. If successful the hcall does not return. If an error occurs it
+returns EV_INTERNAL.
+
+6. KVM_HC_VM_SHUTDOWN
+------------------------
+Architecture: PPC
+Status: active
+Purpose: Requests that the virtual machine be powered-off/halted.
+The hcall takes no arguments. If successful the hcall does not return.
+If an error occurs it returns EV_INTERNAL.
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index cea2c5c..218882d 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -19,7 +19,8 @@
 #define KVM_HC_MMU_OP			2
 #define KVM_HC_FEATURES			3
 #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
-
+#define KVM_HC_VM_RESET			5
+#define KVM_HC_VM_SHUTDOWN		6
 /*
  * hypercalls use architecture specific
  */
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 4/5] powerpc: Resolve KVM_HC_FEATURES compilation dependeny
  2013-07-15 11:23 ` Bharat Bhushan
@ 2013-07-15 11:23   ` Bharat Bhushan
  -1 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:11 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

arch/powerpc/include/asm/kvm_para.h have dependency on uapi/linux/kvm_para.h
for KVM_HC_FEATURES

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 arch/powerpc/include/asm/kvm_para.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index 2b11965..8bdfd22 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -20,6 +20,7 @@
 #define __POWERPC_KVM_PARA_H__
 
 #include <uapi/asm/kvm_para.h>
+#include <uapi/linux/kvm_para.h>
 
 #ifdef CONFIG_KVM_GUEST
 
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 11:23 ` Bharat Bhushan
@ 2013-07-15 11:23   ` Bharat Bhushan
  -1 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:11 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

Detect the availability of the reset hcalls by looking at kvm,has-reset
property on the /hypervisor node in the device tree passed to the VM and
patches the reset mechanism to use reset hcall.

This patch uses the reser hcall when kvm,has-reset is there in

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c
index d44a571..651d701 100644
--- a/arch/powerpc/kernel/epapr_paravirt.c
+++ b/arch/powerpc/kernel/epapr_paravirt.c
@@ -22,6 +22,8 @@
 #include <asm/cacheflush.h>
 #include <asm/code-patching.h>
 #include <asm/machdep.h>
+#include <asm/kvm_para.h>
+#include <asm/kvm_host.h>
 
 #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64)
 extern void epapr_ev_idle(void);
@@ -30,6 +32,14 @@ extern u32 epapr_ev_idle_start[];
 
 bool epapr_paravirt_enabled;
 
+void epapr_hypercall_reset(char *cmd)
+{
+	long ret;
+	ret = kvm_hypercall0(KVM_HC_VM_RESET);
+	printk("error: system reset returned with error %ld\n", ret);
+	BUG();
+}
+
 static int __init epapr_paravirt_init(void)
 {
 	struct device_node *hyper_node;
@@ -58,6 +68,8 @@ static int __init epapr_paravirt_init(void)
 	if (of_get_property(hyper_node, "has-idle", NULL))
 		ppc_md.power_save = epapr_ev_idle;
 #endif
+	if (of_get_property(hyper_node, "kvm,has-reset", NULL))
+		ppc_md.restart = epapr_hypercall_reset;
 
 	epapr_paravirt_enabled = true;
 
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH 1/5] powerpc: define ePAPR hcall exit interface
  2013-07-15 11:23   ` Bharat Bhushan
@ 2013-07-15 11:21     ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:21 UTC (permalink / raw)
  To: Bharat Bhushan; +Cc: kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:11, Bharat Bhushan wrote:

> This patch defines the ePAPR hcall exit interface to guest user space.

The subject line is misleading. This is a kvm patch. Same applies for most other patches.

> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> Documentation/virtual/kvm/api.txt |   20 ++++++++++++++++++++
> include/uapi/linux/kvm.h          |    7 +++++++
> 2 files changed, 27 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 66dd2aa..054f2f4 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2597,6 +2597,26 @@ The possible hypercalls are defined in the Power Architecture Platform
> Requirements (PAPR) document available from www.power.org (free
> developer registration required to access it).
> 
> +		/* KVM_EXIT_EPAPR_HCALL */
> +		struct {
> +			__u64 nr;
> +			__u64 ret;
> +			__u64 args[8];
> +		} epapr_hcall;
> +
> +This is used on PowerPC platforms that support ePAPR hcalls.
> +It occurs when a guest does a hypercall (as defined
> +in the ePAPR 1.1) and the hcall is not handled by the kernel.
> +
> +The 'nr' field contains the hypercall number (from the guest R11),
> +and 'args' contains the arguments (from the guest R3 - R10).
> +Userspace should put the return code in 'ret' and any extra returned
> +values in args[].  If the VM is not in 64-bit mode KVM zeros
> +the upper half of each field in the struct.
> +
> +As per the ePAPR hcall ABI, the return value is returned to
> +the guest in R3 and output return values in R4 - R10.
> +
> 		/* KVM_EXIT_S390_TSCH */
> 		struct {
> 			__u16 subchannel_id;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index acccd08..01ee50e 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -171,6 +171,7 @@ struct kvm_pit_config {
> #define KVM_EXIT_WATCHDOG         21
> #define KVM_EXIT_S390_TSCH        22
> #define KVM_EXIT_EPR              23
> +#define KVM_EXIT_EPAPR_HCALL      24
> 
> /* For KVM_EXIT_INTERNAL_ERROR */
> /* Emulate instruction failed. */
> @@ -288,6 +289,12 @@ struct kvm_run {
> 			__u64 ret;
> 			__u64 args[9];
> 		} papr_hcall;
> +		/* KVM_EXIT_EPAPR_HCALL */
> +		struct {
> +			__u64 nr;
> +			__u64 ret;
> +			__u64 args[8];
> +		} epapr_hcall;

This should be at the end of the union.


Alex

> 		/* KVM_EXIT_S390_TSCH */
> 		struct {
> 			__u16 subchannel_id;
> -- 
> 1.7.0.4
> 
> 

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 1/5] powerpc: define ePAPR hcall exit interface
@ 2013-07-15 11:21     ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:21 UTC (permalink / raw)
  To: Bharat Bhushan; +Cc: kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:11, Bharat Bhushan wrote:

> This patch defines the ePAPR hcall exit interface to guest user space.

The subject line is misleading. This is a kvm patch. Same applies for most other patches.

> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> Documentation/virtual/kvm/api.txt |   20 ++++++++++++++++++++
> include/uapi/linux/kvm.h          |    7 +++++++
> 2 files changed, 27 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 66dd2aa..054f2f4 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2597,6 +2597,26 @@ The possible hypercalls are defined in the Power Architecture Platform
> Requirements (PAPR) document available from www.power.org (free
> developer registration required to access it).
> 
> +		/* KVM_EXIT_EPAPR_HCALL */
> +		struct {
> +			__u64 nr;
> +			__u64 ret;
> +			__u64 args[8];
> +		} epapr_hcall;
> +
> +This is used on PowerPC platforms that support ePAPR hcalls.
> +It occurs when a guest does a hypercall (as defined
> +in the ePAPR 1.1) and the hcall is not handled by the kernel.
> +
> +The 'nr' field contains the hypercall number (from the guest R11),
> +and 'args' contains the arguments (from the guest R3 - R10).
> +Userspace should put the return code in 'ret' and any extra returned
> +values in args[].  If the VM is not in 64-bit mode KVM zeros
> +the upper half of each field in the struct.
> +
> +As per the ePAPR hcall ABI, the return value is returned to
> +the guest in R3 and output return values in R4 - R10.
> +
> 		/* KVM_EXIT_S390_TSCH */
> 		struct {
> 			__u16 subchannel_id;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index acccd08..01ee50e 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -171,6 +171,7 @@ struct kvm_pit_config {
> #define KVM_EXIT_WATCHDOG         21
> #define KVM_EXIT_S390_TSCH        22
> #define KVM_EXIT_EPR              23
> +#define KVM_EXIT_EPAPR_HCALL      24
> 
> /* For KVM_EXIT_INTERNAL_ERROR */
> /* Emulate instruction failed. */
> @@ -288,6 +289,12 @@ struct kvm_run {
> 			__u64 ret;
> 			__u64 args[9];
> 		} papr_hcall;
> +		/* KVM_EXIT_EPAPR_HCALL */
> +		struct {
> +			__u64 nr;
> +			__u64 ret;
> +			__u64 args[8];
> +		} epapr_hcall;

This should be at the end of the union.


Alex

> 		/* KVM_EXIT_S390_TSCH */
> 		struct {
> 			__u16 subchannel_id;
> -- 
> 1.7.0.4
> 
> 


^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 0/5] powerpc: implement reset/shutdown hcalls
@ 2013-07-15 11:23 ` Bharat Bhushan
  0 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:23 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder; +Cc: Bharat Bhushan

From: Bharat Bhushan <bharat.bhushan@freescale.com>

This patchset implements the hcall exit interface to userspace.
Also we added reset and shutdown hcall

Bharat Bhushan (5):
  powerpc: define ePAPR hcall exit interface
  booke: exit to guest userspace for unimplemented hcalls in kvm
  booke: define reset and shutdown hcalls
  powerpc: Resolve KVM_HC_FEATURES compilation dependeny
  powerpc: using reset hcall when kvm,has-reset

 Documentation/virtual/kvm/api.txt        |   20 ++++++++++++
 Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++
 arch/powerpc/include/asm/kvm_para.h      |    1 +
 arch/powerpc/kernel/epapr_paravirt.c     |   12 +++++++
 arch/powerpc/kvm/booke.c                 |   47 +++++++++++++++++++++++++----
 arch/powerpc/kvm/powerpc.c               |    1 +
 include/uapi/linux/kvm.h                 |    8 +++++
 include/uapi/linux/kvm_para.h            |    3 +-
 8 files changed, 100 insertions(+), 8 deletions(-)



^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 1/5] powerpc: define ePAPR hcall exit interface
@ 2013-07-15 11:23   ` Bharat Bhushan
  0 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:23 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

This patch defines the ePAPR hcall exit interface to guest user space.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 Documentation/virtual/kvm/api.txt |   20 ++++++++++++++++++++
 include/uapi/linux/kvm.h          |    7 +++++++
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 66dd2aa..054f2f4 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2597,6 +2597,26 @@ The possible hypercalls are defined in the Power Architecture Platform
 Requirements (PAPR) document available from www.power.org (free
 developer registration required to access it).
 
+		/* KVM_EXIT_EPAPR_HCALL */
+		struct {
+			__u64 nr;
+			__u64 ret;
+			__u64 args[8];
+		} epapr_hcall;
+
+This is used on PowerPC platforms that support ePAPR hcalls.
+It occurs when a guest does a hypercall (as defined
+in the ePAPR 1.1) and the hcall is not handled by the kernel.
+
+The 'nr' field contains the hypercall number (from the guest R11),
+and 'args' contains the arguments (from the guest R3 - R10).
+Userspace should put the return code in 'ret' and any extra returned
+values in args[].  If the VM is not in 64-bit mode KVM zeros
+the upper half of each field in the struct.
+
+As per the ePAPR hcall ABI, the return value is returned to
+the guest in R3 and output return values in R4 - R10.
+
 		/* KVM_EXIT_S390_TSCH */
 		struct {
 			__u16 subchannel_id;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index acccd08..01ee50e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -171,6 +171,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_WATCHDOG         21
 #define KVM_EXIT_S390_TSCH        22
 #define KVM_EXIT_EPR              23
+#define KVM_EXIT_EPAPR_HCALL      24
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -288,6 +289,12 @@ struct kvm_run {
 			__u64 ret;
 			__u64 args[9];
 		} papr_hcall;
+		/* KVM_EXIT_EPAPR_HCALL */
+		struct {
+			__u64 nr;
+			__u64 ret;
+			__u64 args[8];
+		} epapr_hcall;
 		/* KVM_EXIT_S390_TSCH */
 		struct {
 			__u16 subchannel_id;
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-15 11:23   ` Bharat Bhushan
  0 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:23 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

Exit to guest user space if kvm does not implement the hcall.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++------
 arch/powerpc/kvm/powerpc.c |    1 +
 include/uapi/linux/kvm.h   |    1 +
 3 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 17722d8..c8b41b4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		break;
 
 #ifdef CONFIG_KVM_BOOKE_HV
-	case BOOKE_INTERRUPT_HV_SYSCALL:
+	case BOOKE_INTERRUPT_HV_SYSCALL: {
+		int i;
 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
-			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
+			r = kvmppc_kvm_pv(vcpu);
+			if (r != EV_UNIMPLEMENTED) {
+				/* except unimplemented return to guest */
+				kvmppc_set_gpr(vcpu, 3, r);
+				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
+				r = RESUME_GUEST;
+				break;
+			}
+			/* Exit to userspace for unimplemented hcalls in kvm */
+			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
+			run->epapr_hcall.ret = 0;
+			for (i = 0; i < 8; i++)
+				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i);
+			vcpu->arch.hcall_needed = 1;
+			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
+			r = RESUME_HOST;
 		} else {
 			/*
 			 * hcall from guest userspace -- send privileged
@@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			kvmppc_core_queue_program(vcpu, ESR_PPR);
 		}
 
-		r = RESUME_GUEST;
+		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
 		break;
+	}
 #else
-	case BOOKE_INTERRUPT_SYSCALL:
+	case BOOKE_INTERRUPT_SYSCALL: {
+		int i;
+		r = RESUME_GUEST;
 		if (!(vcpu->arch.shared->msr & MSR_PR) &&
 		    (((u32)kvmppc_get_gpr(vcpu, 0)) = KVM_SC_MAGIC_R0)) {
 			/* KVM PV hypercalls */
-			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
-			r = RESUME_GUEST;
+			r = kvmppc_kvm_pv(vcpu);
+			if (r != EV_UNIMPLEMENTED) {
+				/* except unimplemented return to guest */
+				kvmppc_set_gpr(vcpu, 3, r);
+				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
+				r = RESUME_GUEST;
+				break;
+			}
+			/* Exit to userspace for unimplemented hcalls in kvm */
+			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
+			run->epapr_hcall.ret = 0;
+			for (i = 0; i < 8; i++)
+				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i);
+			vcpu->arch.hcall_needed = 1;
+			run->exit_reason = KVM_EXIT_EPAPR_HCALL;
+			r = RESUME_HOST;
 		} else {
 			/* Guest syscalls */
 			kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL);
 		}
 		kvmppc_account_exit(vcpu, SYSCALL_EXITS);
-		r = RESUME_GUEST;
 		break;
+	}
 #endif
 
 	case BOOKE_INTERRUPT_DTLB_MISS: {
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4e05f8c..6c6199d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -307,6 +307,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_PPC_BOOKE_SREGS:
 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
 	case KVM_CAP_PPC_EPR:
+	case KVM_CAP_EXIT_EPAPR_HCALL:
 #else
 	case KVM_CAP_PPC_SEGSTATE:
 	case KVM_CAP_PPC_HIOR:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 01ee50e..b5266e5 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -674,6 +674,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_RTAS 91
 #define KVM_CAP_IRQ_XICS 92
 #define KVM_CAP_ARM_EL1_32BIT 93
+#define KVM_CAP_EXIT_EPAPR_HCALL 94
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 11:23   ` Bharat Bhushan
  0 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:23 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

KVM_HC_VM_RESET: Requests that the virtual machine be reset.
KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.

These hcalls are handled by guest userspace.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
 include/uapi/linux/kvm_para.h            |    3 ++-
 2 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
index ea113b5..58acdc1 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
 shared page that contains parts of supervisor visible register state.
 The guest can map this shared page to access its supervisor register through
 memory using this hypercall.
+
+5. KVM_HC_VM_RESET
+------------------------
+Architecture: PPC
+Status: active
+Purpose:  Requests that the virtual machine be reset.  The hcall takes no
+arguments. If successful the hcall does not return. If an error occurs it
+returns EV_INTERNAL.
+
+6. KVM_HC_VM_SHUTDOWN
+------------------------
+Architecture: PPC
+Status: active
+Purpose: Requests that the virtual machine be powered-off/halted.
+The hcall takes no arguments. If successful the hcall does not return.
+If an error occurs it returns EV_INTERNAL.
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index cea2c5c..218882d 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -19,7 +19,8 @@
 #define KVM_HC_MMU_OP			2
 #define KVM_HC_FEATURES			3
 #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
-
+#define KVM_HC_VM_RESET			5
+#define KVM_HC_VM_SHUTDOWN		6
 /*
  * hypercalls use architecture specific
  */
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 4/5] powerpc: Resolve KVM_HC_FEATURES compilation dependeny
@ 2013-07-15 11:23   ` Bharat Bhushan
  0 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:23 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

arch/powerpc/include/asm/kvm_para.h have dependency on uapi/linux/kvm_para.h
for KVM_HC_FEATURES

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 arch/powerpc/include/asm/kvm_para.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index 2b11965..8bdfd22 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -20,6 +20,7 @@
 #define __POWERPC_KVM_PARA_H__
 
 #include <uapi/asm/kvm_para.h>
+#include <uapi/linux/kvm_para.h>
 
 #ifdef CONFIG_KVM_GUEST
 
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-15 11:23   ` Bharat Bhushan
  0 siblings, 0 replies; 103+ messages in thread
From: Bharat Bhushan @ 2013-07-15 11:23 UTC (permalink / raw)
  To: kvm, kvm-ppc, agraf, scottwood, stuart.yoder
  Cc: Bharat Bhushan, Bharat Bhushan

Detect the availability of the reset hcalls by looking at kvm,has-reset
property on the /hypervisor node in the device tree passed to the VM and
patches the reset mechanism to use reset hcall.

This patch uses the reser hcall when kvm,has-reset is there in

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
 arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c
index d44a571..651d701 100644
--- a/arch/powerpc/kernel/epapr_paravirt.c
+++ b/arch/powerpc/kernel/epapr_paravirt.c
@@ -22,6 +22,8 @@
 #include <asm/cacheflush.h>
 #include <asm/code-patching.h>
 #include <asm/machdep.h>
+#include <asm/kvm_para.h>
+#include <asm/kvm_host.h>
 
 #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64)
 extern void epapr_ev_idle(void);
@@ -30,6 +32,14 @@ extern u32 epapr_ev_idle_start[];
 
 bool epapr_paravirt_enabled;
 
+void epapr_hypercall_reset(char *cmd)
+{
+	long ret;
+	ret = kvm_hypercall0(KVM_HC_VM_RESET);
+	printk("error: system reset returned with error %ld\n", ret);
+	BUG();
+}
+
 static int __init epapr_paravirt_init(void)
 {
 	struct device_node *hyper_node;
@@ -58,6 +68,8 @@ static int __init epapr_paravirt_init(void)
 	if (of_get_property(hyper_node, "has-idle", NULL))
 		ppc_md.power_save = epapr_ev_idle;
 #endif
+	if (of_get_property(hyper_node, "kvm,has-reset", NULL))
+		ppc_md.restart = epapr_hypercall_reset;
 
 	epapr_paravirt_enabled = true;
 
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 11:23   ` Bharat Bhushan
@ 2013-07-15 11:30     ` Gleb Natapov
  -1 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-15 11:30 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: kvm, kvm-ppc, agraf, scottwood, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
> 
> These hcalls are handled by guest userspace.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
>  Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
>  include/uapi/linux/kvm_para.h            |    3 ++-
>  2 files changed, 18 insertions(+), 1 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> index ea113b5..58acdc1 100644
> --- a/Documentation/virtual/kvm/hypercalls.txt
> +++ b/Documentation/virtual/kvm/hypercalls.txt
> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
>  shared page that contains parts of supervisor visible register state.
>  The guest can map this shared page to access its supervisor register through
>  memory using this hypercall.
> +
> +5. KVM_HC_VM_RESET
> +------------------------
> +Architecture: PPC
> +Status: active
> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
> +arguments. If successful the hcall does not return. If an error occurs it
> +returns EV_INTERNAL.
> +
> +6. KVM_HC_VM_SHUTDOWN
> +------------------------
> +Architecture: PPC
> +Status: active
> +Purpose: Requests that the virtual machine be powered-off/halted.
> +The hcall takes no arguments. If successful the hcall does not return.
> +If an error occurs it returns EV_INTERNAL.
> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> index cea2c5c..218882d 100644
> --- a/include/uapi/linux/kvm_para.h
> +++ b/include/uapi/linux/kvm_para.h
> @@ -19,7 +19,8 @@
>  #define KVM_HC_MMU_OP			2
>  #define KVM_HC_FEATURES			3
>  #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> -
> +#define KVM_HC_VM_RESET			5
> +#define KVM_HC_VM_SHUTDOWN		6
There is no much sense to share hypercalls between architectures. There
is zero probability x86 will implement those for instance (not sure
why PPC will want them either instead of emulating devices that do
shutdown/reset).  So lets move them to arch headers.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 11:30     ` Gleb Natapov
  0 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-15 11:30 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: kvm, kvm-ppc, agraf, scottwood, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
> 
> These hcalls are handled by guest userspace.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
>  Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
>  include/uapi/linux/kvm_para.h            |    3 ++-
>  2 files changed, 18 insertions(+), 1 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> index ea113b5..58acdc1 100644
> --- a/Documentation/virtual/kvm/hypercalls.txt
> +++ b/Documentation/virtual/kvm/hypercalls.txt
> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
>  shared page that contains parts of supervisor visible register state.
>  The guest can map this shared page to access its supervisor register through
>  memory using this hypercall.
> +
> +5. KVM_HC_VM_RESET
> +------------------------
> +Architecture: PPC
> +Status: active
> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
> +arguments. If successful the hcall does not return. If an error occurs it
> +returns EV_INTERNAL.
> +
> +6. KVM_HC_VM_SHUTDOWN
> +------------------------
> +Architecture: PPC
> +Status: active
> +Purpose: Requests that the virtual machine be powered-off/halted.
> +The hcall takes no arguments. If successful the hcall does not return.
> +If an error occurs it returns EV_INTERNAL.
> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> index cea2c5c..218882d 100644
> --- a/include/uapi/linux/kvm_para.h
> +++ b/include/uapi/linux/kvm_para.h
> @@ -19,7 +19,8 @@
>  #define KVM_HC_MMU_OP			2
>  #define KVM_HC_FEATURES			3
>  #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> -
> +#define KVM_HC_VM_RESET			5
> +#define KVM_HC_VM_SHUTDOWN		6
There is no much sense to share hypercalls between architectures. There
is zero probability x86 will implement those for instance (not sure
why PPC will want them either instead of emulating devices that do
shutdown/reset).  So lets move them to arch headers.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 11:23   ` Bharat Bhushan
@ 2013-07-15 11:31     ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:31 UTC (permalink / raw)
  To: Bharat Bhushan; +Cc: kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:11, Bharat Bhushan wrote:

> Exit to guest user space if kvm does not implement the hcall.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++------
> arch/powerpc/kvm/powerpc.c |    1 +
> include/uapi/linux/kvm.h   |    1 +
> 3 files changed, 42 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 17722d8..c8b41b4 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
> 		break;
> 
> #ifdef CONFIG_KVM_BOOKE_HV
> -	case BOOKE_INTERRUPT_HV_SYSCALL:
> +	case BOOKE_INTERRUPT_HV_SYSCALL: {

This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too.

> +		int i;
> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> +			r = kvmppc_kvm_pv(vcpu);
> +			if (r != EV_UNIMPLEMENTED) {
> +				/* except unimplemented return to guest */
> +				kvmppc_set_gpr(vcpu, 3, r);
> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> +				r = RESUME_GUEST;
> +				break;
> +			}
> +			/* Exit to userspace for unimplemented hcalls in kvm */
> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> +			run->epapr_hcall.ret = 0;
> +			for (i = 0; i < 8; i++)
> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i);
> +			vcpu->arch.hcall_needed = 1;
> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> +			r = RESUME_HOST;
> 		} else {
> 			/*
> 			 * hcall from guest userspace -- send privileged
> @@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
> 		}
> 
> -		r = RESUME_GUEST;
> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;

This looks odd. Your exit reason only changes when you do the hcall exiting, right?

You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet.


Alex

> 		break;
> +	}
> #else
> -	case BOOKE_INTERRUPT_SYSCALL:
> +	case BOOKE_INTERRUPT_SYSCALL: {
> +		int i;
> +		r = RESUME_GUEST;
> 		if (!(vcpu->arch.shared->msr & MSR_PR) &&
> 		    (((u32)kvmppc_get_gpr(vcpu, 0)) == KVM_SC_MAGIC_R0)) {
> 			/* KVM PV hypercalls */
> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> -			r = RESUME_GUEST;
> +			r = kvmppc_kvm_pv(vcpu);
> +			if (r != EV_UNIMPLEMENTED) {
> +				/* except unimplemented return to guest */
> +				kvmppc_set_gpr(vcpu, 3, r);
> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> +				r = RESUME_GUEST;
> +				break;
> +			}
> +			/* Exit to userspace for unimplemented hcalls in kvm */
> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> +			run->epapr_hcall.ret = 0;
> +			for (i = 0; i < 8; i++)
> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i);
> +			vcpu->arch.hcall_needed = 1;
> +			run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> +			r = RESUME_HOST;
> 		} else {
> 			/* Guest syscalls */
> 			kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL);
> 		}
> 		kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> -		r = RESUME_GUEST;
> 		break;
> +	}
> #endif
> 
> 	case BOOKE_INTERRUPT_DTLB_MISS: {
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 4e05f8c..6c6199d 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -307,6 +307,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> 	case KVM_CAP_PPC_BOOKE_SREGS:
> 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
> 	case KVM_CAP_PPC_EPR:
> +	case KVM_CAP_EXIT_EPAPR_HCALL:
> #else
> 	case KVM_CAP_PPC_SEGSTATE:
> 	case KVM_CAP_PPC_HIOR:
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 01ee50e..b5266e5 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -674,6 +674,7 @@ struct kvm_ppc_smmu_info {
> #define KVM_CAP_PPC_RTAS 91
> #define KVM_CAP_IRQ_XICS 92
> #define KVM_CAP_ARM_EL1_32BIT 93
> +#define KVM_CAP_EXIT_EPAPR_HCALL 94
> 
> #ifdef KVM_CAP_IRQ_ROUTING
> 
> -- 
> 1.7.0.4
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-15 11:31     ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:31 UTC (permalink / raw)
  To: Bharat Bhushan; +Cc: kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:11, Bharat Bhushan wrote:

> Exit to guest user space if kvm does not implement the hcall.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++------
> arch/powerpc/kvm/powerpc.c |    1 +
> include/uapi/linux/kvm.h   |    1 +
> 3 files changed, 42 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 17722d8..c8b41b4 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
> 		break;
> 
> #ifdef CONFIG_KVM_BOOKE_HV
> -	case BOOKE_INTERRUPT_HV_SYSCALL:
> +	case BOOKE_INTERRUPT_HV_SYSCALL: {

This is getting large. Please extract hcall handling into its own function. Maybe you can merge the HV and non-HV case then too.

> +		int i;
> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> +			r = kvmppc_kvm_pv(vcpu);
> +			if (r != EV_UNIMPLEMENTED) {
> +				/* except unimplemented return to guest */
> +				kvmppc_set_gpr(vcpu, 3, r);
> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> +				r = RESUME_GUEST;
> +				break;
> +			}
> +			/* Exit to userspace for unimplemented hcalls in kvm */
> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> +			run->epapr_hcall.ret = 0;
> +			for (i = 0; i < 8; i++)
> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i);
> +			vcpu->arch.hcall_needed = 1;
> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> +			r = RESUME_HOST;
> 		} else {
> 			/*
> 			 * hcall from guest userspace -- send privileged
> @@ -1016,22 +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
> 		}
> 
> -		r = RESUME_GUEST;
> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;

This looks odd. Your exit reason only changes when you do the hcall exiting, right?

You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise older user space will break, as it doesn't know about the exit type yet.


Alex

> 		break;
> +	}
> #else
> -	case BOOKE_INTERRUPT_SYSCALL:
> +	case BOOKE_INTERRUPT_SYSCALL: {
> +		int i;
> +		r = RESUME_GUEST;
> 		if (!(vcpu->arch.shared->msr & MSR_PR) &&
> 		    (((u32)kvmppc_get_gpr(vcpu, 0)) = KVM_SC_MAGIC_R0)) {
> 			/* KVM PV hypercalls */
> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> -			r = RESUME_GUEST;
> +			r = kvmppc_kvm_pv(vcpu);
> +			if (r != EV_UNIMPLEMENTED) {
> +				/* except unimplemented return to guest */
> +				kvmppc_set_gpr(vcpu, 3, r);
> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> +				r = RESUME_GUEST;
> +				break;
> +			}
> +			/* Exit to userspace for unimplemented hcalls in kvm */
> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> +			run->epapr_hcall.ret = 0;
> +			for (i = 0; i < 8; i++)
> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 + i);
> +			vcpu->arch.hcall_needed = 1;
> +			run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> +			r = RESUME_HOST;
> 		} else {
> 			/* Guest syscalls */
> 			kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL);
> 		}
> 		kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> -		r = RESUME_GUEST;
> 		break;
> +	}
> #endif
> 
> 	case BOOKE_INTERRUPT_DTLB_MISS: {
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 4e05f8c..6c6199d 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -307,6 +307,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> 	case KVM_CAP_PPC_BOOKE_SREGS:
> 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
> 	case KVM_CAP_PPC_EPR:
> +	case KVM_CAP_EXIT_EPAPR_HCALL:
> #else
> 	case KVM_CAP_PPC_SEGSTATE:
> 	case KVM_CAP_PPC_HIOR:
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 01ee50e..b5266e5 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -674,6 +674,7 @@ struct kvm_ppc_smmu_info {
> #define KVM_CAP_PPC_RTAS 91
> #define KVM_CAP_IRQ_XICS 92
> #define KVM_CAP_ARM_EL1_32BIT 93
> +#define KVM_CAP_EXIT_EPAPR_HCALL 94
> 
> #ifdef KVM_CAP_IRQ_ROUTING
> 
> -- 
> 1.7.0.4
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 1/5] powerpc: define ePAPR hcall exit interface
  2013-07-15 11:21     ` Alexander Graf
  (?)
@ 2013-07-15 11:32     ` Bhushan Bharat-R65777
  -1 siblings, 0 replies; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-15 11:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Monday, July 15, 2013 4:51 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
> Stuart-B08248; Bhushan Bharat-R65777
> Subject: Re: [PATCH 1/5] powerpc: define ePAPR hcall exit interface
> 
> 
> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> 
> > This patch defines the ePAPR hcall exit interface to guest user space.
> 
> The subject line is misleading. This is a kvm patch. Same applies for most other
> patches.

Ok, will make this "kvm: powerpc: define ePAPR hcall exit interface" 

> 
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> > Documentation/virtual/kvm/api.txt |   20 ++++++++++++++++++++
> > include/uapi/linux/kvm.h          |    7 +++++++
> > 2 files changed, 27 insertions(+), 0 deletions(-)
> >
> > diff --git a/Documentation/virtual/kvm/api.txt
> > b/Documentation/virtual/kvm/api.txt
> > index 66dd2aa..054f2f4 100644
> > --- a/Documentation/virtual/kvm/api.txt
> > +++ b/Documentation/virtual/kvm/api.txt
> > @@ -2597,6 +2597,26 @@ The possible hypercalls are defined in the
> > Power Architecture Platform Requirements (PAPR) document available
> > from www.power.org (free developer registration required to access it).
> >
> > +		/* KVM_EXIT_EPAPR_HCALL */
> > +		struct {
> > +			__u64 nr;
> > +			__u64 ret;
> > +			__u64 args[8];
> > +		} epapr_hcall;
> > +
> > +This is used on PowerPC platforms that support ePAPR hcalls.
> > +It occurs when a guest does a hypercall (as defined in the ePAPR 1.1)
> > +and the hcall is not handled by the kernel.
> > +
> > +The 'nr' field contains the hypercall number (from the guest R11),
> > +and 'args' contains the arguments (from the guest R3 - R10).
> > +Userspace should put the return code in 'ret' and any extra returned
> > +values in args[].  If the VM is not in 64-bit mode KVM zeros the
> > +upper half of each field in the struct.
> > +
> > +As per the ePAPR hcall ABI, the return value is returned to the guest
> > +in R3 and output return values in R4 - R10.
> > +
> > 		/* KVM_EXIT_S390_TSCH */
> > 		struct {
> > 			__u16 subchannel_id;
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index
> > acccd08..01ee50e 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -171,6 +171,7 @@ struct kvm_pit_config {
> > #define KVM_EXIT_WATCHDOG         21
> > #define KVM_EXIT_S390_TSCH        22
> > #define KVM_EXIT_EPR              23
> > +#define KVM_EXIT_EPAPR_HCALL      24
> >
> > /* For KVM_EXIT_INTERNAL_ERROR */
> > /* Emulate instruction failed. */
> > @@ -288,6 +289,12 @@ struct kvm_run {
> > 			__u64 ret;
> > 			__u64 args[9];
> > 		} papr_hcall;
> > +		/* KVM_EXIT_EPAPR_HCALL */
> > +		struct {
> > +			__u64 nr;
> > +			__u64 ret;
> > +			__u64 args[8];
> > +		} epapr_hcall;
> 
> This should be at the end of the union.

Ok.

-Bharat

> 
> 
> Alex
> 
> > 		/* KVM_EXIT_S390_TSCH */
> > 		struct {
> > 			__u16 subchannel_id;
> > --
> > 1.7.0.4
> >
> >
> 



^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 11:31     ` Alexander Graf
@ 2013-07-15 11:38       ` Bhushan Bharat-R65777
  -1 siblings, 0 replies; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-15 11:38 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Monday, July 15, 2013 5:02 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
> Stuart-B08248; Bhushan Bharat-R65777
> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
> in kvm
> 
> 
> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> 
> > Exit to guest user space if kvm does not implement the hcall.
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> > arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++------
> > arch/powerpc/kvm/powerpc.c |    1 +
> > include/uapi/linux/kvm.h   |    1 +
> > 3 files changed, 42 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
> > 17722d8..c8b41b4 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
> kvm_vcpu *vcpu,
> > 		break;
> >
> > #ifdef CONFIG_KVM_BOOKE_HV
> > -	case BOOKE_INTERRUPT_HV_SYSCALL:
> > +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> 
> This is getting large. Please extract hcall handling into its own function.
> Maybe you can merge the HV and non-HV case then too.
> 
> > +		int i;
> > 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> > -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> > +			r = kvmppc_kvm_pv(vcpu);
> > +			if (r != EV_UNIMPLEMENTED) {
> > +				/* except unimplemented return to guest */
> > +				kvmppc_set_gpr(vcpu, 3, r);
> > +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> > +				r = RESUME_GUEST;
> > +				break;
> > +			}
> > +			/* Exit to userspace for unimplemented hcalls in kvm */
> > +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> > +			run->epapr_hcall.ret = 0;
> > +			for (i = 0; i < 8; i++)
> > +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
> i);
> > +			vcpu->arch.hcall_needed = 1;
> > +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> > +			r = RESUME_HOST;
> > 		} else {
> > 			/*
> > 			 * hcall from guest userspace -- send privileged @@ -1016,22
> > +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
> > 			kvmppc_core_queue_program(vcpu, ESR_PPR);
> > 		}
> >
> > -		r = RESUME_GUEST;
> > +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;


Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS);

s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS);

-Bharat

> 
> This looks odd. Your exit reason only changes when you do the hcall exiting,
> right?
> 
> You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise
> older user space will break, as it doesn't know about the exit type yet.

So the user space so make enable_cap also?

-Bharat

> 
> 
> Alex
> 
> > 		break;
> > +	}
> > #else
> > -	case BOOKE_INTERRUPT_SYSCALL:
> > +	case BOOKE_INTERRUPT_SYSCALL: {
> > +		int i;
> > +		r = RESUME_GUEST;
> > 		if (!(vcpu->arch.shared->msr & MSR_PR) &&
> > 		    (((u32)kvmppc_get_gpr(vcpu, 0)) == KVM_SC_MAGIC_R0)) {
> > 			/* KVM PV hypercalls */
> > -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> > -			r = RESUME_GUEST;
> > +			r = kvmppc_kvm_pv(vcpu);
> > +			if (r != EV_UNIMPLEMENTED) {
> > +				/* except unimplemented return to guest */
> > +				kvmppc_set_gpr(vcpu, 3, r);
> > +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> > +				r = RESUME_GUEST;
> > +				break;
> > +			}
> > +			/* Exit to userspace for unimplemented hcalls in kvm */
> > +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> > +			run->epapr_hcall.ret = 0;
> > +			for (i = 0; i < 8; i++)
> > +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
> i);
> > +			vcpu->arch.hcall_needed = 1;
> > +			run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> > +			r = RESUME_HOST;
> > 		} else {
> > 			/* Guest syscalls */
> > 			kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL);
> > 		}
> > 		kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> > -		r = RESUME_GUEST;
> > 		break;
> > +	}
> > #endif
> >
> > 	case BOOKE_INTERRUPT_DTLB_MISS: {
> > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> > index 4e05f8c..6c6199d 100644
> > --- a/arch/powerpc/kvm/powerpc.c
> > +++ b/arch/powerpc/kvm/powerpc.c
> > @@ -307,6 +307,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > 	case KVM_CAP_PPC_BOOKE_SREGS:
> > 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
> > 	case KVM_CAP_PPC_EPR:
> > +	case KVM_CAP_EXIT_EPAPR_HCALL:
> > #else
> > 	case KVM_CAP_PPC_SEGSTATE:
> > 	case KVM_CAP_PPC_HIOR:
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index
> > 01ee50e..b5266e5 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -674,6 +674,7 @@ struct kvm_ppc_smmu_info { #define
> > KVM_CAP_PPC_RTAS 91 #define KVM_CAP_IRQ_XICS 92 #define
> > KVM_CAP_ARM_EL1_32BIT 93
> > +#define KVM_CAP_EXIT_EPAPR_HCALL 94
> >
> > #ifdef KVM_CAP_IRQ_ROUTING
> >
> > --
> > 1.7.0.4
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org More majordomo info
> > at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-15 11:38       ` Bhushan Bharat-R65777
  0 siblings, 0 replies; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-15 11:38 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Monday, July 15, 2013 5:02 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
> Stuart-B08248; Bhushan Bharat-R65777
> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
> in kvm
> 
> 
> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> 
> > Exit to guest user space if kvm does not implement the hcall.
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> > arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++------
> > arch/powerpc/kvm/powerpc.c |    1 +
> > include/uapi/linux/kvm.h   |    1 +
> > 3 files changed, 42 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
> > 17722d8..c8b41b4 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
> kvm_vcpu *vcpu,
> > 		break;
> >
> > #ifdef CONFIG_KVM_BOOKE_HV
> > -	case BOOKE_INTERRUPT_HV_SYSCALL:
> > +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> 
> This is getting large. Please extract hcall handling into its own function.
> Maybe you can merge the HV and non-HV case then too.
> 
> > +		int i;
> > 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> > -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> > +			r = kvmppc_kvm_pv(vcpu);
> > +			if (r != EV_UNIMPLEMENTED) {
> > +				/* except unimplemented return to guest */
> > +				kvmppc_set_gpr(vcpu, 3, r);
> > +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> > +				r = RESUME_GUEST;
> > +				break;
> > +			}
> > +			/* Exit to userspace for unimplemented hcalls in kvm */
> > +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> > +			run->epapr_hcall.ret = 0;
> > +			for (i = 0; i < 8; i++)
> > +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
> i);
> > +			vcpu->arch.hcall_needed = 1;
> > +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> > +			r = RESUME_HOST;
> > 		} else {
> > 			/*
> > 			 * hcall from guest userspace -- send privileged @@ -1016,22
> > +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
> > 			kvmppc_core_queue_program(vcpu, ESR_PPR);
> > 		}
> >
> > -		r = RESUME_GUEST;
> > +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;


Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS);

s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS);

-Bharat

> 
> This looks odd. Your exit reason only changes when you do the hcall exiting,
> right?
> 
> You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise
> older user space will break, as it doesn't know about the exit type yet.

So the user space so make enable_cap also?

-Bharat

> 
> 
> Alex
> 
> > 		break;
> > +	}
> > #else
> > -	case BOOKE_INTERRUPT_SYSCALL:
> > +	case BOOKE_INTERRUPT_SYSCALL: {
> > +		int i;
> > +		r = RESUME_GUEST;
> > 		if (!(vcpu->arch.shared->msr & MSR_PR) &&
> > 		    (((u32)kvmppc_get_gpr(vcpu, 0)) = KVM_SC_MAGIC_R0)) {
> > 			/* KVM PV hypercalls */
> > -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> > -			r = RESUME_GUEST;
> > +			r = kvmppc_kvm_pv(vcpu);
> > +			if (r != EV_UNIMPLEMENTED) {
> > +				/* except unimplemented return to guest */
> > +				kvmppc_set_gpr(vcpu, 3, r);
> > +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> > +				r = RESUME_GUEST;
> > +				break;
> > +			}
> > +			/* Exit to userspace for unimplemented hcalls in kvm */
> > +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> > +			run->epapr_hcall.ret = 0;
> > +			for (i = 0; i < 8; i++)
> > +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
> i);
> > +			vcpu->arch.hcall_needed = 1;
> > +			run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> > +			r = RESUME_HOST;
> > 		} else {
> > 			/* Guest syscalls */
> > 			kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SYSCALL);
> > 		}
> > 		kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> > -		r = RESUME_GUEST;
> > 		break;
> > +	}
> > #endif
> >
> > 	case BOOKE_INTERRUPT_DTLB_MISS: {
> > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> > index 4e05f8c..6c6199d 100644
> > --- a/arch/powerpc/kvm/powerpc.c
> > +++ b/arch/powerpc/kvm/powerpc.c
> > @@ -307,6 +307,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > 	case KVM_CAP_PPC_BOOKE_SREGS:
> > 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
> > 	case KVM_CAP_PPC_EPR:
> > +	case KVM_CAP_EXIT_EPAPR_HCALL:
> > #else
> > 	case KVM_CAP_PPC_SEGSTATE:
> > 	case KVM_CAP_PPC_HIOR:
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index
> > 01ee50e..b5266e5 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -674,6 +674,7 @@ struct kvm_ppc_smmu_info { #define
> > KVM_CAP_PPC_RTAS 91 #define KVM_CAP_IRQ_XICS 92 #define
> > KVM_CAP_ARM_EL1_32BIT 93
> > +#define KVM_CAP_EXIT_EPAPR_HCALL 94
> >
> > #ifdef KVM_CAP_IRQ_ROUTING
> >
> > --
> > 1.7.0.4
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org More majordomo info
> > at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 11:30     ` Gleb Natapov
@ 2013-07-15 11:44       ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:44 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:30, Gleb Natapov wrote:

> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
>> 
>> These hcalls are handled by guest userspace.
>> 
>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>> ---
>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
>> include/uapi/linux/kvm_para.h            |    3 ++-
>> 2 files changed, 18 insertions(+), 1 deletions(-)
>> 
>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
>> index ea113b5..58acdc1 100644
>> --- a/Documentation/virtual/kvm/hypercalls.txt
>> +++ b/Documentation/virtual/kvm/hypercalls.txt
>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
>> shared page that contains parts of supervisor visible register state.
>> The guest can map this shared page to access its supervisor register through
>> memory using this hypercall.
>> +
>> +5. KVM_HC_VM_RESET
>> +------------------------
>> +Architecture: PPC
>> +Status: active
>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
>> +arguments. If successful the hcall does not return. If an error occurs it
>> +returns EV_INTERNAL.
>> +
>> +6. KVM_HC_VM_SHUTDOWN
>> +------------------------
>> +Architecture: PPC
>> +Status: active
>> +Purpose: Requests that the virtual machine be powered-off/halted.
>> +The hcall takes no arguments. If successful the hcall does not return.
>> +If an error occurs it returns EV_INTERNAL.
>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
>> index cea2c5c..218882d 100644
>> --- a/include/uapi/linux/kvm_para.h
>> +++ b/include/uapi/linux/kvm_para.h
>> @@ -19,7 +19,8 @@
>> #define KVM_HC_MMU_OP			2
>> #define KVM_HC_FEATURES			3
>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
>> -
>> +#define KVM_HC_VM_RESET			5
>> +#define KVM_HC_VM_SHUTDOWN		6
> There is no much sense to share hypercalls between architectures. There
> is zero probability x86 will implement those for instance (not sure
> why PPC will want them either instead of emulating devices that do
> shutdown/reset

Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.

So having a separate namespace with hcalls makes things a lot easier. And the guest needs to learn how to access it either way.

> ).  So lets move them to arch headers.

Do we want to keep the numbering scheme interchangable? Maybe there will be hcalls that can get shared between archs? If so, leaving it in the same header file might make sense.


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 11:44       ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:44 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:30, Gleb Natapov wrote:

> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
>> 
>> These hcalls are handled by guest userspace.
>> 
>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>> ---
>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
>> include/uapi/linux/kvm_para.h            |    3 ++-
>> 2 files changed, 18 insertions(+), 1 deletions(-)
>> 
>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
>> index ea113b5..58acdc1 100644
>> --- a/Documentation/virtual/kvm/hypercalls.txt
>> +++ b/Documentation/virtual/kvm/hypercalls.txt
>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
>> shared page that contains parts of supervisor visible register state.
>> The guest can map this shared page to access its supervisor register through
>> memory using this hypercall.
>> +
>> +5. KVM_HC_VM_RESET
>> +------------------------
>> +Architecture: PPC
>> +Status: active
>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
>> +arguments. If successful the hcall does not return. If an error occurs it
>> +returns EV_INTERNAL.
>> +
>> +6. KVM_HC_VM_SHUTDOWN
>> +------------------------
>> +Architecture: PPC
>> +Status: active
>> +Purpose: Requests that the virtual machine be powered-off/halted.
>> +The hcall takes no arguments. If successful the hcall does not return.
>> +If an error occurs it returns EV_INTERNAL.
>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
>> index cea2c5c..218882d 100644
>> --- a/include/uapi/linux/kvm_para.h
>> +++ b/include/uapi/linux/kvm_para.h
>> @@ -19,7 +19,8 @@
>> #define KVM_HC_MMU_OP			2
>> #define KVM_HC_FEATURES			3
>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
>> -
>> +#define KVM_HC_VM_RESET			5
>> +#define KVM_HC_VM_SHUTDOWN		6
> There is no much sense to share hypercalls between architectures. There
> is zero probability x86 will implement those for instance (not sure
> why PPC will want them either instead of emulating devices that do
> shutdown/reset

Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.

So having a separate namespace with hcalls makes things a lot easier. And the guest needs to learn how to access it either way.

> ).  So lets move them to arch headers.

Do we want to keep the numbering scheme interchangable? Maybe there will be hcalls that can get shared between archs? If so, leaving it in the same header file might make sense.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 11:38       ` Bhushan Bharat-R65777
@ 2013-07-15 11:46         ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:46 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 5:02 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248; Bhushan Bharat-R65777
>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
>> in kvm
>> 
>> 
>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>> 
>>> Exit to guest user space if kvm does not implement the hcall.
>>> 
>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>> ---
>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++------
>>> arch/powerpc/kvm/powerpc.c |    1 +
>>> include/uapi/linux/kvm.h   |    1 +
>>> 3 files changed, 42 insertions(+), 7 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
>>> 17722d8..c8b41b4 100644
>>> --- a/arch/powerpc/kvm/booke.c
>>> +++ b/arch/powerpc/kvm/booke.c
>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>> kvm_vcpu *vcpu,
>>> 		break;
>>> 
>>> #ifdef CONFIG_KVM_BOOKE_HV
>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
>> 
>> This is getting large. Please extract hcall handling into its own function.
>> Maybe you can merge the HV and non-HV case then too.
>> 
>>> +		int i;
>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
>>> +			r = kvmppc_kvm_pv(vcpu);
>>> +			if (r != EV_UNIMPLEMENTED) {
>>> +				/* except unimplemented return to guest */
>>> +				kvmppc_set_gpr(vcpu, 3, r);
>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>> +				r = RESUME_GUEST;
>>> +				break;
>>> +			}
>>> +			/* Exit to userspace for unimplemented hcalls in kvm */
>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
>>> +			run->epapr_hcall.ret = 0;
>>> +			for (i = 0; i < 8; i++)
>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
>> i);
>>> +			vcpu->arch.hcall_needed = 1;
>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>> +			r = RESUME_HOST;
>>> 		} else {
>>> 			/*
>>> 			 * hcall from guest userspace -- send privileged @@ -1016,22
>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
>>> 		}
>>> 
>>> -		r = RESUME_GUEST;
>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> 
> 
> Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> 
> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> 
> -Bharat
> 
>> 
>> This looks odd. Your exit reason only changes when you do the hcall exiting,
>> right?
>> 
>> You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise
>> older user space will break, as it doesn't know about the exit type yet.
> 
> So the user space so make enable_cap also?

User space needs to call enable_cap on this cap, yes. Otherwise a guest can confuse user space with an hcall exit it can't handle.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-15 11:46         ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:46 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 5:02 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248; Bhushan Bharat-R65777
>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
>> in kvm
>> 
>> 
>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>> 
>>> Exit to guest user space if kvm does not implement the hcall.
>>> 
>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>> ---
>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++------
>>> arch/powerpc/kvm/powerpc.c |    1 +
>>> include/uapi/linux/kvm.h   |    1 +
>>> 3 files changed, 42 insertions(+), 7 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
>>> 17722d8..c8b41b4 100644
>>> --- a/arch/powerpc/kvm/booke.c
>>> +++ b/arch/powerpc/kvm/booke.c
>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>> kvm_vcpu *vcpu,
>>> 		break;
>>> 
>>> #ifdef CONFIG_KVM_BOOKE_HV
>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
>> 
>> This is getting large. Please extract hcall handling into its own function.
>> Maybe you can merge the HV and non-HV case then too.
>> 
>>> +		int i;
>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
>>> +			r = kvmppc_kvm_pv(vcpu);
>>> +			if (r != EV_UNIMPLEMENTED) {
>>> +				/* except unimplemented return to guest */
>>> +				kvmppc_set_gpr(vcpu, 3, r);
>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>> +				r = RESUME_GUEST;
>>> +				break;
>>> +			}
>>> +			/* Exit to userspace for unimplemented hcalls in kvm */
>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
>>> +			run->epapr_hcall.ret = 0;
>>> +			for (i = 0; i < 8; i++)
>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
>> i);
>>> +			vcpu->arch.hcall_needed = 1;
>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>> +			r = RESUME_HOST;
>>> 		} else {
>>> 			/*
>>> 			 * hcall from guest userspace -- send privileged @@ -1016,22
>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
>>> 		}
>>> 
>>> -		r = RESUME_GUEST;
>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> 
> 
> Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> 
> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> 
> -Bharat
> 
>> 
>> This looks odd. Your exit reason only changes when you do the hcall exiting,
>> right?
>> 
>> You also need to guard user space hcall exits with an ENABLE_CAP. Otherwise
>> older user space will break, as it doesn't know about the exit type yet.
> 
> So the user space so make enable_cap also?

User space needs to call enable_cap on this cap, yes. Otherwise a guest can confuse user space with an hcall exit it can't handle.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 4/5] powerpc: Resolve KVM_HC_FEATURES compilation dependeny
  2013-07-15 11:23   ` Bharat Bhushan
@ 2013-07-15 11:46     ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:46 UTC (permalink / raw)
  To: Bharat Bhushan; +Cc: kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:11, Bharat Bhushan wrote:

> arch/powerpc/include/asm/kvm_para.h have dependency on uapi/linux/kvm_para.h
> for KVM_HC_FEATURES
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> arch/powerpc/include/asm/kvm_para.h |    1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
> index 2b11965..8bdfd22 100644
> --- a/arch/powerpc/include/asm/kvm_para.h
> +++ b/arch/powerpc/include/asm/kvm_para.h
> @@ -20,6 +20,7 @@
> #define __POWERPC_KVM_PARA_H__
> 
> #include <uapi/asm/kvm_para.h>
> +#include <uapi/linux/kvm_para.h>

Shouldn't linux/kvm_para.h include asm/kvm_para.h usually?


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 4/5] powerpc: Resolve KVM_HC_FEATURES compilation dependeny
@ 2013-07-15 11:46     ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:46 UTC (permalink / raw)
  To: Bharat Bhushan; +Cc: kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:11, Bharat Bhushan wrote:

> arch/powerpc/include/asm/kvm_para.h have dependency on uapi/linux/kvm_para.h
> for KVM_HC_FEATURES
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> arch/powerpc/include/asm/kvm_para.h |    1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
> index 2b11965..8bdfd22 100644
> --- a/arch/powerpc/include/asm/kvm_para.h
> +++ b/arch/powerpc/include/asm/kvm_para.h
> @@ -20,6 +20,7 @@
> #define __POWERPC_KVM_PARA_H__
> 
> #include <uapi/asm/kvm_para.h>
> +#include <uapi/linux/kvm_para.h>

Shouldn't linux/kvm_para.h include asm/kvm_para.h usually?


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 11:23   ` Bharat Bhushan
@ 2013-07-15 11:50     ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:50 UTC (permalink / raw)
  To: Bharat Bhushan; +Cc: kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:11, Bharat Bhushan wrote:

> Detect the availability of the reset hcalls by looking at kvm,has-reset
> property on the /hypervisor node in the device tree passed to the VM and
> patches the reset mechanism to use reset hcall.
> 
> This patch uses the reser hcall when kvm,has-reset is there in

Your patch description is pretty broken :).

> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
> 1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c
> index d44a571..651d701 100644
> --- a/arch/powerpc/kernel/epapr_paravirt.c
> +++ b/arch/powerpc/kernel/epapr_paravirt.c
> @@ -22,6 +22,8 @@
> #include <asm/cacheflush.h>
> #include <asm/code-patching.h>
> #include <asm/machdep.h>
> +#include <asm/kvm_para.h>
> +#include <asm/kvm_host.h>

Why would we need kvm_host.h? This is guest code.

> 
> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64)
> extern void epapr_ev_idle(void);
> @@ -30,6 +32,14 @@ extern u32 epapr_ev_idle_start[];
> 
> bool epapr_paravirt_enabled;
> 
> +void epapr_hypercall_reset(char *cmd)
> +{
> +	long ret;
> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);

Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns "unimplemented" for everything when that config option is not set.

> +	printk("error: system reset returned with error %ld\n", ret);

So we should fall back to the normal reset handler here.


Alex

> +	BUG();
> +}
> +
> static int __init epapr_paravirt_init(void)
> {
> 	struct device_node *hyper_node;
> @@ -58,6 +68,8 @@ static int __init epapr_paravirt_init(void)
> 	if (of_get_property(hyper_node, "has-idle", NULL))
> 		ppc_md.power_save = epapr_ev_idle;
> #endif
> +	if (of_get_property(hyper_node, "kvm,has-reset", NULL))
> +		ppc_md.restart = epapr_hypercall_reset;
> 
> 	epapr_paravirt_enabled = true;
> 
> -- 
> 1.7.0.4
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-15 11:50     ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 11:50 UTC (permalink / raw)
  To: Bharat Bhushan; +Cc: kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 13:11, Bharat Bhushan wrote:

> Detect the availability of the reset hcalls by looking at kvm,has-reset
> property on the /hypervisor node in the device tree passed to the VM and
> patches the reset mechanism to use reset hcall.
> 
> This patch uses the reser hcall when kvm,has-reset is there in

Your patch description is pretty broken :).

> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
> 1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c
> index d44a571..651d701 100644
> --- a/arch/powerpc/kernel/epapr_paravirt.c
> +++ b/arch/powerpc/kernel/epapr_paravirt.c
> @@ -22,6 +22,8 @@
> #include <asm/cacheflush.h>
> #include <asm/code-patching.h>
> #include <asm/machdep.h>
> +#include <asm/kvm_para.h>
> +#include <asm/kvm_host.h>

Why would we need kvm_host.h? This is guest code.

> 
> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64)
> extern void epapr_ev_idle(void);
> @@ -30,6 +32,14 @@ extern u32 epapr_ev_idle_start[];
> 
> bool epapr_paravirt_enabled;
> 
> +void epapr_hypercall_reset(char *cmd)
> +{
> +	long ret;
> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);

Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns "unimplemented" for everything when that config option is not set.

> +	printk("error: system reset returned with error %ld\n", ret);

So we should fall back to the normal reset handler here.


Alex

> +	BUG();
> +}
> +
> static int __init epapr_paravirt_init(void)
> {
> 	struct device_node *hyper_node;
> @@ -58,6 +68,8 @@ static int __init epapr_paravirt_init(void)
> 	if (of_get_property(hyper_node, "has-idle", NULL))
> 		ppc_md.power_save = epapr_ev_idle;
> #endif
> +	if (of_get_property(hyper_node, "kvm,has-reset", NULL))
> +		ppc_md.restart = epapr_hypercall_reset;
> 
> 	epapr_paravirt_enabled = true;
> 
> -- 
> 1.7.0.4
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 11:44       ` Alexander Graf
@ 2013-07-15 12:15         ` Gleb Natapov
  -1 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-15 12:15 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
> 
> On 15.07.2013, at 13:30, Gleb Natapov wrote:
> 
> > On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> >> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> >> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
> >> 
> >> These hcalls are handled by guest userspace.
> >> 
> >> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >> ---
> >> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >> include/uapi/linux/kvm_para.h            |    3 ++-
> >> 2 files changed, 18 insertions(+), 1 deletions(-)
> >> 
> >> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> >> index ea113b5..58acdc1 100644
> >> --- a/Documentation/virtual/kvm/hypercalls.txt
> >> +++ b/Documentation/virtual/kvm/hypercalls.txt
> >> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
> >> shared page that contains parts of supervisor visible register state.
> >> The guest can map this shared page to access its supervisor register through
> >> memory using this hypercall.
> >> +
> >> +5. KVM_HC_VM_RESET
> >> +------------------------
> >> +Architecture: PPC
> >> +Status: active
> >> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
> >> +arguments. If successful the hcall does not return. If an error occurs it
> >> +returns EV_INTERNAL.
> >> +
> >> +6. KVM_HC_VM_SHUTDOWN
> >> +------------------------
> >> +Architecture: PPC
> >> +Status: active
> >> +Purpose: Requests that the virtual machine be powered-off/halted.
> >> +The hcall takes no arguments. If successful the hcall does not return.
> >> +If an error occurs it returns EV_INTERNAL.
> >> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> >> index cea2c5c..218882d 100644
> >> --- a/include/uapi/linux/kvm_para.h
> >> +++ b/include/uapi/linux/kvm_para.h
> >> @@ -19,7 +19,8 @@
> >> #define KVM_HC_MMU_OP			2
> >> #define KVM_HC_FEATURES			3
> >> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> >> -
> >> +#define KVM_HC_VM_RESET			5
> >> +#define KVM_HC_VM_SHUTDOWN		6
> > There is no much sense to share hypercalls between architectures. There
> > is zero probability x86 will implement those for instance (not sure
> > why PPC will want them either instead of emulating devices that do
> > shutdown/reset
> 
> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
> 
I thought we have device trees to sort these things out.

> So having a separate namespace with hcalls makes things a lot easier. And the guest needs to learn how to access it either way.
> 
> > ).  So lets move them to arch headers.
> 
> Do we want to keep the numbering scheme interchangable? Maybe there will be hcalls that can get shared between archs? If so, leaving it in the same header file might make sense.
> 
hcalls will not be handled in shared code, so I do not see why would we
want to have interchangable numbering scheme. hcalls handlers of
different arches can call common code after intercepting hcall and
retrieving arguments from an arch vcpu state.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 12:15         ` Gleb Natapov
  0 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-15 12:15 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
> 
> On 15.07.2013, at 13:30, Gleb Natapov wrote:
> 
> > On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> >> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> >> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
> >> 
> >> These hcalls are handled by guest userspace.
> >> 
> >> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >> ---
> >> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >> include/uapi/linux/kvm_para.h            |    3 ++-
> >> 2 files changed, 18 insertions(+), 1 deletions(-)
> >> 
> >> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> >> index ea113b5..58acdc1 100644
> >> --- a/Documentation/virtual/kvm/hypercalls.txt
> >> +++ b/Documentation/virtual/kvm/hypercalls.txt
> >> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
> >> shared page that contains parts of supervisor visible register state.
> >> The guest can map this shared page to access its supervisor register through
> >> memory using this hypercall.
> >> +
> >> +5. KVM_HC_VM_RESET
> >> +------------------------
> >> +Architecture: PPC
> >> +Status: active
> >> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
> >> +arguments. If successful the hcall does not return. If an error occurs it
> >> +returns EV_INTERNAL.
> >> +
> >> +6. KVM_HC_VM_SHUTDOWN
> >> +------------------------
> >> +Architecture: PPC
> >> +Status: active
> >> +Purpose: Requests that the virtual machine be powered-off/halted.
> >> +The hcall takes no arguments. If successful the hcall does not return.
> >> +If an error occurs it returns EV_INTERNAL.
> >> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> >> index cea2c5c..218882d 100644
> >> --- a/include/uapi/linux/kvm_para.h
> >> +++ b/include/uapi/linux/kvm_para.h
> >> @@ -19,7 +19,8 @@
> >> #define KVM_HC_MMU_OP			2
> >> #define KVM_HC_FEATURES			3
> >> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> >> -
> >> +#define KVM_HC_VM_RESET			5
> >> +#define KVM_HC_VM_SHUTDOWN		6
> > There is no much sense to share hypercalls between architectures. There
> > is zero probability x86 will implement those for instance (not sure
> > why PPC will want them either instead of emulating devices that do
> > shutdown/reset
> 
> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
> 
I thought we have device trees to sort these things out.

> So having a separate namespace with hcalls makes things a lot easier. And the guest needs to learn how to access it either way.
> 
> > ).  So lets move them to arch headers.
> 
> Do we want to keep the numbering scheme interchangable? Maybe there will be hcalls that can get shared between archs? If so, leaving it in the same header file might make sense.
> 
hcalls will not be handled in shared code, so I do not see why would we
want to have interchangable numbering scheme. hcalls handlers of
different arches can call common code after intercepting hcall and
retrieving arguments from an arch vcpu state.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 12:15         ` Gleb Natapov
@ 2013-07-15 12:21           ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 12:21 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 14:15, Gleb Natapov wrote:

> On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
>> 
>> On 15.07.2013, at 13:30, Gleb Natapov wrote:
>> 
>>> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
>>>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
>>>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
>>>> 
>>>> These hcalls are handled by guest userspace.
>>>> 
>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>> ---
>>>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
>>>> include/uapi/linux/kvm_para.h            |    3 ++-
>>>> 2 files changed, 18 insertions(+), 1 deletions(-)
>>>> 
>>>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
>>>> index ea113b5..58acdc1 100644
>>>> --- a/Documentation/virtual/kvm/hypercalls.txt
>>>> +++ b/Documentation/virtual/kvm/hypercalls.txt
>>>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
>>>> shared page that contains parts of supervisor visible register state.
>>>> The guest can map this shared page to access its supervisor register through
>>>> memory using this hypercall.
>>>> +
>>>> +5. KVM_HC_VM_RESET
>>>> +------------------------
>>>> +Architecture: PPC
>>>> +Status: active
>>>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
>>>> +arguments. If successful the hcall does not return. If an error occurs it
>>>> +returns EV_INTERNAL.
>>>> +
>>>> +6. KVM_HC_VM_SHUTDOWN
>>>> +------------------------
>>>> +Architecture: PPC
>>>> +Status: active
>>>> +Purpose: Requests that the virtual machine be powered-off/halted.
>>>> +The hcall takes no arguments. If successful the hcall does not return.
>>>> +If an error occurs it returns EV_INTERNAL.
>>>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
>>>> index cea2c5c..218882d 100644
>>>> --- a/include/uapi/linux/kvm_para.h
>>>> +++ b/include/uapi/linux/kvm_para.h
>>>> @@ -19,7 +19,8 @@
>>>> #define KVM_HC_MMU_OP			2
>>>> #define KVM_HC_FEATURES			3
>>>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
>>>> -
>>>> +#define KVM_HC_VM_RESET			5
>>>> +#define KVM_HC_VM_SHUTDOWN		6
>>> There is no much sense to share hypercalls between architectures. There
>>> is zero probability x86 will implement those for instance (not sure
>>> why PPC will want them either instead of emulating devices that do
>>> shutdown/reset
>> 
>> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
>> 
> I thought we have device trees to sort these things out.

For Linux guests, yes :). For proprietary random other guests, no.

> 
>> So having a separate namespace with hcalls makes things a lot easier. And the guest needs to learn how to access it either way.
>> 
>>> ).  So lets move them to arch headers.
>> 
>> Do we want to keep the numbering scheme interchangable? Maybe there will be hcalls that can get shared between archs? If so, leaving it in the same header file might make sense.
>> 
> hcalls will not be handled in shared code, so I do not see why would we
> want to have interchangable numbering scheme. hcalls handlers of
> different arches can call common code after intercepting hcall and
> retrieving arguments from an arch vcpu state.

Works for me, but then we should make hcall numbers 100% arch specific and have no global hc namespace anymore.


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 12:21           ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 12:21 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 14:15, Gleb Natapov wrote:

> On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
>> 
>> On 15.07.2013, at 13:30, Gleb Natapov wrote:
>> 
>>> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
>>>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
>>>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
>>>> 
>>>> These hcalls are handled by guest userspace.
>>>> 
>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>> ---
>>>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
>>>> include/uapi/linux/kvm_para.h            |    3 ++-
>>>> 2 files changed, 18 insertions(+), 1 deletions(-)
>>>> 
>>>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
>>>> index ea113b5..58acdc1 100644
>>>> --- a/Documentation/virtual/kvm/hypercalls.txt
>>>> +++ b/Documentation/virtual/kvm/hypercalls.txt
>>>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
>>>> shared page that contains parts of supervisor visible register state.
>>>> The guest can map this shared page to access its supervisor register through
>>>> memory using this hypercall.
>>>> +
>>>> +5. KVM_HC_VM_RESET
>>>> +------------------------
>>>> +Architecture: PPC
>>>> +Status: active
>>>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
>>>> +arguments. If successful the hcall does not return. If an error occurs it
>>>> +returns EV_INTERNAL.
>>>> +
>>>> +6. KVM_HC_VM_SHUTDOWN
>>>> +------------------------
>>>> +Architecture: PPC
>>>> +Status: active
>>>> +Purpose: Requests that the virtual machine be powered-off/halted.
>>>> +The hcall takes no arguments. If successful the hcall does not return.
>>>> +If an error occurs it returns EV_INTERNAL.
>>>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
>>>> index cea2c5c..218882d 100644
>>>> --- a/include/uapi/linux/kvm_para.h
>>>> +++ b/include/uapi/linux/kvm_para.h
>>>> @@ -19,7 +19,8 @@
>>>> #define KVM_HC_MMU_OP			2
>>>> #define KVM_HC_FEATURES			3
>>>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
>>>> -
>>>> +#define KVM_HC_VM_RESET			5
>>>> +#define KVM_HC_VM_SHUTDOWN		6
>>> There is no much sense to share hypercalls between architectures. There
>>> is zero probability x86 will implement those for instance (not sure
>>> why PPC will want them either instead of emulating devices that do
>>> shutdown/reset
>> 
>> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
>> 
> I thought we have device trees to sort these things out.

For Linux guests, yes :). For proprietary random other guests, no.

> 
>> So having a separate namespace with hcalls makes things a lot easier. And the guest needs to learn how to access it either way.
>> 
>>> ).  So lets move them to arch headers.
>> 
>> Do we want to keep the numbering scheme interchangable? Maybe there will be hcalls that can get shared between archs? If so, leaving it in the same header file might make sense.
>> 
> hcalls will not be handled in shared code, so I do not see why would we
> want to have interchangable numbering scheme. hcalls handlers of
> different arches can call common code after intercepting hcall and
> retrieving arguments from an arch vcpu state.

Works for me, but then we should make hcall numbers 100% arch specific and have no global hc namespace anymore.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 12:21           ` Alexander Graf
@ 2013-07-15 12:24             ` Gleb Natapov
  -1 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-15 12:24 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 02:21:51PM +0200, Alexander Graf wrote:
> 
> On 15.07.2013, at 14:15, Gleb Natapov wrote:
> 
> > On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
> >> 
> >> On 15.07.2013, at 13:30, Gleb Natapov wrote:
> >> 
> >>> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> >>>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> >>>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
> >>>> 
> >>>> These hcalls are handled by guest userspace.
> >>>> 
> >>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >>>> ---
> >>>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >>>> include/uapi/linux/kvm_para.h            |    3 ++-
> >>>> 2 files changed, 18 insertions(+), 1 deletions(-)
> >>>> 
> >>>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> >>>> index ea113b5..58acdc1 100644
> >>>> --- a/Documentation/virtual/kvm/hypercalls.txt
> >>>> +++ b/Documentation/virtual/kvm/hypercalls.txt
> >>>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
> >>>> shared page that contains parts of supervisor visible register state.
> >>>> The guest can map this shared page to access its supervisor register through
> >>>> memory using this hypercall.
> >>>> +
> >>>> +5. KVM_HC_VM_RESET
> >>>> +------------------------
> >>>> +Architecture: PPC
> >>>> +Status: active
> >>>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
> >>>> +arguments. If successful the hcall does not return. If an error occurs it
> >>>> +returns EV_INTERNAL.
> >>>> +
> >>>> +6. KVM_HC_VM_SHUTDOWN
> >>>> +------------------------
> >>>> +Architecture: PPC
> >>>> +Status: active
> >>>> +Purpose: Requests that the virtual machine be powered-off/halted.
> >>>> +The hcall takes no arguments. If successful the hcall does not return.
> >>>> +If an error occurs it returns EV_INTERNAL.
> >>>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> >>>> index cea2c5c..218882d 100644
> >>>> --- a/include/uapi/linux/kvm_para.h
> >>>> +++ b/include/uapi/linux/kvm_para.h
> >>>> @@ -19,7 +19,8 @@
> >>>> #define KVM_HC_MMU_OP			2
> >>>> #define KVM_HC_FEATURES			3
> >>>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> >>>> -
> >>>> +#define KVM_HC_VM_RESET			5
> >>>> +#define KVM_HC_VM_SHUTDOWN		6
> >>> There is no much sense to share hypercalls between architectures. There
> >>> is zero probability x86 will implement those for instance (not sure
> >>> why PPC will want them either instead of emulating devices that do
> >>> shutdown/reset
> >> 
> >> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
> >> 
> > I thought we have device trees to sort these things out.
> 
> For Linux guests, yes :). For proprietary random other guests, no.
> 
But those can't use hcalls too, no?

> > 
> >> So having a separate namespace with hcalls makes things a lot easier. And the guest needs to learn how to access it either way.
> >> 
> >>> ).  So lets move them to arch headers.
> >> 
> >> Do we want to keep the numbering scheme interchangable? Maybe there will be hcalls that can get shared between archs? If so, leaving it in the same header file might make sense.
> >> 
> > hcalls will not be handled in shared code, so I do not see why would we
> > want to have interchangable numbering scheme. hcalls handlers of
> > different arches can call common code after intercepting hcall and
> > retrieving arguments from an arch vcpu state.
> 
> Works for me, but then we should make hcall numbers 100% arch specific and have no global hc namespace anymore.
> 
Yes, of course. Move all of them to arch headers.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 12:24             ` Gleb Natapov
  0 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-15 12:24 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 02:21:51PM +0200, Alexander Graf wrote:
> 
> On 15.07.2013, at 14:15, Gleb Natapov wrote:
> 
> > On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
> >> 
> >> On 15.07.2013, at 13:30, Gleb Natapov wrote:
> >> 
> >>> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> >>>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> >>>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
> >>>> 
> >>>> These hcalls are handled by guest userspace.
> >>>> 
> >>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >>>> ---
> >>>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >>>> include/uapi/linux/kvm_para.h            |    3 ++-
> >>>> 2 files changed, 18 insertions(+), 1 deletions(-)
> >>>> 
> >>>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> >>>> index ea113b5..58acdc1 100644
> >>>> --- a/Documentation/virtual/kvm/hypercalls.txt
> >>>> +++ b/Documentation/virtual/kvm/hypercalls.txt
> >>>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
> >>>> shared page that contains parts of supervisor visible register state.
> >>>> The guest can map this shared page to access its supervisor register through
> >>>> memory using this hypercall.
> >>>> +
> >>>> +5. KVM_HC_VM_RESET
> >>>> +------------------------
> >>>> +Architecture: PPC
> >>>> +Status: active
> >>>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
> >>>> +arguments. If successful the hcall does not return. If an error occurs it
> >>>> +returns EV_INTERNAL.
> >>>> +
> >>>> +6. KVM_HC_VM_SHUTDOWN
> >>>> +------------------------
> >>>> +Architecture: PPC
> >>>> +Status: active
> >>>> +Purpose: Requests that the virtual machine be powered-off/halted.
> >>>> +The hcall takes no arguments. If successful the hcall does not return.
> >>>> +If an error occurs it returns EV_INTERNAL.
> >>>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> >>>> index cea2c5c..218882d 100644
> >>>> --- a/include/uapi/linux/kvm_para.h
> >>>> +++ b/include/uapi/linux/kvm_para.h
> >>>> @@ -19,7 +19,8 @@
> >>>> #define KVM_HC_MMU_OP			2
> >>>> #define KVM_HC_FEATURES			3
> >>>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> >>>> -
> >>>> +#define KVM_HC_VM_RESET			5
> >>>> +#define KVM_HC_VM_SHUTDOWN		6
> >>> There is no much sense to share hypercalls between architectures. There
> >>> is zero probability x86 will implement those for instance (not sure
> >>> why PPC will want them either instead of emulating devices that do
> >>> shutdown/reset
> >> 
> >> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
> >> 
> > I thought we have device trees to sort these things out.
> 
> For Linux guests, yes :). For proprietary random other guests, no.
> 
But those can't use hcalls too, no?

> > 
> >> So having a separate namespace with hcalls makes things a lot easier. And the guest needs to learn how to access it either way.
> >> 
> >>> ).  So lets move them to arch headers.
> >> 
> >> Do we want to keep the numbering scheme interchangable? Maybe there will be hcalls that can get shared between archs? If so, leaving it in the same header file might make sense.
> >> 
> > hcalls will not be handled in shared code, so I do not see why would we
> > want to have interchangable numbering scheme. hcalls handlers of
> > different arches can call common code after intercepting hcall and
> > retrieving arguments from an arch vcpu state.
> 
> Works for me, but then we should make hcall numbers 100% arch specific and have no global hc namespace anymore.
> 
Yes, of course. Move all of them to arch headers.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 12:24             ` Gleb Natapov
@ 2013-07-15 12:26               ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 12:26 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 14:24, Gleb Natapov wrote:

> On Mon, Jul 15, 2013 at 02:21:51PM +0200, Alexander Graf wrote:
>> 
>> On 15.07.2013, at 14:15, Gleb Natapov wrote:
>> 
>>> On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
>>>> 
>>>> On 15.07.2013, at 13:30, Gleb Natapov wrote:
>>>> 
>>>>> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
>>>>>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
>>>>>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
>>>>>> 
>>>>>> These hcalls are handled by guest userspace.
>>>>>> 
>>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>>>> ---
>>>>>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
>>>>>> include/uapi/linux/kvm_para.h            |    3 ++-
>>>>>> 2 files changed, 18 insertions(+), 1 deletions(-)
>>>>>> 
>>>>>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
>>>>>> index ea113b5..58acdc1 100644
>>>>>> --- a/Documentation/virtual/kvm/hypercalls.txt
>>>>>> +++ b/Documentation/virtual/kvm/hypercalls.txt
>>>>>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
>>>>>> shared page that contains parts of supervisor visible register state.
>>>>>> The guest can map this shared page to access its supervisor register through
>>>>>> memory using this hypercall.
>>>>>> +
>>>>>> +5. KVM_HC_VM_RESET
>>>>>> +------------------------
>>>>>> +Architecture: PPC
>>>>>> +Status: active
>>>>>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
>>>>>> +arguments. If successful the hcall does not return. If an error occurs it
>>>>>> +returns EV_INTERNAL.
>>>>>> +
>>>>>> +6. KVM_HC_VM_SHUTDOWN
>>>>>> +------------------------
>>>>>> +Architecture: PPC
>>>>>> +Status: active
>>>>>> +Purpose: Requests that the virtual machine be powered-off/halted.
>>>>>> +The hcall takes no arguments. If successful the hcall does not return.
>>>>>> +If an error occurs it returns EV_INTERNAL.
>>>>>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
>>>>>> index cea2c5c..218882d 100644
>>>>>> --- a/include/uapi/linux/kvm_para.h
>>>>>> +++ b/include/uapi/linux/kvm_para.h
>>>>>> @@ -19,7 +19,8 @@
>>>>>> #define KVM_HC_MMU_OP			2
>>>>>> #define KVM_HC_FEATURES			3
>>>>>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
>>>>>> -
>>>>>> +#define KVM_HC_VM_RESET			5
>>>>>> +#define KVM_HC_VM_SHUTDOWN		6
>>>>> There is no much sense to share hypercalls between architectures. There
>>>>> is zero probability x86 will implement those for instance (not sure
>>>>> why PPC will want them either instead of emulating devices that do
>>>>> shutdown/reset
>>>> 
>>>> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
>>>> 
>>> I thought we have device trees to sort these things out.
>> 
>> For Linux guests, yes :). For proprietary random other guests, no.
>> 
> But those can't use hcalls too, no?

Why not? There are customers out there who are more than happy to add functionality, but not change functionality.


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 12:26               ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 12:26 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan


On 15.07.2013, at 14:24, Gleb Natapov wrote:

> On Mon, Jul 15, 2013 at 02:21:51PM +0200, Alexander Graf wrote:
>> 
>> On 15.07.2013, at 14:15, Gleb Natapov wrote:
>> 
>>> On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
>>>> 
>>>> On 15.07.2013, at 13:30, Gleb Natapov wrote:
>>>> 
>>>>> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
>>>>>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
>>>>>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
>>>>>> 
>>>>>> These hcalls are handled by guest userspace.
>>>>>> 
>>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>>>> ---
>>>>>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
>>>>>> include/uapi/linux/kvm_para.h            |    3 ++-
>>>>>> 2 files changed, 18 insertions(+), 1 deletions(-)
>>>>>> 
>>>>>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
>>>>>> index ea113b5..58acdc1 100644
>>>>>> --- a/Documentation/virtual/kvm/hypercalls.txt
>>>>>> +++ b/Documentation/virtual/kvm/hypercalls.txt
>>>>>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
>>>>>> shared page that contains parts of supervisor visible register state.
>>>>>> The guest can map this shared page to access its supervisor register through
>>>>>> memory using this hypercall.
>>>>>> +
>>>>>> +5. KVM_HC_VM_RESET
>>>>>> +------------------------
>>>>>> +Architecture: PPC
>>>>>> +Status: active
>>>>>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
>>>>>> +arguments. If successful the hcall does not return. If an error occurs it
>>>>>> +returns EV_INTERNAL.
>>>>>> +
>>>>>> +6. KVM_HC_VM_SHUTDOWN
>>>>>> +------------------------
>>>>>> +Architecture: PPC
>>>>>> +Status: active
>>>>>> +Purpose: Requests that the virtual machine be powered-off/halted.
>>>>>> +The hcall takes no arguments. If successful the hcall does not return.
>>>>>> +If an error occurs it returns EV_INTERNAL.
>>>>>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
>>>>>> index cea2c5c..218882d 100644
>>>>>> --- a/include/uapi/linux/kvm_para.h
>>>>>> +++ b/include/uapi/linux/kvm_para.h
>>>>>> @@ -19,7 +19,8 @@
>>>>>> #define KVM_HC_MMU_OP			2
>>>>>> #define KVM_HC_FEATURES			3
>>>>>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
>>>>>> -
>>>>>> +#define KVM_HC_VM_RESET			5
>>>>>> +#define KVM_HC_VM_SHUTDOWN		6
>>>>> There is no much sense to share hypercalls between architectures. There
>>>>> is zero probability x86 will implement those for instance (not sure
>>>>> why PPC will want them either instead of emulating devices that do
>>>>> shutdown/reset
>>>> 
>>>> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
>>>> 
>>> I thought we have device trees to sort these things out.
>> 
>> For Linux guests, yes :). For proprietary random other guests, no.
>> 
> But those can't use hcalls too, no?

Why not? There are customers out there who are more than happy to add functionality, but not change functionality.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 12:26               ` Alexander Graf
@ 2013-07-15 12:31                 ` Gleb Natapov
  -1 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-15 12:31 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 02:26:38PM +0200, Alexander Graf wrote:
> 
> On 15.07.2013, at 14:24, Gleb Natapov wrote:
> 
> > On Mon, Jul 15, 2013 at 02:21:51PM +0200, Alexander Graf wrote:
> >> 
> >> On 15.07.2013, at 14:15, Gleb Natapov wrote:
> >> 
> >>> On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
> >>>> 
> >>>> On 15.07.2013, at 13:30, Gleb Natapov wrote:
> >>>> 
> >>>>> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> >>>>>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> >>>>>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
> >>>>>> 
> >>>>>> These hcalls are handled by guest userspace.
> >>>>>> 
> >>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >>>>>> ---
> >>>>>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >>>>>> include/uapi/linux/kvm_para.h            |    3 ++-
> >>>>>> 2 files changed, 18 insertions(+), 1 deletions(-)
> >>>>>> 
> >>>>>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> >>>>>> index ea113b5..58acdc1 100644
> >>>>>> --- a/Documentation/virtual/kvm/hypercalls.txt
> >>>>>> +++ b/Documentation/virtual/kvm/hypercalls.txt
> >>>>>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
> >>>>>> shared page that contains parts of supervisor visible register state.
> >>>>>> The guest can map this shared page to access its supervisor register through
> >>>>>> memory using this hypercall.
> >>>>>> +
> >>>>>> +5. KVM_HC_VM_RESET
> >>>>>> +------------------------
> >>>>>> +Architecture: PPC
> >>>>>> +Status: active
> >>>>>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
> >>>>>> +arguments. If successful the hcall does not return. If an error occurs it
> >>>>>> +returns EV_INTERNAL.
> >>>>>> +
> >>>>>> +6. KVM_HC_VM_SHUTDOWN
> >>>>>> +------------------------
> >>>>>> +Architecture: PPC
> >>>>>> +Status: active
> >>>>>> +Purpose: Requests that the virtual machine be powered-off/halted.
> >>>>>> +The hcall takes no arguments. If successful the hcall does not return.
> >>>>>> +If an error occurs it returns EV_INTERNAL.
> >>>>>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> >>>>>> index cea2c5c..218882d 100644
> >>>>>> --- a/include/uapi/linux/kvm_para.h
> >>>>>> +++ b/include/uapi/linux/kvm_para.h
> >>>>>> @@ -19,7 +19,8 @@
> >>>>>> #define KVM_HC_MMU_OP			2
> >>>>>> #define KVM_HC_FEATURES			3
> >>>>>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> >>>>>> -
> >>>>>> +#define KVM_HC_VM_RESET			5
> >>>>>> +#define KVM_HC_VM_SHUTDOWN		6
> >>>>> There is no much sense to share hypercalls between architectures. There
> >>>>> is zero probability x86 will implement those for instance (not sure
> >>>>> why PPC will want them either instead of emulating devices that do
> >>>>> shutdown/reset
> >>>> 
> >>>> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
> >>>> 
> >>> I thought we have device trees to sort these things out.
> >> 
> >> For Linux guests, yes :). For proprietary random other guests, no.
> >> 
> > But those can't use hcalls too, no?
> 
> Why not? There are customers out there who are more than happy to add functionality, but not change functionality.
> 
Ah, so you are talking about proprietary guests that are actively
developed and it is easier to use separate address space than make them
parse device tree or agree upon common device location, oh well.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 12:31                 ` Gleb Natapov
  0 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-15 12:31 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bharat Bhushan, kvm, kvm-ppc, scottwood, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 02:26:38PM +0200, Alexander Graf wrote:
> 
> On 15.07.2013, at 14:24, Gleb Natapov wrote:
> 
> > On Mon, Jul 15, 2013 at 02:21:51PM +0200, Alexander Graf wrote:
> >> 
> >> On 15.07.2013, at 14:15, Gleb Natapov wrote:
> >> 
> >>> On Mon, Jul 15, 2013 at 01:44:46PM +0200, Alexander Graf wrote:
> >>>> 
> >>>> On 15.07.2013, at 13:30, Gleb Natapov wrote:
> >>>> 
> >>>>> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> >>>>>> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> >>>>>> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be powered-off/halted.
> >>>>>> 
> >>>>>> These hcalls are handled by guest userspace.
> >>>>>> 
> >>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >>>>>> ---
> >>>>>> Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >>>>>> include/uapi/linux/kvm_para.h            |    3 ++-
> >>>>>> 2 files changed, 18 insertions(+), 1 deletions(-)
> >>>>>> 
> >>>>>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> >>>>>> index ea113b5..58acdc1 100644
> >>>>>> --- a/Documentation/virtual/kvm/hypercalls.txt
> >>>>>> +++ b/Documentation/virtual/kvm/hypercalls.txt
> >>>>>> @@ -64,3 +64,19 @@ Purpose: To enable communication between the hypervisor and guest there is a
> >>>>>> shared page that contains parts of supervisor visible register state.
> >>>>>> The guest can map this shared page to access its supervisor register through
> >>>>>> memory using this hypercall.
> >>>>>> +
> >>>>>> +5. KVM_HC_VM_RESET
> >>>>>> +------------------------
> >>>>>> +Architecture: PPC
> >>>>>> +Status: active
> >>>>>> +Purpose:  Requests that the virtual machine be reset.  The hcall takes no
> >>>>>> +arguments. If successful the hcall does not return. If an error occurs it
> >>>>>> +returns EV_INTERNAL.
> >>>>>> +
> >>>>>> +6. KVM_HC_VM_SHUTDOWN
> >>>>>> +------------------------
> >>>>>> +Architecture: PPC
> >>>>>> +Status: active
> >>>>>> +Purpose: Requests that the virtual machine be powered-off/halted.
> >>>>>> +The hcall takes no arguments. If successful the hcall does not return.
> >>>>>> +If an error occurs it returns EV_INTERNAL.
> >>>>>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> >>>>>> index cea2c5c..218882d 100644
> >>>>>> --- a/include/uapi/linux/kvm_para.h
> >>>>>> +++ b/include/uapi/linux/kvm_para.h
> >>>>>> @@ -19,7 +19,8 @@
> >>>>>> #define KVM_HC_MMU_OP			2
> >>>>>> #define KVM_HC_FEATURES			3
> >>>>>> #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> >>>>>> -
> >>>>>> +#define KVM_HC_VM_RESET			5
> >>>>>> +#define KVM_HC_VM_SHUTDOWN		6
> >>>>> There is no much sense to share hypercalls between architectures. There
> >>>>> is zero probability x86 will implement those for instance (not sure
> >>>>> why PPC will want them either instead of emulating devices that do
> >>>>> shutdown/reset
> >>>> 
> >>>> Implementing devices gets pretty tricky. Usually all of your devices sit on the SoC with a strictly defined layout. We can randomly shove some device in there, but there's a good chance we're overlapping with another device.
> >>>> 
> >>> I thought we have device trees to sort these things out.
> >> 
> >> For Linux guests, yes :). For proprietary random other guests, no.
> >> 
> > But those can't use hcalls too, no?
> 
> Why not? There are customers out there who are more than happy to add functionality, but not change functionality.
> 
Ah, so you are talking about proprietary guests that are actively
developed and it is easier to use separate address space than make them
parse device tree or agree upon common device location, oh well.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 11:46         ` Alexander Graf
  (?)
@ 2013-07-15 14:50         ` Bhushan Bharat-R65777
  2013-07-15 14:56             ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-15 14:50 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Monday, July 15, 2013 5:16 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
> Stuart-B08248
> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
> in kvm
> 
> 
> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Alexander Graf [mailto:agraf@suse.de]
> >> Sent: Monday, July 15, 2013 5:02 PM
> >> To: Bhushan Bharat-R65777
> >> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
> >> Yoder Stuart-B08248; Bhushan Bharat-R65777
> >> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
> >> unimplemented hcalls in kvm
> >>
> >>
> >> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> >>
> >>> Exit to guest user space if kvm does not implement the hcall.
> >>>
> >>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >>> ---
> >>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++-----
> -
> >>> arch/powerpc/kvm/powerpc.c |    1 +
> >>> include/uapi/linux/kvm.h   |    1 +
> >>> 3 files changed, 42 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> >>> index
> >>> 17722d8..c8b41b4 100644
> >>> --- a/arch/powerpc/kvm/booke.c
> >>> +++ b/arch/powerpc/kvm/booke.c
> >>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
> >>> struct
> >> kvm_vcpu *vcpu,
> >>> 		break;
> >>>
> >>> #ifdef CONFIG_KVM_BOOKE_HV
> >>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
> >>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> >>
> >> This is getting large. Please extract hcall handling into its own function.
> >> Maybe you can merge the HV and non-HV case then too.
> >>
> >>> +		int i;
> >>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> >>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> >>> +			r = kvmppc_kvm_pv(vcpu);
> >>> +			if (r != EV_UNIMPLEMENTED) {
> >>> +				/* except unimplemented return to guest */
> >>> +				kvmppc_set_gpr(vcpu, 3, r);
> >>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>> +				r = RESUME_GUEST;
> >>> +				break;
> >>> +			}
> >>> +			/* Exit to userspace for unimplemented hcalls in kvm */
> >>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> >>> +			run->epapr_hcall.ret = 0;
> >>> +			for (i = 0; i < 8; i++)
> >>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
> >> i);
> >>> +			vcpu->arch.hcall_needed = 1;
> >>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>> +			r = RESUME_HOST;
> >>> 		} else {
> >>> 			/*
> >>> 			 * hcall from guest userspace -- send privileged @@ -1016,22
> >>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
> >>> +kvm_vcpu *vcpu,
> >>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
> >>> 		}
> >>>
> >>> -		r = RESUME_GUEST;
> >>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> >
> >
> > Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu,
> > SYSCALL_EXITS);
> >
> > s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu,
> > SYSCALL_EXITS);
> >
> > -Bharat
> >
> >>
> >> This looks odd. Your exit reason only changes when you do the hcall
> >> exiting, right?
> >>
> >> You also need to guard user space hcall exits with an ENABLE_CAP.
> >> Otherwise older user space will break, as it doesn't know about the exit type
> yet.
> >
> > So the user space so make enable_cap also?
> 
> User space needs to call enable_cap on this cap, yes. Otherwise a guest can
> confuse user space with an hcall exit it can't handle.

We do not have enable_cap for book3s, any specific reason why ?

-Bharat

> 
> 
> Alex
> 



^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 14:50         ` Bhushan Bharat-R65777
@ 2013-07-15 14:56             ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 14:56 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 5:16 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248
>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
>> in kvm
>> 
>> 
>> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Monday, July 15, 2013 5:02 PM
>>>> To: Bhushan Bharat-R65777
>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
>>>> Yoder Stuart-B08248; Bhushan Bharat-R65777
>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>> unimplemented hcalls in kvm
>>>> 
>>>> 
>>>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>>>> 
>>>>> Exit to guest user space if kvm does not implement the hcall.
>>>>> 
>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>>> ---
>>>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++-----
>> -
>>>>> arch/powerpc/kvm/powerpc.c |    1 +
>>>>> include/uapi/linux/kvm.h   |    1 +
>>>>> 3 files changed, 42 insertions(+), 7 deletions(-)
>>>>> 
>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>>> index
>>>>> 17722d8..c8b41b4 100644
>>>>> --- a/arch/powerpc/kvm/booke.c
>>>>> +++ b/arch/powerpc/kvm/booke.c
>>>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
>>>>> struct
>>>> kvm_vcpu *vcpu,
>>>>> 		break;
>>>>> 
>>>>> #ifdef CONFIG_KVM_BOOKE_HV
>>>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
>>>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
>>>> 
>>>> This is getting large. Please extract hcall handling into its own function.
>>>> Maybe you can merge the HV and non-HV case then too.
>>>> 
>>>>> +		int i;
>>>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
>>>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
>>>>> +			r = kvmppc_kvm_pv(vcpu);
>>>>> +			if (r != EV_UNIMPLEMENTED) {
>>>>> +				/* except unimplemented return to guest */
>>>>> +				kvmppc_set_gpr(vcpu, 3, r);
>>>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>> +				r = RESUME_GUEST;
>>>>> +				break;
>>>>> +			}
>>>>> +			/* Exit to userspace for unimplemented hcalls in kvm */
>>>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
>>>>> +			run->epapr_hcall.ret = 0;
>>>>> +			for (i = 0; i < 8; i++)
>>>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
>>>> i);
>>>>> +			vcpu->arch.hcall_needed = 1;
>>>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>> +			r = RESUME_HOST;
>>>>> 		} else {
>>>>> 			/*
>>>>> 			 * hcall from guest userspace -- send privileged @@ -1016,22
>>>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>>>>> +kvm_vcpu *vcpu,
>>>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
>>>>> 		}
>>>>> 
>>>>> -		r = RESUME_GUEST;
>>>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
>>> 
>>> 
>>> Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu,
>>> SYSCALL_EXITS);
>>> 
>>> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu,
>>> SYSCALL_EXITS);
>>> 
>>> -Bharat
>>> 
>>>> 
>>>> This looks odd. Your exit reason only changes when you do the hcall
>>>> exiting, right?
>>>> 
>>>> You also need to guard user space hcall exits with an ENABLE_CAP.
>>>> Otherwise older user space will break, as it doesn't know about the exit type
>> yet.
>>> 
>>> So the user space so make enable_cap also?
>> 
>> User space needs to call enable_cap on this cap, yes. Otherwise a guest can
>> confuse user space with an hcall exit it can't handle.
> 
> We do not have enable_cap for book3s, any specific reason why ?

We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI hcalls. KVM hcalls on book3s don't return to user space. Which is something we probably want to change along with this patch set.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-15 14:56             ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 14:56 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 5:16 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248
>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
>> in kvm
>> 
>> 
>> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Monday, July 15, 2013 5:02 PM
>>>> To: Bhushan Bharat-R65777
>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
>>>> Yoder Stuart-B08248; Bhushan Bharat-R65777
>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>> unimplemented hcalls in kvm
>>>> 
>>>> 
>>>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>>>> 
>>>>> Exit to guest user space if kvm does not implement the hcall.
>>>>> 
>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>>> ---
>>>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++-----
>> -
>>>>> arch/powerpc/kvm/powerpc.c |    1 +
>>>>> include/uapi/linux/kvm.h   |    1 +
>>>>> 3 files changed, 42 insertions(+), 7 deletions(-)
>>>>> 
>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>>> index
>>>>> 17722d8..c8b41b4 100644
>>>>> --- a/arch/powerpc/kvm/booke.c
>>>>> +++ b/arch/powerpc/kvm/booke.c
>>>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
>>>>> struct
>>>> kvm_vcpu *vcpu,
>>>>> 		break;
>>>>> 
>>>>> #ifdef CONFIG_KVM_BOOKE_HV
>>>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
>>>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
>>>> 
>>>> This is getting large. Please extract hcall handling into its own function.
>>>> Maybe you can merge the HV and non-HV case then too.
>>>> 
>>>>> +		int i;
>>>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
>>>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
>>>>> +			r = kvmppc_kvm_pv(vcpu);
>>>>> +			if (r != EV_UNIMPLEMENTED) {
>>>>> +				/* except unimplemented return to guest */
>>>>> +				kvmppc_set_gpr(vcpu, 3, r);
>>>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>> +				r = RESUME_GUEST;
>>>>> +				break;
>>>>> +			}
>>>>> +			/* Exit to userspace for unimplemented hcalls in kvm */
>>>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
>>>>> +			run->epapr_hcall.ret = 0;
>>>>> +			for (i = 0; i < 8; i++)
>>>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu, 3 +
>>>> i);
>>>>> +			vcpu->arch.hcall_needed = 1;
>>>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>> +			r = RESUME_HOST;
>>>>> 		} else {
>>>>> 			/*
>>>>> 			 * hcall from guest userspace -- send privileged @@ -1016,22
>>>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>>>>> +kvm_vcpu *vcpu,
>>>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
>>>>> 		}
>>>>> 
>>>>> -		r = RESUME_GUEST;
>>>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
>>> 
>>> 
>>> Oops, what I have done, I wanted this to be kvmppc_account_exit(vcpu,
>>> SYSCALL_EXITS);
>>> 
>>> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/ kvmppc_account_exit(vcpu,
>>> SYSCALL_EXITS);
>>> 
>>> -Bharat
>>> 
>>>> 
>>>> This looks odd. Your exit reason only changes when you do the hcall
>>>> exiting, right?
>>>> 
>>>> You also need to guard user space hcall exits with an ENABLE_CAP.
>>>> Otherwise older user space will break, as it doesn't know about the exit type
>> yet.
>>> 
>>> So the user space so make enable_cap also?
>> 
>> User space needs to call enable_cap on this cap, yes. Otherwise a guest can
>> confuse user space with an hcall exit it can't handle.
> 
> We do not have enable_cap for book3s, any specific reason why ?

We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI hcalls. KVM hcalls on book3s don't return to user space. Which is something we probably want to change along with this patch set.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 11:50     ` Alexander Graf
  (?)
@ 2013-07-15 15:05     ` Bhushan Bharat-R65777
  2013-07-15 15:09         ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-15 15:05 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Monday, July 15, 2013 5:20 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
> Stuart-B08248; Bhushan Bharat-R65777
> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
> 
> 
> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> 
> > Detect the availability of the reset hcalls by looking at
> > kvm,has-reset property on the /hypervisor node in the device tree
> > passed to the VM and patches the reset mechanism to use reset hcall.
> >
> > This patch uses the reser hcall when kvm,has-reset is there in
> 
> Your patch description is pretty broken :).
> 
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> > arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
> > 1 files changed, 12 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/epapr_paravirt.c
> > b/arch/powerpc/kernel/epapr_paravirt.c
> > index d44a571..651d701 100644
> > --- a/arch/powerpc/kernel/epapr_paravirt.c
> > +++ b/arch/powerpc/kernel/epapr_paravirt.c
> > @@ -22,6 +22,8 @@
> > #include <asm/cacheflush.h>
> > #include <asm/code-patching.h>
> > #include <asm/machdep.h>
> > +#include <asm/kvm_para.h>
> > +#include <asm/kvm_host.h>
> 
> Why would we need kvm_host.h? This is guest code.
> 
> >
> > #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern
> > void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
> > epapr_ev_idle_start[];
> >
> > bool epapr_paravirt_enabled;
> >
> > +void epapr_hypercall_reset(char *cmd) {
> > +	long ret;
> > +	ret = kvm_hypercall0(KVM_HC_VM_RESET);
> 
> Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns
> "unimplemented" for everything when that config option is not set.

We are here because we patched the ppc_md.restart to point to new handler.
So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is true. 


> 
> > +	printk("error: system reset returned with error %ld\n", ret);
> 
> So we should fall back to the normal reset handler here.

Do you mean return normally from here, no BUG() etc? 

-Bharat

> 
> 
> Alex
> 
> > +	BUG();
> > +}
> > +
> > static int __init epapr_paravirt_init(void) {
> > 	struct device_node *hyper_node;
> > @@ -58,6 +68,8 @@ static int __init epapr_paravirt_init(void)
> > 	if (of_get_property(hyper_node, "has-idle", NULL))
> > 		ppc_md.power_save = epapr_ev_idle;
> > #endif
> > +	if (of_get_property(hyper_node, "kvm,has-reset", NULL))
> > +		ppc_md.restart = epapr_hypercall_reset;
> >
> > 	epapr_paravirt_enabled = true;
> >
> > --
> > 1.7.0.4
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org More majordomo info
> > at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 15:05     ` Bhushan Bharat-R65777
@ 2013-07-15 15:09         ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 15:09 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 5:20 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248; Bhushan Bharat-R65777
>> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
>> 
>> 
>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>> 
>>> Detect the availability of the reset hcalls by looking at
>>> kvm,has-reset property on the /hypervisor node in the device tree
>>> passed to the VM and patches the reset mechanism to use reset hcall.
>>> 
>>> This patch uses the reser hcall when kvm,has-reset is there in
>> 
>> Your patch description is pretty broken :).
>> 
>>> 
>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>> ---
>>> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
>>> 1 files changed, 12 insertions(+), 0 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/kernel/epapr_paravirt.c
>>> b/arch/powerpc/kernel/epapr_paravirt.c
>>> index d44a571..651d701 100644
>>> --- a/arch/powerpc/kernel/epapr_paravirt.c
>>> +++ b/arch/powerpc/kernel/epapr_paravirt.c
>>> @@ -22,6 +22,8 @@
>>> #include <asm/cacheflush.h>
>>> #include <asm/code-patching.h>
>>> #include <asm/machdep.h>
>>> +#include <asm/kvm_para.h>
>>> +#include <asm/kvm_host.h>
>> 
>> Why would we need kvm_host.h? This is guest code.
>> 
>>> 
>>> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern
>>> void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
>>> epapr_ev_idle_start[];
>>> 
>>> bool epapr_paravirt_enabled;
>>> 
>>> +void epapr_hypercall_reset(char *cmd) {
>>> +	long ret;
>>> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);
>> 
>> Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns
>> "unimplemented" for everything when that config option is not set.
> 
> We are here because we patched the ppc_md.restart to point to new handler.
> So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is true. 

We should only patch it if kvm_para_available(). That should guard us against everything.

> 
> 
>> 
>>> +	printk("error: system reset returned with error %ld\n", ret);
>> 
>> So we should fall back to the normal reset handler here.
> 
> Do you mean return normally from here, no BUG() etc? 

If we guard the patching against everything, we can treat a broken hcall as BUG. However, if we don't we want to fall back to the normal guts based reset.


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-15 15:09         ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 15:09 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 5:20 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248; Bhushan Bharat-R65777
>> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
>> 
>> 
>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>> 
>>> Detect the availability of the reset hcalls by looking at
>>> kvm,has-reset property on the /hypervisor node in the device tree
>>> passed to the VM and patches the reset mechanism to use reset hcall.
>>> 
>>> This patch uses the reser hcall when kvm,has-reset is there in
>> 
>> Your patch description is pretty broken :).
>> 
>>> 
>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>> ---
>>> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
>>> 1 files changed, 12 insertions(+), 0 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/kernel/epapr_paravirt.c
>>> b/arch/powerpc/kernel/epapr_paravirt.c
>>> index d44a571..651d701 100644
>>> --- a/arch/powerpc/kernel/epapr_paravirt.c
>>> +++ b/arch/powerpc/kernel/epapr_paravirt.c
>>> @@ -22,6 +22,8 @@
>>> #include <asm/cacheflush.h>
>>> #include <asm/code-patching.h>
>>> #include <asm/machdep.h>
>>> +#include <asm/kvm_para.h>
>>> +#include <asm/kvm_host.h>
>> 
>> Why would we need kvm_host.h? This is guest code.
>> 
>>> 
>>> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern
>>> void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
>>> epapr_ev_idle_start[];
>>> 
>>> bool epapr_paravirt_enabled;
>>> 
>>> +void epapr_hypercall_reset(char *cmd) {
>>> +	long ret;
>>> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);
>> 
>> Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply returns
>> "unimplemented" for everything when that config option is not set.
> 
> We are here because we patched the ppc_md.restart to point to new handler.
> So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is true. 

We should only patch it if kvm_para_available(). That should guard us against everything.

> 
> 
>> 
>>> +	printk("error: system reset returned with error %ld\n", ret);
>> 
>> So we should fall back to the normal reset handler here.
> 
> Do you mean return normally from here, no BUG() etc? 

If we guard the patching against everything, we can treat a broken hcall as BUG. However, if we don't we want to fall back to the normal guts based reset.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 14:56             ` Alexander Graf
  (?)
@ 2013-07-15 15:13             ` Bhushan Bharat-R65777
  2013-07-15 15:29                 ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-15 15:13 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Monday, July 15, 2013 8:27 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
> Stuart-B08248
> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
> in kvm
> 
> 
> On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Alexander Graf [mailto:agraf@suse.de]
> >> Sent: Monday, July 15, 2013 5:16 PM
> >> To: Bhushan Bharat-R65777
> >> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
> >> Yoder
> >> Stuart-B08248
> >> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
> >> unimplemented hcalls in kvm
> >>
> >>
> >> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
> >>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Alexander Graf [mailto:agraf@suse.de]
> >>>> Sent: Monday, July 15, 2013 5:02 PM
> >>>> To: Bhushan Bharat-R65777
> >>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
> >>>> Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777
> >>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
> >>>> unimplemented hcalls in kvm
> >>>>
> >>>>
> >>>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> >>>>
> >>>>> Exit to guest user space if kvm does not implement the hcall.
> >>>>>
> >>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >>>>> ---
> >>>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++---
> --
> >> -
> >>>>> arch/powerpc/kvm/powerpc.c |    1 +
> >>>>> include/uapi/linux/kvm.h   |    1 +
> >>>>> 3 files changed, 42 insertions(+), 7 deletions(-)
> >>>>>
> >>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> >>>>> index
> >>>>> 17722d8..c8b41b4 100644
> >>>>> --- a/arch/powerpc/kvm/booke.c
> >>>>> +++ b/arch/powerpc/kvm/booke.c
> >>>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
> >>>>> struct
> >>>> kvm_vcpu *vcpu,
> >>>>> 		break;
> >>>>>
> >>>>> #ifdef CONFIG_KVM_BOOKE_HV
> >>>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
> >>>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> >>>>
> >>>> This is getting large. Please extract hcall handling into its own function.
> >>>> Maybe you can merge the HV and non-HV case then too.
> >>>>
> >>>>> +		int i;
> >>>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> >>>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> >>>>> +			r = kvmppc_kvm_pv(vcpu);
> >>>>> +			if (r != EV_UNIMPLEMENTED) {
> >>>>> +				/* except unimplemented return to guest */
> >>>>> +				kvmppc_set_gpr(vcpu, 3, r);
> >>>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>>>> +				r = RESUME_GUEST;
> >>>>> +				break;
> >>>>> +			}
> >>>>> +			/* Exit to userspace for unimplemented hcalls in kvm
> */
> >>>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> >>>>> +			run->epapr_hcall.ret = 0;
> >>>>> +			for (i = 0; i < 8; i++)
> >>>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu,
> 3 +
> >>>> i);
> >>>>> +			vcpu->arch.hcall_needed = 1;
> >>>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>>>> +			r = RESUME_HOST;
> >>>>> 		} else {
> >>>>> 			/*
> >>>>> 			 * hcall from guest userspace -- send privileged @@ -1016,22
> >>>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
> >>>>> +kvm_vcpu *vcpu,
> >>>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
> >>>>> 		}
> >>>>>
> >>>>> -		r = RESUME_GUEST;
> >>>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> >>>
> >>>
> >>> Oops, what I have done, I wanted this to be
> >>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>>
> >>> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/
> >>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>>
> >>> -Bharat
> >>>
> >>>>
> >>>> This looks odd. Your exit reason only changes when you do the hcall
> >>>> exiting, right?
> >>>>
> >>>> You also need to guard user space hcall exits with an ENABLE_CAP.
> >>>> Otherwise older user space will break, as it doesn't know about the
> >>>> exit type
> >> yet.
> >>>
> >>> So the user space so make enable_cap also?
> >>
> >> User space needs to call enable_cap on this cap, yes. Otherwise a
> >> guest can confuse user space with an hcall exit it can't handle.
> >
> > We do not have enable_cap for book3s, any specific reason why ?
> 
> We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI
> hcalls.

Oh, We check this on book3s_PR and book3s_HV.

> KVM hcalls on book3s don't return to user space.

It exits, is not it? "arch/powerpc/kvm/book3s_pr.c" exits with KVM_EXIT_PAPR_HCALL. And same in book3s_pv.

Btw, Adding this on booke is not a question. I am just understanding book3s.

-Bharat
 

> Which is something we
> probably want to change along with this patch set.
> 
> 
> Alex
> 



^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 15:09         ` Alexander Graf
  (?)
@ 2013-07-15 15:16         ` Bhushan Bharat-R65777
  2013-07-15 18:21             ` Scott Wood
  -1 siblings, 1 reply; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-15 15:16 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Monday, July 15, 2013 8:40 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
> Stuart-B08248
> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
> 
> 
> On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Alexander Graf [mailto:agraf@suse.de]
> >> Sent: Monday, July 15, 2013 5:20 PM
> >> To: Bhushan Bharat-R65777
> >> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
> >> Yoder Stuart-B08248; Bhushan Bharat-R65777
> >> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when
> >> kvm,has-reset
> >>
> >>
> >> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> >>
> >>> Detect the availability of the reset hcalls by looking at
> >>> kvm,has-reset property on the /hypervisor node in the device tree
> >>> passed to the VM and patches the reset mechanism to use reset hcall.
> >>>
> >>> This patch uses the reser hcall when kvm,has-reset is there in
> >>
> >> Your patch description is pretty broken :).
> >>
> >>>
> >>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >>> ---
> >>> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
> >>> 1 files changed, 12 insertions(+), 0 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/kernel/epapr_paravirt.c
> >>> b/arch/powerpc/kernel/epapr_paravirt.c
> >>> index d44a571..651d701 100644
> >>> --- a/arch/powerpc/kernel/epapr_paravirt.c
> >>> +++ b/arch/powerpc/kernel/epapr_paravirt.c
> >>> @@ -22,6 +22,8 @@
> >>> #include <asm/cacheflush.h>
> >>> #include <asm/code-patching.h>
> >>> #include <asm/machdep.h>
> >>> +#include <asm/kvm_para.h>
> >>> +#include <asm/kvm_host.h>
> >>
> >> Why would we need kvm_host.h? This is guest code.
> >>
> >>>
> >>> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern
> >>> void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
> >>> epapr_ev_idle_start[];
> >>>
> >>> bool epapr_paravirt_enabled;
> >>>
> >>> +void epapr_hypercall_reset(char *cmd) {
> >>> +	long ret;
> >>> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);
> >>
> >> Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply
> >> returns "unimplemented" for everything when that config option is not set.
> >
> > We are here because we patched the ppc_md.restart to point to new handler.
> > So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is
> true.
> 
> We should only patch it if kvm_para_available(). That should guard us against
> everything.
> 
> >
> >
> >>
> >>> +	printk("error: system reset returned with error %ld\n", ret);
> >>
> >> So we should fall back to the normal reset handler here.
> >
> > Do you mean return normally from here, no BUG() etc?
> 
> If we guard the patching against everything, we can treat a broken hcall as BUG.
> However, if we don't we want to fall back to the normal guts based reset.

Will let Scott comment on this?

But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler 

-Bharat

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 15:13             ` Bhushan Bharat-R65777
@ 2013-07-15 15:29                 ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 15:29 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 17:13, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 8:27 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248
>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
>> in kvm
>> 
>> 
>> On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Monday, July 15, 2013 5:16 PM
>>>> To: Bhushan Bharat-R65777
>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
>>>> Yoder
>>>> Stuart-B08248
>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>> unimplemented hcalls in kvm
>>>> 
>>>> 
>>>> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
>>>> 
>>>>> 
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>>>> Sent: Monday, July 15, 2013 5:02 PM
>>>>>> To: Bhushan Bharat-R65777
>>>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
>>>>>> Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777
>>>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>>>> unimplemented hcalls in kvm
>>>>>> 
>>>>>> 
>>>>>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>>>>>> 
>>>>>>> Exit to guest user space if kvm does not implement the hcall.
>>>>>>> 
>>>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>>>>> ---
>>>>>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++---
>> --
>>>> -
>>>>>>> arch/powerpc/kvm/powerpc.c |    1 +
>>>>>>> include/uapi/linux/kvm.h   |    1 +
>>>>>>> 3 files changed, 42 insertions(+), 7 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>>>>> index
>>>>>>> 17722d8..c8b41b4 100644
>>>>>>> --- a/arch/powerpc/kvm/booke.c
>>>>>>> +++ b/arch/powerpc/kvm/booke.c
>>>>>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
>>>>>>> struct
>>>>>> kvm_vcpu *vcpu,
>>>>>>> 		break;
>>>>>>> 
>>>>>>> #ifdef CONFIG_KVM_BOOKE_HV
>>>>>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
>>>>>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
>>>>>> 
>>>>>> This is getting large. Please extract hcall handling into its own function.
>>>>>> Maybe you can merge the HV and non-HV case then too.
>>>>>> 
>>>>>>> +		int i;
>>>>>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
>>>>>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
>>>>>>> +			r = kvmppc_kvm_pv(vcpu);
>>>>>>> +			if (r != EV_UNIMPLEMENTED) {
>>>>>>> +				/* except unimplemented return to guest */
>>>>>>> +				kvmppc_set_gpr(vcpu, 3, r);
>>>>>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>> +				r = RESUME_GUEST;
>>>>>>> +				break;
>>>>>>> +			}
>>>>>>> +			/* Exit to userspace for unimplemented hcalls in kvm
>> */
>>>>>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
>>>>>>> +			run->epapr_hcall.ret = 0;
>>>>>>> +			for (i = 0; i < 8; i++)
>>>>>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu,
>> 3 +
>>>>>> i);
>>>>>>> +			vcpu->arch.hcall_needed = 1;
>>>>>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>> +			r = RESUME_HOST;
>>>>>>> 		} else {
>>>>>>> 			/*
>>>>>>> 			 * hcall from guest userspace -- send privileged @@ -1016,22
>>>>>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>>>>>>> +kvm_vcpu *vcpu,
>>>>>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
>>>>>>> 		}
>>>>>>> 
>>>>>>> -		r = RESUME_GUEST;
>>>>>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
>>>>> 
>>>>> 
>>>>> Oops, what I have done, I wanted this to be
>>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>> 
>>>>> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/
>>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>> 
>>>>> -Bharat
>>>>> 
>>>>>> 
>>>>>> This looks odd. Your exit reason only changes when you do the hcall
>>>>>> exiting, right?
>>>>>> 
>>>>>> You also need to guard user space hcall exits with an ENABLE_CAP.
>>>>>> Otherwise older user space will break, as it doesn't know about the
>>>>>> exit type
>>>> yet.
>>>>> 
>>>>> So the user space so make enable_cap also?
>>>> 
>>>> User space needs to call enable_cap on this cap, yes. Otherwise a
>>>> guest can confuse user space with an hcall exit it can't handle.
>>> 
>>> We do not have enable_cap for book3s, any specific reason why ?
>> 
>> We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI
>> hcalls.
> 
> Oh, We check this on book3s_PR and book3s_HV.
> 
>> KVM hcalls on book3s don't return to user space.
> 
> It exits, is not it? "arch/powerpc/kvm/book3s_pr.c" exits with KVM_EXIT_PAPR_HCALL. And same in book3s_pv.

It doesn't even start handling the hcall if papr_enabled isn't set ;).


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-15 15:29                 ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 15:29 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 17:13, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 8:27 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248
>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
>> in kvm
>> 
>> 
>> On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Monday, July 15, 2013 5:16 PM
>>>> To: Bhushan Bharat-R65777
>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
>>>> Yoder
>>>> Stuart-B08248
>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>> unimplemented hcalls in kvm
>>>> 
>>>> 
>>>> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
>>>> 
>>>>> 
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>>>> Sent: Monday, July 15, 2013 5:02 PM
>>>>>> To: Bhushan Bharat-R65777
>>>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
>>>>>> Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777
>>>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>>>> unimplemented hcalls in kvm
>>>>>> 
>>>>>> 
>>>>>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>>>>>> 
>>>>>>> Exit to guest user space if kvm does not implement the hcall.
>>>>>>> 
>>>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>>>>> ---
>>>>>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++---
>> --
>>>> -
>>>>>>> arch/powerpc/kvm/powerpc.c |    1 +
>>>>>>> include/uapi/linux/kvm.h   |    1 +
>>>>>>> 3 files changed, 42 insertions(+), 7 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>>>>> index
>>>>>>> 17722d8..c8b41b4 100644
>>>>>>> --- a/arch/powerpc/kvm/booke.c
>>>>>>> +++ b/arch/powerpc/kvm/booke.c
>>>>>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
>>>>>>> struct
>>>>>> kvm_vcpu *vcpu,
>>>>>>> 		break;
>>>>>>> 
>>>>>>> #ifdef CONFIG_KVM_BOOKE_HV
>>>>>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
>>>>>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
>>>>>> 
>>>>>> This is getting large. Please extract hcall handling into its own function.
>>>>>> Maybe you can merge the HV and non-HV case then too.
>>>>>> 
>>>>>>> +		int i;
>>>>>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
>>>>>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
>>>>>>> +			r = kvmppc_kvm_pv(vcpu);
>>>>>>> +			if (r != EV_UNIMPLEMENTED) {
>>>>>>> +				/* except unimplemented return to guest */
>>>>>>> +				kvmppc_set_gpr(vcpu, 3, r);
>>>>>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>> +				r = RESUME_GUEST;
>>>>>>> +				break;
>>>>>>> +			}
>>>>>>> +			/* Exit to userspace for unimplemented hcalls in kvm
>> */
>>>>>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
>>>>>>> +			run->epapr_hcall.ret = 0;
>>>>>>> +			for (i = 0; i < 8; i++)
>>>>>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu,
>> 3 +
>>>>>> i);
>>>>>>> +			vcpu->arch.hcall_needed = 1;
>>>>>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>> +			r = RESUME_HOST;
>>>>>>> 		} else {
>>>>>>> 			/*
>>>>>>> 			 * hcall from guest userspace -- send privileged @@ -1016,22
>>>>>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>>>>>>> +kvm_vcpu *vcpu,
>>>>>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
>>>>>>> 		}
>>>>>>> 
>>>>>>> -		r = RESUME_GUEST;
>>>>>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
>>>>> 
>>>>> 
>>>>> Oops, what I have done, I wanted this to be
>>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>> 
>>>>> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/
>>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>> 
>>>>> -Bharat
>>>>> 
>>>>>> 
>>>>>> This looks odd. Your exit reason only changes when you do the hcall
>>>>>> exiting, right?
>>>>>> 
>>>>>> You also need to guard user space hcall exits with an ENABLE_CAP.
>>>>>> Otherwise older user space will break, as it doesn't know about the
>>>>>> exit type
>>>> yet.
>>>>> 
>>>>> So the user space so make enable_cap also?
>>>> 
>>>> User space needs to call enable_cap on this cap, yes. Otherwise a
>>>> guest can confuse user space with an hcall exit it can't handle.
>>> 
>>> We do not have enable_cap for book3s, any specific reason why ?
>> 
>> We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI, you get OSI
>> hcalls.
> 
> Oh, We check this on book3s_PR and book3s_HV.
> 
>> KVM hcalls on book3s don't return to user space.
> 
> It exits, is not it? "arch/powerpc/kvm/book3s_pr.c" exits with KVM_EXIT_PAPR_HCALL. And same in book3s_pv.

It doesn't even start handling the hcall if papr_enabled isn't set ;).


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 15:29                 ` Alexander Graf
  (?)
@ 2013-07-15 15:35                 ` Bhushan Bharat-R65777
  2013-07-15 15:38                     ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-15 15:35 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Monday, July 15, 2013 8:59 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
> Stuart-B08248
> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
> in kvm
> 
> 
> On 15.07.2013, at 17:13, Bhushan Bharat-R65777 wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Alexander Graf [mailto:agraf@suse.de]
> >> Sent: Monday, July 15, 2013 8:27 PM
> >> To: Bhushan Bharat-R65777
> >> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
> >> Yoder
> >> Stuart-B08248
> >> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
> >> unimplemented hcalls in kvm
> >>
> >>
> >> On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:
> >>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Alexander Graf [mailto:agraf@suse.de]
> >>>> Sent: Monday, July 15, 2013 5:16 PM
> >>>> To: Bhushan Bharat-R65777
> >>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
> >>>> Scott-B07421; Yoder
> >>>> Stuart-B08248
> >>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
> >>>> unimplemented hcalls in kvm
> >>>>
> >>>>
> >>>> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Alexander Graf [mailto:agraf@suse.de]
> >>>>>> Sent: Monday, July 15, 2013 5:02 PM
> >>>>>> To: Bhushan Bharat-R65777
> >>>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
> >>>>>> Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777
> >>>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
> >>>>>> unimplemented hcalls in kvm
> >>>>>>
> >>>>>>
> >>>>>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> >>>>>>
> >>>>>>> Exit to guest user space if kvm does not implement the hcall.
> >>>>>>>
> >>>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >>>>>>> ---
> >>>>>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++-
> --
> >> --
> >>>> -
> >>>>>>> arch/powerpc/kvm/powerpc.c |    1 +
> >>>>>>> include/uapi/linux/kvm.h   |    1 +
> >>>>>>> 3 files changed, 42 insertions(+), 7 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> >>>>>>> index
> >>>>>>> 17722d8..c8b41b4 100644
> >>>>>>> --- a/arch/powerpc/kvm/booke.c
> >>>>>>> +++ b/arch/powerpc/kvm/booke.c
> >>>>>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run
> >>>>>>> *run, struct
> >>>>>> kvm_vcpu *vcpu,
> >>>>>>> 		break;
> >>>>>>>
> >>>>>>> #ifdef CONFIG_KVM_BOOKE_HV
> >>>>>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
> >>>>>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> >>>>>>
> >>>>>> This is getting large. Please extract hcall handling into its own
> function.
> >>>>>> Maybe you can merge the HV and non-HV case then too.
> >>>>>>
> >>>>>>> +		int i;
> >>>>>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> >>>>>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> >>>>>>> +			r = kvmppc_kvm_pv(vcpu);
> >>>>>>> +			if (r != EV_UNIMPLEMENTED) {
> >>>>>>> +				/* except unimplemented return to guest */
> >>>>>>> +				kvmppc_set_gpr(vcpu, 3, r);
> >>>>>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>>>>>> +				r = RESUME_GUEST;
> >>>>>>> +				break;
> >>>>>>> +			}
> >>>>>>> +			/* Exit to userspace for unimplemented hcalls in kvm
> >> */
> >>>>>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> >>>>>>> +			run->epapr_hcall.ret = 0;
> >>>>>>> +			for (i = 0; i < 8; i++)
> >>>>>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu,
> >> 3 +
> >>>>>> i);
> >>>>>>> +			vcpu->arch.hcall_needed = 1;
> >>>>>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>>>>>> +			r = RESUME_HOST;
> >>>>>>> 		} else {
> >>>>>>> 			/*
> >>>>>>> 			 * hcall from guest userspace -- send privileged @@ -
> 1016,22
> >>>>>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
> >>>>>>> +kvm_vcpu *vcpu,
> >>>>>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
> >>>>>>> 		}
> >>>>>>>
> >>>>>>> -		r = RESUME_GUEST;
> >>>>>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
> >>>>>
> >>>>>
> >>>>> Oops, what I have done, I wanted this to be
> >>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>>>>
> >>>>> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/
> >>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
> >>>>>
> >>>>> -Bharat
> >>>>>
> >>>>>>
> >>>>>> This looks odd. Your exit reason only changes when you do the
> >>>>>> hcall exiting, right?
> >>>>>>
> >>>>>> You also need to guard user space hcall exits with an ENABLE_CAP.
> >>>>>> Otherwise older user space will break, as it doesn't know about
> >>>>>> the exit type
> >>>> yet.
> >>>>>
> >>>>> So the user space so make enable_cap also?
> >>>>
> >>>> User space needs to call enable_cap on this cap, yes. Otherwise a
> >>>> guest can confuse user space with an hcall exit it can't handle.
> >>>
> >>> We do not have enable_cap for book3s, any specific reason why ?
> >>
> >> We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI,
> >> you get OSI hcalls.
> >
> > Oh, We check this on book3s_PR and book3s_HV.
> >
> >> KVM hcalls on book3s don't return to user space.
> >
> > It exits, is not it? "arch/powerpc/kvm/book3s_pr.c" exits with
> KVM_EXIT_PAPR_HCALL. And same in book3s_pv.
> 
> It doesn't even start handling the hcall if papr_enabled isn't set ;).

On PR, not HV :-)

-Bharat

> 
> 
> Alex
> 

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 15:35                 ` Bhushan Bharat-R65777
@ 2013-07-15 15:38                     ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 15:38 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 17:35, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 8:59 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248
>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
>> in kvm
>> 
>> 
>> On 15.07.2013, at 17:13, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Monday, July 15, 2013 8:27 PM
>>>> To: Bhushan Bharat-R65777
>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
>>>> Yoder
>>>> Stuart-B08248
>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>> unimplemented hcalls in kvm
>>>> 
>>>> 
>>>> On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:
>>>> 
>>>>> 
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>>>> Sent: Monday, July 15, 2013 5:16 PM
>>>>>> To: Bhushan Bharat-R65777
>>>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
>>>>>> Scott-B07421; Yoder
>>>>>> Stuart-B08248
>>>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>>>> unimplemented hcalls in kvm
>>>>>> 
>>>>>> 
>>>>>> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>>>>>> Sent: Monday, July 15, 2013 5:02 PM
>>>>>>>> To: Bhushan Bharat-R65777
>>>>>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
>>>>>>>> Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777
>>>>>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>>>>>> unimplemented hcalls in kvm
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>>>>>>>> 
>>>>>>>>> Exit to guest user space if kvm does not implement the hcall.
>>>>>>>>> 
>>>>>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>>>>>>> ---
>>>>>>>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++-
>> --
>>>> --
>>>>>> -
>>>>>>>>> arch/powerpc/kvm/powerpc.c |    1 +
>>>>>>>>> include/uapi/linux/kvm.h   |    1 +
>>>>>>>>> 3 files changed, 42 insertions(+), 7 deletions(-)
>>>>>>>>> 
>>>>>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>>>>>>> index
>>>>>>>>> 17722d8..c8b41b4 100644
>>>>>>>>> --- a/arch/powerpc/kvm/booke.c
>>>>>>>>> +++ b/arch/powerpc/kvm/booke.c
>>>>>>>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run
>>>>>>>>> *run, struct
>>>>>>>> kvm_vcpu *vcpu,
>>>>>>>>> 		break;
>>>>>>>>> 
>>>>>>>>> #ifdef CONFIG_KVM_BOOKE_HV
>>>>>>>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
>>>>>>>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
>>>>>>>> 
>>>>>>>> This is getting large. Please extract hcall handling into its own
>> function.
>>>>>>>> Maybe you can merge the HV and non-HV case then too.
>>>>>>>> 
>>>>>>>>> +		int i;
>>>>>>>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
>>>>>>>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
>>>>>>>>> +			r = kvmppc_kvm_pv(vcpu);
>>>>>>>>> +			if (r != EV_UNIMPLEMENTED) {
>>>>>>>>> +				/* except unimplemented return to guest */
>>>>>>>>> +				kvmppc_set_gpr(vcpu, 3, r);
>>>>>>>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>>>> +				r = RESUME_GUEST;
>>>>>>>>> +				break;
>>>>>>>>> +			}
>>>>>>>>> +			/* Exit to userspace for unimplemented hcalls in kvm
>>>> */
>>>>>>>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
>>>>>>>>> +			run->epapr_hcall.ret = 0;
>>>>>>>>> +			for (i = 0; i < 8; i++)
>>>>>>>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu,
>>>> 3 +
>>>>>>>> i);
>>>>>>>>> +			vcpu->arch.hcall_needed = 1;
>>>>>>>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>>>> +			r = RESUME_HOST;
>>>>>>>>> 		} else {
>>>>>>>>> 			/*
>>>>>>>>> 			 * hcall from guest userspace -- send privileged @@ -
>> 1016,22
>>>>>>>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>>>>>>>>> +kvm_vcpu *vcpu,
>>>>>>>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
>>>>>>>>> 		}
>>>>>>>>> 
>>>>>>>>> -		r = RESUME_GUEST;
>>>>>>>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
>>>>>>> 
>>>>>>> 
>>>>>>> Oops, what I have done, I wanted this to be
>>>>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>> 
>>>>>>> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/
>>>>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>> 
>>>>>>> -Bharat
>>>>>>> 
>>>>>>>> 
>>>>>>>> This looks odd. Your exit reason only changes when you do the
>>>>>>>> hcall exiting, right?
>>>>>>>> 
>>>>>>>> You also need to guard user space hcall exits with an ENABLE_CAP.
>>>>>>>> Otherwise older user space will break, as it doesn't know about
>>>>>>>> the exit type
>>>>>> yet.
>>>>>>> 
>>>>>>> So the user space so make enable_cap also?
>>>>>> 
>>>>>> User space needs to call enable_cap on this cap, yes. Otherwise a
>>>>>> guest can confuse user space with an hcall exit it can't handle.
>>>>> 
>>>>> We do not have enable_cap for book3s, any specific reason why ?
>>>> 
>>>> We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI,
>>>> you get OSI hcalls.
>>> 
>>> Oh, We check this on book3s_PR and book3s_HV.
>>> 
>>>> KVM hcalls on book3s don't return to user space.
>>> 
>>> It exits, is not it? "arch/powerpc/kvm/book3s_pr.c" exits with
>> KVM_EXIT_PAPR_HCALL. And same in book3s_pv.
>> 
>> It doesn't even start handling the hcall if papr_enabled isn't set ;).
> 
> On PR, not HV :-)

Book3S HV doesn't support non-PAPR ;).


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-15 15:38                     ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 15:38 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248


On 15.07.2013, at 17:35, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Monday, July 15, 2013 8:59 PM
>> To: Bhushan Bharat-R65777
>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> Stuart-B08248
>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
>> in kvm
>> 
>> 
>> On 15.07.2013, at 17:13, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Monday, July 15, 2013 8:27 PM
>>>> To: Bhushan Bharat-R65777
>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
>>>> Yoder
>>>> Stuart-B08248
>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>> unimplemented hcalls in kvm
>>>> 
>>>> 
>>>> On 15.07.2013, at 16:50, Bhushan Bharat-R65777 wrote:
>>>> 
>>>>> 
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>>>> Sent: Monday, July 15, 2013 5:16 PM
>>>>>> To: Bhushan Bharat-R65777
>>>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
>>>>>> Scott-B07421; Yoder
>>>>>> Stuart-B08248
>>>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>>>> unimplemented hcalls in kvm
>>>>>> 
>>>>>> 
>>>>>> On 15.07.2013, at 13:38, Bhushan Bharat-R65777 wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>>>>>> Sent: Monday, July 15, 2013 5:02 PM
>>>>>>>> To: Bhushan Bharat-R65777
>>>>>>>> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood
>>>>>>>> Scott-B07421; Yoder Stuart-B08248; Bhushan Bharat-R65777
>>>>>>>> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for
>>>>>>>> unimplemented hcalls in kvm
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>>>>>>>> 
>>>>>>>>> Exit to guest user space if kvm does not implement the hcall.
>>>>>>>>> 
>>>>>>>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>>>>>>>> ---
>>>>>>>>> arch/powerpc/kvm/booke.c   |   47 +++++++++++++++++++++++++++++++++++++-
>> --
>>>> --
>>>>>> -
>>>>>>>>> arch/powerpc/kvm/powerpc.c |    1 +
>>>>>>>>> include/uapi/linux/kvm.h   |    1 +
>>>>>>>>> 3 files changed, 42 insertions(+), 7 deletions(-)
>>>>>>>>> 
>>>>>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>>>>>>> index
>>>>>>>>> 17722d8..c8b41b4 100644
>>>>>>>>> --- a/arch/powerpc/kvm/booke.c
>>>>>>>>> +++ b/arch/powerpc/kvm/booke.c
>>>>>>>>> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run
>>>>>>>>> *run, struct
>>>>>>>> kvm_vcpu *vcpu,
>>>>>>>>> 		break;
>>>>>>>>> 
>>>>>>>>> #ifdef CONFIG_KVM_BOOKE_HV
>>>>>>>>> -	case BOOKE_INTERRUPT_HV_SYSCALL:
>>>>>>>>> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
>>>>>>>> 
>>>>>>>> This is getting large. Please extract hcall handling into its own
>> function.
>>>>>>>> Maybe you can merge the HV and non-HV case then too.
>>>>>>>> 
>>>>>>>>> +		int i;
>>>>>>>>> 		if (!(vcpu->arch.shared->msr & MSR_PR)) {
>>>>>>>>> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
>>>>>>>>> +			r = kvmppc_kvm_pv(vcpu);
>>>>>>>>> +			if (r != EV_UNIMPLEMENTED) {
>>>>>>>>> +				/* except unimplemented return to guest */
>>>>>>>>> +				kvmppc_set_gpr(vcpu, 3, r);
>>>>>>>>> +				kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>>>> +				r = RESUME_GUEST;
>>>>>>>>> +				break;
>>>>>>>>> +			}
>>>>>>>>> +			/* Exit to userspace for unimplemented hcalls in kvm
>>>> */
>>>>>>>>> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
>>>>>>>>> +			run->epapr_hcall.ret = 0;
>>>>>>>>> +			for (i = 0; i < 8; i++)
>>>>>>>>> +				run->epapr_hcall.args[i] = kvmppc_get_gpr(vcpu,
>>>> 3 +
>>>>>>>> i);
>>>>>>>>> +			vcpu->arch.hcall_needed = 1;
>>>>>>>>> +			kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>>>> +			r = RESUME_HOST;
>>>>>>>>> 		} else {
>>>>>>>>> 			/*
>>>>>>>>> 			 * hcall from guest userspace -- send privileged @@ -
>> 1016,22
>>>>>>>>> +1032,39 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>>>>>>>>> +kvm_vcpu *vcpu,
>>>>>>>>> 			kvmppc_core_queue_program(vcpu, ESR_PPR);
>>>>>>>>> 		}
>>>>>>>>> 
>>>>>>>>> -		r = RESUME_GUEST;
>>>>>>>>> +		run->exit_reason = KVM_EXIT_EPAPR_HCALL;
>>>>>>> 
>>>>>>> 
>>>>>>> Oops, what I have done, I wanted this to be
>>>>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>> 
>>>>>>> s/ run->exit_reason = KVM_EXIT_EPAPR_HCALL;/
>>>>>>> kvmppc_account_exit(vcpu, SYSCALL_EXITS);
>>>>>>> 
>>>>>>> -Bharat
>>>>>>> 
>>>>>>>> 
>>>>>>>> This looks odd. Your exit reason only changes when you do the
>>>>>>>> hcall exiting, right?
>>>>>>>> 
>>>>>>>> You also need to guard user space hcall exits with an ENABLE_CAP.
>>>>>>>> Otherwise older user space will break, as it doesn't know about
>>>>>>>> the exit type
>>>>>> yet.
>>>>>>> 
>>>>>>> So the user space so make enable_cap also?
>>>>>> 
>>>>>> User space needs to call enable_cap on this cap, yes. Otherwise a
>>>>>> guest can confuse user space with an hcall exit it can't handle.
>>>>> 
>>>>> We do not have enable_cap for book3s, any specific reason why ?
>>>> 
>>>> We do. If you enable PAPR, you get PAPR hcalls. If you enable OSI,
>>>> you get OSI hcalls.
>>> 
>>> Oh, We check this on book3s_PR and book3s_HV.
>>> 
>>>> KVM hcalls on book3s don't return to user space.
>>> 
>>> It exits, is not it? "arch/powerpc/kvm/book3s_pr.c" exits with
>> KVM_EXIT_PAPR_HCALL. And same in book3s_pv.
>> 
>> It doesn't even start handling the hcall if papr_enabled isn't set ;).
> 
> On PR, not HV :-)

Book3S HV doesn't support non-PAPR ;).


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 11:23   ` Bharat Bhushan
@ 2013-07-15 18:07     ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 18:07 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan, Bharat Bhushan

On 07/15/2013 06:11:16 AM, Bharat Bhushan wrote:
> Exit to guest user space if kvm does not implement the hcall.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
>  arch/powerpc/kvm/booke.c   |   47  
> +++++++++++++++++++++++++++++++++++++------
>  arch/powerpc/kvm/powerpc.c |    1 +
>  include/uapi/linux/kvm.h   |    1 +
>  3 files changed, 42 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 17722d8..c8b41b4 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,  
> struct kvm_vcpu *vcpu,
>  		break;
> 
>  #ifdef CONFIG_KVM_BOOKE_HV
> -	case BOOKE_INTERRUPT_HV_SYSCALL:
> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> +		int i;
>  		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> +			r = kvmppc_kvm_pv(vcpu);
> +			if (r != EV_UNIMPLEMENTED) {
> +				/* except unimplemented return to guest  
> */
> +				kvmppc_set_gpr(vcpu, 3, r);
> +				kvmppc_account_exit(vcpu,  
> SYSCALL_EXITS);
> +				r = RESUME_GUEST;
> +				break;
> +			}
> +			/* Exit to userspace for unimplemented hcalls  
> in kvm */
> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> +			run->epapr_hcall.ret = 0;
> +			for (i = 0; i < 8; i++)
> +				run->epapr_hcall.args[i] =  
> kvmppc_get_gpr(vcpu, 3 + i);

You need to clear the upper half of each register if CONFIG_PPC64=y and  
MSR_CM is not set.

> +			vcpu->arch.hcall_needed = 1;

The existing code for hcall_needed restores 9 return arguments, rather  
than the 8 that are defined for this interface.  Thus, you'll be  
restoring one word of padding into the guest -- which could be  
arbitrary userspace data that shouldn't be leaked.  r12 is volatile in  
the ePAPR hcall ABI so simply clobbering it isn't a problem, though.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-15 18:07     ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 18:07 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan, Bharat Bhushan

On 07/15/2013 06:11:16 AM, Bharat Bhushan wrote:
> Exit to guest user space if kvm does not implement the hcall.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> ---
>  arch/powerpc/kvm/booke.c   |   47  
> +++++++++++++++++++++++++++++++++++++------
>  arch/powerpc/kvm/powerpc.c |    1 +
>  include/uapi/linux/kvm.h   |    1 +
>  3 files changed, 42 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 17722d8..c8b41b4 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,  
> struct kvm_vcpu *vcpu,
>  		break;
> 
>  #ifdef CONFIG_KVM_BOOKE_HV
> -	case BOOKE_INTERRUPT_HV_SYSCALL:
> +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> +		int i;
>  		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> +			r = kvmppc_kvm_pv(vcpu);
> +			if (r != EV_UNIMPLEMENTED) {
> +				/* except unimplemented return to guest  
> */
> +				kvmppc_set_gpr(vcpu, 3, r);
> +				kvmppc_account_exit(vcpu,  
> SYSCALL_EXITS);
> +				r = RESUME_GUEST;
> +				break;
> +			}
> +			/* Exit to userspace for unimplemented hcalls  
> in kvm */
> +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> +			run->epapr_hcall.ret = 0;
> +			for (i = 0; i < 8; i++)
> +				run->epapr_hcall.args[i] =  
> kvmppc_get_gpr(vcpu, 3 + i);

You need to clear the upper half of each register if CONFIG_PPC64=y and  
MSR_CM is not set.

> +			vcpu->arch.hcall_needed = 1;

The existing code for hcall_needed restores 9 return arguments, rather  
than the 8 that are defined for this interface.  Thus, you'll be  
restoring one word of padding into the guest -- which could be  
arbitrary userspace data that shouldn't be leaked.  r12 is volatile in  
the ePAPR hcall ABI so simply clobbering it isn't a problem, though.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 11:30     ` Gleb Natapov
@ 2013-07-15 18:17       ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 18:17 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan

On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> > KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> > KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be  
> powered-off/halted.
> >
> > These hcalls are handled by guest userspace.
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> >  Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >  include/uapi/linux/kvm_para.h            |    3 ++-
> >  2 files changed, 18 insertions(+), 1 deletions(-)
> >
> > diff --git a/Documentation/virtual/kvm/hypercalls.txt  
> b/Documentation/virtual/kvm/hypercalls.txt
> > index ea113b5..58acdc1 100644
> > --- a/Documentation/virtual/kvm/hypercalls.txt
> > +++ b/Documentation/virtual/kvm/hypercalls.txt
> > @@ -64,3 +64,19 @@ Purpose: To enable communication between the  
> hypervisor and guest there is a
> >  shared page that contains parts of supervisor visible register  
> state.
> >  The guest can map this shared page to access its supervisor  
> register through
> >  memory using this hypercall.
> > +
> > +5. KVM_HC_VM_RESET
> > +------------------------
> > +Architecture: PPC
> > +Status: active
> > +Purpose:  Requests that the virtual machine be reset.  The hcall  
> takes no
> > +arguments. If successful the hcall does not return. If an error  
> occurs it
> > +returns EV_INTERNAL.
> > +
> > +6. KVM_HC_VM_SHUTDOWN
> > +------------------------
> > +Architecture: PPC
> > +Status: active
> > +Purpose: Requests that the virtual machine be powered-off/halted.
> > +The hcall takes no arguments. If successful the hcall does not  
> return.
> > +If an error occurs it returns EV_INTERNAL.
> > diff --git a/include/uapi/linux/kvm_para.h  
> b/include/uapi/linux/kvm_para.h
> > index cea2c5c..218882d 100644
> > --- a/include/uapi/linux/kvm_para.h
> > +++ b/include/uapi/linux/kvm_para.h
> > @@ -19,7 +19,8 @@
> >  #define KVM_HC_MMU_OP			2
> >  #define KVM_HC_FEATURES			3
> >  #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> > -
> > +#define KVM_HC_VM_RESET			5
> > +#define KVM_HC_VM_SHUTDOWN		6
> There is no much sense to share hypercalls between architectures.  
> There
> is zero probability x86 will implement those for instance

This is similar to the question of whether to keep device API  
enumerations per-architecture...  It costs very little to keep it in a  
common place, and it's hard to go back in the other direction if we  
later realize there are things that should be shared.

Keeping it in a common place also makes it more visible to people  
looking to add new hcalls, which could cut down on reinventing the  
wheel.

> (not sure why PPC will want them either instead of emulating devices  
> that do
> shutdown/reset).

Besides what Alex said, for shutdown we don't have any existing device  
to emulate (our real hardware just doesn't have that functionality).   
For reset we currently do emulate, but it's awkward to describe in the  
device tree what we actually emulate since the reset functionality is  
part of a kitchen-sink "device" of which we emulate virtually nothing  
other than the reset.  Currently we advertise the entire thing and just  
ignore the rest, but that causes problems with the guest seeing the  
node and trying to use that functionality.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-15 18:17       ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 18:17 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan

On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> > KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> > KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be  
> powered-off/halted.
> >
> > These hcalls are handled by guest userspace.
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> >  Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >  include/uapi/linux/kvm_para.h            |    3 ++-
> >  2 files changed, 18 insertions(+), 1 deletions(-)
> >
> > diff --git a/Documentation/virtual/kvm/hypercalls.txt  
> b/Documentation/virtual/kvm/hypercalls.txt
> > index ea113b5..58acdc1 100644
> > --- a/Documentation/virtual/kvm/hypercalls.txt
> > +++ b/Documentation/virtual/kvm/hypercalls.txt
> > @@ -64,3 +64,19 @@ Purpose: To enable communication between the  
> hypervisor and guest there is a
> >  shared page that contains parts of supervisor visible register  
> state.
> >  The guest can map this shared page to access its supervisor  
> register through
> >  memory using this hypercall.
> > +
> > +5. KVM_HC_VM_RESET
> > +------------------------
> > +Architecture: PPC
> > +Status: active
> > +Purpose:  Requests that the virtual machine be reset.  The hcall  
> takes no
> > +arguments. If successful the hcall does not return. If an error  
> occurs it
> > +returns EV_INTERNAL.
> > +
> > +6. KVM_HC_VM_SHUTDOWN
> > +------------------------
> > +Architecture: PPC
> > +Status: active
> > +Purpose: Requests that the virtual machine be powered-off/halted.
> > +The hcall takes no arguments. If successful the hcall does not  
> return.
> > +If an error occurs it returns EV_INTERNAL.
> > diff --git a/include/uapi/linux/kvm_para.h  
> b/include/uapi/linux/kvm_para.h
> > index cea2c5c..218882d 100644
> > --- a/include/uapi/linux/kvm_para.h
> > +++ b/include/uapi/linux/kvm_para.h
> > @@ -19,7 +19,8 @@
> >  #define KVM_HC_MMU_OP			2
> >  #define KVM_HC_FEATURES			3
> >  #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> > -
> > +#define KVM_HC_VM_RESET			5
> > +#define KVM_HC_VM_SHUTDOWN		6
> There is no much sense to share hypercalls between architectures.  
> There
> is zero probability x86 will implement those for instance

This is similar to the question of whether to keep device API  
enumerations per-architecture...  It costs very little to keep it in a  
common place, and it's hard to go back in the other direction if we  
later realize there are things that should be shared.

Keeping it in a common place also makes it more visible to people  
looking to add new hcalls, which could cut down on reinventing the  
wheel.

> (not sure why PPC will want them either instead of emulating devices  
> that do
> shutdown/reset).

Besides what Alex said, for shutdown we don't have any existing device  
to emulate (our real hardware just doesn't have that functionality).   
For reset we currently do emulate, but it's awkward to describe in the  
device tree what we actually emulate since the reset functionality is  
part of a kitchen-sink "device" of which we emulate virtually nothing  
other than the reset.  Currently we advertise the entire thing and just  
ignore the rest, but that causes problems with the guest seeing the  
node and trying to use that functionality.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 15:16         ` Bhushan Bharat-R65777
@ 2013-07-15 18:21             ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 18:21 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: Alexander Graf, kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248

On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
> 
> 
> > -----Original Message-----
> > From: Alexander Graf [mailto:agraf@suse.de]
> > Sent: Monday, July 15, 2013 8:40 PM
> > To: Bhushan Bharat-R65777
> > Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood  
> Scott-B07421; Yoder
> > Stuart-B08248
> > Subject: Re: [PATCH 5/5] powerpc: using reset hcall when  
> kvm,has-reset
> >
> >
> > On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote:
> >
> > >
> > >
> > >> -----Original Message-----
> > >> From: Alexander Graf [mailto:agraf@suse.de]
> > >> Sent: Monday, July 15, 2013 5:20 PM
> > >> To: Bhushan Bharat-R65777
> > >> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood  
> Scott-B07421;
> > >> Yoder Stuart-B08248; Bhushan Bharat-R65777
> > >> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when
> > >> kvm,has-reset
> > >>
> > >>
> > >> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> > >>
> > >>> Detect the availability of the reset hcalls by looking at
> > >>> kvm,has-reset property on the /hypervisor node in the device  
> tree
> > >>> passed to the VM and patches the reset mechanism to use reset  
> hcall.
> > >>>
> > >>> This patch uses the reser hcall when kvm,has-reset is there in
> > >>
> > >> Your patch description is pretty broken :).
> > >>
> > >>>
> > >>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > >>> ---
> > >>> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
> > >>> 1 files changed, 12 insertions(+), 0 deletions(-)
> > >>>
> > >>> diff --git a/arch/powerpc/kernel/epapr_paravirt.c
> > >>> b/arch/powerpc/kernel/epapr_paravirt.c
> > >>> index d44a571..651d701 100644
> > >>> --- a/arch/powerpc/kernel/epapr_paravirt.c
> > >>> +++ b/arch/powerpc/kernel/epapr_paravirt.c
> > >>> @@ -22,6 +22,8 @@
> > >>> #include <asm/cacheflush.h>
> > >>> #include <asm/code-patching.h>
> > >>> #include <asm/machdep.h>
> > >>> +#include <asm/kvm_para.h>
> > >>> +#include <asm/kvm_host.h>
> > >>
> > >> Why would we need kvm_host.h? This is guest code.
> > >>
> > >>>
> > >>> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64)  
> extern
> > >>> void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
> > >>> epapr_ev_idle_start[];
> > >>>
> > >>> bool epapr_paravirt_enabled;
> > >>>
> > >>> +void epapr_hypercall_reset(char *cmd) {
> > >>> +	long ret;
> > >>> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);
> > >>
> > >> Is this available without CONFIG_KVM_GUEST? kvm_hypercall()  
> simply
> > >> returns "unimplemented" for everything when that config option  
> is not set.
> > >
> > > We are here because we patched the ppc_md.restart to point to new  
> handler.
> > > So I think we should patch the ppc_md.restart only if  
> CONFIG_KVM_GUEST is
> > true.
> >
> > We should only patch it if kvm_para_available(). That should guard  
> us against
> > everything.

It also should depend on whether the reset hcall is advertised in the  
device tree.

> > >>> +	printk("error: system reset returned with error %ld\n",  
> ret);
> > >>
> > >> So we should fall back to the normal reset handler here.
> > >
> > > Do you mean return normally from here, no BUG() etc?
> >
> > If we guard the patching against everything, we can treat a broken  
> hcall as BUG.
> > However, if we don't we want to fall back to the normal guts based  
> reset.
> 
> Will let Scott comment on this?
> 
> But ppc_md.restart can point to only one handler and during paravirt  
> patching we changed this to new handler. So we cannot jump back to  
> guts type handler

I don't think it's worth implementing a fall-back scheme -- if KVM  
advertises that the reset hcall exists, then it had better exist.

BTW, this is not part of ePAPR so it should not be in  
epapr_paravirt.c.  It should go in kvm.c.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-15 18:21             ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 18:21 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: Alexander Graf, kvm, kvm-ppc, Wood Scott-B07421, Yoder Stuart-B08248

On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
> 
> 
> > -----Original Message-----
> > From: Alexander Graf [mailto:agraf@suse.de]
> > Sent: Monday, July 15, 2013 8:40 PM
> > To: Bhushan Bharat-R65777
> > Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood  
> Scott-B07421; Yoder
> > Stuart-B08248
> > Subject: Re: [PATCH 5/5] powerpc: using reset hcall when  
> kvm,has-reset
> >
> >
> > On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote:
> >
> > >
> > >
> > >> -----Original Message-----
> > >> From: Alexander Graf [mailto:agraf@suse.de]
> > >> Sent: Monday, July 15, 2013 5:20 PM
> > >> To: Bhushan Bharat-R65777
> > >> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood  
> Scott-B07421;
> > >> Yoder Stuart-B08248; Bhushan Bharat-R65777
> > >> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when
> > >> kvm,has-reset
> > >>
> > >>
> > >> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
> > >>
> > >>> Detect the availability of the reset hcalls by looking at
> > >>> kvm,has-reset property on the /hypervisor node in the device  
> tree
> > >>> passed to the VM and patches the reset mechanism to use reset  
> hcall.
> > >>>
> > >>> This patch uses the reser hcall when kvm,has-reset is there in
> > >>
> > >> Your patch description is pretty broken :).
> > >>
> > >>>
> > >>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > >>> ---
> > >>> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
> > >>> 1 files changed, 12 insertions(+), 0 deletions(-)
> > >>>
> > >>> diff --git a/arch/powerpc/kernel/epapr_paravirt.c
> > >>> b/arch/powerpc/kernel/epapr_paravirt.c
> > >>> index d44a571..651d701 100644
> > >>> --- a/arch/powerpc/kernel/epapr_paravirt.c
> > >>> +++ b/arch/powerpc/kernel/epapr_paravirt.c
> > >>> @@ -22,6 +22,8 @@
> > >>> #include <asm/cacheflush.h>
> > >>> #include <asm/code-patching.h>
> > >>> #include <asm/machdep.h>
> > >>> +#include <asm/kvm_para.h>
> > >>> +#include <asm/kvm_host.h>
> > >>
> > >> Why would we need kvm_host.h? This is guest code.
> > >>
> > >>>
> > >>> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64)  
> extern
> > >>> void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
> > >>> epapr_ev_idle_start[];
> > >>>
> > >>> bool epapr_paravirt_enabled;
> > >>>
> > >>> +void epapr_hypercall_reset(char *cmd) {
> > >>> +	long ret;
> > >>> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);
> > >>
> > >> Is this available without CONFIG_KVM_GUEST? kvm_hypercall()  
> simply
> > >> returns "unimplemented" for everything when that config option  
> is not set.
> > >
> > > We are here because we patched the ppc_md.restart to point to new  
> handler.
> > > So I think we should patch the ppc_md.restart only if  
> CONFIG_KVM_GUEST is
> > true.
> >
> > We should only patch it if kvm_para_available(). That should guard  
> us against
> > everything.

It also should depend on whether the reset hcall is advertised in the  
device tree.

> > >>> +	printk("error: system reset returned with error %ld\n",  
> ret);
> > >>
> > >> So we should fall back to the normal reset handler here.
> > >
> > > Do you mean return normally from here, no BUG() etc?
> >
> > If we guard the patching against everything, we can treat a broken  
> hcall as BUG.
> > However, if we don't we want to fall back to the normal guts based  
> reset.
> 
> Will let Scott comment on this?
> 
> But ppc_md.restart can point to only one handler and during paravirt  
> patching we changed this to new handler. So we cannot jump back to  
> guts type handler

I don't think it's worth implementing a fall-back scheme -- if KVM  
advertises that the reset hcall exists, then it had better exist.

BTW, this is not part of ePAPR so it should not be in  
epapr_paravirt.c.  It should go in kvm.c.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 18:21             ` Scott Wood
@ 2013-07-15 20:28               ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 20:28 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248


On 15.07.2013, at 20:21, Scott Wood wrote:

> On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
>> > -----Original Message-----
>> > From: Alexander Graf [mailto:agraf@suse.de]
>> > Sent: Monday, July 15, 2013 8:40 PM
>> > To: Bhushan Bharat-R65777
>> > Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> > Stuart-B08248
>> > Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
>> >
>> >
>> > On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote:
>> >
>> > >
>> > >
>> > >> -----Original Message-----
>> > >> From: Alexander Graf [mailto:agraf@suse.de]
>> > >> Sent: Monday, July 15, 2013 5:20 PM
>> > >> To: Bhushan Bharat-R65777
>> > >> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
>> > >> Yoder Stuart-B08248; Bhushan Bharat-R65777
>> > >> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when
>> > >> kvm,has-reset
>> > >>
>> > >>
>> > >> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>> > >>
>> > >>> Detect the availability of the reset hcalls by looking at
>> > >>> kvm,has-reset property on the /hypervisor node in the device tree
>> > >>> passed to the VM and patches the reset mechanism to use reset hcall.
>> > >>>
>> > >>> This patch uses the reser hcall when kvm,has-reset is there in
>> > >>
>> > >> Your patch description is pretty broken :).
>> > >>
>> > >>>
>> > >>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>> > >>> ---
>> > >>> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
>> > >>> 1 files changed, 12 insertions(+), 0 deletions(-)
>> > >>>
>> > >>> diff --git a/arch/powerpc/kernel/epapr_paravirt.c
>> > >>> b/arch/powerpc/kernel/epapr_paravirt.c
>> > >>> index d44a571..651d701 100644
>> > >>> --- a/arch/powerpc/kernel/epapr_paravirt.c
>> > >>> +++ b/arch/powerpc/kernel/epapr_paravirt.c
>> > >>> @@ -22,6 +22,8 @@
>> > >>> #include <asm/cacheflush.h>
>> > >>> #include <asm/code-patching.h>
>> > >>> #include <asm/machdep.h>
>> > >>> +#include <asm/kvm_para.h>
>> > >>> +#include <asm/kvm_host.h>
>> > >>
>> > >> Why would we need kvm_host.h? This is guest code.
>> > >>
>> > >>>
>> > >>> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern
>> > >>> void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
>> > >>> epapr_ev_idle_start[];
>> > >>>
>> > >>> bool epapr_paravirt_enabled;
>> > >>>
>> > >>> +void epapr_hypercall_reset(char *cmd) {
>> > >>> +	long ret;
>> > >>> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);
>> > >>
>> > >> Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply
>> > >> returns "unimplemented" for everything when that config option is not set.
>> > >
>> > > We are here because we patched the ppc_md.restart to point to new handler.
>> > > So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is
>> > true.
>> >
>> > We should only patch it if kvm_para_available(). That should guard us against
>> > everything.
> 
> It also should depend on whether the reset hcall is advertised in the device tree.

Ah, figured that part was obvious :).

> 
>> > >>> +	printk("error: system reset returned with error %ld\n", ret);
>> > >>
>> > >> So we should fall back to the normal reset handler here.
>> > >
>> > > Do you mean return normally from here, no BUG() etc?
>> >
>> > If we guard the patching against everything, we can treat a broken hcall as BUG.
>> > However, if we don't we want to fall back to the normal guts based reset.
>> Will let Scott comment on this?
>> But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler
> 
> I don't think it's worth implementing a fall-back scheme -- if KVM advertises that the reset hcall exists, then it had better exist.

If we also check for kvm_para_available() I agree. Otherwise QEMU might advertise the reset hcall, but the guest kernel may not implement KVM hypercalls. In that case the device tree check will succeed, but the actual hypercall will not.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-15 20:28               ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 20:28 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248


On 15.07.2013, at 20:21, Scott Wood wrote:

> On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
>> > -----Original Message-----
>> > From: Alexander Graf [mailto:agraf@suse.de]
>> > Sent: Monday, July 15, 2013 8:40 PM
>> > To: Bhushan Bharat-R65777
>> > Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421; Yoder
>> > Stuart-B08248
>> > Subject: Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
>> >
>> >
>> > On 15.07.2013, at 17:05, Bhushan Bharat-R65777 wrote:
>> >
>> > >
>> > >
>> > >> -----Original Message-----
>> > >> From: Alexander Graf [mailto:agraf@suse.de]
>> > >> Sent: Monday, July 15, 2013 5:20 PM
>> > >> To: Bhushan Bharat-R65777
>> > >> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Wood Scott-B07421;
>> > >> Yoder Stuart-B08248; Bhushan Bharat-R65777
>> > >> Subject: Re: [PATCH 5/5] powerpc: using reset hcall when
>> > >> kvm,has-reset
>> > >>
>> > >>
>> > >> On 15.07.2013, at 13:11, Bharat Bhushan wrote:
>> > >>
>> > >>> Detect the availability of the reset hcalls by looking at
>> > >>> kvm,has-reset property on the /hypervisor node in the device tree
>> > >>> passed to the VM and patches the reset mechanism to use reset hcall.
>> > >>>
>> > >>> This patch uses the reser hcall when kvm,has-reset is there in
>> > >>
>> > >> Your patch description is pretty broken :).
>> > >>
>> > >>>
>> > >>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>> > >>> ---
>> > >>> arch/powerpc/kernel/epapr_paravirt.c |   12 ++++++++++++
>> > >>> 1 files changed, 12 insertions(+), 0 deletions(-)
>> > >>>
>> > >>> diff --git a/arch/powerpc/kernel/epapr_paravirt.c
>> > >>> b/arch/powerpc/kernel/epapr_paravirt.c
>> > >>> index d44a571..651d701 100644
>> > >>> --- a/arch/powerpc/kernel/epapr_paravirt.c
>> > >>> +++ b/arch/powerpc/kernel/epapr_paravirt.c
>> > >>> @@ -22,6 +22,8 @@
>> > >>> #include <asm/cacheflush.h>
>> > >>> #include <asm/code-patching.h>
>> > >>> #include <asm/machdep.h>
>> > >>> +#include <asm/kvm_para.h>
>> > >>> +#include <asm/kvm_host.h>
>> > >>
>> > >> Why would we need kvm_host.h? This is guest code.
>> > >>
>> > >>>
>> > >>> #if !defined(CONFIG_64BIT) || defined(CONFIG_PPC_BOOK3E_64) extern
>> > >>> void epapr_ev_idle(void); @@ -30,6 +32,14 @@ extern u32
>> > >>> epapr_ev_idle_start[];
>> > >>>
>> > >>> bool epapr_paravirt_enabled;
>> > >>>
>> > >>> +void epapr_hypercall_reset(char *cmd) {
>> > >>> +	long ret;
>> > >>> +	ret = kvm_hypercall0(KVM_HC_VM_RESET);
>> > >>
>> > >> Is this available without CONFIG_KVM_GUEST? kvm_hypercall() simply
>> > >> returns "unimplemented" for everything when that config option is not set.
>> > >
>> > > We are here because we patched the ppc_md.restart to point to new handler.
>> > > So I think we should patch the ppc_md.restart only if CONFIG_KVM_GUEST is
>> > true.
>> >
>> > We should only patch it if kvm_para_available(). That should guard us against
>> > everything.
> 
> It also should depend on whether the reset hcall is advertised in the device tree.

Ah, figured that part was obvious :).

> 
>> > >>> +	printk("error: system reset returned with error %ld\n", ret);
>> > >>
>> > >> So we should fall back to the normal reset handler here.
>> > >
>> > > Do you mean return normally from here, no BUG() etc?
>> >
>> > If we guard the patching against everything, we can treat a broken hcall as BUG.
>> > However, if we don't we want to fall back to the normal guts based reset.
>> Will let Scott comment on this?
>> But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler
> 
> I don't think it's worth implementing a fall-back scheme -- if KVM advertises that the reset hcall exists, then it had better exist.

If we also check for kvm_para_available() I agree. Otherwise QEMU might advertise the reset hcall, but the guest kernel may not implement KVM hypercalls. In that case the device tree check will succeed, but the actual hypercall will not.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 20:28               ` Alexander Graf
@ 2013-07-15 20:52                 ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 20:52 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248

On 07/15/2013 03:28:46 PM, Alexander Graf wrote:
> 
> On 15.07.2013, at 20:21, Scott Wood wrote:
> 
> > On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
> >> > >>> +	printk("error: system reset returned with error %ld\n",  
> ret);
> >> > >>
> >> > >> So we should fall back to the normal reset handler here.
> >> > >
> >> > > Do you mean return normally from here, no BUG() etc?
> >> >
> >> > If we guard the patching against everything, we can treat a  
> broken hcall as BUG.
> >> > However, if we don't we want to fall back to the normal guts  
> based reset.
> >> Will let Scott comment on this?
> >> But ppc_md.restart can point to only one handler and during  
> paravirt patching we changed this to new handler. So we cannot jump  
> back to guts type handler
> >
> > I don't think it's worth implementing a fall-back scheme -- if KVM  
> advertises that the reset hcall exists, then it had better exist.
> 
> If we also check for kvm_para_available() I agree. Otherwise QEMU  
> might advertise the reset hcall, but the guest kernel may not  
> implement KVM hypercalls. In that case the device tree check will  
> succeed, but the actual hypercall will not.

Wouldn't that be a bug in QEMU?  Or in KVM for exposing the hcall  
capability without implementing them?

Not that I have anything against checking kvm_para_available()...   
though that (or its functional equivalent that returns a pointer to the  
node) should really be a prerequisite for even being able to interpret  
KVM-specific properties in the hypervisor node.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-15 20:52                 ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 20:52 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248

On 07/15/2013 03:28:46 PM, Alexander Graf wrote:
> 
> On 15.07.2013, at 20:21, Scott Wood wrote:
> 
> > On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
> >> > >>> +	printk("error: system reset returned with error %ld\n",  
> ret);
> >> > >>
> >> > >> So we should fall back to the normal reset handler here.
> >> > >
> >> > > Do you mean return normally from here, no BUG() etc?
> >> >
> >> > If we guard the patching against everything, we can treat a  
> broken hcall as BUG.
> >> > However, if we don't we want to fall back to the normal guts  
> based reset.
> >> Will let Scott comment on this?
> >> But ppc_md.restart can point to only one handler and during  
> paravirt patching we changed this to new handler. So we cannot jump  
> back to guts type handler
> >
> > I don't think it's worth implementing a fall-back scheme -- if KVM  
> advertises that the reset hcall exists, then it had better exist.
> 
> If we also check for kvm_para_available() I agree. Otherwise QEMU  
> might advertise the reset hcall, but the guest kernel may not  
> implement KVM hypercalls. In that case the device tree check will  
> succeed, but the actual hypercall will not.

Wouldn't that be a bug in QEMU?  Or in KVM for exposing the hcall  
capability without implementing them?

Not that I have anything against checking kvm_para_available()...   
though that (or its functional equivalent that returns a pointer to the  
node) should really be a prerequisite for even being able to interpret  
KVM-specific properties in the hypervisor node.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 20:52                 ` Scott Wood
@ 2013-07-15 20:55                   ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 20:55 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248


On 15.07.2013, at 22:52, Scott Wood wrote:

> On 07/15/2013 03:28:46 PM, Alexander Graf wrote:
>> On 15.07.2013, at 20:21, Scott Wood wrote:
>> > On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
>> >> > >>> +	printk("error: system reset returned with error %ld\n", ret);
>> >> > >>
>> >> > >> So we should fall back to the normal reset handler here.
>> >> > >
>> >> > > Do you mean return normally from here, no BUG() etc?
>> >> >
>> >> > If we guard the patching against everything, we can treat a broken hcall as BUG.
>> >> > However, if we don't we want to fall back to the normal guts based reset.
>> >> Will let Scott comment on this?
>> >> But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler
>> >
>> > I don't think it's worth implementing a fall-back scheme -- if KVM advertises that the reset hcall exists, then it had better exist.
>> If we also check for kvm_para_available() I agree. Otherwise QEMU might advertise the reset hcall, but the guest kernel may not implement KVM hypercalls. In that case the device tree check will succeed, but the actual hypercall will not.
> 
> Wouldn't that be a bug in QEMU?  Or in KVM for exposing the hcall capability without implementing them?

No, because it would be the guest that doesn't know how to handle kvm hypercalls.

> Not that I have anything against checking kvm_para_available()...  though that (or its functional equivalent that returns a pointer to the node) should really be a prerequisite for even being able to interpret KVM-specific properties in the hypervisor node.

That's my point :).


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-15 20:55                   ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-15 20:55 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248


On 15.07.2013, at 22:52, Scott Wood wrote:

> On 07/15/2013 03:28:46 PM, Alexander Graf wrote:
>> On 15.07.2013, at 20:21, Scott Wood wrote:
>> > On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
>> >> > >>> +	printk("error: system reset returned with error %ld\n", ret);
>> >> > >>
>> >> > >> So we should fall back to the normal reset handler here.
>> >> > >
>> >> > > Do you mean return normally from here, no BUG() etc?
>> >> >
>> >> > If we guard the patching against everything, we can treat a broken hcall as BUG.
>> >> > However, if we don't we want to fall back to the normal guts based reset.
>> >> Will let Scott comment on this?
>> >> But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler
>> >
>> > I don't think it's worth implementing a fall-back scheme -- if KVM advertises that the reset hcall exists, then it had better exist.
>> If we also check for kvm_para_available() I agree. Otherwise QEMU might advertise the reset hcall, but the guest kernel may not implement KVM hypercalls. In that case the device tree check will succeed, but the actual hypercall will not.
> 
> Wouldn't that be a bug in QEMU?  Or in KVM for exposing the hcall capability without implementing them?

No, because it would be the guest that doesn't know how to handle kvm hypercalls.

> Not that I have anything against checking kvm_para_available()...  though that (or its functional equivalent that returns a pointer to the node) should really be a prerequisite for even being able to interpret KVM-specific properties in the hypervisor node.

That's my point :).


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 20:55                   ` Alexander Graf
@ 2013-07-15 22:23                     ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 22:23 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248

On 07/15/2013 03:55:08 PM, Alexander Graf wrote:
> 
> On 15.07.2013, at 22:52, Scott Wood wrote:
> 
> > On 07/15/2013 03:28:46 PM, Alexander Graf wrote:
> >> On 15.07.2013, at 20:21, Scott Wood wrote:
> >> > On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
> >> >> > >>> +	printk("error: system reset returned with error %ld\n",  
> ret);
> >> >> > >>
> >> >> > >> So we should fall back to the normal reset handler here.
> >> >> > >
> >> >> > > Do you mean return normally from here, no BUG() etc?
> >> >> >
> >> >> > If we guard the patching against everything, we can treat a  
> broken hcall as BUG.
> >> >> > However, if we don't we want to fall back to the normal guts  
> based reset.
> >> >> Will let Scott comment on this?
> >> >> But ppc_md.restart can point to only one handler and during  
> paravirt patching we changed this to new handler. So we cannot jump  
> back to guts type handler
> >> >
> >> > I don't think it's worth implementing a fall-back scheme -- if  
> KVM advertises that the reset hcall exists, then it had better exist.
> >> If we also check for kvm_para_available() I agree. Otherwise QEMU  
> might advertise the reset hcall, but the guest kernel may not  
> implement KVM hypercalls. In that case the device tree check will  
> succeed, but the actual hypercall will not.
> >
> > Wouldn't that be a bug in QEMU?  Or in KVM for exposing the hcall  
> capability without implementing them?
> 
> No, because it would be the guest that doesn't know how to handle kvm  
> hypercalls.

Oh, I misread "guest kernel" as "host kernel". :-P

Still, I'm not sure what sort of error you're thinking of.  If the  
guest didn't support the hcall mechanism we would have returned from  
the function by that point.  In fact, seeing kvm,has-reset on a  
different hypervisor ought to mean that that hypervisor is emulating  
KVM in this particular respect.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-15 22:23                     ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-15 22:23 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248

On 07/15/2013 03:55:08 PM, Alexander Graf wrote:
> 
> On 15.07.2013, at 22:52, Scott Wood wrote:
> 
> > On 07/15/2013 03:28:46 PM, Alexander Graf wrote:
> >> On 15.07.2013, at 20:21, Scott Wood wrote:
> >> > On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
> >> >> > >>> +	printk("error: system reset returned with error %ld\n",  
> ret);
> >> >> > >>
> >> >> > >> So we should fall back to the normal reset handler here.
> >> >> > >
> >> >> > > Do you mean return normally from here, no BUG() etc?
> >> >> >
> >> >> > If we guard the patching against everything, we can treat a  
> broken hcall as BUG.
> >> >> > However, if we don't we want to fall back to the normal guts  
> based reset.
> >> >> Will let Scott comment on this?
> >> >> But ppc_md.restart can point to only one handler and during  
> paravirt patching we changed this to new handler. So we cannot jump  
> back to guts type handler
> >> >
> >> > I don't think it's worth implementing a fall-back scheme -- if  
> KVM advertises that the reset hcall exists, then it had better exist.
> >> If we also check for kvm_para_available() I agree. Otherwise QEMU  
> might advertise the reset hcall, but the guest kernel may not  
> implement KVM hypercalls. In that case the device tree check will  
> succeed, but the actual hypercall will not.
> >
> > Wouldn't that be a bug in QEMU?  Or in KVM for exposing the hcall  
> capability without implementing them?
> 
> No, because it would be the guest that doesn't know how to handle kvm  
> hypercalls.

Oh, I misread "guest kernel" as "host kernel". :-P

Still, I'm not sure what sort of error you're thinking of.  If the  
guest didn't support the hcall mechanism we would have returned from  
the function by that point.  In fact, seeing kvm,has-reset on a  
different hypervisor ought to mean that that hypervisor is emulating  
KVM in this particular respect.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
  2013-07-15 18:07     ` Scott Wood
@ 2013-07-16  4:46       ` Bhushan Bharat-R65777
  -1 siblings, 0 replies; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-16  4:46 UTC (permalink / raw)
  To: Wood Scott-B07421; +Cc: kvm, kvm-ppc, agraf, Yoder Stuart-B08248



> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Monday, July 15, 2013 11:38 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> B08248; Bhushan Bharat-R65777; Bhushan Bharat-R65777
> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
> in kvm
> 
> On 07/15/2013 06:11:16 AM, Bharat Bhushan wrote:
> > Exit to guest user space if kvm does not implement the hcall.
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> >  arch/powerpc/kvm/booke.c   |   47
> > +++++++++++++++++++++++++++++++++++++------
> >  arch/powerpc/kvm/powerpc.c |    1 +
> >  include/uapi/linux/kvm.h   |    1 +
> >  3 files changed, 42 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
> > 17722d8..c8b41b4 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
> > struct kvm_vcpu *vcpu,
> >  		break;
> >
> >  #ifdef CONFIG_KVM_BOOKE_HV
> > -	case BOOKE_INTERRUPT_HV_SYSCALL:
> > +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> > +		int i;
> >  		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> > -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> > +			r = kvmppc_kvm_pv(vcpu);
> > +			if (r != EV_UNIMPLEMENTED) {
> > +				/* except unimplemented return to guest
> > */
> > +				kvmppc_set_gpr(vcpu, 3, r);
> > +				kvmppc_account_exit(vcpu,
> > SYSCALL_EXITS);
> > +				r = RESUME_GUEST;
> > +				break;
> > +			}
> > +			/* Exit to userspace for unimplemented hcalls
> > in kvm */
> > +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> > +			run->epapr_hcall.ret = 0;
> > +			for (i = 0; i < 8; i++)
> > +				run->epapr_hcall.args[i] =
> > kvmppc_get_gpr(vcpu, 3 + i);
> 
> You need to clear the upper half of each register if CONFIG_PPC64=y and MSR_CM
> is not set.
> 
> > +			vcpu->arch.hcall_needed = 1;
> 
> The existing code for hcall_needed restores 9 return arguments, rather than the
> 8 that are defined for this interface.  Thus, you'll be restoring one word of
> padding into the guest -- which could be arbitrary userspace data that shouldn't
> be leaked.  r12 is volatile in the ePAPR hcall ABI so simply clobbering it isn't
> a problem, though.

Oops; Not just that, currently this uses struct type "papr_hcall" while on booke we should use epapr_hcall. I will make a function which will be defined in book3s.c and booke.c to setup hcall return registers accordingly. 

-Bharat


> 
> -Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm
@ 2013-07-16  4:46       ` Bhushan Bharat-R65777
  0 siblings, 0 replies; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-16  4:46 UTC (permalink / raw)
  To: Wood Scott-B07421; +Cc: kvm, kvm-ppc, agraf, Yoder Stuart-B08248



> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Monday, July 15, 2013 11:38 PM
> To: Bhushan Bharat-R65777
> Cc: kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; agraf@suse.de; Yoder Stuart-
> B08248; Bhushan Bharat-R65777; Bhushan Bharat-R65777
> Subject: Re: [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls
> in kvm
> 
> On 07/15/2013 06:11:16 AM, Bharat Bhushan wrote:
> > Exit to guest user space if kvm does not implement the hcall.
> >
> > Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> > ---
> >  arch/powerpc/kvm/booke.c   |   47
> > +++++++++++++++++++++++++++++++++++++------
> >  arch/powerpc/kvm/powerpc.c |    1 +
> >  include/uapi/linux/kvm.h   |    1 +
> >  3 files changed, 42 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
> > 17722d8..c8b41b4 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -1005,9 +1005,25 @@ int kvmppc_handle_exit(struct kvm_run *run,
> > struct kvm_vcpu *vcpu,
> >  		break;
> >
> >  #ifdef CONFIG_KVM_BOOKE_HV
> > -	case BOOKE_INTERRUPT_HV_SYSCALL:
> > +	case BOOKE_INTERRUPT_HV_SYSCALL: {
> > +		int i;
> >  		if (!(vcpu->arch.shared->msr & MSR_PR)) {
> > -			kvmppc_set_gpr(vcpu, 3, kvmppc_kvm_pv(vcpu));
> > +			r = kvmppc_kvm_pv(vcpu);
> > +			if (r != EV_UNIMPLEMENTED) {
> > +				/* except unimplemented return to guest
> > */
> > +				kvmppc_set_gpr(vcpu, 3, r);
> > +				kvmppc_account_exit(vcpu,
> > SYSCALL_EXITS);
> > +				r = RESUME_GUEST;
> > +				break;
> > +			}
> > +			/* Exit to userspace for unimplemented hcalls
> > in kvm */
> > +			run->epapr_hcall.nr = kvmppc_get_gpr(vcpu, 11);
> > +			run->epapr_hcall.ret = 0;
> > +			for (i = 0; i < 8; i++)
> > +				run->epapr_hcall.args[i] > > kvmppc_get_gpr(vcpu, 3 + i);
> 
> You need to clear the upper half of each register if CONFIG_PPC64=y and MSR_CM
> is not set.
> 
> > +			vcpu->arch.hcall_needed = 1;
> 
> The existing code for hcall_needed restores 9 return arguments, rather than the
> 8 that are defined for this interface.  Thus, you'll be restoring one word of
> padding into the guest -- which could be arbitrary userspace data that shouldn't
> be leaked.  r12 is volatile in the ePAPR hcall ABI so simply clobbering it isn't
> a problem, though.

Oops; Not just that, currently this uses struct type "papr_hcall" while on booke we should use epapr_hcall. I will make a function which will be defined in book3s.c and booke.c to setup hcall return registers accordingly. 

-Bharat


> 
> -Scott


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-15 18:17       ` Scott Wood
@ 2013-07-16  6:35         ` Gleb Natapov
  -1 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-16  6:35 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bharat Bhushan, kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> >> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> >> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be
> >powered-off/halted.
> >>
> >> These hcalls are handled by guest userspace.
> >>
> >> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >> ---
> >>  Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >>  include/uapi/linux/kvm_para.h            |    3 ++-
> >>  2 files changed, 18 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/Documentation/virtual/kvm/hypercalls.txt
> >b/Documentation/virtual/kvm/hypercalls.txt
> >> index ea113b5..58acdc1 100644
> >> --- a/Documentation/virtual/kvm/hypercalls.txt
> >> +++ b/Documentation/virtual/kvm/hypercalls.txt
> >> @@ -64,3 +64,19 @@ Purpose: To enable communication between the
> >hypervisor and guest there is a
> >>  shared page that contains parts of supervisor visible register
> >state.
> >>  The guest can map this shared page to access its supervisor
> >register through
> >>  memory using this hypercall.
> >> +
> >> +5. KVM_HC_VM_RESET
> >> +------------------------
> >> +Architecture: PPC
> >> +Status: active
> >> +Purpose:  Requests that the virtual machine be reset.  The
> >hcall takes no
> >> +arguments. If successful the hcall does not return. If an error
> >occurs it
> >> +returns EV_INTERNAL.
> >> +
> >> +6. KVM_HC_VM_SHUTDOWN
> >> +------------------------
> >> +Architecture: PPC
> >> +Status: active
> >> +Purpose: Requests that the virtual machine be powered-off/halted.
> >> +The hcall takes no arguments. If successful the hcall does not
> >return.
> >> +If an error occurs it returns EV_INTERNAL.
> >> diff --git a/include/uapi/linux/kvm_para.h
> >b/include/uapi/linux/kvm_para.h
> >> index cea2c5c..218882d 100644
> >> --- a/include/uapi/linux/kvm_para.h
> >> +++ b/include/uapi/linux/kvm_para.h
> >> @@ -19,7 +19,8 @@
> >>  #define KVM_HC_MMU_OP			2
> >>  #define KVM_HC_FEATURES			3
> >>  #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> >> -
> >> +#define KVM_HC_VM_RESET			5
> >> +#define KVM_HC_VM_SHUTDOWN		6
> >There is no much sense to share hypercalls between architectures.
> >There
> >is zero probability x86 will implement those for instance
> 
> This is similar to the question of whether to keep device API
> enumerations per-architecture...  It costs very little to keep it in
> a common place, and it's hard to go back in the other direction if
> we later realize there are things that should be shared.
>
This is different from device API since with device API all arches have
to create/destroy devices, so it make sense to put device lifecycle
management into the common code, and device API has single entry point
to the code - device fd ioctl - where it makes sense to handle common
tasks, if any, and despatch others to specific device implementation.

This is totally unlike hypercalls which are, by definition, very
architecture specific (the way they are triggered, the way parameter
are passed from guest to host, what hypercalls arch needs...). The
entry point of hypercalls is in arch specific code (again unlike device
API), so they way to reuse code if need arise is different too and does
not require common name space - just call common function after
retrieving hypercalls parameters in arch specific way.

> Keeping it in a common place also makes it more visible to people
> looking to add new hcalls, which could cut down on reinventing the
> wheel.
I do not want other arches to start using hypercalls in the way powerpc
started to use them: separate device io space, so it is better to hide
this as far away from common code as possible :) But on a more serious
note hypercalls should be a last resort and added only when no other
possibility exists, so people should not look what hcalls others
implemented, so they can add them to their favorite arch, but they
should have a problem at hand that they cannot solve without hcall, but
at this point they will have pretty good idea what this hcall should do.

> 
> >(not sure why PPC will want them either instead of emulating
> >devices that do
> >shutdown/reset).
> 
> Besides what Alex said, for shutdown we don't have any existing
> device to emulate (our real hardware just doesn't have that
> functionality).  For reset we currently do emulate, but it's awkward
> to describe in the device tree what we actually emulate since the
> reset functionality is part of a kitchen-sink "device" of which we
> emulate virtually nothing other than the reset.  Currently we
> advertise the entire thing and just ignore the rest, but that causes
> problems with the guest seeing the node and trying to use that
> functionality.
> 
What about writing virtio device for shutdown and add missing emulation
to kitchen-sink device (yeah I know easily said that done), or make
the virtio device handle reset too? This of course raises the question
what address to use for a device hence the idea to use hcalls as
separate address space.
 
--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-16  6:35         ` Gleb Natapov
  0 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-16  6:35 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bharat Bhushan, kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan

On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >On Mon, Jul 15, 2013 at 04:41:17PM +0530, Bharat Bhushan wrote:
> >> KVM_HC_VM_RESET: Requests that the virtual machine be reset.
> >> KVM_HC_VM_SHUTDOWN: Requests that the virtual machine be
> >powered-off/halted.
> >>
> >> These hcalls are handled by guest userspace.
> >>
> >> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> >> ---
> >>  Documentation/virtual/kvm/hypercalls.txt |   16 ++++++++++++++++
> >>  include/uapi/linux/kvm_para.h            |    3 ++-
> >>  2 files changed, 18 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/Documentation/virtual/kvm/hypercalls.txt
> >b/Documentation/virtual/kvm/hypercalls.txt
> >> index ea113b5..58acdc1 100644
> >> --- a/Documentation/virtual/kvm/hypercalls.txt
> >> +++ b/Documentation/virtual/kvm/hypercalls.txt
> >> @@ -64,3 +64,19 @@ Purpose: To enable communication between the
> >hypervisor and guest there is a
> >>  shared page that contains parts of supervisor visible register
> >state.
> >>  The guest can map this shared page to access its supervisor
> >register through
> >>  memory using this hypercall.
> >> +
> >> +5. KVM_HC_VM_RESET
> >> +------------------------
> >> +Architecture: PPC
> >> +Status: active
> >> +Purpose:  Requests that the virtual machine be reset.  The
> >hcall takes no
> >> +arguments. If successful the hcall does not return. If an error
> >occurs it
> >> +returns EV_INTERNAL.
> >> +
> >> +6. KVM_HC_VM_SHUTDOWN
> >> +------------------------
> >> +Architecture: PPC
> >> +Status: active
> >> +Purpose: Requests that the virtual machine be powered-off/halted.
> >> +The hcall takes no arguments. If successful the hcall does not
> >return.
> >> +If an error occurs it returns EV_INTERNAL.
> >> diff --git a/include/uapi/linux/kvm_para.h
> >b/include/uapi/linux/kvm_para.h
> >> index cea2c5c..218882d 100644
> >> --- a/include/uapi/linux/kvm_para.h
> >> +++ b/include/uapi/linux/kvm_para.h
> >> @@ -19,7 +19,8 @@
> >>  #define KVM_HC_MMU_OP			2
> >>  #define KVM_HC_FEATURES			3
> >>  #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
> >> -
> >> +#define KVM_HC_VM_RESET			5
> >> +#define KVM_HC_VM_SHUTDOWN		6
> >There is no much sense to share hypercalls between architectures.
> >There
> >is zero probability x86 will implement those for instance
> 
> This is similar to the question of whether to keep device API
> enumerations per-architecture...  It costs very little to keep it in
> a common place, and it's hard to go back in the other direction if
> we later realize there are things that should be shared.
>
This is different from device API since with device API all arches have
to create/destroy devices, so it make sense to put device lifecycle
management into the common code, and device API has single entry point
to the code - device fd ioctl - where it makes sense to handle common
tasks, if any, and despatch others to specific device implementation.

This is totally unlike hypercalls which are, by definition, very
architecture specific (the way they are triggered, the way parameter
are passed from guest to host, what hypercalls arch needs...). The
entry point of hypercalls is in arch specific code (again unlike device
API), so they way to reuse code if need arise is different too and does
not require common name space - just call common function after
retrieving hypercalls parameters in arch specific way.

> Keeping it in a common place also makes it more visible to people
> looking to add new hcalls, which could cut down on reinventing the
> wheel.
I do not want other arches to start using hypercalls in the way powerpc
started to use them: separate device io space, so it is better to hide
this as far away from common code as possible :) But on a more serious
note hypercalls should be a last resort and added only when no other
possibility exists, so people should not look what hcalls others
implemented, so they can add them to their favorite arch, but they
should have a problem at hand that they cannot solve without hcall, but
at this point they will have pretty good idea what this hcall should do.

> 
> >(not sure why PPC will want them either instead of emulating
> >devices that do
> >shutdown/reset).
> 
> Besides what Alex said, for shutdown we don't have any existing
> device to emulate (our real hardware just doesn't have that
> functionality).  For reset we currently do emulate, but it's awkward
> to describe in the device tree what we actually emulate since the
> reset functionality is part of a kitchen-sink "device" of which we
> emulate virtually nothing other than the reset.  Currently we
> advertise the entire thing and just ignore the rest, but that causes
> problems with the guest seeing the node and trying to use that
> functionality.
> 
What about writing virtio device for shutdown and add missing emulation
to kitchen-sink device (yeah I know easily said that done), or make
the virtio device handle reset too? This of course raises the question
what address to use for a device hence the idea to use hcalls as
separate address space.
 
--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-16  6:35         ` Gleb Natapov
@ 2013-07-16 23:04           ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-16 23:04 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan

On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> > On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> > >There is no much sense to share hypercalls between architectures.
> > >There
> > >is zero probability x86 will implement those for instance
> >
> > This is similar to the question of whether to keep device API
> > enumerations per-architecture...  It costs very little to keep it in
> > a common place, and it's hard to go back in the other direction if
> > we later realize there are things that should be shared.
> >
> This is different from device API since with device API all arches  
> have
> to create/destroy devices, so it make sense to put device lifecycle
> management into the common code, and device API has single entry point
> to the code - device fd ioctl - where it makes sense to handle common
> tasks, if any, and despatch others to specific device implementation.
> 
> This is totally unlike hypercalls which are, by definition, very
> architecture specific (the way they are triggered, the way parameter
> are passed from guest to host, what hypercalls arch needs...).

The ABI is architecture specific.  The API doesn't need to be, any more  
than it does with syscalls (I consider the architecture-specific  
definition of syscall numbers and similar constants in Linux to be  
unfortunate, especially for tools such as strace or QEMU's linux-user  
emulation).

> > Keeping it in a common place also makes it more visible to people
> > looking to add new hcalls, which could cut down on reinventing the
> > wheel.
> I do not want other arches to start using hypercalls in the way  
> powerpc
> started to use them: separate device io space, so it is better to hide
> this as far away from common code as possible :) But on a more serious
> note hypercalls should be a last resort and added only when no other
> possibility exists, so people should not look what hcalls others
> implemented, so they can add them to their favorite arch, but they
> should have a problem at hand that they cannot solve without hcall,  
> but
> at this point they will have pretty good idea what this hcall should  
> do.

Why are hcalls such a bad thing?

Should new Linux syscalls be avoided too, in favor of new emulated  
devices exposed via vfio? :-)

> > >(not sure why PPC will want them either instead of emulating
> > >devices that do
> > >shutdown/reset).
> >
> > Besides what Alex said, for shutdown we don't have any existing
> > device to emulate (our real hardware just doesn't have that
> > functionality).  For reset we currently do emulate, but it's awkward
> > to describe in the device tree what we actually emulate since the
> > reset functionality is part of a kitchen-sink "device" of which we
> > emulate virtually nothing other than the reset.  Currently we
> > advertise the entire thing and just ignore the rest, but that causes
> > problems with the guest seeing the node and trying to use that
> > functionality.
> >
> What about writing virtio device for shutdown

That sounds like quite a bit more work than hcalls.  It also ties up a  
virtual PCI slot -- some machines don't have very many (mpc8544ds has  
2, though we could and should expand that in the paravirt e500 machine).

> and add missing emulation
> to kitchen-sink device (yeah I know easily said that done),

Not going to happen... there's lots of low-level and chip-specific  
stuff in there.  We'd have to make several versions, even for the  
paravirt platform so it would correspond to some chip that goes with  
the cpu being used.  Even then we couldn't do everything, at least with  
KVM -- one of the things in there is the ability to freeze the  
timebase, but reading the timebase doesn't trap, and we aren't going to  
freeze the host timebase.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-16 23:04           ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-16 23:04 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Bharat Bhushan, kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan

On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> > On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> > >There is no much sense to share hypercalls between architectures.
> > >There
> > >is zero probability x86 will implement those for instance
> >
> > This is similar to the question of whether to keep device API
> > enumerations per-architecture...  It costs very little to keep it in
> > a common place, and it's hard to go back in the other direction if
> > we later realize there are things that should be shared.
> >
> This is different from device API since with device API all arches  
> have
> to create/destroy devices, so it make sense to put device lifecycle
> management into the common code, and device API has single entry point
> to the code - device fd ioctl - where it makes sense to handle common
> tasks, if any, and despatch others to specific device implementation.
> 
> This is totally unlike hypercalls which are, by definition, very
> architecture specific (the way they are triggered, the way parameter
> are passed from guest to host, what hypercalls arch needs...).

The ABI is architecture specific.  The API doesn't need to be, any more  
than it does with syscalls (I consider the architecture-specific  
definition of syscall numbers and similar constants in Linux to be  
unfortunate, especially for tools such as strace or QEMU's linux-user  
emulation).

> > Keeping it in a common place also makes it more visible to people
> > looking to add new hcalls, which could cut down on reinventing the
> > wheel.
> I do not want other arches to start using hypercalls in the way  
> powerpc
> started to use them: separate device io space, so it is better to hide
> this as far away from common code as possible :) But on a more serious
> note hypercalls should be a last resort and added only when no other
> possibility exists, so people should not look what hcalls others
> implemented, so they can add them to their favorite arch, but they
> should have a problem at hand that they cannot solve without hcall,  
> but
> at this point they will have pretty good idea what this hcall should  
> do.

Why are hcalls such a bad thing?

Should new Linux syscalls be avoided too, in favor of new emulated  
devices exposed via vfio? :-)

> > >(not sure why PPC will want them either instead of emulating
> > >devices that do
> > >shutdown/reset).
> >
> > Besides what Alex said, for shutdown we don't have any existing
> > device to emulate (our real hardware just doesn't have that
> > functionality).  For reset we currently do emulate, but it's awkward
> > to describe in the device tree what we actually emulate since the
> > reset functionality is part of a kitchen-sink "device" of which we
> > emulate virtually nothing other than the reset.  Currently we
> > advertise the entire thing and just ignore the rest, but that causes
> > problems with the guest seeing the node and trying to use that
> > functionality.
> >
> What about writing virtio device for shutdown

That sounds like quite a bit more work than hcalls.  It also ties up a  
virtual PCI slot -- some machines don't have very many (mpc8544ds has  
2, though we could and should expand that in the paravirt e500 machine).

> and add missing emulation
> to kitchen-sink device (yeah I know easily said that done),

Not going to happen... there's lots of low-level and chip-specific  
stuff in there.  We'd have to make several versions, even for the  
paravirt platform so it would correspond to some chip that goes with  
the cpu being used.  Even then we couldn't do everything, at least with  
KVM -- one of the things in there is the ability to freeze the  
timebase, but reading the timebase doesn't trap, and we aren't going to  
freeze the host timebase.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-15 22:23                     ` Scott Wood
@ 2013-07-16 23:21                       ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-16 23:21 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248


On 16.07.2013, at 00:23, Scott Wood wrote:

> On 07/15/2013 03:55:08 PM, Alexander Graf wrote:
>> On 15.07.2013, at 22:52, Scott Wood wrote:
>> > On 07/15/2013 03:28:46 PM, Alexander Graf wrote:
>> >> On 15.07.2013, at 20:21, Scott Wood wrote:
>> >> > On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
>> >> >> > >>> +	printk("error: system reset returned with error %ld\n", ret);
>> >> >> > >>
>> >> >> > >> So we should fall back to the normal reset handler here.
>> >> >> > >
>> >> >> > > Do you mean return normally from here, no BUG() etc?
>> >> >> >
>> >> >> > If we guard the patching against everything, we can treat a broken hcall as BUG.
>> >> >> > However, if we don't we want to fall back to the normal guts based reset.
>> >> >> Will let Scott comment on this?
>> >> >> But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler
>> >> >
>> >> > I don't think it's worth implementing a fall-back scheme -- if KVM advertises that the reset hcall exists, then it had better exist.
>> >> If we also check for kvm_para_available() I agree. Otherwise QEMU might advertise the reset hcall, but the guest kernel may not implement KVM hypercalls. In that case the device tree check will succeed, but the actual hypercall will not.
>> >
>> > Wouldn't that be a bug in QEMU?  Or in KVM for exposing the hcall capability without implementing them?
>> No, because it would be the guest that doesn't know how to handle kvm hypercalls.
> 
> Oh, I misread "guest kernel" as "host kernel". :-P
> 
> Still, I'm not sure what sort of error you're thinking of.  If the guest didn't support the hcall mechanism we would have returned from the function by that point.  In fact, seeing kvm,has-reset on a different hypervisor ought to mean that that hypervisor is emulating KVM in this particular respect.

Imagine we're running on KVM with this reset hcall support. Now if the guest has CONFIG_EPAPR_PARAVIRT enabled but CONFIG_KVM_GUEST disabled, we would patch the callback, but kvm_hypercall0() would be implemented as a nop.


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-16 23:21                       ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-16 23:21 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248


On 16.07.2013, at 00:23, Scott Wood wrote:

> On 07/15/2013 03:55:08 PM, Alexander Graf wrote:
>> On 15.07.2013, at 22:52, Scott Wood wrote:
>> > On 07/15/2013 03:28:46 PM, Alexander Graf wrote:
>> >> On 15.07.2013, at 20:21, Scott Wood wrote:
>> >> > On 07/15/2013 10:16:41 AM, Bhushan Bharat-R65777 wrote:
>> >> >> > >>> +	printk("error: system reset returned with error %ld\n", ret);
>> >> >> > >>
>> >> >> > >> So we should fall back to the normal reset handler here.
>> >> >> > >
>> >> >> > > Do you mean return normally from here, no BUG() etc?
>> >> >> >
>> >> >> > If we guard the patching against everything, we can treat a broken hcall as BUG.
>> >> >> > However, if we don't we want to fall back to the normal guts based reset.
>> >> >> Will let Scott comment on this?
>> >> >> But ppc_md.restart can point to only one handler and during paravirt patching we changed this to new handler. So we cannot jump back to guts type handler
>> >> >
>> >> > I don't think it's worth implementing a fall-back scheme -- if KVM advertises that the reset hcall exists, then it had better exist.
>> >> If we also check for kvm_para_available() I agree. Otherwise QEMU might advertise the reset hcall, but the guest kernel may not implement KVM hypercalls. In that case the device tree check will succeed, but the actual hypercall will not.
>> >
>> > Wouldn't that be a bug in QEMU?  Or in KVM for exposing the hcall capability without implementing them?
>> No, because it would be the guest that doesn't know how to handle kvm hypercalls.
> 
> Oh, I misread "guest kernel" as "host kernel". :-P
> 
> Still, I'm not sure what sort of error you're thinking of.  If the guest didn't support the hcall mechanism we would have returned from the function by that point.  In fact, seeing kvm,has-reset on a different hypervisor ought to mean that that hypervisor is emulating KVM in this particular respect.

Imagine we're running on KVM with this reset hcall support. Now if the guest has CONFIG_EPAPR_PARAVIRT enabled but CONFIG_KVM_GUEST disabled, we would patch the callback, but kvm_hypercall0() would be implemented as a nop.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-16 23:21                       ` Alexander Graf
@ 2013-07-16 23:26                         ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-16 23:26 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248

On 07/16/2013 06:21:51 PM, Alexander Graf wrote:
> 
> On 16.07.2013, at 00:23, Scott Wood wrote:
> 
> > Still, I'm not sure what sort of error you're thinking of.  If the  
> guest didn't support the hcall mechanism we would have returned from  
> the function by that point.  In fact, seeing kvm,has-reset on a  
> different hypervisor ought to mean that that hypervisor is emulating  
> KVM in this particular respect.
> 
> Imagine we're running on KVM with this reset hcall support. Now if  
> the guest has CONFIG_EPAPR_PARAVIRT enabled but CONFIG_KVM_GUEST  
> disabled, we would patch the callback, but kvm_hypercall0() would be  
> implemented as a nop.

Ugh -- that should be renamed epapr_hypercall and moved to  
epapr_paravirt.c.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-16 23:26                         ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-16 23:26 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, kvm, kvm-ppc, Wood Scott-B07421,
	Yoder Stuart-B08248

On 07/16/2013 06:21:51 PM, Alexander Graf wrote:
> 
> On 16.07.2013, at 00:23, Scott Wood wrote:
> 
> > Still, I'm not sure what sort of error you're thinking of.  If the  
> guest didn't support the hcall mechanism we would have returned from  
> the function by that point.  In fact, seeing kvm,has-reset on a  
> different hypervisor ought to mean that that hypervisor is emulating  
> KVM in this particular respect.
> 
> Imagine we're running on KVM with this reset hcall support. Now if  
> the guest has CONFIG_EPAPR_PARAVIRT enabled but CONFIG_KVM_GUEST  
> disabled, we would patch the callback, but kvm_hypercall0() would be  
> implemented as a nop.

Ugh -- that should be renamed epapr_hypercall and moved to  
epapr_paravirt.c.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
  2013-07-16 23:26                         ` Scott Wood
@ 2013-07-16 23:37                           ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-16 23:37 UTC (permalink / raw)
  To: Scott Wood
  Cc: Alexander Graf, Bhushan Bharat-R65777, kvm, kvm-ppc,
	Wood Scott-B07421, Yoder Stuart-B08248

On 07/16/2013 06:26:40 PM, Scott Wood wrote:
> On 07/16/2013 06:21:51 PM, Alexander Graf wrote:
>> 
>> On 16.07.2013, at 00:23, Scott Wood wrote:
>> 
>> > Still, I'm not sure what sort of error you're thinking of.  If the  
>> guest didn't support the hcall mechanism we would have returned from  
>> the function by that point.  In fact, seeing kvm,has-reset on a  
>> different hypervisor ought to mean that that hypervisor is emulating  
>> KVM in this particular respect.
>> 
>> Imagine we're running on KVM with this reset hcall support. Now if  
>> the guest has CONFIG_EPAPR_PARAVIRT enabled but CONFIG_KVM_GUEST  
>> disabled, we would patch the callback, but kvm_hypercall0() would be  
>> implemented as a nop.
> 
> Ugh -- that should be renamed epapr_hypercall and moved to  
> epapr_paravirt.c.

Or rather, kvm_hypercall() should become epapr_hypercall() in  
epapr_paravirt.c -- there's nothing KVM-specific about it.

kvm_hypercall0() and friends could become epapr_hypercall0() in  
epapr_hcalls.h, with the KVM_HCALL_TOKEN() moved to the caller.  Or  
they could stay as they are but depend on CONFIG_EPAPR_PARAVIRT rather  
than CONFIG_KVM_GUEST -- there's no real dependency on the rest of the  
KVM guest code.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset
@ 2013-07-16 23:37                           ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-16 23:37 UTC (permalink / raw)
  To: Scott Wood
  Cc: Alexander Graf, Bhushan Bharat-R65777, kvm, kvm-ppc,
	Wood Scott-B07421, Yoder Stuart-B08248

On 07/16/2013 06:26:40 PM, Scott Wood wrote:
> On 07/16/2013 06:21:51 PM, Alexander Graf wrote:
>> 
>> On 16.07.2013, at 00:23, Scott Wood wrote:
>> 
>> > Still, I'm not sure what sort of error you're thinking of.  If the  
>> guest didn't support the hcall mechanism we would have returned from  
>> the function by that point.  In fact, seeing kvm,has-reset on a  
>> different hypervisor ought to mean that that hypervisor is emulating  
>> KVM in this particular respect.
>> 
>> Imagine we're running on KVM with this reset hcall support. Now if  
>> the guest has CONFIG_EPAPR_PARAVIRT enabled but CONFIG_KVM_GUEST  
>> disabled, we would patch the callback, but kvm_hypercall0() would be  
>> implemented as a nop.
> 
> Ugh -- that should be renamed epapr_hypercall and moved to  
> epapr_paravirt.c.

Or rather, kvm_hypercall() should become epapr_hypercall() in  
epapr_paravirt.c -- there's nothing KVM-specific about it.

kvm_hypercall0() and friends could become epapr_hypercall0() in  
epapr_hcalls.h, with the KVM_HCALL_TOKEN() moved to the caller.  Or  
they could stay as they are but depend on CONFIG_EPAPR_PARAVIRT rather  
than CONFIG_KVM_GUEST -- there's no real dependency on the rest of the  
KVM guest code.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-16 23:04           ` Scott Wood
@ 2013-07-17 11:00             ` Gleb Natapov
  -1 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-17 11:00 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bharat Bhushan, kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan

On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> >On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> >> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >> >There is no much sense to share hypercalls between architectures.
> >> >There
> >> >is zero probability x86 will implement those for instance
> >>
> >> This is similar to the question of whether to keep device API
> >> enumerations per-architecture...  It costs very little to keep it in
> >> a common place, and it's hard to go back in the other direction if
> >> we later realize there are things that should be shared.
> >>
> >This is different from device API since with device API all arches
> >have
> >to create/destroy devices, so it make sense to put device lifecycle
> >management into the common code, and device API has single entry point
> >to the code - device fd ioctl - where it makes sense to handle common
> >tasks, if any, and despatch others to specific device implementation.
> >
> >This is totally unlike hypercalls which are, by definition, very
> >architecture specific (the way they are triggered, the way parameter
> >are passed from guest to host, what hypercalls arch needs...).
> 
> The ABI is architecture specific.  The API doesn't need to be, any
> more than it does with syscalls (I consider the
> architecture-specific definition of syscall numbers and similar
> constants in Linux to be unfortunate, especially for tools such as
> strace or QEMU's linux-user emulation).
> 
Unlike syscalls different arches have very different ideas what
hypercalls they need to implement, so while with unified syscall space I
can see how it may benefit (very) small number of tools, I do not see
what advantage it will give us. The disadvantage is one more global name
space to manage.

> >> Keeping it in a common place also makes it more visible to people
> >> looking to add new hcalls, which could cut down on reinventing the
> >> wheel.
> >I do not want other arches to start using hypercalls in the way
> >powerpc
> >started to use them: separate device io space, so it is better to hide
> >this as far away from common code as possible :) But on a more serious
> >note hypercalls should be a last resort and added only when no other
> >possibility exists, so people should not look what hcalls others
> >implemented, so they can add them to their favorite arch, but they
> >should have a problem at hand that they cannot solve without
> >hcall, but
> >at this point they will have pretty good idea what this hcall
> >should do.
> 
> Why are hcalls such a bad thing?
> 
Because they often used to do non architectural things making OSes
behave different from how they runs on real HW and real HW is what
OSes are designed and tested for. Example: there once was a KVM (XEN
have/had similar one) hypercall to accelerate MMU operation.  One thing it
allowed is to to flush tlb without doing IPI if vcpu is not running. Later
optimization was added to Linux MMU code that _relies_ on those IPIs for
synchronisation. Good that at that point those hypercalls were already
deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
brings me to another point: they often get obsoleted by code improvement
and HW advancement (happened to aforementioned MMU hypercalls), but they
hard to deprecate if hypervisor supports live migration, without live
migration it is less of a problem. Next point is that people often try
to use them instead of emulate PV or real device just because they
think it is easier, but it is often not so. Example: pvpanic device was
initially proposed as hypercall, so lets say we would implement it as
such. It would have been KVM specific, implementation would touch core
guest KVM code and would have been Linux guest specific. Instead it was
implemented as platform device with very small platform driver confined
in drivers/ directory, immediately usable by XEN and QEMU tcg in addition
to KVM, will likely gain Windows driver. No downsides, only upsides.

So given all that hypercalls are considered more of a necessary evil in
KVM land :)

> Should new Linux syscalls be avoided too, in favor of new emulated
> devices exposed via vfio? :-)
Try to add new syscall to Linux and see how simple it is.

> 
> >> >(not sure why PPC will want them either instead of emulating
> >> >devices that do
> >> >shutdown/reset).
> >>
> >> Besides what Alex said, for shutdown we don't have any existing
> >> device to emulate (our real hardware just doesn't have that
> >> functionality).  For reset we currently do emulate, but it's awkward
> >> to describe in the device tree what we actually emulate since the
> >> reset functionality is part of a kitchen-sink "device" of which we
> >> emulate virtually nothing other than the reset.  Currently we
> >> advertise the entire thing and just ignore the rest, but that causes
> >> problems with the guest seeing the node and trying to use that
> >> functionality.
> >>
> >What about writing virtio device for shutdown
> 
> That sounds like quite a bit more work than hcalls.  It also ties up
> a virtual PCI slot -- some machines don't have very many (mpc8544ds
> has 2, though we could and should expand that in the paravirt e500
> machine).
Yes, virtio device may be more work, but it should not be complex
or high performance device, having only one outstanding command will
be OK.  The 2 slots limit is to harsh indeed, but since hcall implies PV
the device may be available only on paravirt. And device functionality
can be expandable, so you will not need to write another one and take
another slot for each little thing you want to add. It can advertise
capability in one bar and takes command/return values through virtio ring.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 11:00             ` Gleb Natapov
  0 siblings, 0 replies; 103+ messages in thread
From: Gleb Natapov @ 2013-07-17 11:00 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bharat Bhushan, kvm, kvm-ppc, agraf, stuart.yoder, Bharat Bhushan

On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> >On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> >> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >> >There is no much sense to share hypercalls between architectures.
> >> >There
> >> >is zero probability x86 will implement those for instance
> >>
> >> This is similar to the question of whether to keep device API
> >> enumerations per-architecture...  It costs very little to keep it in
> >> a common place, and it's hard to go back in the other direction if
> >> we later realize there are things that should be shared.
> >>
> >This is different from device API since with device API all arches
> >have
> >to create/destroy devices, so it make sense to put device lifecycle
> >management into the common code, and device API has single entry point
> >to the code - device fd ioctl - where it makes sense to handle common
> >tasks, if any, and despatch others to specific device implementation.
> >
> >This is totally unlike hypercalls which are, by definition, very
> >architecture specific (the way they are triggered, the way parameter
> >are passed from guest to host, what hypercalls arch needs...).
> 
> The ABI is architecture specific.  The API doesn't need to be, any
> more than it does with syscalls (I consider the
> architecture-specific definition of syscall numbers and similar
> constants in Linux to be unfortunate, especially for tools such as
> strace or QEMU's linux-user emulation).
> 
Unlike syscalls different arches have very different ideas what
hypercalls they need to implement, so while with unified syscall space I
can see how it may benefit (very) small number of tools, I do not see
what advantage it will give us. The disadvantage is one more global name
space to manage.

> >> Keeping it in a common place also makes it more visible to people
> >> looking to add new hcalls, which could cut down on reinventing the
> >> wheel.
> >I do not want other arches to start using hypercalls in the way
> >powerpc
> >started to use them: separate device io space, so it is better to hide
> >this as far away from common code as possible :) But on a more serious
> >note hypercalls should be a last resort and added only when no other
> >possibility exists, so people should not look what hcalls others
> >implemented, so they can add them to their favorite arch, but they
> >should have a problem at hand that they cannot solve without
> >hcall, but
> >at this point they will have pretty good idea what this hcall
> >should do.
> 
> Why are hcalls such a bad thing?
> 
Because they often used to do non architectural things making OSes
behave different from how they runs on real HW and real HW is what
OSes are designed and tested for. Example: there once was a KVM (XEN
have/had similar one) hypercall to accelerate MMU operation.  One thing it
allowed is to to flush tlb without doing IPI if vcpu is not running. Later
optimization was added to Linux MMU code that _relies_ on those IPIs for
synchronisation. Good that at that point those hypercalls were already
deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
brings me to another point: they often get obsoleted by code improvement
and HW advancement (happened to aforementioned MMU hypercalls), but they
hard to deprecate if hypervisor supports live migration, without live
migration it is less of a problem. Next point is that people often try
to use them instead of emulate PV or real device just because they
think it is easier, but it is often not so. Example: pvpanic device was
initially proposed as hypercall, so lets say we would implement it as
such. It would have been KVM specific, implementation would touch core
guest KVM code and would have been Linux guest specific. Instead it was
implemented as platform device with very small platform driver confined
in drivers/ directory, immediately usable by XEN and QEMU tcg in addition
to KVM, will likely gain Windows driver. No downsides, only upsides.

So given all that hypercalls are considered more of a necessary evil in
KVM land :)

> Should new Linux syscalls be avoided too, in favor of new emulated
> devices exposed via vfio? :-)
Try to add new syscall to Linux and see how simple it is.

> 
> >> >(not sure why PPC will want them either instead of emulating
> >> >devices that do
> >> >shutdown/reset).
> >>
> >> Besides what Alex said, for shutdown we don't have any existing
> >> device to emulate (our real hardware just doesn't have that
> >> functionality).  For reset we currently do emulate, but it's awkward
> >> to describe in the device tree what we actually emulate since the
> >> reset functionality is part of a kitchen-sink "device" of which we
> >> emulate virtually nothing other than the reset.  Currently we
> >> advertise the entire thing and just ignore the rest, but that causes
> >> problems with the guest seeing the node and trying to use that
> >> functionality.
> >>
> >What about writing virtio device for shutdown
> 
> That sounds like quite a bit more work than hcalls.  It also ties up
> a virtual PCI slot -- some machines don't have very many (mpc8544ds
> has 2, though we could and should expand that in the paravirt e500
> machine).
Yes, virtio device may be more work, but it should not be complex
or high performance device, having only one outstanding command will
be OK.  The 2 slots limit is to harsh indeed, but since hcall implies PV
the device may be available only on paravirt. And device functionality
can be expandable, so you will not need to write another one and take
another slot for each little thing you want to add. It can advertise
capability in one bar and takes command/return values through virtio ring.

--
			Gleb.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 11:00             ` Gleb Natapov
@ 2013-07-17 12:19               ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 12:19 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Scott Wood, Bharat Bhushan, kvm, kvm-ppc, stuart.yoder, Bharat Bhushan


On 17.07.2013, at 13:00, Gleb Natapov wrote:

> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>> There is no much sense to share hypercalls between architectures.
>>>>> There
>>>>> is zero probability x86 will implement those for instance
>>>> 
>>>> This is similar to the question of whether to keep device API
>>>> enumerations per-architecture...  It costs very little to keep it in
>>>> a common place, and it's hard to go back in the other direction if
>>>> we later realize there are things that should be shared.
>>>> 
>>> This is different from device API since with device API all arches
>>> have
>>> to create/destroy devices, so it make sense to put device lifecycle
>>> management into the common code, and device API has single entry point
>>> to the code - device fd ioctl - where it makes sense to handle common
>>> tasks, if any, and despatch others to specific device implementation.
>>> 
>>> This is totally unlike hypercalls which are, by definition, very
>>> architecture specific (the way they are triggered, the way parameter
>>> are passed from guest to host, what hypercalls arch needs...).
>> 
>> The ABI is architecture specific.  The API doesn't need to be, any
>> more than it does with syscalls (I consider the
>> architecture-specific definition of syscall numbers and similar
>> constants in Linux to be unfortunate, especially for tools such as
>> strace or QEMU's linux-user emulation).
>> 
> Unlike syscalls different arches have very different ideas what
> hypercalls they need to implement, so while with unified syscall space I
> can see how it may benefit (very) small number of tools, I do not see
> what advantage it will give us. The disadvantage is one more global name
> space to manage.
> 
>>>> Keeping it in a common place also makes it more visible to people
>>>> looking to add new hcalls, which could cut down on reinventing the
>>>> wheel.
>>> I do not want other arches to start using hypercalls in the way
>>> powerpc
>>> started to use them: separate device io space, so it is better to hide
>>> this as far away from common code as possible :) But on a more serious
>>> note hypercalls should be a last resort and added only when no other
>>> possibility exists, so people should not look what hcalls others
>>> implemented, so they can add them to their favorite arch, but they
>>> should have a problem at hand that they cannot solve without
>>> hcall, but
>>> at this point they will have pretty good idea what this hcall
>>> should do.
>> 
>> Why are hcalls such a bad thing?
>> 
> Because they often used to do non architectural things making OSes
> behave different from how they runs on real HW and real HW is what
> OSes are designed and tested for. Example: there once was a KVM (XEN
> have/had similar one) hypercall to accelerate MMU operation.  One thing it
> allowed is to to flush tlb without doing IPI if vcpu is not running. Later
> optimization was added to Linux MMU code that _relies_ on those IPIs for
> synchronisation. Good that at that point those hypercalls were already
> deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
> brings me to another point: they often get obsoleted by code improvement
> and HW advancement (happened to aforementioned MMU hypercalls), but they
> hard to deprecate if hypervisor supports live migration, without live
> migration it is less of a problem. Next point is that people often try
> to use them instead of emulate PV or real device just because they
> think it is easier, but it is often not so. Example: pvpanic device was
> initially proposed as hypercall, so lets say we would implement it as
> such. It would have been KVM specific, implementation would touch core
> guest KVM code and would have been Linux guest specific. Instead it was
> implemented as platform device with very small platform driver confined
> in drivers/ directory, immediately usable by XEN and QEMU tcg in addition

This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely don't want to expose TCG as KVM hypervisor.


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 12:19               ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 12:19 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Scott Wood, Bharat Bhushan, kvm, kvm-ppc, stuart.yoder, Bharat Bhushan


On 17.07.2013, at 13:00, Gleb Natapov wrote:

> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>> There is no much sense to share hypercalls between architectures.
>>>>> There
>>>>> is zero probability x86 will implement those for instance
>>>> 
>>>> This is similar to the question of whether to keep device API
>>>> enumerations per-architecture...  It costs very little to keep it in
>>>> a common place, and it's hard to go back in the other direction if
>>>> we later realize there are things that should be shared.
>>>> 
>>> This is different from device API since with device API all arches
>>> have
>>> to create/destroy devices, so it make sense to put device lifecycle
>>> management into the common code, and device API has single entry point
>>> to the code - device fd ioctl - where it makes sense to handle common
>>> tasks, if any, and despatch others to specific device implementation.
>>> 
>>> This is totally unlike hypercalls which are, by definition, very
>>> architecture specific (the way they are triggered, the way parameter
>>> are passed from guest to host, what hypercalls arch needs...).
>> 
>> The ABI is architecture specific.  The API doesn't need to be, any
>> more than it does with syscalls (I consider the
>> architecture-specific definition of syscall numbers and similar
>> constants in Linux to be unfortunate, especially for tools such as
>> strace or QEMU's linux-user emulation).
>> 
> Unlike syscalls different arches have very different ideas what
> hypercalls they need to implement, so while with unified syscall space I
> can see how it may benefit (very) small number of tools, I do not see
> what advantage it will give us. The disadvantage is one more global name
> space to manage.
> 
>>>> Keeping it in a common place also makes it more visible to people
>>>> looking to add new hcalls, which could cut down on reinventing the
>>>> wheel.
>>> I do not want other arches to start using hypercalls in the way
>>> powerpc
>>> started to use them: separate device io space, so it is better to hide
>>> this as far away from common code as possible :) But on a more serious
>>> note hypercalls should be a last resort and added only when no other
>>> possibility exists, so people should not look what hcalls others
>>> implemented, so they can add them to their favorite arch, but they
>>> should have a problem at hand that they cannot solve without
>>> hcall, but
>>> at this point they will have pretty good idea what this hcall
>>> should do.
>> 
>> Why are hcalls such a bad thing?
>> 
> Because they often used to do non architectural things making OSes
> behave different from how they runs on real HW and real HW is what
> OSes are designed and tested for. Example: there once was a KVM (XEN
> have/had similar one) hypercall to accelerate MMU operation.  One thing it
> allowed is to to flush tlb without doing IPI if vcpu is not running. Later
> optimization was added to Linux MMU code that _relies_ on those IPIs for
> synchronisation. Good that at that point those hypercalls were already
> deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
> brings me to another point: they often get obsoleted by code improvement
> and HW advancement (happened to aforementioned MMU hypercalls), but they
> hard to deprecate if hypervisor supports live migration, without live
> migration it is less of a problem. Next point is that people often try
> to use them instead of emulate PV or real device just because they
> think it is easier, but it is often not so. Example: pvpanic device was
> initially proposed as hypercall, so lets say we would implement it as
> such. It would have been KVM specific, implementation would touch core
> guest KVM code and would have been Linux guest specific. Instead it was
> implemented as platform device with very small platform driver confined
> in drivers/ directory, immediately usable by XEN and QEMU tcg in addition

This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely don't want to expose TCG as KVM hypervisor.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 12:19               ` Alexander Graf
  (?)
@ 2013-07-17 15:19               ` Yoder Stuart-B08248
  2013-07-17 15:21                   ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Yoder Stuart-B08248 @ 2013-07-17 15:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Wood Scott-B07421, Bhushan Bharat-R65777, kvm, kvm-ppc,
	Bhushan Bharat-R65777, Gleb Natapov



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Wednesday, July 17, 2013 7:19 AM
> To: Gleb Natapov
> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Yoder
> Stuart-B08248; Bhushan Bharat-R65777
> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
> 
> 
> On 17.07.2013, at 13:00, Gleb Natapov wrote:
> 
> > On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
> >> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> >>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> >>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >>>>> There is no much sense to share hypercalls between architectures.
> >>>>> There
> >>>>> is zero probability x86 will implement those for instance
> >>>>
> >>>> This is similar to the question of whether to keep device API
> >>>> enumerations per-architecture...  It costs very little to keep it in
> >>>> a common place, and it's hard to go back in the other direction if
> >>>> we later realize there are things that should be shared.
> >>>>
> >>> This is different from device API since with device API all arches
> >>> have
> >>> to create/destroy devices, so it make sense to put device lifecycle
> >>> management into the common code, and device API has single entry point
> >>> to the code - device fd ioctl - where it makes sense to handle common
> >>> tasks, if any, and despatch others to specific device implementation.
> >>>
> >>> This is totally unlike hypercalls which are, by definition, very
> >>> architecture specific (the way they are triggered, the way parameter
> >>> are passed from guest to host, what hypercalls arch needs...).
> >>
> >> The ABI is architecture specific.  The API doesn't need to be, any
> >> more than it does with syscalls (I consider the
> >> architecture-specific definition of syscall numbers and similar
> >> constants in Linux to be unfortunate, especially for tools such as
> >> strace or QEMU's linux-user emulation).
> >>
> > Unlike syscalls different arches have very different ideas what
> > hypercalls they need to implement, so while with unified syscall space I
> > can see how it may benefit (very) small number of tools, I do not see
> > what advantage it will give us. The disadvantage is one more global name
> > space to manage.
> >
> >>>> Keeping it in a common place also makes it more visible to people
> >>>> looking to add new hcalls, which could cut down on reinventing the
> >>>> wheel.
> >>> I do not want other arches to start using hypercalls in the way
> >>> powerpc
> >>> started to use them: separate device io space, so it is better to hide
> >>> this as far away from common code as possible :) But on a more serious
> >>> note hypercalls should be a last resort and added only when no other
> >>> possibility exists, so people should not look what hcalls others
> >>> implemented, so they can add them to their favorite arch, but they
> >>> should have a problem at hand that they cannot solve without
> >>> hcall, but
> >>> at this point they will have pretty good idea what this hcall
> >>> should do.
> >>
> >> Why are hcalls such a bad thing?
> >>
> > Because they often used to do non architectural things making OSes
> > behave different from how they runs on real HW and real HW is what
> > OSes are designed and tested for. Example: there once was a KVM (XEN
> > have/had similar one) hypercall to accelerate MMU operation.  One thing it
> > allowed is to to flush tlb without doing IPI if vcpu is not running. Later
> > optimization was added to Linux MMU code that _relies_ on those IPIs for
> > synchronisation. Good that at that point those hypercalls were already
> > deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
> > brings me to another point: they often get obsoleted by code improvement
> > and HW advancement (happened to aforementioned MMU hypercalls), but they
> > hard to deprecate if hypervisor supports live migration, without live
> > migration it is less of a problem. Next point is that people often try
> > to use them instead of emulate PV or real device just because they
> > think it is easier, but it is often not so. Example: pvpanic device was
> > initially proposed as hypercall, so lets say we would implement it as
> > such. It would have been KVM specific, implementation would touch core
> > guest KVM code and would have been Linux guest specific. Instead it was
> > implemented as platform device with very small platform driver confined
> > in drivers/ directory, immediately usable by XEN and QEMU tcg in addition
> 
> This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely
> don't want to expose TCG as KVM hypervisor.

Hmm...so are you proposing that we abandon the current approach,
and switch to a device-based mechanism for reboot/shutdown?

Stuart

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 15:19               ` Yoder Stuart-B08248
@ 2013-07-17 15:21                   ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 15:21 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: Wood Scott-B07421, Bhushan Bharat-R65777, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 17:19, Yoder Stuart-B08248 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Wednesday, July 17, 2013 7:19 AM
>> To: Gleb Natapov
>> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Yoder
>> Stuart-B08248; Bhushan Bharat-R65777
>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
>> 
>> 
>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>> 
>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>> There
>>>>>>> is zero probability x86 will implement those for instance
>>>>>> 
>>>>>> This is similar to the question of whether to keep device API
>>>>>> enumerations per-architecture...  It costs very little to keep it in
>>>>>> a common place, and it's hard to go back in the other direction if
>>>>>> we later realize there are things that should be shared.
>>>>>> 
>>>>> This is different from device API since with device API all arches
>>>>> have
>>>>> to create/destroy devices, so it make sense to put device lifecycle
>>>>> management into the common code, and device API has single entry point
>>>>> to the code - device fd ioctl - where it makes sense to handle common
>>>>> tasks, if any, and despatch others to specific device implementation.
>>>>> 
>>>>> This is totally unlike hypercalls which are, by definition, very
>>>>> architecture specific (the way they are triggered, the way parameter
>>>>> are passed from guest to host, what hypercalls arch needs...).
>>>> 
>>>> The ABI is architecture specific.  The API doesn't need to be, any
>>>> more than it does with syscalls (I consider the
>>>> architecture-specific definition of syscall numbers and similar
>>>> constants in Linux to be unfortunate, especially for tools such as
>>>> strace or QEMU's linux-user emulation).
>>>> 
>>> Unlike syscalls different arches have very different ideas what
>>> hypercalls they need to implement, so while with unified syscall space I
>>> can see how it may benefit (very) small number of tools, I do not see
>>> what advantage it will give us. The disadvantage is one more global name
>>> space to manage.
>>> 
>>>>>> Keeping it in a common place also makes it more visible to people
>>>>>> looking to add new hcalls, which could cut down on reinventing the
>>>>>> wheel.
>>>>> I do not want other arches to start using hypercalls in the way
>>>>> powerpc
>>>>> started to use them: separate device io space, so it is better to hide
>>>>> this as far away from common code as possible :) But on a more serious
>>>>> note hypercalls should be a last resort and added only when no other
>>>>> possibility exists, so people should not look what hcalls others
>>>>> implemented, so they can add them to their favorite arch, but they
>>>>> should have a problem at hand that they cannot solve without
>>>>> hcall, but
>>>>> at this point they will have pretty good idea what this hcall
>>>>> should do.
>>>> 
>>>> Why are hcalls such a bad thing?
>>>> 
>>> Because they often used to do non architectural things making OSes
>>> behave different from how they runs on real HW and real HW is what
>>> OSes are designed and tested for. Example: there once was a KVM (XEN
>>> have/had similar one) hypercall to accelerate MMU operation.  One thing it
>>> allowed is to to flush tlb without doing IPI if vcpu is not running. Later
>>> optimization was added to Linux MMU code that _relies_ on those IPIs for
>>> synchronisation. Good that at that point those hypercalls were already
>>> deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
>>> brings me to another point: they often get obsoleted by code improvement
>>> and HW advancement (happened to aforementioned MMU hypercalls), but they
>>> hard to deprecate if hypervisor supports live migration, without live
>>> migration it is less of a problem. Next point is that people often try
>>> to use them instead of emulate PV or real device just because they
>>> think it is easier, but it is often not so. Example: pvpanic device was
>>> initially proposed as hypercall, so lets say we would implement it as
>>> such. It would have been KVM specific, implementation would touch core
>>> guest KVM code and would have been Linux guest specific. Instead it was
>>> implemented as platform device with very small platform driver confined
>>> in drivers/ directory, immediately usable by XEN and QEMU tcg in addition
>> 
>> This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely
>> don't want to expose TCG as KVM hypervisor.
> 
> Hmm...so are you proposing that we abandon the current approach,
> and switch to a device-based mechanism for reboot/shutdown?

Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we should plug this though.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 15:21                   ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 15:21 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: Wood Scott-B07421, Bhushan Bharat-R65777, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 17:19, Yoder Stuart-B08248 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Wednesday, July 17, 2013 7:19 AM
>> To: Gleb Natapov
>> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Yoder
>> Stuart-B08248; Bhushan Bharat-R65777
>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
>> 
>> 
>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>> 
>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>> There
>>>>>>> is zero probability x86 will implement those for instance
>>>>>> 
>>>>>> This is similar to the question of whether to keep device API
>>>>>> enumerations per-architecture...  It costs very little to keep it in
>>>>>> a common place, and it's hard to go back in the other direction if
>>>>>> we later realize there are things that should be shared.
>>>>>> 
>>>>> This is different from device API since with device API all arches
>>>>> have
>>>>> to create/destroy devices, so it make sense to put device lifecycle
>>>>> management into the common code, and device API has single entry point
>>>>> to the code - device fd ioctl - where it makes sense to handle common
>>>>> tasks, if any, and despatch others to specific device implementation.
>>>>> 
>>>>> This is totally unlike hypercalls which are, by definition, very
>>>>> architecture specific (the way they are triggered, the way parameter
>>>>> are passed from guest to host, what hypercalls arch needs...).
>>>> 
>>>> The ABI is architecture specific.  The API doesn't need to be, any
>>>> more than it does with syscalls (I consider the
>>>> architecture-specific definition of syscall numbers and similar
>>>> constants in Linux to be unfortunate, especially for tools such as
>>>> strace or QEMU's linux-user emulation).
>>>> 
>>> Unlike syscalls different arches have very different ideas what
>>> hypercalls they need to implement, so while with unified syscall space I
>>> can see how it may benefit (very) small number of tools, I do not see
>>> what advantage it will give us. The disadvantage is one more global name
>>> space to manage.
>>> 
>>>>>> Keeping it in a common place also makes it more visible to people
>>>>>> looking to add new hcalls, which could cut down on reinventing the
>>>>>> wheel.
>>>>> I do not want other arches to start using hypercalls in the way
>>>>> powerpc
>>>>> started to use them: separate device io space, so it is better to hide
>>>>> this as far away from common code as possible :) But on a more serious
>>>>> note hypercalls should be a last resort and added only when no other
>>>>> possibility exists, so people should not look what hcalls others
>>>>> implemented, so they can add them to their favorite arch, but they
>>>>> should have a problem at hand that they cannot solve without
>>>>> hcall, but
>>>>> at this point they will have pretty good idea what this hcall
>>>>> should do.
>>>> 
>>>> Why are hcalls such a bad thing?
>>>> 
>>> Because they often used to do non architectural things making OSes
>>> behave different from how they runs on real HW and real HW is what
>>> OSes are designed and tested for. Example: there once was a KVM (XEN
>>> have/had similar one) hypercall to accelerate MMU operation.  One thing it
>>> allowed is to to flush tlb without doing IPI if vcpu is not running. Later
>>> optimization was added to Linux MMU code that _relies_ on those IPIs for
>>> synchronisation. Good that at that point those hypercalls were already
>>> deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
>>> brings me to another point: they often get obsoleted by code improvement
>>> and HW advancement (happened to aforementioned MMU hypercalls), but they
>>> hard to deprecate if hypervisor supports live migration, without live
>>> migration it is less of a problem. Next point is that people often try
>>> to use them instead of emulate PV or real device just because they
>>> think it is easier, but it is often not so. Example: pvpanic device was
>>> initially proposed as hypercall, so lets say we would implement it as
>>> such. It would have been KVM specific, implementation would touch core
>>> guest KVM code and would have been Linux guest specific. Instead it was
>>> implemented as platform device with very small platform driver confined
>>> in drivers/ directory, immediately usable by XEN and QEMU tcg in addition
>> 
>> This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely
>> don't want to expose TCG as KVM hypervisor.
> 
> Hmm...so are you proposing that we abandon the current approach,
> and switch to a device-based mechanism for reboot/shutdown?

Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we should plug this though.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 15:21                   ` Alexander Graf
  (?)
@ 2013-07-17 15:36                   ` Yoder Stuart-B08248
  2013-07-17 15:41                       ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Yoder Stuart-B08248 @ 2013-07-17 15:36 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Wood Scott-B07421, Bhushan Bharat-R65777, kvm, kvm-ppc, Gleb Natapov



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Wednesday, July 17, 2013 10:21 AM
> To: Yoder Stuart-B08248
> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Gleb Natapov
> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
> 
> 
> On 17.07.2013, at 17:19, Yoder Stuart-B08248 wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Alexander Graf [mailto:agraf@suse.de]
> >> Sent: Wednesday, July 17, 2013 7:19 AM
> >> To: Gleb Natapov
> >> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Yoder
> >> Stuart-B08248; Bhushan Bharat-R65777
> >> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
> >>
> >>
> >> On 17.07.2013, at 13:00, Gleb Natapov wrote:
> >>
> >>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
> >>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> >>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> >>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >>>>>>> There is no much sense to share hypercalls between architectures.
> >>>>>>> There
> >>>>>>> is zero probability x86 will implement those for instance
> >>>>>>
> >>>>>> This is similar to the question of whether to keep device API
> >>>>>> enumerations per-architecture...  It costs very little to keep it in
> >>>>>> a common place, and it's hard to go back in the other direction if
> >>>>>> we later realize there are things that should be shared.
> >>>>>>
> >>>>> This is different from device API since with device API all arches
> >>>>> have
> >>>>> to create/destroy devices, so it make sense to put device lifecycle
> >>>>> management into the common code, and device API has single entry point
> >>>>> to the code - device fd ioctl - where it makes sense to handle common
> >>>>> tasks, if any, and despatch others to specific device implementation.
> >>>>>
> >>>>> This is totally unlike hypercalls which are, by definition, very
> >>>>> architecture specific (the way they are triggered, the way parameter
> >>>>> are passed from guest to host, what hypercalls arch needs...).
> >>>>
> >>>> The ABI is architecture specific.  The API doesn't need to be, any
> >>>> more than it does with syscalls (I consider the
> >>>> architecture-specific definition of syscall numbers and similar
> >>>> constants in Linux to be unfortunate, especially for tools such as
> >>>> strace or QEMU's linux-user emulation).
> >>>>
> >>> Unlike syscalls different arches have very different ideas what
> >>> hypercalls they need to implement, so while with unified syscall space I
> >>> can see how it may benefit (very) small number of tools, I do not see
> >>> what advantage it will give us. The disadvantage is one more global name
> >>> space to manage.
> >>>
> >>>>>> Keeping it in a common place also makes it more visible to people
> >>>>>> looking to add new hcalls, which could cut down on reinventing the
> >>>>>> wheel.
> >>>>> I do not want other arches to start using hypercalls in the way
> >>>>> powerpc
> >>>>> started to use them: separate device io space, so it is better to hide
> >>>>> this as far away from common code as possible :) But on a more serious
> >>>>> note hypercalls should be a last resort and added only when no other
> >>>>> possibility exists, so people should not look what hcalls others
> >>>>> implemented, so they can add them to their favorite arch, but they
> >>>>> should have a problem at hand that they cannot solve without
> >>>>> hcall, but
> >>>>> at this point they will have pretty good idea what this hcall
> >>>>> should do.
> >>>>
> >>>> Why are hcalls such a bad thing?
> >>>>
> >>> Because they often used to do non architectural things making OSes
> >>> behave different from how they runs on real HW and real HW is what
> >>> OSes are designed and tested for. Example: there once was a KVM (XEN
> >>> have/had similar one) hypercall to accelerate MMU operation.  One thing it
> >>> allowed is to to flush tlb without doing IPI if vcpu is not running. Later
> >>> optimization was added to Linux MMU code that _relies_ on those IPIs for
> >>> synchronisation. Good that at that point those hypercalls were already
> >>> deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
> >>> brings me to another point: they often get obsoleted by code improvement
> >>> and HW advancement (happened to aforementioned MMU hypercalls), but they
> >>> hard to deprecate if hypervisor supports live migration, without live
> >>> migration it is less of a problem. Next point is that people often try
> >>> to use them instead of emulate PV or real device just because they
> >>> think it is easier, but it is often not so. Example: pvpanic device was
> >>> initially proposed as hypercall, so lets say we would implement it as
> >>> such. It would have been KVM specific, implementation would touch core
> >>> guest KVM code and would have been Linux guest specific. Instead it was
> >>> implemented as platform device with very small platform driver confined
> >>> in drivers/ directory, immediately usable by XEN and QEMU tcg in addition
> >>
> >> This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely
> >> don't want to expose TCG as KVM hypervisor.
> >
> > Hmm...so are you proposing that we abandon the current approach,
> > and switch to a device-based mechanism for reboot/shutdown?
> 
> Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we
> should plug this though.

What do you mean...where the paravirt device would go in the physical
address map??

Stuart


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 15:36                   ` Yoder Stuart-B08248
@ 2013-07-17 15:41                       ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 15:41 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: Wood Scott-B07421, Bhushan Bharat-R65777, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 17:36, Yoder Stuart-B08248 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Wednesday, July 17, 2013 10:21 AM
>> To: Yoder Stuart-B08248
>> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Gleb Natapov
>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
>> 
>> 
>> On 17.07.2013, at 17:19, Yoder Stuart-B08248 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Wednesday, July 17, 2013 7:19 AM
>>>> To: Gleb Natapov
>>>> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Yoder
>>>> Stuart-B08248; Bhushan Bharat-R65777
>>>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
>>>> 
>>>> 
>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>>>> 
>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>>>> There
>>>>>>>>> is zero probability x86 will implement those for instance
>>>>>>>> 
>>>>>>>> This is similar to the question of whether to keep device API
>>>>>>>> enumerations per-architecture...  It costs very little to keep it in
>>>>>>>> a common place, and it's hard to go back in the other direction if
>>>>>>>> we later realize there are things that should be shared.
>>>>>>>> 
>>>>>>> This is different from device API since with device API all arches
>>>>>>> have
>>>>>>> to create/destroy devices, so it make sense to put device lifecycle
>>>>>>> management into the common code, and device API has single entry point
>>>>>>> to the code - device fd ioctl - where it makes sense to handle common
>>>>>>> tasks, if any, and despatch others to specific device implementation.
>>>>>>> 
>>>>>>> This is totally unlike hypercalls which are, by definition, very
>>>>>>> architecture specific (the way they are triggered, the way parameter
>>>>>>> are passed from guest to host, what hypercalls arch needs...).
>>>>>> 
>>>>>> The ABI is architecture specific.  The API doesn't need to be, any
>>>>>> more than it does with syscalls (I consider the
>>>>>> architecture-specific definition of syscall numbers and similar
>>>>>> constants in Linux to be unfortunate, especially for tools such as
>>>>>> strace or QEMU's linux-user emulation).
>>>>>> 
>>>>> Unlike syscalls different arches have very different ideas what
>>>>> hypercalls they need to implement, so while with unified syscall space I
>>>>> can see how it may benefit (very) small number of tools, I do not see
>>>>> what advantage it will give us. The disadvantage is one more global name
>>>>> space to manage.
>>>>> 
>>>>>>>> Keeping it in a common place also makes it more visible to people
>>>>>>>> looking to add new hcalls, which could cut down on reinventing the
>>>>>>>> wheel.
>>>>>>> I do not want other arches to start using hypercalls in the way
>>>>>>> powerpc
>>>>>>> started to use them: separate device io space, so it is better to hide
>>>>>>> this as far away from common code as possible :) But on a more serious
>>>>>>> note hypercalls should be a last resort and added only when no other
>>>>>>> possibility exists, so people should not look what hcalls others
>>>>>>> implemented, so they can add them to their favorite arch, but they
>>>>>>> should have a problem at hand that they cannot solve without
>>>>>>> hcall, but
>>>>>>> at this point they will have pretty good idea what this hcall
>>>>>>> should do.
>>>>>> 
>>>>>> Why are hcalls such a bad thing?
>>>>>> 
>>>>> Because they often used to do non architectural things making OSes
>>>>> behave different from how they runs on real HW and real HW is what
>>>>> OSes are designed and tested for. Example: there once was a KVM (XEN
>>>>> have/had similar one) hypercall to accelerate MMU operation.  One thing it
>>>>> allowed is to to flush tlb without doing IPI if vcpu is not running. Later
>>>>> optimization was added to Linux MMU code that _relies_ on those IPIs for
>>>>> synchronisation. Good that at that point those hypercalls were already
>>>>> deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
>>>>> brings me to another point: they often get obsoleted by code improvement
>>>>> and HW advancement (happened to aforementioned MMU hypercalls), but they
>>>>> hard to deprecate if hypervisor supports live migration, without live
>>>>> migration it is less of a problem. Next point is that people often try
>>>>> to use them instead of emulate PV or real device just because they
>>>>> think it is easier, but it is often not so. Example: pvpanic device was
>>>>> initially proposed as hypercall, so lets say we would implement it as
>>>>> such. It would have been KVM specific, implementation would touch core
>>>>> guest KVM code and would have been Linux guest specific. Instead it was
>>>>> implemented as platform device with very small platform driver confined
>>>>> in drivers/ directory, immediately usable by XEN and QEMU tcg in addition
>>>> 
>>>> This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely
>>>> don't want to expose TCG as KVM hypervisor.
>>> 
>>> Hmm...so are you proposing that we abandon the current approach,
>>> and switch to a device-based mechanism for reboot/shutdown?
>> 
>> Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we
>> should plug this though.
> 
> What do you mean...where the paravirt device would go in the physical
> address map??

Right. Either we

  - let the guest decide (PCI)
  - let QEMU decide, but potentially break the SoC layout (SysBus)
  - let QEMU decide, but only for the virt machine so that we don't break anyone (PlatBus)


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 15:41                       ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 15:41 UTC (permalink / raw)
  To: Yoder Stuart-B08248
  Cc: Wood Scott-B07421, Bhushan Bharat-R65777, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 17:36, Yoder Stuart-B08248 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Wednesday, July 17, 2013 10:21 AM
>> To: Yoder Stuart-B08248
>> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Gleb Natapov
>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
>> 
>> 
>> On 17.07.2013, at 17:19, Yoder Stuart-B08248 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Alexander Graf [mailto:agraf@suse.de]
>>>> Sent: Wednesday, July 17, 2013 7:19 AM
>>>> To: Gleb Natapov
>>>> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@vger.kernel.org; kvm-ppc@vger.kernel.org; Yoder
>>>> Stuart-B08248; Bhushan Bharat-R65777
>>>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
>>>> 
>>>> 
>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>>>> 
>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>>>> There
>>>>>>>>> is zero probability x86 will implement those for instance
>>>>>>>> 
>>>>>>>> This is similar to the question of whether to keep device API
>>>>>>>> enumerations per-architecture...  It costs very little to keep it in
>>>>>>>> a common place, and it's hard to go back in the other direction if
>>>>>>>> we later realize there are things that should be shared.
>>>>>>>> 
>>>>>>> This is different from device API since with device API all arches
>>>>>>> have
>>>>>>> to create/destroy devices, so it make sense to put device lifecycle
>>>>>>> management into the common code, and device API has single entry point
>>>>>>> to the code - device fd ioctl - where it makes sense to handle common
>>>>>>> tasks, if any, and despatch others to specific device implementation.
>>>>>>> 
>>>>>>> This is totally unlike hypercalls which are, by definition, very
>>>>>>> architecture specific (the way they are triggered, the way parameter
>>>>>>> are passed from guest to host, what hypercalls arch needs...).
>>>>>> 
>>>>>> The ABI is architecture specific.  The API doesn't need to be, any
>>>>>> more than it does with syscalls (I consider the
>>>>>> architecture-specific definition of syscall numbers and similar
>>>>>> constants in Linux to be unfortunate, especially for tools such as
>>>>>> strace or QEMU's linux-user emulation).
>>>>>> 
>>>>> Unlike syscalls different arches have very different ideas what
>>>>> hypercalls they need to implement, so while with unified syscall space I
>>>>> can see how it may benefit (very) small number of tools, I do not see
>>>>> what advantage it will give us. The disadvantage is one more global name
>>>>> space to manage.
>>>>> 
>>>>>>>> Keeping it in a common place also makes it more visible to people
>>>>>>>> looking to add new hcalls, which could cut down on reinventing the
>>>>>>>> wheel.
>>>>>>> I do not want other arches to start using hypercalls in the way
>>>>>>> powerpc
>>>>>>> started to use them: separate device io space, so it is better to hide
>>>>>>> this as far away from common code as possible :) But on a more serious
>>>>>>> note hypercalls should be a last resort and added only when no other
>>>>>>> possibility exists, so people should not look what hcalls others
>>>>>>> implemented, so they can add them to their favorite arch, but they
>>>>>>> should have a problem at hand that they cannot solve without
>>>>>>> hcall, but
>>>>>>> at this point they will have pretty good idea what this hcall
>>>>>>> should do.
>>>>>> 
>>>>>> Why are hcalls such a bad thing?
>>>>>> 
>>>>> Because they often used to do non architectural things making OSes
>>>>> behave different from how they runs on real HW and real HW is what
>>>>> OSes are designed and tested for. Example: there once was a KVM (XEN
>>>>> have/had similar one) hypercall to accelerate MMU operation.  One thing it
>>>>> allowed is to to flush tlb without doing IPI if vcpu is not running. Later
>>>>> optimization was added to Linux MMU code that _relies_ on those IPIs for
>>>>> synchronisation. Good that at that point those hypercalls were already
>>>>> deprecated on KVM (IIRC XEN was broke for some time in that regard). Which
>>>>> brings me to another point: they often get obsoleted by code improvement
>>>>> and HW advancement (happened to aforementioned MMU hypercalls), but they
>>>>> hard to deprecate if hypervisor supports live migration, without live
>>>>> migration it is less of a problem. Next point is that people often try
>>>>> to use them instead of emulate PV or real device just because they
>>>>> think it is easier, but it is often not so. Example: pvpanic device was
>>>>> initially proposed as hypercall, so lets say we would implement it as
>>>>> such. It would have been KVM specific, implementation would touch core
>>>>> guest KVM code and would have been Linux guest specific. Instead it was
>>>>> implemented as platform device with very small platform driver confined
>>>>> in drivers/ directory, immediately usable by XEN and QEMU tcg in addition
>>>> 
>>>> This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely
>>>> don't want to expose TCG as KVM hypervisor.
>>> 
>>> Hmm...so are you proposing that we abandon the current approach,
>>> and switch to a device-based mechanism for reboot/shutdown?
>> 
>> Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we
>> should plug this though.
> 
> What do you mean...where the paravirt device would go in the physical
> address map??

Right. Either we

  - let the guest decide (PCI)
  - let QEMU decide, but potentially break the SoC layout (SysBus)
  - let QEMU decide, but only for the virt machine so that we don't break anyone (PlatBus)


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 15:41                       ` Alexander Graf
  (?)
@ 2013-07-17 15:47                       ` Bhushan Bharat-R65777
  2013-07-17 15:52                           ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-17 15:47 UTC (permalink / raw)
  To: Alexander Graf, Yoder Stuart-B08248
  Cc: Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov


> >>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
> >>>>
> >>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
> >>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> >>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> >>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >>>>>>>>> There is no much sense to share hypercalls between architectures.
> >>>>>>>>> There
> >>>>>>>>> is zero probability x86 will implement those for instance
> >>>>>>>>
> >>>>>>>> This is similar to the question of whether to keep device API
> >>>>>>>> enumerations per-architecture...  It costs very little to keep
> >>>>>>>> it in a common place, and it's hard to go back in the other
> >>>>>>>> direction if we later realize there are things that should be shared.
> >>>>>>>>
> >>>>>>> This is different from device API since with device API all
> >>>>>>> arches have to create/destroy devices, so it make sense to put
> >>>>>>> device lifecycle management into the common code, and device API
> >>>>>>> has single entry point to the code - device fd ioctl - where it
> >>>>>>> makes sense to handle common tasks, if any, and despatch others
> >>>>>>> to specific device implementation.
> >>>>>>>
> >>>>>>> This is totally unlike hypercalls which are, by definition, very
> >>>>>>> architecture specific (the way they are triggered, the way
> >>>>>>> parameter are passed from guest to host, what hypercalls arch needs...).
> >>>>>>
> >>>>>> The ABI is architecture specific.  The API doesn't need to be,
> >>>>>> any more than it does with syscalls (I consider the
> >>>>>> architecture-specific definition of syscall numbers and similar
> >>>>>> constants in Linux to be unfortunate, especially for tools such
> >>>>>> as strace or QEMU's linux-user emulation).
> >>>>>>
> >>>>> Unlike syscalls different arches have very different ideas what
> >>>>> hypercalls they need to implement, so while with unified syscall
> >>>>> space I can see how it may benefit (very) small number of tools, I
> >>>>> do not see what advantage it will give us. The disadvantage is one
> >>>>> more global name space to manage.
> >>>>>
> >>>>>>>> Keeping it in a common place also makes it more visible to
> >>>>>>>> people looking to add new hcalls, which could cut down on
> >>>>>>>> reinventing the wheel.
> >>>>>>> I do not want other arches to start using hypercalls in the way
> >>>>>>> powerpc started to use them: separate device io space, so it is
> >>>>>>> better to hide this as far away from common code as possible :)
> >>>>>>> But on a more serious note hypercalls should be a last resort
> >>>>>>> and added only when no other possibility exists, so people
> >>>>>>> should not look what hcalls others implemented, so they can add
> >>>>>>> them to their favorite arch, but they should have a problem at
> >>>>>>> hand that they cannot solve without hcall, but at this point
> >>>>>>> they will have pretty good idea what this hcall should do.
> >>>>>>
> >>>>>> Why are hcalls such a bad thing?
> >>>>>>
> >>>>> Because they often used to do non architectural things making OSes
> >>>>> behave different from how they runs on real HW and real HW is what
> >>>>> OSes are designed and tested for. Example: there once was a KVM
> >>>>> (XEN have/had similar one) hypercall to accelerate MMU operation.
> >>>>> One thing it allowed is to to flush tlb without doing IPI if vcpu
> >>>>> is not running. Later optimization was added to Linux MMU code
> >>>>> that _relies_ on those IPIs for synchronisation. Good that at that
> >>>>> point those hypercalls were already deprecated on KVM (IIRC XEN
> >>>>> was broke for some time in that regard). Which brings me to
> >>>>> another point: they often get obsoleted by code improvement and HW
> >>>>> advancement (happened to aforementioned MMU hypercalls), but they
> >>>>> hard to deprecate if hypervisor supports live migration, without
> >>>>> live migration it is less of a problem. Next point is that people
> >>>>> often try to use them instead of emulate PV or real device just
> >>>>> because they think it is easier, but it is often not so. Example:
> >>>>> pvpanic device was initially proposed as hypercall, so lets say we
> >>>>> would implement it as such. It would have been KVM specific,
> >>>>> implementation would touch core guest KVM code and would have been
> >>>>> Linux guest specific. Instead it was implemented as platform
> >>>>> device with very small platform driver confined in drivers/
> >>>>> directory, immediately usable by XEN and QEMU tcg in addition
> >>>>
> >>>> This is actually a very good point. How do we support reboot and
> >>>> shutdown for TCG guests? We surely don't want to expose TCG as KVM
> hypervisor.
> >>>
> >>> Hmm...so are you proposing that we abandon the current approach, and
> >>> switch to a device-based mechanism for reboot/shutdown?
> >>
> >> Reading Gleb's email it sounds like the more future proof approach,
> >> yes. I'm not quite sure yet where we should plug this though.
> >
> > What do you mean...where the paravirt device would go in the physical
> > address map??
> 
> Right. Either we
> 
>   - let the guest decide (PCI)
>   - let QEMU decide, but potentially break the SoC layout (SysBus)
>   - let QEMU decide, but only for the virt machine so that we don't break anyone
> (PlatBus)

Can you please elaborate above two points ?

-Bharat

> 
> 
> Alex
> 

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 15:47                       ` Bhushan Bharat-R65777
@ 2013-07-17 15:52                           ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 15:52 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: Yoder Stuart-B08248, Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 17:47, Bhushan Bharat-R65777 wrote:

> 
>>>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>>>>>> 
>>>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>>>>>> There
>>>>>>>>>>> is zero probability x86 will implement those for instance
>>>>>>>>>> 
>>>>>>>>>> This is similar to the question of whether to keep device API
>>>>>>>>>> enumerations per-architecture...  It costs very little to keep
>>>>>>>>>> it in a common place, and it's hard to go back in the other
>>>>>>>>>> direction if we later realize there are things that should be shared.
>>>>>>>>>> 
>>>>>>>>> This is different from device API since with device API all
>>>>>>>>> arches have to create/destroy devices, so it make sense to put
>>>>>>>>> device lifecycle management into the common code, and device API
>>>>>>>>> has single entry point to the code - device fd ioctl - where it
>>>>>>>>> makes sense to handle common tasks, if any, and despatch others
>>>>>>>>> to specific device implementation.
>>>>>>>>> 
>>>>>>>>> This is totally unlike hypercalls which are, by definition, very
>>>>>>>>> architecture specific (the way they are triggered, the way
>>>>>>>>> parameter are passed from guest to host, what hypercalls arch needs...).
>>>>>>>> 
>>>>>>>> The ABI is architecture specific.  The API doesn't need to be,
>>>>>>>> any more than it does with syscalls (I consider the
>>>>>>>> architecture-specific definition of syscall numbers and similar
>>>>>>>> constants in Linux to be unfortunate, especially for tools such
>>>>>>>> as strace or QEMU's linux-user emulation).
>>>>>>>> 
>>>>>>> Unlike syscalls different arches have very different ideas what
>>>>>>> hypercalls they need to implement, so while with unified syscall
>>>>>>> space I can see how it may benefit (very) small number of tools, I
>>>>>>> do not see what advantage it will give us. The disadvantage is one
>>>>>>> more global name space to manage.
>>>>>>> 
>>>>>>>>>> Keeping it in a common place also makes it more visible to
>>>>>>>>>> people looking to add new hcalls, which could cut down on
>>>>>>>>>> reinventing the wheel.
>>>>>>>>> I do not want other arches to start using hypercalls in the way
>>>>>>>>> powerpc started to use them: separate device io space, so it is
>>>>>>>>> better to hide this as far away from common code as possible :)
>>>>>>>>> But on a more serious note hypercalls should be a last resort
>>>>>>>>> and added only when no other possibility exists, so people
>>>>>>>>> should not look what hcalls others implemented, so they can add
>>>>>>>>> them to their favorite arch, but they should have a problem at
>>>>>>>>> hand that they cannot solve without hcall, but at this point
>>>>>>>>> they will have pretty good idea what this hcall should do.
>>>>>>>> 
>>>>>>>> Why are hcalls such a bad thing?
>>>>>>>> 
>>>>>>> Because they often used to do non architectural things making OSes
>>>>>>> behave different from how they runs on real HW and real HW is what
>>>>>>> OSes are designed and tested for. Example: there once was a KVM
>>>>>>> (XEN have/had similar one) hypercall to accelerate MMU operation.
>>>>>>> One thing it allowed is to to flush tlb without doing IPI if vcpu
>>>>>>> is not running. Later optimization was added to Linux MMU code
>>>>>>> that _relies_ on those IPIs for synchronisation. Good that at that
>>>>>>> point those hypercalls were already deprecated on KVM (IIRC XEN
>>>>>>> was broke for some time in that regard). Which brings me to
>>>>>>> another point: they often get obsoleted by code improvement and HW
>>>>>>> advancement (happened to aforementioned MMU hypercalls), but they
>>>>>>> hard to deprecate if hypervisor supports live migration, without
>>>>>>> live migration it is less of a problem. Next point is that people
>>>>>>> often try to use them instead of emulate PV or real device just
>>>>>>> because they think it is easier, but it is often not so. Example:
>>>>>>> pvpanic device was initially proposed as hypercall, so lets say we
>>>>>>> would implement it as such. It would have been KVM specific,
>>>>>>> implementation would touch core guest KVM code and would have been
>>>>>>> Linux guest specific. Instead it was implemented as platform
>>>>>>> device with very small platform driver confined in drivers/
>>>>>>> directory, immediately usable by XEN and QEMU tcg in addition
>>>>>> 
>>>>>> This is actually a very good point. How do we support reboot and
>>>>>> shutdown for TCG guests? We surely don't want to expose TCG as KVM
>> hypervisor.
>>>>> 
>>>>> Hmm...so are you proposing that we abandon the current approach, and
>>>>> switch to a device-based mechanism for reboot/shutdown?
>>>> 
>>>> Reading Gleb's email it sounds like the more future proof approach,
>>>> yes. I'm not quite sure yet where we should plug this though.
>>> 
>>> What do you mean...where the paravirt device would go in the physical
>>> address map??
>> 
>> Right. Either we
>> 
>>  - let the guest decide (PCI)
>>  - let QEMU decide, but potentially break the SoC layout (SysBus)
>>  - let QEMU decide, but only for the virt machine so that we don't break anyone
>> (PlatBus)
> 
> Can you please elaborate above two points ?

If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time we diverge from the layout of the original chip, things can break.

However, for our PV machine (-M ppce500 / e500plat) we don't care about real hardware layouts. We simply emulate a machine that is 100% described through the device tree. So guests that can't deal with the machine looking different from real hardware don't really matter anyways, since they'd already be broken.


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 15:52                           ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 15:52 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: Yoder Stuart-B08248, Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 17:47, Bhushan Bharat-R65777 wrote:

> 
>>>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>>>>>> 
>>>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>>>>>> There
>>>>>>>>>>> is zero probability x86 will implement those for instance
>>>>>>>>>> 
>>>>>>>>>> This is similar to the question of whether to keep device API
>>>>>>>>>> enumerations per-architecture...  It costs very little to keep
>>>>>>>>>> it in a common place, and it's hard to go back in the other
>>>>>>>>>> direction if we later realize there are things that should be shared.
>>>>>>>>>> 
>>>>>>>>> This is different from device API since with device API all
>>>>>>>>> arches have to create/destroy devices, so it make sense to put
>>>>>>>>> device lifecycle management into the common code, and device API
>>>>>>>>> has single entry point to the code - device fd ioctl - where it
>>>>>>>>> makes sense to handle common tasks, if any, and despatch others
>>>>>>>>> to specific device implementation.
>>>>>>>>> 
>>>>>>>>> This is totally unlike hypercalls which are, by definition, very
>>>>>>>>> architecture specific (the way they are triggered, the way
>>>>>>>>> parameter are passed from guest to host, what hypercalls arch needs...).
>>>>>>>> 
>>>>>>>> The ABI is architecture specific.  The API doesn't need to be,
>>>>>>>> any more than it does with syscalls (I consider the
>>>>>>>> architecture-specific definition of syscall numbers and similar
>>>>>>>> constants in Linux to be unfortunate, especially for tools such
>>>>>>>> as strace or QEMU's linux-user emulation).
>>>>>>>> 
>>>>>>> Unlike syscalls different arches have very different ideas what
>>>>>>> hypercalls they need to implement, so while with unified syscall
>>>>>>> space I can see how it may benefit (very) small number of tools, I
>>>>>>> do not see what advantage it will give us. The disadvantage is one
>>>>>>> more global name space to manage.
>>>>>>> 
>>>>>>>>>> Keeping it in a common place also makes it more visible to
>>>>>>>>>> people looking to add new hcalls, which could cut down on
>>>>>>>>>> reinventing the wheel.
>>>>>>>>> I do not want other arches to start using hypercalls in the way
>>>>>>>>> powerpc started to use them: separate device io space, so it is
>>>>>>>>> better to hide this as far away from common code as possible :)
>>>>>>>>> But on a more serious note hypercalls should be a last resort
>>>>>>>>> and added only when no other possibility exists, so people
>>>>>>>>> should not look what hcalls others implemented, so they can add
>>>>>>>>> them to their favorite arch, but they should have a problem at
>>>>>>>>> hand that they cannot solve without hcall, but at this point
>>>>>>>>> they will have pretty good idea what this hcall should do.
>>>>>>>> 
>>>>>>>> Why are hcalls such a bad thing?
>>>>>>>> 
>>>>>>> Because they often used to do non architectural things making OSes
>>>>>>> behave different from how they runs on real HW and real HW is what
>>>>>>> OSes are designed and tested for. Example: there once was a KVM
>>>>>>> (XEN have/had similar one) hypercall to accelerate MMU operation.
>>>>>>> One thing it allowed is to to flush tlb without doing IPI if vcpu
>>>>>>> is not running. Later optimization was added to Linux MMU code
>>>>>>> that _relies_ on those IPIs for synchronisation. Good that at that
>>>>>>> point those hypercalls were already deprecated on KVM (IIRC XEN
>>>>>>> was broke for some time in that regard). Which brings me to
>>>>>>> another point: they often get obsoleted by code improvement and HW
>>>>>>> advancement (happened to aforementioned MMU hypercalls), but they
>>>>>>> hard to deprecate if hypervisor supports live migration, without
>>>>>>> live migration it is less of a problem. Next point is that people
>>>>>>> often try to use them instead of emulate PV or real device just
>>>>>>> because they think it is easier, but it is often not so. Example:
>>>>>>> pvpanic device was initially proposed as hypercall, so lets say we
>>>>>>> would implement it as such. It would have been KVM specific,
>>>>>>> implementation would touch core guest KVM code and would have been
>>>>>>> Linux guest specific. Instead it was implemented as platform
>>>>>>> device with very small platform driver confined in drivers/
>>>>>>> directory, immediately usable by XEN and QEMU tcg in addition
>>>>>> 
>>>>>> This is actually a very good point. How do we support reboot and
>>>>>> shutdown for TCG guests? We surely don't want to expose TCG as KVM
>> hypervisor.
>>>>> 
>>>>> Hmm...so are you proposing that we abandon the current approach, and
>>>>> switch to a device-based mechanism for reboot/shutdown?
>>>> 
>>>> Reading Gleb's email it sounds like the more future proof approach,
>>>> yes. I'm not quite sure yet where we should plug this though.
>>> 
>>> What do you mean...where the paravirt device would go in the physical
>>> address map??
>> 
>> Right. Either we
>> 
>>  - let the guest decide (PCI)
>>  - let QEMU decide, but potentially break the SoC layout (SysBus)
>>  - let QEMU decide, but only for the virt machine so that we don't break anyone
>> (PlatBus)
> 
> Can you please elaborate above two points ?

If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time we diverge from the layout of the original chip, things can break.

However, for our PV machine (-M ppce500 / e500plat) we don't care about real hardware layouts. We simply emulate a machine that is 100% described through the device tree. So guests that can't deal with the machine looking different from real hardware don't really matter anyways, since they'd already be broken.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 15:52                           ` Alexander Graf
  (?)
@ 2013-07-17 15:59                           ` Bhushan Bharat-R65777
  2013-07-17 16:04                               ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-17 15:59 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Yoder Stuart-B08248, Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov



> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Wednesday, July 17, 2013 9:22 PM
> To: Bhushan Bharat-R65777
> Cc: Yoder Stuart-B08248; Wood Scott-B07421; kvm@vger.kernel.org; kvm-
> ppc@vger.kernel.org; Gleb Natapov
> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
> 
> 
> On 17.07.2013, at 17:47, Bhushan Bharat-R65777 wrote:
> 
> >
> >>>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
> >>>>>>
> >>>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
> >>>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> >>>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> >>>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >>>>>>>>>>> There is no much sense to share hypercalls between architectures.
> >>>>>>>>>>> There
> >>>>>>>>>>> is zero probability x86 will implement those for instance
> >>>>>>>>>>
> >>>>>>>>>> This is similar to the question of whether to keep device API
> >>>>>>>>>> enumerations per-architecture...  It costs very little to
> >>>>>>>>>> keep it in a common place, and it's hard to go back in the
> >>>>>>>>>> other direction if we later realize there are things that should be
> shared.
> >>>>>>>>>>
> >>>>>>>>> This is different from device API since with device API all
> >>>>>>>>> arches have to create/destroy devices, so it make sense to put
> >>>>>>>>> device lifecycle management into the common code, and device
> >>>>>>>>> API has single entry point to the code - device fd ioctl -
> >>>>>>>>> where it makes sense to handle common tasks, if any, and
> >>>>>>>>> despatch others to specific device implementation.
> >>>>>>>>>
> >>>>>>>>> This is totally unlike hypercalls which are, by definition,
> >>>>>>>>> very architecture specific (the way they are triggered, the
> >>>>>>>>> way parameter are passed from guest to host, what hypercalls arch
> needs...).
> >>>>>>>>
> >>>>>>>> The ABI is architecture specific.  The API doesn't need to be,
> >>>>>>>> any more than it does with syscalls (I consider the
> >>>>>>>> architecture-specific definition of syscall numbers and similar
> >>>>>>>> constants in Linux to be unfortunate, especially for tools such
> >>>>>>>> as strace or QEMU's linux-user emulation).
> >>>>>>>>
> >>>>>>> Unlike syscalls different arches have very different ideas what
> >>>>>>> hypercalls they need to implement, so while with unified syscall
> >>>>>>> space I can see how it may benefit (very) small number of tools,
> >>>>>>> I do not see what advantage it will give us. The disadvantage is
> >>>>>>> one more global name space to manage.
> >>>>>>>
> >>>>>>>>>> Keeping it in a common place also makes it more visible to
> >>>>>>>>>> people looking to add new hcalls, which could cut down on
> >>>>>>>>>> reinventing the wheel.
> >>>>>>>>> I do not want other arches to start using hypercalls in the
> >>>>>>>>> way powerpc started to use them: separate device io space, so
> >>>>>>>>> it is better to hide this as far away from common code as
> >>>>>>>>> possible :) But on a more serious note hypercalls should be a
> >>>>>>>>> last resort and added only when no other possibility exists,
> >>>>>>>>> so people should not look what hcalls others implemented, so
> >>>>>>>>> they can add them to their favorite arch, but they should have
> >>>>>>>>> a problem at hand that they cannot solve without hcall, but at
> >>>>>>>>> this point they will have pretty good idea what this hcall should do.
> >>>>>>>>
> >>>>>>>> Why are hcalls such a bad thing?
> >>>>>>>>
> >>>>>>> Because they often used to do non architectural things making
> >>>>>>> OSes behave different from how they runs on real HW and real HW
> >>>>>>> is what OSes are designed and tested for. Example: there once
> >>>>>>> was a KVM (XEN have/had similar one) hypercall to accelerate MMU
> operation.
> >>>>>>> One thing it allowed is to to flush tlb without doing IPI if
> >>>>>>> vcpu is not running. Later optimization was added to Linux MMU
> >>>>>>> code that _relies_ on those IPIs for synchronisation. Good that
> >>>>>>> at that point those hypercalls were already deprecated on KVM
> >>>>>>> (IIRC XEN was broke for some time in that regard). Which brings
> >>>>>>> me to another point: they often get obsoleted by code
> >>>>>>> improvement and HW advancement (happened to aforementioned MMU
> >>>>>>> hypercalls), but they hard to deprecate if hypervisor supports
> >>>>>>> live migration, without live migration it is less of a problem.
> >>>>>>> Next point is that people often try to use them instead of
> >>>>>>> emulate PV or real device just because they think it is easier, but it
> is often not so. Example:
> >>>>>>> pvpanic device was initially proposed as hypercall, so lets say
> >>>>>>> we would implement it as such. It would have been KVM specific,
> >>>>>>> implementation would touch core guest KVM code and would have
> >>>>>>> been Linux guest specific. Instead it was implemented as
> >>>>>>> platform device with very small platform driver confined in
> >>>>>>> drivers/ directory, immediately usable by XEN and QEMU tcg in
> >>>>>>> addition
> >>>>>>
> >>>>>> This is actually a very good point. How do we support reboot and
> >>>>>> shutdown for TCG guests? We surely don't want to expose TCG as
> >>>>>> KVM
> >> hypervisor.
> >>>>>
> >>>>> Hmm...so are you proposing that we abandon the current approach,
> >>>>> and switch to a device-based mechanism for reboot/shutdown?
> >>>>
> >>>> Reading Gleb's email it sounds like the more future proof approach,
> >>>> yes. I'm not quite sure yet where we should plug this though.
> >>>
> >>> What do you mean...where the paravirt device would go in the
> >>> physical address map??
> >>
> >> Right. Either we
> >>
> >>  - let the guest decide (PCI)
> >>  - let QEMU decide, but potentially break the SoC layout (SysBus)
> >>  - let QEMU decide, but only for the virt machine so that we don't
> >> break anyone
> >> (PlatBus)
> >
> > Can you please elaborate above two points ?
> 
> If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time we diverge
> from the layout of the original chip, things can break.
> 
> However, for our PV machine (-M ppce500 / e500plat) we don't care about real
> hardware layouts. We simply emulate a machine that is 100% described through the
> device tree. So guests that can't deal with the machine looking different from
> real hardware don't really matter anyways, since they'd already be broken.
> 

Ah, so we can choose any address range in ccsr space of a PV machine (-M ppce500 / e500plat). What about MPC8544DS machine?.

So what is preferred way, vitio-reset/shutdown device or the above mentioned ?
 
Thanks
-Bharat

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 15:59                           ` Bhushan Bharat-R65777
@ 2013-07-17 16:04                               ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 16:04 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: Yoder Stuart-B08248, Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 17:59, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Wednesday, July 17, 2013 9:22 PM
>> To: Bhushan Bharat-R65777
>> Cc: Yoder Stuart-B08248; Wood Scott-B07421; kvm@vger.kernel.org; kvm-
>> ppc@vger.kernel.org; Gleb Natapov
>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
>> 
>> 
>> On 17.07.2013, at 17:47, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>>>>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>>>>>>>> 
>>>>>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>>>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>>>>>>>> There
>>>>>>>>>>>>> is zero probability x86 will implement those for instance
>>>>>>>>>>>> 
>>>>>>>>>>>> This is similar to the question of whether to keep device API
>>>>>>>>>>>> enumerations per-architecture...  It costs very little to
>>>>>>>>>>>> keep it in a common place, and it's hard to go back in the
>>>>>>>>>>>> other direction if we later realize there are things that should be
>> shared.
>>>>>>>>>>>> 
>>>>>>>>>>> This is different from device API since with device API all
>>>>>>>>>>> arches have to create/destroy devices, so it make sense to put
>>>>>>>>>>> device lifecycle management into the common code, and device
>>>>>>>>>>> API has single entry point to the code - device fd ioctl -
>>>>>>>>>>> where it makes sense to handle common tasks, if any, and
>>>>>>>>>>> despatch others to specific device implementation.
>>>>>>>>>>> 
>>>>>>>>>>> This is totally unlike hypercalls which are, by definition,
>>>>>>>>>>> very architecture specific (the way they are triggered, the
>>>>>>>>>>> way parameter are passed from guest to host, what hypercalls arch
>> needs...).
>>>>>>>>>> 
>>>>>>>>>> The ABI is architecture specific.  The API doesn't need to be,
>>>>>>>>>> any more than it does with syscalls (I consider the
>>>>>>>>>> architecture-specific definition of syscall numbers and similar
>>>>>>>>>> constants in Linux to be unfortunate, especially for tools such
>>>>>>>>>> as strace or QEMU's linux-user emulation).
>>>>>>>>>> 
>>>>>>>>> Unlike syscalls different arches have very different ideas what
>>>>>>>>> hypercalls they need to implement, so while with unified syscall
>>>>>>>>> space I can see how it may benefit (very) small number of tools,
>>>>>>>>> I do not see what advantage it will give us. The disadvantage is
>>>>>>>>> one more global name space to manage.
>>>>>>>>> 
>>>>>>>>>>>> Keeping it in a common place also makes it more visible to
>>>>>>>>>>>> people looking to add new hcalls, which could cut down on
>>>>>>>>>>>> reinventing the wheel.
>>>>>>>>>>> I do not want other arches to start using hypercalls in the
>>>>>>>>>>> way powerpc started to use them: separate device io space, so
>>>>>>>>>>> it is better to hide this as far away from common code as
>>>>>>>>>>> possible :) But on a more serious note hypercalls should be a
>>>>>>>>>>> last resort and added only when no other possibility exists,
>>>>>>>>>>> so people should not look what hcalls others implemented, so
>>>>>>>>>>> they can add them to their favorite arch, but they should have
>>>>>>>>>>> a problem at hand that they cannot solve without hcall, but at
>>>>>>>>>>> this point they will have pretty good idea what this hcall should do.
>>>>>>>>>> 
>>>>>>>>>> Why are hcalls such a bad thing?
>>>>>>>>>> 
>>>>>>>>> Because they often used to do non architectural things making
>>>>>>>>> OSes behave different from how they runs on real HW and real HW
>>>>>>>>> is what OSes are designed and tested for. Example: there once
>>>>>>>>> was a KVM (XEN have/had similar one) hypercall to accelerate MMU
>> operation.
>>>>>>>>> One thing it allowed is to to flush tlb without doing IPI if
>>>>>>>>> vcpu is not running. Later optimization was added to Linux MMU
>>>>>>>>> code that _relies_ on those IPIs for synchronisation. Good that
>>>>>>>>> at that point those hypercalls were already deprecated on KVM
>>>>>>>>> (IIRC XEN was broke for some time in that regard). Which brings
>>>>>>>>> me to another point: they often get obsoleted by code
>>>>>>>>> improvement and HW advancement (happened to aforementioned MMU
>>>>>>>>> hypercalls), but they hard to deprecate if hypervisor supports
>>>>>>>>> live migration, without live migration it is less of a problem.
>>>>>>>>> Next point is that people often try to use them instead of
>>>>>>>>> emulate PV or real device just because they think it is easier, but it
>> is often not so. Example:
>>>>>>>>> pvpanic device was initially proposed as hypercall, so lets say
>>>>>>>>> we would implement it as such. It would have been KVM specific,
>>>>>>>>> implementation would touch core guest KVM code and would have
>>>>>>>>> been Linux guest specific. Instead it was implemented as
>>>>>>>>> platform device with very small platform driver confined in
>>>>>>>>> drivers/ directory, immediately usable by XEN and QEMU tcg in
>>>>>>>>> addition
>>>>>>>> 
>>>>>>>> This is actually a very good point. How do we support reboot and
>>>>>>>> shutdown for TCG guests? We surely don't want to expose TCG as
>>>>>>>> KVM
>>>> hypervisor.
>>>>>>> 
>>>>>>> Hmm...so are you proposing that we abandon the current approach,
>>>>>>> and switch to a device-based mechanism for reboot/shutdown?
>>>>>> 
>>>>>> Reading Gleb's email it sounds like the more future proof approach,
>>>>>> yes. I'm not quite sure yet where we should plug this though.
>>>>> 
>>>>> What do you mean...where the paravirt device would go in the
>>>>> physical address map??
>>>> 
>>>> Right. Either we
>>>> 
>>>> - let the guest decide (PCI)
>>>> - let QEMU decide, but potentially break the SoC layout (SysBus)
>>>> - let QEMU decide, but only for the virt machine so that we don't
>>>> break anyone
>>>> (PlatBus)
>>> 
>>> Can you please elaborate above two points ?
>> 
>> If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time we diverge
>> from the layout of the original chip, things can break.
>> 
>> However, for our PV machine (-M ppce500 / e500plat) we don't care about real
>> hardware layouts. We simply emulate a machine that is 100% described through the
>> device tree. So guests that can't deal with the machine looking different from
>> real hardware don't really matter anyways, since they'd already be broken.
>> 
> 
> Ah, so we can choose any address range in ccsr space of a PV machine (-M ppce500 / e500plat).

No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.

> What about MPC8544DS machine?.

I guess we'll have to live with GUTS there.

> So what is preferred way, vitio-reset/shutdown device or the above mentioned ?

A virtio device would clutter our PCI space which we're already pretty tight on. So I'd personally prefer the above mentioned.


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 16:04                               ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 16:04 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: Yoder Stuart-B08248, Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 17:59, Bhushan Bharat-R65777 wrote:

> 
> 
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Wednesday, July 17, 2013 9:22 PM
>> To: Bhushan Bharat-R65777
>> Cc: Yoder Stuart-B08248; Wood Scott-B07421; kvm@vger.kernel.org; kvm-
>> ppc@vger.kernel.org; Gleb Natapov
>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls
>> 
>> 
>> On 17.07.2013, at 17:47, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>>>>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>>>>>>>> 
>>>>>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>>>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>>>>>>>> There
>>>>>>>>>>>>> is zero probability x86 will implement those for instance
>>>>>>>>>>>> 
>>>>>>>>>>>> This is similar to the question of whether to keep device API
>>>>>>>>>>>> enumerations per-architecture...  It costs very little to
>>>>>>>>>>>> keep it in a common place, and it's hard to go back in the
>>>>>>>>>>>> other direction if we later realize there are things that should be
>> shared.
>>>>>>>>>>>> 
>>>>>>>>>>> This is different from device API since with device API all
>>>>>>>>>>> arches have to create/destroy devices, so it make sense to put
>>>>>>>>>>> device lifecycle management into the common code, and device
>>>>>>>>>>> API has single entry point to the code - device fd ioctl -
>>>>>>>>>>> where it makes sense to handle common tasks, if any, and
>>>>>>>>>>> despatch others to specific device implementation.
>>>>>>>>>>> 
>>>>>>>>>>> This is totally unlike hypercalls which are, by definition,
>>>>>>>>>>> very architecture specific (the way they are triggered, the
>>>>>>>>>>> way parameter are passed from guest to host, what hypercalls arch
>> needs...).
>>>>>>>>>> 
>>>>>>>>>> The ABI is architecture specific.  The API doesn't need to be,
>>>>>>>>>> any more than it does with syscalls (I consider the
>>>>>>>>>> architecture-specific definition of syscall numbers and similar
>>>>>>>>>> constants in Linux to be unfortunate, especially for tools such
>>>>>>>>>> as strace or QEMU's linux-user emulation).
>>>>>>>>>> 
>>>>>>>>> Unlike syscalls different arches have very different ideas what
>>>>>>>>> hypercalls they need to implement, so while with unified syscall
>>>>>>>>> space I can see how it may benefit (very) small number of tools,
>>>>>>>>> I do not see what advantage it will give us. The disadvantage is
>>>>>>>>> one more global name space to manage.
>>>>>>>>> 
>>>>>>>>>>>> Keeping it in a common place also makes it more visible to
>>>>>>>>>>>> people looking to add new hcalls, which could cut down on
>>>>>>>>>>>> reinventing the wheel.
>>>>>>>>>>> I do not want other arches to start using hypercalls in the
>>>>>>>>>>> way powerpc started to use them: separate device io space, so
>>>>>>>>>>> it is better to hide this as far away from common code as
>>>>>>>>>>> possible :) But on a more serious note hypercalls should be a
>>>>>>>>>>> last resort and added only when no other possibility exists,
>>>>>>>>>>> so people should not look what hcalls others implemented, so
>>>>>>>>>>> they can add them to their favorite arch, but they should have
>>>>>>>>>>> a problem at hand that they cannot solve without hcall, but at
>>>>>>>>>>> this point they will have pretty good idea what this hcall should do.
>>>>>>>>>> 
>>>>>>>>>> Why are hcalls such a bad thing?
>>>>>>>>>> 
>>>>>>>>> Because they often used to do non architectural things making
>>>>>>>>> OSes behave different from how they runs on real HW and real HW
>>>>>>>>> is what OSes are designed and tested for. Example: there once
>>>>>>>>> was a KVM (XEN have/had similar one) hypercall to accelerate MMU
>> operation.
>>>>>>>>> One thing it allowed is to to flush tlb without doing IPI if
>>>>>>>>> vcpu is not running. Later optimization was added to Linux MMU
>>>>>>>>> code that _relies_ on those IPIs for synchronisation. Good that
>>>>>>>>> at that point those hypercalls were already deprecated on KVM
>>>>>>>>> (IIRC XEN was broke for some time in that regard). Which brings
>>>>>>>>> me to another point: they often get obsoleted by code
>>>>>>>>> improvement and HW advancement (happened to aforementioned MMU
>>>>>>>>> hypercalls), but they hard to deprecate if hypervisor supports
>>>>>>>>> live migration, without live migration it is less of a problem.
>>>>>>>>> Next point is that people often try to use them instead of
>>>>>>>>> emulate PV or real device just because they think it is easier, but it
>> is often not so. Example:
>>>>>>>>> pvpanic device was initially proposed as hypercall, so lets say
>>>>>>>>> we would implement it as such. It would have been KVM specific,
>>>>>>>>> implementation would touch core guest KVM code and would have
>>>>>>>>> been Linux guest specific. Instead it was implemented as
>>>>>>>>> platform device with very small platform driver confined in
>>>>>>>>> drivers/ directory, immediately usable by XEN and QEMU tcg in
>>>>>>>>> addition
>>>>>>>> 
>>>>>>>> This is actually a very good point. How do we support reboot and
>>>>>>>> shutdown for TCG guests? We surely don't want to expose TCG as
>>>>>>>> KVM
>>>> hypervisor.
>>>>>>> 
>>>>>>> Hmm...so are you proposing that we abandon the current approach,
>>>>>>> and switch to a device-based mechanism for reboot/shutdown?
>>>>>> 
>>>>>> Reading Gleb's email it sounds like the more future proof approach,
>>>>>> yes. I'm not quite sure yet where we should plug this though.
>>>>> 
>>>>> What do you mean...where the paravirt device would go in the
>>>>> physical address map??
>>>> 
>>>> Right. Either we
>>>> 
>>>> - let the guest decide (PCI)
>>>> - let QEMU decide, but potentially break the SoC layout (SysBus)
>>>> - let QEMU decide, but only for the virt machine so that we don't
>>>> break anyone
>>>> (PlatBus)
>>> 
>>> Can you please elaborate above two points ?
>> 
>> If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time we diverge
>> from the layout of the original chip, things can break.
>> 
>> However, for our PV machine (-M ppce500 / e500plat) we don't care about real
>> hardware layouts. We simply emulate a machine that is 100% described through the
>> device tree. So guests that can't deal with the machine looking different from
>> real hardware don't really matter anyways, since they'd already be broken.
>> 
> 
> Ah, so we can choose any address range in ccsr space of a PV machine (-M ppce500 / e500plat).

No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.

> What about MPC8544DS machine?.

I guess we'll have to live with GUTS there.

> So what is preferred way, vitio-reset/shutdown device or the above mentioned ?

A virtio device would clutter our PCI space which we're already pretty tight on. So I'd personally prefer the above mentioned.


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* RE: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 16:04                               ` Alexander Graf
  (?)
@ 2013-07-17 16:21                               ` Bhushan Bharat-R65777
  2013-07-17 16:23                                   ` Alexander Graf
  -1 siblings, 1 reply; 103+ messages in thread
From: Bhushan Bharat-R65777 @ 2013-07-17 16:21 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Yoder Stuart-B08248, Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov

> >>>>>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
> >>>>>>>>
> >>>>>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
> >>>>>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> >>>>>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> >>>>>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >>>>>>>>>>>>> There is no much sense to share hypercalls between architectures.
> >>>>>>>>>>>>> There
> >>>>>>>>>>>>> is zero probability x86 will implement those for instance
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is similar to the question of whether to keep device
> >>>>>>>>>>>> API enumerations per-architecture...  It costs very little
> >>>>>>>>>>>> to keep it in a common place, and it's hard to go back in
> >>>>>>>>>>>> the other direction if we later realize there are things
> >>>>>>>>>>>> that should be
> >> shared.
> >>>>>>>>>>>>
> >>>>>>>>>>> This is different from device API since with device API all
> >>>>>>>>>>> arches have to create/destroy devices, so it make sense to
> >>>>>>>>>>> put device lifecycle management into the common code, and
> >>>>>>>>>>> device API has single entry point to the code - device fd
> >>>>>>>>>>> ioctl - where it makes sense to handle common tasks, if any,
> >>>>>>>>>>> and despatch others to specific device implementation.
> >>>>>>>>>>>
> >>>>>>>>>>> This is totally unlike hypercalls which are, by definition,
> >>>>>>>>>>> very architecture specific (the way they are triggered, the
> >>>>>>>>>>> way parameter are passed from guest to host, what hypercalls
> >>>>>>>>>>> arch
> >> needs...).
> >>>>>>>>>>
> >>>>>>>>>> The ABI is architecture specific.  The API doesn't need to
> >>>>>>>>>> be, any more than it does with syscalls (I consider the
> >>>>>>>>>> architecture-specific definition of syscall numbers and
> >>>>>>>>>> similar constants in Linux to be unfortunate, especially for
> >>>>>>>>>> tools such as strace or QEMU's linux-user emulation).
> >>>>>>>>>>
> >>>>>>>>> Unlike syscalls different arches have very different ideas
> >>>>>>>>> what hypercalls they need to implement, so while with unified
> >>>>>>>>> syscall space I can see how it may benefit (very) small number
> >>>>>>>>> of tools, I do not see what advantage it will give us. The
> >>>>>>>>> disadvantage is one more global name space to manage.
> >>>>>>>>>
> >>>>>>>>>>>> Keeping it in a common place also makes it more visible to
> >>>>>>>>>>>> people looking to add new hcalls, which could cut down on
> >>>>>>>>>>>> reinventing the wheel.
> >>>>>>>>>>> I do not want other arches to start using hypercalls in the
> >>>>>>>>>>> way powerpc started to use them: separate device io space,
> >>>>>>>>>>> so it is better to hide this as far away from common code as
> >>>>>>>>>>> possible :) But on a more serious note hypercalls should be
> >>>>>>>>>>> a last resort and added only when no other possibility
> >>>>>>>>>>> exists, so people should not look what hcalls others
> >>>>>>>>>>> implemented, so they can add them to their favorite arch,
> >>>>>>>>>>> but they should have a problem at hand that they cannot
> >>>>>>>>>>> solve without hcall, but at this point they will have pretty good
> idea what this hcall should do.
> >>>>>>>>>>
> >>>>>>>>>> Why are hcalls such a bad thing?
> >>>>>>>>>>
> >>>>>>>>> Because they often used to do non architectural things making
> >>>>>>>>> OSes behave different from how they runs on real HW and real
> >>>>>>>>> HW is what OSes are designed and tested for. Example: there
> >>>>>>>>> once was a KVM (XEN have/had similar one) hypercall to
> >>>>>>>>> accelerate MMU
> >> operation.
> >>>>>>>>> One thing it allowed is to to flush tlb without doing IPI if
> >>>>>>>>> vcpu is not running. Later optimization was added to Linux MMU
> >>>>>>>>> code that _relies_ on those IPIs for synchronisation. Good
> >>>>>>>>> that at that point those hypercalls were already deprecated on
> >>>>>>>>> KVM (IIRC XEN was broke for some time in that regard). Which
> >>>>>>>>> brings me to another point: they often get obsoleted by code
> >>>>>>>>> improvement and HW advancement (happened to aforementioned MMU
> >>>>>>>>> hypercalls), but they hard to deprecate if hypervisor supports
> >>>>>>>>> live migration, without live migration it is less of a problem.
> >>>>>>>>> Next point is that people often try to use them instead of
> >>>>>>>>> emulate PV or real device just because they think it is
> >>>>>>>>> easier, but it
> >> is often not so. Example:
> >>>>>>>>> pvpanic device was initially proposed as hypercall, so lets
> >>>>>>>>> say we would implement it as such. It would have been KVM
> >>>>>>>>> specific, implementation would touch core guest KVM code and
> >>>>>>>>> would have been Linux guest specific. Instead it was
> >>>>>>>>> implemented as platform device with very small platform driver
> >>>>>>>>> confined in drivers/ directory, immediately usable by XEN and
> >>>>>>>>> QEMU tcg in addition
> >>>>>>>>
> >>>>>>>> This is actually a very good point. How do we support reboot
> >>>>>>>> and shutdown for TCG guests? We surely don't want to expose TCG
> >>>>>>>> as KVM
> >>>> hypervisor.
> >>>>>>>
> >>>>>>> Hmm...so are you proposing that we abandon the current approach,
> >>>>>>> and switch to a device-based mechanism for reboot/shutdown?
> >>>>>>
> >>>>>> Reading Gleb's email it sounds like the more future proof
> >>>>>> approach, yes. I'm not quite sure yet where we should plug this though.
> >>>>>
> >>>>> What do you mean...where the paravirt device would go in the
> >>>>> physical address map??
> >>>>
> >>>> Right. Either we
> >>>>
> >>>> - let the guest decide (PCI)
> >>>> - let QEMU decide, but potentially break the SoC layout (SysBus)
> >>>> - let QEMU decide, but only for the virt machine so that we don't
> >>>> break anyone
> >>>> (PlatBus)
> >>>
> >>> Can you please elaborate above two points ?
> >>
> >> If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time
> >> we diverge from the layout of the original chip, things can break.
> >>
> >> However, for our PV machine (-M ppce500 / e500plat) we don't care
> >> about real hardware layouts. We simply emulate a machine that is 100%
> >> described through the device tree. So guests that can't deal with the
> >> machine looking different from real hardware don't really matter anyways,
> since they'd already be broken.
> >>
> >
> > Ah, so we can choose any address range in ccsr space of a PV machine (-M
> ppce500 / e500plat).
> 
> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.

All devices are represented in guest device tree, so how we will represent this device in guest Device Tree?

-Bharat

> 
> > What about MPC8544DS machine?.
> 
> I guess we'll have to live with GUTS there.
> 
> > So what is preferred way, vitio-reset/shutdown device or the above mentioned ?
> 
> A virtio device would clutter our PCI space which we're already pretty tight on.
> So I'd personally prefer the above mentioned.
> 
> 
> Alex
> 

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 16:21                               ` Bhushan Bharat-R65777
@ 2013-07-17 16:23                                   ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 16:23 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: Yoder Stuart-B08248, Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 18:21, Bhushan Bharat-R65777 wrote:

>>>>>>>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>>>>>>>>>> 
>>>>>>>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>>>>>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>>>>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>>>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>>>>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>> is zero probability x86 will implement those for instance
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This is similar to the question of whether to keep device
>>>>>>>>>>>>>> API enumerations per-architecture...  It costs very little
>>>>>>>>>>>>>> to keep it in a common place, and it's hard to go back in
>>>>>>>>>>>>>> the other direction if we later realize there are things
>>>>>>>>>>>>>> that should be
>>>> shared.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> This is different from device API since with device API all
>>>>>>>>>>>>> arches have to create/destroy devices, so it make sense to
>>>>>>>>>>>>> put device lifecycle management into the common code, and
>>>>>>>>>>>>> device API has single entry point to the code - device fd
>>>>>>>>>>>>> ioctl - where it makes sense to handle common tasks, if any,
>>>>>>>>>>>>> and despatch others to specific device implementation.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This is totally unlike hypercalls which are, by definition,
>>>>>>>>>>>>> very architecture specific (the way they are triggered, the
>>>>>>>>>>>>> way parameter are passed from guest to host, what hypercalls
>>>>>>>>>>>>> arch
>>>> needs...).
>>>>>>>>>>>> 
>>>>>>>>>>>> The ABI is architecture specific.  The API doesn't need to
>>>>>>>>>>>> be, any more than it does with syscalls (I consider the
>>>>>>>>>>>> architecture-specific definition of syscall numbers and
>>>>>>>>>>>> similar constants in Linux to be unfortunate, especially for
>>>>>>>>>>>> tools such as strace or QEMU's linux-user emulation).
>>>>>>>>>>>> 
>>>>>>>>>>> Unlike syscalls different arches have very different ideas
>>>>>>>>>>> what hypercalls they need to implement, so while with unified
>>>>>>>>>>> syscall space I can see how it may benefit (very) small number
>>>>>>>>>>> of tools, I do not see what advantage it will give us. The
>>>>>>>>>>> disadvantage is one more global name space to manage.
>>>>>>>>>>> 
>>>>>>>>>>>>>> Keeping it in a common place also makes it more visible to
>>>>>>>>>>>>>> people looking to add new hcalls, which could cut down on
>>>>>>>>>>>>>> reinventing the wheel.
>>>>>>>>>>>>> I do not want other arches to start using hypercalls in the
>>>>>>>>>>>>> way powerpc started to use them: separate device io space,
>>>>>>>>>>>>> so it is better to hide this as far away from common code as
>>>>>>>>>>>>> possible :) But on a more serious note hypercalls should be
>>>>>>>>>>>>> a last resort and added only when no other possibility
>>>>>>>>>>>>> exists, so people should not look what hcalls others
>>>>>>>>>>>>> implemented, so they can add them to their favorite arch,
>>>>>>>>>>>>> but they should have a problem at hand that they cannot
>>>>>>>>>>>>> solve without hcall, but at this point they will have pretty good
>> idea what this hcall should do.
>>>>>>>>>>>> 
>>>>>>>>>>>> Why are hcalls such a bad thing?
>>>>>>>>>>>> 
>>>>>>>>>>> Because they often used to do non architectural things making
>>>>>>>>>>> OSes behave different from how they runs on real HW and real
>>>>>>>>>>> HW is what OSes are designed and tested for. Example: there
>>>>>>>>>>> once was a KVM (XEN have/had similar one) hypercall to
>>>>>>>>>>> accelerate MMU
>>>> operation.
>>>>>>>>>>> One thing it allowed is to to flush tlb without doing IPI if
>>>>>>>>>>> vcpu is not running. Later optimization was added to Linux MMU
>>>>>>>>>>> code that _relies_ on those IPIs for synchronisation. Good
>>>>>>>>>>> that at that point those hypercalls were already deprecated on
>>>>>>>>>>> KVM (IIRC XEN was broke for some time in that regard). Which
>>>>>>>>>>> brings me to another point: they often get obsoleted by code
>>>>>>>>>>> improvement and HW advancement (happened to aforementioned MMU
>>>>>>>>>>> hypercalls), but they hard to deprecate if hypervisor supports
>>>>>>>>>>> live migration, without live migration it is less of a problem.
>>>>>>>>>>> Next point is that people often try to use them instead of
>>>>>>>>>>> emulate PV or real device just because they think it is
>>>>>>>>>>> easier, but it
>>>> is often not so. Example:
>>>>>>>>>>> pvpanic device was initially proposed as hypercall, so lets
>>>>>>>>>>> say we would implement it as such. It would have been KVM
>>>>>>>>>>> specific, implementation would touch core guest KVM code and
>>>>>>>>>>> would have been Linux guest specific. Instead it was
>>>>>>>>>>> implemented as platform device with very small platform driver
>>>>>>>>>>> confined in drivers/ directory, immediately usable by XEN and
>>>>>>>>>>> QEMU tcg in addition
>>>>>>>>>> 
>>>>>>>>>> This is actually a very good point. How do we support reboot
>>>>>>>>>> and shutdown for TCG guests? We surely don't want to expose TCG
>>>>>>>>>> as KVM
>>>>>> hypervisor.
>>>>>>>>> 
>>>>>>>>> Hmm...so are you proposing that we abandon the current approach,
>>>>>>>>> and switch to a device-based mechanism for reboot/shutdown?
>>>>>>>> 
>>>>>>>> Reading Gleb's email it sounds like the more future proof
>>>>>>>> approach, yes. I'm not quite sure yet where we should plug this though.
>>>>>>> 
>>>>>>> What do you mean...where the paravirt device would go in the
>>>>>>> physical address map??
>>>>>> 
>>>>>> Right. Either we
>>>>>> 
>>>>>> - let the guest decide (PCI)
>>>>>> - let QEMU decide, but potentially break the SoC layout (SysBus)
>>>>>> - let QEMU decide, but only for the virt machine so that we don't
>>>>>> break anyone
>>>>>> (PlatBus)
>>>>> 
>>>>> Can you please elaborate above two points ?
>>>> 
>>>> If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time
>>>> we diverge from the layout of the original chip, things can break.
>>>> 
>>>> However, for our PV machine (-M ppce500 / e500plat) we don't care
>>>> about real hardware layouts. We simply emulate a machine that is 100%
>>>> described through the device tree. So guests that can't deal with the
>>>> machine looking different from real hardware don't really matter anyways,
>> since they'd already be broken.
>>>> 
>>> 
>>> Ah, so we can choose any address range in ccsr space of a PV machine (-M
>> ppce500 / e500plat).
>> 
>> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.
> 
> All devices are represented in guest device tree, so how we will represent this device in guest Device Tree?

Not inside of the CCSR node :).


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 16:23                                   ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 16:23 UTC (permalink / raw)
  To: Bhushan Bharat-R65777
  Cc: Yoder Stuart-B08248, Wood Scott-B07421, kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 18:21, Bhushan Bharat-R65777 wrote:

>>>>>>>>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
>>>>>>>>>> 
>>>>>>>>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
>>>>>>>>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
>>>>>>>>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
>>>>>>>>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
>>>>>>>>>>>>>>> There is no much sense to share hypercalls between architectures.
>>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>> is zero probability x86 will implement those for instance
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This is similar to the question of whether to keep device
>>>>>>>>>>>>>> API enumerations per-architecture...  It costs very little
>>>>>>>>>>>>>> to keep it in a common place, and it's hard to go back in
>>>>>>>>>>>>>> the other direction if we later realize there are things
>>>>>>>>>>>>>> that should be
>>>> shared.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> This is different from device API since with device API all
>>>>>>>>>>>>> arches have to create/destroy devices, so it make sense to
>>>>>>>>>>>>> put device lifecycle management into the common code, and
>>>>>>>>>>>>> device API has single entry point to the code - device fd
>>>>>>>>>>>>> ioctl - where it makes sense to handle common tasks, if any,
>>>>>>>>>>>>> and despatch others to specific device implementation.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This is totally unlike hypercalls which are, by definition,
>>>>>>>>>>>>> very architecture specific (the way they are triggered, the
>>>>>>>>>>>>> way parameter are passed from guest to host, what hypercalls
>>>>>>>>>>>>> arch
>>>> needs...).
>>>>>>>>>>>> 
>>>>>>>>>>>> The ABI is architecture specific.  The API doesn't need to
>>>>>>>>>>>> be, any more than it does with syscalls (I consider the
>>>>>>>>>>>> architecture-specific definition of syscall numbers and
>>>>>>>>>>>> similar constants in Linux to be unfortunate, especially for
>>>>>>>>>>>> tools such as strace or QEMU's linux-user emulation).
>>>>>>>>>>>> 
>>>>>>>>>>> Unlike syscalls different arches have very different ideas
>>>>>>>>>>> what hypercalls they need to implement, so while with unified
>>>>>>>>>>> syscall space I can see how it may benefit (very) small number
>>>>>>>>>>> of tools, I do not see what advantage it will give us. The
>>>>>>>>>>> disadvantage is one more global name space to manage.
>>>>>>>>>>> 
>>>>>>>>>>>>>> Keeping it in a common place also makes it more visible to
>>>>>>>>>>>>>> people looking to add new hcalls, which could cut down on
>>>>>>>>>>>>>> reinventing the wheel.
>>>>>>>>>>>>> I do not want other arches to start using hypercalls in the
>>>>>>>>>>>>> way powerpc started to use them: separate device io space,
>>>>>>>>>>>>> so it is better to hide this as far away from common code as
>>>>>>>>>>>>> possible :) But on a more serious note hypercalls should be
>>>>>>>>>>>>> a last resort and added only when no other possibility
>>>>>>>>>>>>> exists, so people should not look what hcalls others
>>>>>>>>>>>>> implemented, so they can add them to their favorite arch,
>>>>>>>>>>>>> but they should have a problem at hand that they cannot
>>>>>>>>>>>>> solve without hcall, but at this point they will have pretty good
>> idea what this hcall should do.
>>>>>>>>>>>> 
>>>>>>>>>>>> Why are hcalls such a bad thing?
>>>>>>>>>>>> 
>>>>>>>>>>> Because they often used to do non architectural things making
>>>>>>>>>>> OSes behave different from how they runs on real HW and real
>>>>>>>>>>> HW is what OSes are designed and tested for. Example: there
>>>>>>>>>>> once was a KVM (XEN have/had similar one) hypercall to
>>>>>>>>>>> accelerate MMU
>>>> operation.
>>>>>>>>>>> One thing it allowed is to to flush tlb without doing IPI if
>>>>>>>>>>> vcpu is not running. Later optimization was added to Linux MMU
>>>>>>>>>>> code that _relies_ on those IPIs for synchronisation. Good
>>>>>>>>>>> that at that point those hypercalls were already deprecated on
>>>>>>>>>>> KVM (IIRC XEN was broke for some time in that regard). Which
>>>>>>>>>>> brings me to another point: they often get obsoleted by code
>>>>>>>>>>> improvement and HW advancement (happened to aforementioned MMU
>>>>>>>>>>> hypercalls), but they hard to deprecate if hypervisor supports
>>>>>>>>>>> live migration, without live migration it is less of a problem.
>>>>>>>>>>> Next point is that people often try to use them instead of
>>>>>>>>>>> emulate PV or real device just because they think it is
>>>>>>>>>>> easier, but it
>>>> is often not so. Example:
>>>>>>>>>>> pvpanic device was initially proposed as hypercall, so lets
>>>>>>>>>>> say we would implement it as such. It would have been KVM
>>>>>>>>>>> specific, implementation would touch core guest KVM code and
>>>>>>>>>>> would have been Linux guest specific. Instead it was
>>>>>>>>>>> implemented as platform device with very small platform driver
>>>>>>>>>>> confined in drivers/ directory, immediately usable by XEN and
>>>>>>>>>>> QEMU tcg in addition
>>>>>>>>>> 
>>>>>>>>>> This is actually a very good point. How do we support reboot
>>>>>>>>>> and shutdown for TCG guests? We surely don't want to expose TCG
>>>>>>>>>> as KVM
>>>>>> hypervisor.
>>>>>>>>> 
>>>>>>>>> Hmm...so are you proposing that we abandon the current approach,
>>>>>>>>> and switch to a device-based mechanism for reboot/shutdown?
>>>>>>>> 
>>>>>>>> Reading Gleb's email it sounds like the more future proof
>>>>>>>> approach, yes. I'm not quite sure yet where we should plug this though.
>>>>>>> 
>>>>>>> What do you mean...where the paravirt device would go in the
>>>>>>> physical address map??
>>>>>> 
>>>>>> Right. Either we
>>>>>> 
>>>>>> - let the guest decide (PCI)
>>>>>> - let QEMU decide, but potentially break the SoC layout (SysBus)
>>>>>> - let QEMU decide, but only for the virt machine so that we don't
>>>>>> break anyone
>>>>>> (PlatBus)
>>>>> 
>>>>> Can you please elaborate above two points ?
>>>> 
>>>> If we emulate an MPC8544DS, we need to emulate an MPC8544DS. Any time
>>>> we diverge from the layout of the original chip, things can break.
>>>> 
>>>> However, for our PV machine (-M ppce500 / e500plat) we don't care
>>>> about real hardware layouts. We simply emulate a machine that is 100%
>>>> described through the device tree. So guests that can't deal with the
>>>> machine looking different from real hardware don't really matter anyways,
>> since they'd already be broken.
>>>> 
>>> 
>>> Ah, so we can choose any address range in ccsr space of a PV machine (-M
>> ppce500 / e500plat).
>> 
>> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.
> 
> All devices are represented in guest device tree, so how we will represent this device in guest Device Tree?

Not inside of the CCSR node :).


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 16:04                               ` Alexander Graf
@ 2013-07-17 16:59                                 ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-17 16:59 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, Yoder Stuart-B08248, Wood Scott-B07421,
	kvm, kvm-ppc, Gleb Natapov

On 07/17/2013 11:04:08 AM, Alexander Graf wrote:
> 
> On 17.07.2013, at 17:59, Bhushan Bharat-R65777 wrote:
> 
> > Ah, so we can choose any address range in ccsr space of a PV  
> machine (-M ppce500 / e500plat).
> 
> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.

I'd rather we put it in CCSR, especially if/when we implement LAWs and  
CCSRBAR which gives the guest control of its address space.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 16:59                                 ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-17 16:59 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, Yoder Stuart-B08248, Wood Scott-B07421,
	kvm, kvm-ppc, Gleb Natapov

On 07/17/2013 11:04:08 AM, Alexander Graf wrote:
> 
> On 17.07.2013, at 17:59, Bhushan Bharat-R65777 wrote:
> 
> > Ah, so we can choose any address range in ccsr space of a PV  
> machine (-M ppce500 / e500plat).
> 
> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.

I'd rather we put it in CCSR, especially if/when we implement LAWs and  
CCSRBAR which gives the guest control of its address space.

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 16:59                                 ` Scott Wood
@ 2013-07-17 17:05                                   ` Alexander Graf
  -1 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 17:05 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bhushan Bharat-R65777, Yoder Stuart-B08248, Wood Scott-B07421,
	kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 18:59, Scott Wood wrote:

> On 07/17/2013 11:04:08 AM, Alexander Graf wrote:
>> On 17.07.2013, at 17:59, Bhushan Bharat-R65777 wrote:
>> > Ah, so we can choose any address range in ccsr space of a PV machine (-M ppce500 / e500plat).
>> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.
> 
> I'd rather we put it in CCSR, especially if/when we implement LAWs and CCSRBAR which gives the guest control of its address space.

Do we have space in CCSR?


Alex

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 17:05                                   ` Alexander Graf
  0 siblings, 0 replies; 103+ messages in thread
From: Alexander Graf @ 2013-07-17 17:05 UTC (permalink / raw)
  To: Scott Wood
  Cc: Bhushan Bharat-R65777, Yoder Stuart-B08248, Wood Scott-B07421,
	kvm, kvm-ppc, Gleb Natapov


On 17.07.2013, at 18:59, Scott Wood wrote:

> On 07/17/2013 11:04:08 AM, Alexander Graf wrote:
>> On 17.07.2013, at 17:59, Bhushan Bharat-R65777 wrote:
>> > Ah, so we can choose any address range in ccsr space of a PV machine (-M ppce500 / e500plat).
>> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.
> 
> I'd rather we put it in CCSR, especially if/when we implement LAWs and CCSRBAR which gives the guest control of its address space.

Do we have space in CCSR?


Alex


^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
  2013-07-17 17:05                                   ` Alexander Graf
@ 2013-07-17 17:09                                     ` Scott Wood
  -1 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-17 17:09 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, Yoder Stuart-B08248, Wood Scott-B07421,
	kvm, kvm-ppc, Gleb Natapov

On 07/17/2013 12:05:41 PM, Alexander Graf wrote:
> 
> On 17.07.2013, at 18:59, Scott Wood wrote:
> 
> > On 07/17/2013 11:04:08 AM, Alexander Graf wrote:
> >> On 17.07.2013, at 17:59, Bhushan Bharat-R65777 wrote:
> >> > Ah, so we can choose any address range in ccsr space of a PV  
> machine (-M ppce500 / e500plat).
> >> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.
> >
> > I'd rather we put it in CCSR, especially if/when we implement LAWs  
> and CCSRBAR which gives the guest control of its address space.
> 
> Do we have space in CCSR?

Sure.  Even on real hardware there are gaps, and on the paravirt  
platform we have loads of space. :-)

This does raise the question of what compatible the ccsr node should  
have on the paravirt platform.  Currently it's still labelled  
"fsl,mpc8544-immr", which is clearly wrong.  If we add CCSRBAR support,  
should the paravirt platform have it as well?  If so, what is the size  
of CCSR on the paravirt platform?

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 3/5] booke: define reset and shutdown hcalls
@ 2013-07-17 17:09                                     ` Scott Wood
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Wood @ 2013-07-17 17:09 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Bhushan Bharat-R65777, Yoder Stuart-B08248, Wood Scott-B07421,
	kvm, kvm-ppc, Gleb Natapov

On 07/17/2013 12:05:41 PM, Alexander Graf wrote:
> 
> On 17.07.2013, at 18:59, Scott Wood wrote:
> 
> > On 07/17/2013 11:04:08 AM, Alexander Graf wrote:
> >> On 17.07.2013, at 17:59, Bhushan Bharat-R65777 wrote:
> >> > Ah, so we can choose any address range in ccsr space of a PV  
> machine (-M ppce500 / e500plat).
> >> No, we don't put it in CCSR space. It'd just be orthogonal to CCSR.
> >
> > I'd rather we put it in CCSR, especially if/when we implement LAWs  
> and CCSRBAR which gives the guest control of its address space.
> 
> Do we have space in CCSR?

Sure.  Even on real hardware there are gaps, and on the paravirt  
platform we have loads of space. :-)

This does raise the question of what compatible the ccsr node should  
have on the paravirt platform.  Currently it's still labelled  
"fsl,mpc8544-immr", which is clearly wrong.  If we add CCSRBAR support,  
should the paravirt platform have it as well?  If so, what is the size  
of CCSR on the paravirt platform?

-Scott

^ permalink raw reply	[flat|nested] 103+ messages in thread

end of thread, other threads:[~2013-07-17 17:09 UTC | newest]

Thread overview: 103+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-15 11:11 [PATCH 0/5] powerpc: implement reset/shutdown hcalls Bharat Bhushan
2013-07-15 11:23 ` Bharat Bhushan
2013-07-15 11:11 ` [PATCH 1/5] powerpc: define ePAPR hcall exit interface Bharat Bhushan
2013-07-15 11:23   ` Bharat Bhushan
2013-07-15 11:21   ` Alexander Graf
2013-07-15 11:21     ` Alexander Graf
2013-07-15 11:32     ` Bhushan Bharat-R65777
2013-07-15 11:11 ` [PATCH 2/5] booke: exit to guest userspace for unimplemented hcalls in kvm Bharat Bhushan
2013-07-15 11:23   ` Bharat Bhushan
2013-07-15 11:31   ` Alexander Graf
2013-07-15 11:31     ` Alexander Graf
2013-07-15 11:38     ` Bhushan Bharat-R65777
2013-07-15 11:38       ` Bhushan Bharat-R65777
2013-07-15 11:46       ` Alexander Graf
2013-07-15 11:46         ` Alexander Graf
2013-07-15 14:50         ` Bhushan Bharat-R65777
2013-07-15 14:56           ` Alexander Graf
2013-07-15 14:56             ` Alexander Graf
2013-07-15 15:13             ` Bhushan Bharat-R65777
2013-07-15 15:29               ` Alexander Graf
2013-07-15 15:29                 ` Alexander Graf
2013-07-15 15:35                 ` Bhushan Bharat-R65777
2013-07-15 15:38                   ` Alexander Graf
2013-07-15 15:38                     ` Alexander Graf
2013-07-15 18:07   ` Scott Wood
2013-07-15 18:07     ` Scott Wood
2013-07-16  4:46     ` Bhushan Bharat-R65777
2013-07-16  4:46       ` Bhushan Bharat-R65777
2013-07-15 11:11 ` [PATCH 3/5] booke: define reset and shutdown hcalls Bharat Bhushan
2013-07-15 11:23   ` Bharat Bhushan
2013-07-15 11:30   ` Gleb Natapov
2013-07-15 11:30     ` Gleb Natapov
2013-07-15 11:44     ` Alexander Graf
2013-07-15 11:44       ` Alexander Graf
2013-07-15 12:15       ` Gleb Natapov
2013-07-15 12:15         ` Gleb Natapov
2013-07-15 12:21         ` Alexander Graf
2013-07-15 12:21           ` Alexander Graf
2013-07-15 12:24           ` Gleb Natapov
2013-07-15 12:24             ` Gleb Natapov
2013-07-15 12:26             ` Alexander Graf
2013-07-15 12:26               ` Alexander Graf
2013-07-15 12:31               ` Gleb Natapov
2013-07-15 12:31                 ` Gleb Natapov
2013-07-15 18:17     ` Scott Wood
2013-07-15 18:17       ` Scott Wood
2013-07-16  6:35       ` Gleb Natapov
2013-07-16  6:35         ` Gleb Natapov
2013-07-16 23:04         ` Scott Wood
2013-07-16 23:04           ` Scott Wood
2013-07-17 11:00           ` Gleb Natapov
2013-07-17 11:00             ` Gleb Natapov
2013-07-17 12:19             ` Alexander Graf
2013-07-17 12:19               ` Alexander Graf
2013-07-17 15:19               ` Yoder Stuart-B08248
2013-07-17 15:21                 ` Alexander Graf
2013-07-17 15:21                   ` Alexander Graf
2013-07-17 15:36                   ` Yoder Stuart-B08248
2013-07-17 15:41                     ` Alexander Graf
2013-07-17 15:41                       ` Alexander Graf
2013-07-17 15:47                       ` Bhushan Bharat-R65777
2013-07-17 15:52                         ` Alexander Graf
2013-07-17 15:52                           ` Alexander Graf
2013-07-17 15:59                           ` Bhushan Bharat-R65777
2013-07-17 16:04                             ` Alexander Graf
2013-07-17 16:04                               ` Alexander Graf
2013-07-17 16:21                               ` Bhushan Bharat-R65777
2013-07-17 16:23                                 ` Alexander Graf
2013-07-17 16:23                                   ` Alexander Graf
2013-07-17 16:59                               ` Scott Wood
2013-07-17 16:59                                 ` Scott Wood
2013-07-17 17:05                                 ` Alexander Graf
2013-07-17 17:05                                   ` Alexander Graf
2013-07-17 17:09                                   ` Scott Wood
2013-07-17 17:09                                     ` Scott Wood
2013-07-15 11:11 ` [PATCH 4/5] powerpc: Resolve KVM_HC_FEATURES compilation dependeny Bharat Bhushan
2013-07-15 11:23   ` Bharat Bhushan
2013-07-15 11:46   ` Alexander Graf
2013-07-15 11:46     ` Alexander Graf
2013-07-15 11:11 ` [PATCH 5/5] powerpc: using reset hcall when kvm,has-reset Bharat Bhushan
2013-07-15 11:23   ` Bharat Bhushan
2013-07-15 11:50   ` Alexander Graf
2013-07-15 11:50     ` Alexander Graf
2013-07-15 15:05     ` Bhushan Bharat-R65777
2013-07-15 15:09       ` Alexander Graf
2013-07-15 15:09         ` Alexander Graf
2013-07-15 15:16         ` Bhushan Bharat-R65777
2013-07-15 18:21           ` Scott Wood
2013-07-15 18:21             ` Scott Wood
2013-07-15 20:28             ` Alexander Graf
2013-07-15 20:28               ` Alexander Graf
2013-07-15 20:52               ` Scott Wood
2013-07-15 20:52                 ` Scott Wood
2013-07-15 20:55                 ` Alexander Graf
2013-07-15 20:55                   ` Alexander Graf
2013-07-15 22:23                   ` Scott Wood
2013-07-15 22:23                     ` Scott Wood
2013-07-16 23:21                     ` Alexander Graf
2013-07-16 23:21                       ` Alexander Graf
2013-07-16 23:26                       ` Scott Wood
2013-07-16 23:26                         ` Scott Wood
2013-07-16 23:37                         ` Scott Wood
2013-07-16 23:37                           ` Scott Wood

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.