All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/41] kvm updates for 2.6.22
@ 2007-04-01 14:34 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:34 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel

Following is my current 2.6.22 kvm queue.  It contains userspace interface
updates, improved guest support, cleanups, and plain bugfixes.  It will
likely grow slightly by the time the merge window opens.

Avi Kivity (34):
      KVM: Use own minor number
      KVM: Export <linux/kvm.h>
      KVM: Fix bogus sign extension in mmu mapping audit
      KVM: Use a shared page for kernel/user communication when runing a vcpu
      KVM: Do not communicate to userspace through cpu registers during PIO
      KVM: Handle cpuid in the kernel instead of punting to userspace
      KVM: Remove the 'emulated' field from the userspace interface
      KVM: Remove minor wart from KVM_CREATE_VCPU ioctl
      KVM: Renumber ioctls
      KVM: Add method to check for backwards-compatible API extensions
      KVM: Allow userspace to process hypercalls which have no kernel handler
      KVM: Fold kvm_run::exit_type into kvm_run::exit_reason
      KVM: Add a special exit reason when exiting due to an interrupt
      KVM: Initialize the apic_base msr on svm too
      KVM: Add guest mode signal mask
      KVM: Allow kernel to select size of mmap() buffer
      KVM: Future-proof argument-less ioctls
      KVM: Avoid guest virtual addresses in string pio userspace interface
      KVM: MMU: Remove unnecessary check for pdptr access
      KVM: MMU: Remove global pte tracking
      KVM: Workaround vmx inability to virtualize the reset state
      KVM: Remove set_cr0_no_modeswitch() arch op
      KVM: Modify guest segments after potentially switching modes
      KVM: Hack real-mode segments on vmx from KVM_SET_SREGS
      KVM: Don't allow the guest to turn off the cpu cache
      KVM: Remove unused and write-only variables
      KVM: MMU: Fix hugepage pdes mapping same physical address with different access
      KVM: SVM: Ensure timestamp counter monotonicity
      KVM: Use list_move()
      KVM: Remove debug message
      KVM: x86 emulator: fix bit string operations operand size
      KVM: Simply gfn_to_page()
      KVM: Add physical memory aliasing feature
      KVM: Add fpu get/set operations

Dor Laor (3):
      KVM: Fix guest register corruption on paravirt hypercall
      KVM: Use the generic skip_emulated_instruction() in hypercall code
      KVM: Add mmu cache clear function

Joerg Roedel (2):
      KVM: SVM: forbid guest to execute monitor/mwait
      KVM: SVM: enable LBRV virtualization if available

Michal Piotrowski (1):
      KVM: Remove unused function

Sergey Kiselev (1):
      KVM: Handle writes to MCG_STATUS msr

 drivers/kvm/kvm.h          |   57 +++-
 drivers/kvm/kvm_main.c     |  649 ++++++++++++++++++++++++++++++++++++++++----
 drivers/kvm/kvm_svm.h      |    2 -
 drivers/kvm/mmu.c          |   69 +++--
 drivers/kvm/paging_tmpl.h  |   10 +-
 drivers/kvm/svm.c          |  111 +++++---
 drivers/kvm/svm.h          |    6 +
 drivers/kvm/vmx.c          |   88 +++----
 drivers/kvm/x86_emulate.c  |    5 +-
 include/linux/Kbuild       |    1 +
 include/linux/kvm.h        |  135 +++++++---
 include/linux/miscdevice.h |    1 +
 12 files changed, 898 insertions(+), 236 deletions(-)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 00/41] kvm updates for 2.6.22
@ 2007-04-01 14:34 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:34 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Following is my current 2.6.22 kvm queue.  It contains userspace interface
updates, improved guest support, cleanups, and plain bugfixes.  It will
likely grow slightly by the time the merge window opens.

Avi Kivity (34):
      KVM: Use own minor number
      KVM: Export <linux/kvm.h>
      KVM: Fix bogus sign extension in mmu mapping audit
      KVM: Use a shared page for kernel/user communication when runing a vcpu
      KVM: Do not communicate to userspace through cpu registers during PIO
      KVM: Handle cpuid in the kernel instead of punting to userspace
      KVM: Remove the 'emulated' field from the userspace interface
      KVM: Remove minor wart from KVM_CREATE_VCPU ioctl
      KVM: Renumber ioctls
      KVM: Add method to check for backwards-compatible API extensions
      KVM: Allow userspace to process hypercalls which have no kernel handler
      KVM: Fold kvm_run::exit_type into kvm_run::exit_reason
      KVM: Add a special exit reason when exiting due to an interrupt
      KVM: Initialize the apic_base msr on svm too
      KVM: Add guest mode signal mask
      KVM: Allow kernel to select size of mmap() buffer
      KVM: Future-proof argument-less ioctls
      KVM: Avoid guest virtual addresses in string pio userspace interface
      KVM: MMU: Remove unnecessary check for pdptr access
      KVM: MMU: Remove global pte tracking
      KVM: Workaround vmx inability to virtualize the reset state
      KVM: Remove set_cr0_no_modeswitch() arch op
      KVM: Modify guest segments after potentially switching modes
      KVM: Hack real-mode segments on vmx from KVM_SET_SREGS
      KVM: Don't allow the guest to turn off the cpu cache
      KVM: Remove unused and write-only variables
      KVM: MMU: Fix hugepage pdes mapping same physical address with different access
      KVM: SVM: Ensure timestamp counter monotonicity
      KVM: Use list_move()
      KVM: Remove debug message
      KVM: x86 emulator: fix bit string operations operand size
      KVM: Simply gfn_to_page()
      KVM: Add physical memory aliasing feature
      KVM: Add fpu get/set operations

Dor Laor (3):
      KVM: Fix guest register corruption on paravirt hypercall
      KVM: Use the generic skip_emulated_instruction() in hypercall code
      KVM: Add mmu cache clear function

Joerg Roedel (2):
      KVM: SVM: forbid guest to execute monitor/mwait
      KVM: SVM: enable LBRV virtualization if available

Michal Piotrowski (1):
      KVM: Remove unused function

Sergey Kiselev (1):
      KVM: Handle writes to MCG_STATUS msr

 drivers/kvm/kvm.h          |   57 +++-
 drivers/kvm/kvm_main.c     |  649 ++++++++++++++++++++++++++++++++++++++++----
 drivers/kvm/kvm_svm.h      |    2 -
 drivers/kvm/mmu.c          |   69 +++--
 drivers/kvm/paging_tmpl.h  |   10 +-
 drivers/kvm/svm.c          |  111 +++++---
 drivers/kvm/svm.h          |    6 +
 drivers/kvm/vmx.c          |   88 +++----
 drivers/kvm/x86_emulate.c  |    5 +-
 include/linux/Kbuild       |    1 +
 include/linux/kvm.h        |  135 +++++++---
 include/linux/miscdevice.h |    1 +
 12 files changed, 898 insertions(+), 236 deletions(-)

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 01/41] KVM: Fix guest register corruption on paravirt hypercall
@ 2007-04-01 14:34   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:34 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Dor Laor, Avi Kivity

From: Dor Laor <dor.laor@qumranet.com>

The hypercall code mixes up the ->cache_regs() and ->decache_regs()
callbacks, resulting in guest register corruption.

Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index dc7a8c7..ff7c836 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1177,7 +1177,7 @@ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	unsigned long nr, a0, a1, a2, a3, a4, a5, ret;
 
-	kvm_arch_ops->decache_regs(vcpu);
+	kvm_arch_ops->cache_regs(vcpu);
 	ret = -KVM_EINVAL;
 #ifdef CONFIG_X86_64
 	if (is_long_mode(vcpu)) {
@@ -1204,7 +1204,7 @@ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		;
 	}
 	vcpu->regs[VCPU_REGS_RAX] = ret;
-	kvm_arch_ops->cache_regs(vcpu);
+	kvm_arch_ops->decache_regs(vcpu);
 	return 1;
 }
 EXPORT_SYMBOL_GPL(kvm_hypercall);
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 01/41] KVM: Fix guest register corruption on paravirt hypercall
@ 2007-04-01 14:34   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:34 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Dor Laor <dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

The hypercall code mixes up the ->cache_regs() and ->decache_regs()
callbacks, resulting in guest register corruption.

Signed-off-by: Dor Laor <dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index dc7a8c7..ff7c836 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1177,7 +1177,7 @@ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	unsigned long nr, a0, a1, a2, a3, a4, a5, ret;
 
-	kvm_arch_ops->decache_regs(vcpu);
+	kvm_arch_ops->cache_regs(vcpu);
 	ret = -KVM_EINVAL;
 #ifdef CONFIG_X86_64
 	if (is_long_mode(vcpu)) {
@@ -1204,7 +1204,7 @@ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		;
 	}
 	vcpu->regs[VCPU_REGS_RAX] = ret;
-	kvm_arch_ops->cache_regs(vcpu);
+	kvm_arch_ops->decache_regs(vcpu);
 	return 1;
 }
 EXPORT_SYMBOL_GPL(kvm_hypercall);
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 02/41] KVM: Use the generic skip_emulated_instruction() in hypercall code
@ 2007-04-01 14:34     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:34 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Dor Laor, Avi Kivity

From: Dor Laor <dor.laor@qumranet.com>

Instead of twiddling the rip registers directly, use the
skip_emulated_instruction() function to do that for us.

Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    3 ++-
 drivers/kvm/vmx.c |    2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 3d8ea7a..6787f11 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1078,7 +1078,8 @@ static int halt_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 static int vmmcall_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
-	vcpu->svm->vmcb->save.rip += 3;
+	vcpu->svm->next_rip = vcpu->svm->vmcb->save.rip + 3;
+	skip_emulated_instruction(vcpu);
 	return kvm_hypercall(vcpu, kvm_run);
 }
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index fbbf9d6..a721b60 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1658,7 +1658,7 @@ static int handle_halt(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 static int handle_vmcall(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
-	vmcs_writel(GUEST_RIP, vmcs_readl(GUEST_RIP)+3);
+	skip_emulated_instruction(vcpu);
 	return kvm_hypercall(vcpu, kvm_run);
 }
 
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 02/41] KVM: Use the generic skip_emulated_instruction() in hypercall code
@ 2007-04-01 14:34     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:34 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Dor Laor <dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

Instead of twiddling the rip registers directly, use the
skip_emulated_instruction() function to do that for us.

Signed-off-by: Dor Laor <dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/svm.c |    3 ++-
 drivers/kvm/vmx.c |    2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 3d8ea7a..6787f11 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1078,7 +1078,8 @@ static int halt_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 static int vmmcall_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
-	vcpu->svm->vmcb->save.rip += 3;
+	vcpu->svm->next_rip = vcpu->svm->vmcb->save.rip + 3;
+	skip_emulated_instruction(vcpu);
 	return kvm_hypercall(vcpu, kvm_run);
 }
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index fbbf9d6..a721b60 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1658,7 +1658,7 @@ static int handle_halt(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 static int handle_vmcall(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
-	vmcs_writel(GUEST_RIP, vmcs_readl(GUEST_RIP)+3);
+	skip_emulated_instruction(vcpu);
 	return kvm_hypercall(vcpu, kvm_run);
 }
 
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 03/41] KVM: Use own minor number
@ 2007-04-01 14:35       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Use the minor number (232) allocated to kvm by lanana.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c     |    2 +-
 include/linux/miscdevice.h |    1 +
 2 files changed, 2 insertions(+), 1 deletions(-)
 mode change 100644 => 100755 drivers/kvm/kvm_main.c

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
old mode 100644
new mode 100755
index ff7c836..946ed86
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2299,7 +2299,7 @@ static struct file_operations kvm_chardev_ops = {
 };
 
 static struct miscdevice kvm_dev = {
-	MISC_DYNAMIC_MINOR,
+	KVM_MINOR,
 	"kvm",
 	&kvm_chardev_ops,
 };
diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
index 326da7d..dff9ea3 100644
--- a/include/linux/miscdevice.h
+++ b/include/linux/miscdevice.h
@@ -29,6 +29,7 @@
 
 #define TUN_MINOR	     200
 #define	HPET_MINOR	     228
+#define KVM_MINOR            232
 
 struct device;
 
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 03/41] KVM: Use own minor number
@ 2007-04-01 14:35       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Use the minor number (232) allocated to kvm by lanana.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c     |    2 +-
 include/linux/miscdevice.h |    1 +
 2 files changed, 2 insertions(+), 1 deletions(-)
 mode change 100644 => 100755 drivers/kvm/kvm_main.c

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
old mode 100644
new mode 100755
index ff7c836..946ed86
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2299,7 +2299,7 @@ static struct file_operations kvm_chardev_ops = {
 };
 
 static struct miscdevice kvm_dev = {
-	MISC_DYNAMIC_MINOR,
+	KVM_MINOR,
 	"kvm",
 	&kvm_chardev_ops,
 };
diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
index 326da7d..dff9ea3 100644
--- a/include/linux/miscdevice.h
+++ b/include/linux/miscdevice.h
@@ -29,6 +29,7 @@
 
 #define TUN_MINOR	     200
 #define	HPET_MINOR	     228
+#define KVM_MINOR            232
 
 struct device;
 
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 04/41] KVM: Export <linux/kvm.h>
@ 2007-04-01 14:35         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This allows users to actually build prgrams that use kvm without
the entire source tree.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 include/linux/Kbuild |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index e81e301..b35b593 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -99,6 +99,7 @@ header-y += iso_fs.h
 header-y += ixjuser.h
 header-y += jffs2.h
 header-y += keyctl.h
+header-y += kvm.h
 header-y += limits.h
 header-y += lock_dlm_plock.h
 header-y += magic.h
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 04/41] KVM: Export <linux/kvm.h>
@ 2007-04-01 14:35         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

This allows users to actually build prgrams that use kvm without
the entire source tree.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 include/linux/Kbuild |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index e81e301..b35b593 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -99,6 +99,7 @@ header-y += iso_fs.h
 header-y += ixjuser.h
 header-y += jffs2.h
 header-y += keyctl.h
+header-y += kvm.h
 header-y += limits.h
 header-y += lock_dlm_plock.h
 header-y += magic.h
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 05/41] KVM: Fix bogus sign extension in mmu mapping audit
@ 2007-04-01 14:35           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

When auditing a 32-bit guest on a 64-bit host, sign extension of the page
table directory pointer table index caused bogus addresses to be shown on
audit errors.

Fix by declaring the index unsigned.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index e85b4c7..266444a 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -1359,7 +1359,7 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
 
 static void audit_mappings(struct kvm_vcpu *vcpu)
 {
-	int i;
+	unsigned i;
 
 	if (vcpu->mmu.root_level == 4)
 		audit_mappings_page(vcpu, vcpu->mmu.root_hpa, 0, 4);
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 05/41] KVM: Fix bogus sign extension in mmu mapping audit
@ 2007-04-01 14:35           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

When auditing a 32-bit guest on a 64-bit host, sign extension of the page
table directory pointer table index caused bogus addresses to be shown on
audit errors.

Fix by declaring the index unsigned.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index e85b4c7..266444a 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -1359,7 +1359,7 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
 
 static void audit_mappings(struct kvm_vcpu *vcpu)
 {
-	int i;
+	unsigned i;
 
 	if (vcpu->mmu.root_level == 4)
 		audit_mappings_page(vcpu, vcpu->mmu.root_hpa, 0, 4);
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 06/41] KVM: Use a shared page for kernel/user communication when runing a vcpu
@ 2007-04-01 14:35             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Instead of passing a 'struct kvm_run' back and forth between the kernel and
userspace, allocate a page and allow the user to mmap() it.  This reduces
needless copying and makes the interface expandable by providing lots of
free space.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    1 +
 drivers/kvm/kvm_main.c |   54 +++++++++++++++++++++++++++++++++++------------
 include/linux/kvm.h    |    6 ++--
 3 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 0d122bf..901b8d9 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -228,6 +228,7 @@ struct kvm_vcpu {
 	struct mutex mutex;
 	int   cpu;
 	int   launched;
+	struct kvm_run *run;
 	int interrupt_window_open;
 	unsigned long irq_summary; /* bit vector: 1 per word in irq_pending */
 #define NR_IRQ_WORDS KVM_IRQ_BITMAP_SIZE(unsigned long)
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 946ed86..42be8a8 100755
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -355,6 +355,8 @@ static void kvm_free_vcpu(struct kvm_vcpu *vcpu)
 	kvm_mmu_destroy(vcpu);
 	vcpu_put(vcpu);
 	kvm_arch_ops->vcpu_free(vcpu);
+	free_page((unsigned long)vcpu->run);
+	vcpu->run = NULL;
 }
 
 static void kvm_free_vcpus(struct kvm *kvm)
@@ -1887,6 +1889,33 @@ static int kvm_vcpu_ioctl_debug_guest(struct kvm_vcpu *vcpu,
 	return r;
 }
 
+static struct page *kvm_vcpu_nopage(struct vm_area_struct *vma,
+				    unsigned long address,
+				    int *type)
+{
+	struct kvm_vcpu *vcpu = vma->vm_file->private_data;
+	unsigned long pgoff;
+	struct page *page;
+
+	*type = VM_FAULT_MINOR;
+	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
+	if (pgoff != 0)
+		return NOPAGE_SIGBUS;
+	page = virt_to_page(vcpu->run);
+	get_page(page);
+	return page;
+}
+
+static struct vm_operations_struct kvm_vcpu_vm_ops = {
+	.nopage = kvm_vcpu_nopage,
+};
+
+static int kvm_vcpu_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	vma->vm_ops = &kvm_vcpu_vm_ops;
+	return 0;
+}
+
 static int kvm_vcpu_release(struct inode *inode, struct file *filp)
 {
 	struct kvm_vcpu *vcpu = filp->private_data;
@@ -1899,6 +1928,7 @@ static struct file_operations kvm_vcpu_fops = {
 	.release        = kvm_vcpu_release,
 	.unlocked_ioctl = kvm_vcpu_ioctl,
 	.compat_ioctl   = kvm_vcpu_ioctl,
+	.mmap           = kvm_vcpu_mmap,
 };
 
 /*
@@ -1947,6 +1977,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 {
 	int r;
 	struct kvm_vcpu *vcpu;
+	struct page *page;
 
 	r = -EINVAL;
 	if (!valid_vcpu(n))
@@ -1961,6 +1992,12 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 		return -EEXIST;
 	}
 
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	r = -ENOMEM;
+	if (!page)
+		goto out_unlock;
+	vcpu->run = page_address(page);
+
 	vcpu->host_fx_image = (char*)ALIGN((hva_t)vcpu->fx_buf,
 					   FX_IMAGE_ALIGN);
 	vcpu->guest_fx_image = vcpu->host_fx_image + FX_IMAGE_SIZE;
@@ -1990,6 +2027,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 
 out_free_vcpus:
 	kvm_free_vcpu(vcpu);
+out_unlock:
 	mutex_unlock(&vcpu->mutex);
 out:
 	return r;
@@ -2003,21 +2041,9 @@ static long kvm_vcpu_ioctl(struct file *filp,
 	int r = -EINVAL;
 
 	switch (ioctl) {
-	case KVM_RUN: {
-		struct kvm_run kvm_run;
-
-		r = -EFAULT;
-		if (copy_from_user(&kvm_run, argp, sizeof kvm_run))
-			goto out;
-		r = kvm_vcpu_ioctl_run(vcpu, &kvm_run);
-		if (r < 0 &&  r != -EINTR)
-			goto out;
-		if (copy_to_user(argp, &kvm_run, sizeof kvm_run)) {
-			r = -EFAULT;
-			goto out;
-		}
+	case KVM_RUN:
+		r = kvm_vcpu_ioctl_run(vcpu, vcpu->run);
 		break;
-	}
 	case KVM_GET_REGS: {
 		struct kvm_regs kvm_regs;
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 275354f..d88e750 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 4
+#define KVM_API_VERSION 5
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -49,7 +49,7 @@ enum kvm_exit_reason {
 	KVM_EXIT_SHUTDOWN         = 8,
 };
 
-/* for KVM_RUN */
+/* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
 struct kvm_run {
 	/* in */
 	__u32 emulated;  /* skip current instruction */
@@ -233,7 +233,7 @@ struct kvm_dirty_log {
 /*
  * ioctls for vcpu fds
  */
-#define KVM_RUN                   _IOWR(KVMIO, 2, struct kvm_run)
+#define KVM_RUN                   _IO(KVMIO, 16)
 #define KVM_GET_REGS              _IOR(KVMIO, 3, struct kvm_regs)
 #define KVM_SET_REGS              _IOW(KVMIO, 4, struct kvm_regs)
 #define KVM_GET_SREGS             _IOR(KVMIO, 5, struct kvm_sregs)
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 06/41] KVM: Use a shared page for kernel/user communication when runing a vcpu
@ 2007-04-01 14:35             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Instead of passing a 'struct kvm_run' back and forth between the kernel and
userspace, allocate a page and allow the user to mmap() it.  This reduces
needless copying and makes the interface expandable by providing lots of
free space.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    1 +
 drivers/kvm/kvm_main.c |   54 +++++++++++++++++++++++++++++++++++------------
 include/linux/kvm.h    |    6 ++--
 3 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 0d122bf..901b8d9 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -228,6 +228,7 @@ struct kvm_vcpu {
 	struct mutex mutex;
 	int   cpu;
 	int   launched;
+	struct kvm_run *run;
 	int interrupt_window_open;
 	unsigned long irq_summary; /* bit vector: 1 per word in irq_pending */
 #define NR_IRQ_WORDS KVM_IRQ_BITMAP_SIZE(unsigned long)
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 946ed86..42be8a8 100755
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -355,6 +355,8 @@ static void kvm_free_vcpu(struct kvm_vcpu *vcpu)
 	kvm_mmu_destroy(vcpu);
 	vcpu_put(vcpu);
 	kvm_arch_ops->vcpu_free(vcpu);
+	free_page((unsigned long)vcpu->run);
+	vcpu->run = NULL;
 }
 
 static void kvm_free_vcpus(struct kvm *kvm)
@@ -1887,6 +1889,33 @@ static int kvm_vcpu_ioctl_debug_guest(struct kvm_vcpu *vcpu,
 	return r;
 }
 
+static struct page *kvm_vcpu_nopage(struct vm_area_struct *vma,
+				    unsigned long address,
+				    int *type)
+{
+	struct kvm_vcpu *vcpu = vma->vm_file->private_data;
+	unsigned long pgoff;
+	struct page *page;
+
+	*type = VM_FAULT_MINOR;
+	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
+	if (pgoff != 0)
+		return NOPAGE_SIGBUS;
+	page = virt_to_page(vcpu->run);
+	get_page(page);
+	return page;
+}
+
+static struct vm_operations_struct kvm_vcpu_vm_ops = {
+	.nopage = kvm_vcpu_nopage,
+};
+
+static int kvm_vcpu_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	vma->vm_ops = &kvm_vcpu_vm_ops;
+	return 0;
+}
+
 static int kvm_vcpu_release(struct inode *inode, struct file *filp)
 {
 	struct kvm_vcpu *vcpu = filp->private_data;
@@ -1899,6 +1928,7 @@ static struct file_operations kvm_vcpu_fops = {
 	.release        = kvm_vcpu_release,
 	.unlocked_ioctl = kvm_vcpu_ioctl,
 	.compat_ioctl   = kvm_vcpu_ioctl,
+	.mmap           = kvm_vcpu_mmap,
 };
 
 /*
@@ -1947,6 +1977,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 {
 	int r;
 	struct kvm_vcpu *vcpu;
+	struct page *page;
 
 	r = -EINVAL;
 	if (!valid_vcpu(n))
@@ -1961,6 +1992,12 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 		return -EEXIST;
 	}
 
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	r = -ENOMEM;
+	if (!page)
+		goto out_unlock;
+	vcpu->run = page_address(page);
+
 	vcpu->host_fx_image = (char*)ALIGN((hva_t)vcpu->fx_buf,
 					   FX_IMAGE_ALIGN);
 	vcpu->guest_fx_image = vcpu->host_fx_image + FX_IMAGE_SIZE;
@@ -1990,6 +2027,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 
 out_free_vcpus:
 	kvm_free_vcpu(vcpu);
+out_unlock:
 	mutex_unlock(&vcpu->mutex);
 out:
 	return r;
@@ -2003,21 +2041,9 @@ static long kvm_vcpu_ioctl(struct file *filp,
 	int r = -EINVAL;
 
 	switch (ioctl) {
-	case KVM_RUN: {
-		struct kvm_run kvm_run;
-
-		r = -EFAULT;
-		if (copy_from_user(&kvm_run, argp, sizeof kvm_run))
-			goto out;
-		r = kvm_vcpu_ioctl_run(vcpu, &kvm_run);
-		if (r < 0 &&  r != -EINTR)
-			goto out;
-		if (copy_to_user(argp, &kvm_run, sizeof kvm_run)) {
-			r = -EFAULT;
-			goto out;
-		}
+	case KVM_RUN:
+		r = kvm_vcpu_ioctl_run(vcpu, vcpu->run);
 		break;
-	}
 	case KVM_GET_REGS: {
 		struct kvm_regs kvm_regs;
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 275354f..d88e750 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 4
+#define KVM_API_VERSION 5
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -49,7 +49,7 @@ enum kvm_exit_reason {
 	KVM_EXIT_SHUTDOWN         = 8,
 };
 
-/* for KVM_RUN */
+/* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
 struct kvm_run {
 	/* in */
 	__u32 emulated;  /* skip current instruction */
@@ -233,7 +233,7 @@ struct kvm_dirty_log {
 /*
  * ioctls for vcpu fds
  */
-#define KVM_RUN                   _IOWR(KVMIO, 2, struct kvm_run)
+#define KVM_RUN                   _IO(KVMIO, 16)
 #define KVM_GET_REGS              _IOR(KVMIO, 3, struct kvm_regs)
 #define KVM_SET_REGS              _IOW(KVMIO, 4, struct kvm_regs)
 #define KVM_GET_SREGS             _IOR(KVMIO, 5, struct kvm_sregs)
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 07/41] KVM: Do not communicate to userspace through cpu registers during PIO
@ 2007-04-01 14:35               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Currently when passing the a PIO emulation request to userspace, we
rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx
(on string instructions).  This (a) requires two extra ioctls for getting
and setting the registers and (b) is unfriendly to non-x86 archs, when
they get kvm ports.

So fix by doing the register fixups in the kernel and passing to userspace
only an abstract description of the PIO to be done.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    1 +
 drivers/kvm/kvm_main.c |   48 +++++++++++++++++++++++++++++++++++++++++++++---
 drivers/kvm/svm.c      |    2 ++
 drivers/kvm/vmx.c      |    2 ++
 include/linux/kvm.h    |    6 +++---
 5 files changed, 53 insertions(+), 6 deletions(-)
 mode change 100755 => 100644 drivers/kvm/kvm_main.c

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 901b8d9..59cbc5b 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -274,6 +274,7 @@ struct kvm_vcpu {
 	int mmio_size;
 	unsigned char mmio_data[8];
 	gpa_t mmio_phys_addr;
+	int pio_pending;
 
 	struct {
 		int active;
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
old mode 100755
new mode 100644
index 42be8a8..ff8bcfe
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1504,6 +1504,44 @@ void save_msrs(struct vmx_msr_entry *e, int n)
 }
 EXPORT_SYMBOL_GPL(save_msrs);
 
+static void complete_pio(struct kvm_vcpu *vcpu)
+{
+	struct kvm_io *io = &vcpu->run->io;
+	long delta;
+
+	kvm_arch_ops->cache_regs(vcpu);
+
+	if (!io->string) {
+		if (io->direction == KVM_EXIT_IO_IN)
+			memcpy(&vcpu->regs[VCPU_REGS_RAX], &io->value,
+			       io->size);
+	} else {
+		delta = 1;
+		if (io->rep) {
+			delta *= io->count;
+			/*
+			 * The size of the register should really depend on
+			 * current address size.
+			 */
+			vcpu->regs[VCPU_REGS_RCX] -= delta;
+		}
+		if (io->string_down)
+			delta = -delta;
+		delta *= io->size;
+		if (io->direction == KVM_EXIT_IO_IN)
+			vcpu->regs[VCPU_REGS_RDI] += delta;
+		else
+			vcpu->regs[VCPU_REGS_RSI] += delta;
+	}
+
+	vcpu->pio_pending = 0;
+	vcpu->run->io_completed = 0;
+
+	kvm_arch_ops->decache_regs(vcpu);
+
+	kvm_arch_ops->skip_emulated_instruction(vcpu);
+}
+
 static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	int r;
@@ -1518,9 +1556,13 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		kvm_run->emulated = 0;
 	}
 
-	if (kvm_run->mmio_completed) {
-		memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
-		vcpu->mmio_read_completed = 1;
+	if (kvm_run->io_completed) {
+		if (vcpu->pio_pending)
+			complete_pio(vcpu);
+		else {
+			memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
+			vcpu->mmio_read_completed = 1;
+		}
 	}
 
 	vcpu->mmio_needed = 0;
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 6787f11..c35b8c8 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1037,6 +1037,7 @@ static int io_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	kvm_run->io.size = ((io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT);
 	kvm_run->io.string = (io_info & SVM_IOIO_STR_MASK) != 0;
 	kvm_run->io.rep = (io_info & SVM_IOIO_REP_MASK) != 0;
+	kvm_run->io.count = 1;
 
 	if (kvm_run->io.string) {
 		unsigned addr_mask;
@@ -1056,6 +1057,7 @@ static int io_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		}
 	} else
 		kvm_run->io.value = vcpu->svm->vmcb->save.rax;
+	vcpu->pio_pending = 1;
 	return 0;
 }
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index a721b60..4d5f40f 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1459,12 +1459,14 @@ static int handle_io(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		= (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_DF) != 0;
 	kvm_run->io.rep = (exit_qualification & 32) != 0;
 	kvm_run->io.port = exit_qualification >> 16;
+	kvm_run->io.count = 1;
 	if (kvm_run->io.string) {
 		if (!get_io_count(vcpu, &kvm_run->io.count))
 			return 1;
 		kvm_run->io.address = vmcs_readl(GUEST_LINEAR_ADDRESS);
 	} else
 		kvm_run->io.value = vcpu->regs[VCPU_REGS_RAX]; /* rax */
+	vcpu->pio_pending = 1;
 	return 0;
 }
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index d88e750..19aeb33 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 5
+#define KVM_API_VERSION 6
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -53,7 +53,7 @@ enum kvm_exit_reason {
 struct kvm_run {
 	/* in */
 	__u32 emulated;  /* skip current instruction */
-	__u32 mmio_completed; /* mmio request completed */
+	__u32 io_completed; /* mmio/pio request completed */
 	__u8 request_interrupt_window;
 	__u8 padding1[7];
 
@@ -80,7 +80,7 @@ struct kvm_run {
 			__u32 error_code;
 		} ex;
 		/* KVM_EXIT_IO */
-		struct {
+		struct kvm_io {
 #define KVM_EXIT_IO_IN  0
 #define KVM_EXIT_IO_OUT 1
 			__u8 direction;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 07/41] KVM: Do not communicate to userspace through cpu registers during PIO
@ 2007-04-01 14:35               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Currently when passing the a PIO emulation request to userspace, we
rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx
(on string instructions).  This (a) requires two extra ioctls for getting
and setting the registers and (b) is unfriendly to non-x86 archs, when
they get kvm ports.

So fix by doing the register fixups in the kernel and passing to userspace
only an abstract description of the PIO to be done.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    1 +
 drivers/kvm/kvm_main.c |   48 +++++++++++++++++++++++++++++++++++++++++++++---
 drivers/kvm/svm.c      |    2 ++
 drivers/kvm/vmx.c      |    2 ++
 include/linux/kvm.h    |    6 +++---
 5 files changed, 53 insertions(+), 6 deletions(-)
 mode change 100755 => 100644 drivers/kvm/kvm_main.c

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 901b8d9..59cbc5b 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -274,6 +274,7 @@ struct kvm_vcpu {
 	int mmio_size;
 	unsigned char mmio_data[8];
 	gpa_t mmio_phys_addr;
+	int pio_pending;
 
 	struct {
 		int active;
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
old mode 100755
new mode 100644
index 42be8a8..ff8bcfe
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1504,6 +1504,44 @@ void save_msrs(struct vmx_msr_entry *e, int n)
 }
 EXPORT_SYMBOL_GPL(save_msrs);
 
+static void complete_pio(struct kvm_vcpu *vcpu)
+{
+	struct kvm_io *io = &vcpu->run->io;
+	long delta;
+
+	kvm_arch_ops->cache_regs(vcpu);
+
+	if (!io->string) {
+		if (io->direction == KVM_EXIT_IO_IN)
+			memcpy(&vcpu->regs[VCPU_REGS_RAX], &io->value,
+			       io->size);
+	} else {
+		delta = 1;
+		if (io->rep) {
+			delta *= io->count;
+			/*
+			 * The size of the register should really depend on
+			 * current address size.
+			 */
+			vcpu->regs[VCPU_REGS_RCX] -= delta;
+		}
+		if (io->string_down)
+			delta = -delta;
+		delta *= io->size;
+		if (io->direction == KVM_EXIT_IO_IN)
+			vcpu->regs[VCPU_REGS_RDI] += delta;
+		else
+			vcpu->regs[VCPU_REGS_RSI] += delta;
+	}
+
+	vcpu->pio_pending = 0;
+	vcpu->run->io_completed = 0;
+
+	kvm_arch_ops->decache_regs(vcpu);
+
+	kvm_arch_ops->skip_emulated_instruction(vcpu);
+}
+
 static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	int r;
@@ -1518,9 +1556,13 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		kvm_run->emulated = 0;
 	}
 
-	if (kvm_run->mmio_completed) {
-		memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
-		vcpu->mmio_read_completed = 1;
+	if (kvm_run->io_completed) {
+		if (vcpu->pio_pending)
+			complete_pio(vcpu);
+		else {
+			memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
+			vcpu->mmio_read_completed = 1;
+		}
 	}
 
 	vcpu->mmio_needed = 0;
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 6787f11..c35b8c8 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1037,6 +1037,7 @@ static int io_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	kvm_run->io.size = ((io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT);
 	kvm_run->io.string = (io_info & SVM_IOIO_STR_MASK) != 0;
 	kvm_run->io.rep = (io_info & SVM_IOIO_REP_MASK) != 0;
+	kvm_run->io.count = 1;
 
 	if (kvm_run->io.string) {
 		unsigned addr_mask;
@@ -1056,6 +1057,7 @@ static int io_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		}
 	} else
 		kvm_run->io.value = vcpu->svm->vmcb->save.rax;
+	vcpu->pio_pending = 1;
 	return 0;
 }
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index a721b60..4d5f40f 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1459,12 +1459,14 @@ static int handle_io(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		= (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_DF) != 0;
 	kvm_run->io.rep = (exit_qualification & 32) != 0;
 	kvm_run->io.port = exit_qualification >> 16;
+	kvm_run->io.count = 1;
 	if (kvm_run->io.string) {
 		if (!get_io_count(vcpu, &kvm_run->io.count))
 			return 1;
 		kvm_run->io.address = vmcs_readl(GUEST_LINEAR_ADDRESS);
 	} else
 		kvm_run->io.value = vcpu->regs[VCPU_REGS_RAX]; /* rax */
+	vcpu->pio_pending = 1;
 	return 0;
 }
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index d88e750..19aeb33 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 5
+#define KVM_API_VERSION 6
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -53,7 +53,7 @@ enum kvm_exit_reason {
 struct kvm_run {
 	/* in */
 	__u32 emulated;  /* skip current instruction */
-	__u32 mmio_completed; /* mmio request completed */
+	__u32 io_completed; /* mmio/pio request completed */
 	__u8 request_interrupt_window;
 	__u8 padding1[7];
 
@@ -80,7 +80,7 @@ struct kvm_run {
 			__u32 error_code;
 		} ex;
 		/* KVM_EXIT_IO */
-		struct {
+		struct kvm_io {
 #define KVM_EXIT_IO_IN  0
 #define KVM_EXIT_IO_OUT 1
 			__u8 direction;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 08/41] KVM: Handle cpuid in the kernel instead of punting to userspace
@ 2007-04-01 14:35                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

KVM used to handle cpuid by letting userspace decide what values to
return to the guest.  We now handle cpuid completely in the kernel.  We
still let userspace decide which values the guest will see by having
userspace set up the value table beforehand (this is necessary to allow
management software to set the cpu features to the least common denominator,
so that live migration can work).

The motivation for the change is that kvm kernel code can be impacted by
cpuid features, for example the x86 emulator.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    5 +++
 drivers/kvm/kvm_main.c |   69 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/kvm/svm.c      |    4 +-
 drivers/kvm/vmx.c      |    4 +-
 include/linux/kvm.h    |   18 ++++++++++++-
 5 files changed, 95 insertions(+), 5 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 59cbc5b..be3a0e7 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -55,6 +55,7 @@
 #define KVM_NUM_MMU_PAGES 256
 #define KVM_MIN_FREE_MMU_PAGES 5
 #define KVM_REFILL_PAGES 25
+#define KVM_MAX_CPUID_ENTRIES 40
 
 #define FX_IMAGE_SIZE 512
 #define FX_IMAGE_ALIGN 16
@@ -286,6 +287,9 @@ struct kvm_vcpu {
 			u32 ar;
 		} tr, es, ds, fs, gs;
 	} rmode;
+
+	int cpuid_nent;
+	struct kvm_cpuid_entry cpuid_entries[KVM_MAX_CPUID_ENTRIES];
 };
 
 struct kvm_memory_slot {
@@ -446,6 +450,7 @@ void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value,
 
 struct x86_emulate_ctxt;
 
+void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address);
 int emulate_clts(struct kvm_vcpu *vcpu);
 int emulator_get_dr(struct x86_emulate_ctxt* ctxt, int dr,
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index ff8bcfe..caec54f 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1504,6 +1504,43 @@ void save_msrs(struct vmx_msr_entry *e, int n)
 }
 EXPORT_SYMBOL_GPL(save_msrs);
 
+void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
+{
+	int i;
+	u32 function;
+	struct kvm_cpuid_entry *e, *best;
+
+	kvm_arch_ops->cache_regs(vcpu);
+	function = vcpu->regs[VCPU_REGS_RAX];
+	vcpu->regs[VCPU_REGS_RAX] = 0;
+	vcpu->regs[VCPU_REGS_RBX] = 0;
+	vcpu->regs[VCPU_REGS_RCX] = 0;
+	vcpu->regs[VCPU_REGS_RDX] = 0;
+	best = NULL;
+	for (i = 0; i < vcpu->cpuid_nent; ++i) {
+		e = &vcpu->cpuid_entries[i];
+		if (e->function == function) {
+			best = e;
+			break;
+		}
+		/*
+		 * Both basic or both extended?
+		 */
+		if (((e->function ^ function) & 0x80000000) == 0)
+			if (!best || e->function > best->function)
+				best = e;
+	}
+	if (best) {
+		vcpu->regs[VCPU_REGS_RAX] = best->eax;
+		vcpu->regs[VCPU_REGS_RBX] = best->ebx;
+		vcpu->regs[VCPU_REGS_RCX] = best->ecx;
+		vcpu->regs[VCPU_REGS_RDX] = best->edx;
+	}
+	kvm_arch_ops->decache_regs(vcpu);
+	kvm_arch_ops->skip_emulated_instruction(vcpu);
+}
+EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
+
 static void complete_pio(struct kvm_vcpu *vcpu)
 {
 	struct kvm_io *io = &vcpu->run->io;
@@ -2075,6 +2112,26 @@ out:
 	return r;
 }
 
+static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
+				    struct kvm_cpuid *cpuid,
+				    struct kvm_cpuid_entry __user *entries)
+{
+	int r;
+
+	r = -E2BIG;
+	if (cpuid->nent > KVM_MAX_CPUID_ENTRIES)
+		goto out;
+	r = -EFAULT;
+	if (copy_from_user(&vcpu->cpuid_entries, entries,
+			   cpuid->nent * sizeof(struct kvm_cpuid_entry)))
+		goto out;
+	vcpu->cpuid_nent = cpuid->nent;
+	return 0;
+
+out:
+	return r;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2181,6 +2238,18 @@ static long kvm_vcpu_ioctl(struct file *filp,
 	case KVM_SET_MSRS:
 		r = msr_io(vcpu, argp, do_set_msr, 0);
 		break;
+	case KVM_SET_CPUID: {
+		struct kvm_cpuid __user *cpuid_arg = argp;
+		struct kvm_cpuid cpuid;
+
+		r = -EFAULT;
+		if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
+			goto out;
+		r = kvm_vcpu_ioctl_set_cpuid(vcpu, &cpuid, cpuid_arg->entries);
+		if (r)
+			goto out;
+		break;
+	}
 	default:
 		;
 	}
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index c35b8c8..d4b2936 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1101,8 +1101,8 @@ static int task_switch_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_r
 static int cpuid_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	vcpu->svm->next_rip = vcpu->svm->vmcb->save.rip + 2;
-	kvm_run->exit_reason = KVM_EXIT_CPUID;
-	return 0;
+	kvm_emulate_cpuid(vcpu);
+	return 1;
 }
 
 static int emulate_on_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 4d5f40f..71410a6 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1585,8 +1585,8 @@ static int handle_dr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 static int handle_cpuid(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
-	kvm_run->exit_reason = KVM_EXIT_CPUID;
-	return 0;
+	kvm_emulate_cpuid(vcpu);
+	return 1;
 }
 
 static int handle_rdmsr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 19aeb33..15e23bc 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -41,7 +41,6 @@ enum kvm_exit_reason {
 	KVM_EXIT_UNKNOWN          = 0,
 	KVM_EXIT_EXCEPTION        = 1,
 	KVM_EXIT_IO               = 2,
-	KVM_EXIT_CPUID            = 3,
 	KVM_EXIT_DEBUG            = 4,
 	KVM_EXIT_HLT              = 5,
 	KVM_EXIT_MMIO             = 6,
@@ -210,6 +209,22 @@ struct kvm_dirty_log {
 	};
 };
 
+struct kvm_cpuid_entry {
+	__u32 function;
+	__u32 eax;
+	__u32 ebx;
+	__u32 ecx;
+	__u32 edx;
+	__u32 padding;
+};
+
+/* for KVM_SET_CPUID */
+struct kvm_cpuid {
+	__u32 nent;
+	__u32 padding;
+	struct kvm_cpuid_entry entries[0];
+};
+
 #define KVMIO 0xAE
 
 /*
@@ -243,5 +258,6 @@ struct kvm_dirty_log {
 #define KVM_DEBUG_GUEST           _IOW(KVMIO, 9, struct kvm_debug_guest)
 #define KVM_GET_MSRS              _IOWR(KVMIO, 13, struct kvm_msrs)
 #define KVM_SET_MSRS              _IOW(KVMIO, 14, struct kvm_msrs)
+#define KVM_SET_CPUID             _IOW(KVMIO, 17, struct kvm_cpuid)
 
 #endif
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 08/41] KVM: Handle cpuid in the kernel instead of punting to userspace
@ 2007-04-01 14:35                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

KVM used to handle cpuid by letting userspace decide what values to
return to the guest.  We now handle cpuid completely in the kernel.  We
still let userspace decide which values the guest will see by having
userspace set up the value table beforehand (this is necessary to allow
management software to set the cpu features to the least common denominator,
so that live migration can work).

The motivation for the change is that kvm kernel code can be impacted by
cpuid features, for example the x86 emulator.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    5 +++
 drivers/kvm/kvm_main.c |   69 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/kvm/svm.c      |    4 +-
 drivers/kvm/vmx.c      |    4 +-
 include/linux/kvm.h    |   18 ++++++++++++-
 5 files changed, 95 insertions(+), 5 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 59cbc5b..be3a0e7 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -55,6 +55,7 @@
 #define KVM_NUM_MMU_PAGES 256
 #define KVM_MIN_FREE_MMU_PAGES 5
 #define KVM_REFILL_PAGES 25
+#define KVM_MAX_CPUID_ENTRIES 40
 
 #define FX_IMAGE_SIZE 512
 #define FX_IMAGE_ALIGN 16
@@ -286,6 +287,9 @@ struct kvm_vcpu {
 			u32 ar;
 		} tr, es, ds, fs, gs;
 	} rmode;
+
+	int cpuid_nent;
+	struct kvm_cpuid_entry cpuid_entries[KVM_MAX_CPUID_ENTRIES];
 };
 
 struct kvm_memory_slot {
@@ -446,6 +450,7 @@ void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value,
 
 struct x86_emulate_ctxt;
 
+void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address);
 int emulate_clts(struct kvm_vcpu *vcpu);
 int emulator_get_dr(struct x86_emulate_ctxt* ctxt, int dr,
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index ff8bcfe..caec54f 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1504,6 +1504,43 @@ void save_msrs(struct vmx_msr_entry *e, int n)
 }
 EXPORT_SYMBOL_GPL(save_msrs);
 
+void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
+{
+	int i;
+	u32 function;
+	struct kvm_cpuid_entry *e, *best;
+
+	kvm_arch_ops->cache_regs(vcpu);
+	function = vcpu->regs[VCPU_REGS_RAX];
+	vcpu->regs[VCPU_REGS_RAX] = 0;
+	vcpu->regs[VCPU_REGS_RBX] = 0;
+	vcpu->regs[VCPU_REGS_RCX] = 0;
+	vcpu->regs[VCPU_REGS_RDX] = 0;
+	best = NULL;
+	for (i = 0; i < vcpu->cpuid_nent; ++i) {
+		e = &vcpu->cpuid_entries[i];
+		if (e->function == function) {
+			best = e;
+			break;
+		}
+		/*
+		 * Both basic or both extended?
+		 */
+		if (((e->function ^ function) & 0x80000000) == 0)
+			if (!best || e->function > best->function)
+				best = e;
+	}
+	if (best) {
+		vcpu->regs[VCPU_REGS_RAX] = best->eax;
+		vcpu->regs[VCPU_REGS_RBX] = best->ebx;
+		vcpu->regs[VCPU_REGS_RCX] = best->ecx;
+		vcpu->regs[VCPU_REGS_RDX] = best->edx;
+	}
+	kvm_arch_ops->decache_regs(vcpu);
+	kvm_arch_ops->skip_emulated_instruction(vcpu);
+}
+EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
+
 static void complete_pio(struct kvm_vcpu *vcpu)
 {
 	struct kvm_io *io = &vcpu->run->io;
@@ -2075,6 +2112,26 @@ out:
 	return r;
 }
 
+static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
+				    struct kvm_cpuid *cpuid,
+				    struct kvm_cpuid_entry __user *entries)
+{
+	int r;
+
+	r = -E2BIG;
+	if (cpuid->nent > KVM_MAX_CPUID_ENTRIES)
+		goto out;
+	r = -EFAULT;
+	if (copy_from_user(&vcpu->cpuid_entries, entries,
+			   cpuid->nent * sizeof(struct kvm_cpuid_entry)))
+		goto out;
+	vcpu->cpuid_nent = cpuid->nent;
+	return 0;
+
+out:
+	return r;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2181,6 +2238,18 @@ static long kvm_vcpu_ioctl(struct file *filp,
 	case KVM_SET_MSRS:
 		r = msr_io(vcpu, argp, do_set_msr, 0);
 		break;
+	case KVM_SET_CPUID: {
+		struct kvm_cpuid __user *cpuid_arg = argp;
+		struct kvm_cpuid cpuid;
+
+		r = -EFAULT;
+		if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
+			goto out;
+		r = kvm_vcpu_ioctl_set_cpuid(vcpu, &cpuid, cpuid_arg->entries);
+		if (r)
+			goto out;
+		break;
+	}
 	default:
 		;
 	}
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index c35b8c8..d4b2936 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1101,8 +1101,8 @@ static int task_switch_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_r
 static int cpuid_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	vcpu->svm->next_rip = vcpu->svm->vmcb->save.rip + 2;
-	kvm_run->exit_reason = KVM_EXIT_CPUID;
-	return 0;
+	kvm_emulate_cpuid(vcpu);
+	return 1;
 }
 
 static int emulate_on_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 4d5f40f..71410a6 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1585,8 +1585,8 @@ static int handle_dr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 static int handle_cpuid(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
-	kvm_run->exit_reason = KVM_EXIT_CPUID;
-	return 0;
+	kvm_emulate_cpuid(vcpu);
+	return 1;
 }
 
 static int handle_rdmsr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 19aeb33..15e23bc 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -41,7 +41,6 @@ enum kvm_exit_reason {
 	KVM_EXIT_UNKNOWN          = 0,
 	KVM_EXIT_EXCEPTION        = 1,
 	KVM_EXIT_IO               = 2,
-	KVM_EXIT_CPUID            = 3,
 	KVM_EXIT_DEBUG            = 4,
 	KVM_EXIT_HLT              = 5,
 	KVM_EXIT_MMIO             = 6,
@@ -210,6 +209,22 @@ struct kvm_dirty_log {
 	};
 };
 
+struct kvm_cpuid_entry {
+	__u32 function;
+	__u32 eax;
+	__u32 ebx;
+	__u32 ecx;
+	__u32 edx;
+	__u32 padding;
+};
+
+/* for KVM_SET_CPUID */
+struct kvm_cpuid {
+	__u32 nent;
+	__u32 padding;
+	struct kvm_cpuid_entry entries[0];
+};
+
 #define KVMIO 0xAE
 
 /*
@@ -243,5 +258,6 @@ struct kvm_dirty_log {
 #define KVM_DEBUG_GUEST           _IOW(KVMIO, 9, struct kvm_debug_guest)
 #define KVM_GET_MSRS              _IOWR(KVMIO, 13, struct kvm_msrs)
 #define KVM_SET_MSRS              _IOW(KVMIO, 14, struct kvm_msrs)
+#define KVM_SET_CPUID             _IOW(KVMIO, 17, struct kvm_cpuid)
 
 #endif
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 09/41] KVM: Remove the 'emulated' field from the userspace interface
@ 2007-04-01 14:35                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

We no longer emulate single instructions in userspace.  Instead, we service
mmio or pio requests.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    5 -----
 include/linux/kvm.h    |    3 +--
 2 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index caec54f..5d24203 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1588,11 +1588,6 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	/* re-sync apic's tpr */
 	vcpu->cr8 = kvm_run->cr8;
 
-	if (kvm_run->emulated) {
-		kvm_arch_ops->skip_emulated_instruction(vcpu);
-		kvm_run->emulated = 0;
-	}
-
 	if (kvm_run->io_completed) {
 		if (vcpu->pio_pending)
 			complete_pio(vcpu);
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 15e23bc..c6dd4a7 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -51,10 +51,9 @@ enum kvm_exit_reason {
 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
 struct kvm_run {
 	/* in */
-	__u32 emulated;  /* skip current instruction */
 	__u32 io_completed; /* mmio/pio request completed */
 	__u8 request_interrupt_window;
-	__u8 padding1[7];
+	__u8 padding1[3];
 
 	/* out */
 	__u32 exit_type;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 09/41] KVM: Remove the 'emulated' field from the userspace interface
@ 2007-04-01 14:35                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

We no longer emulate single instructions in userspace.  Instead, we service
mmio or pio requests.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |    5 -----
 include/linux/kvm.h    |    3 +--
 2 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index caec54f..5d24203 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1588,11 +1588,6 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	/* re-sync apic's tpr */
 	vcpu->cr8 = kvm_run->cr8;
 
-	if (kvm_run->emulated) {
-		kvm_arch_ops->skip_emulated_instruction(vcpu);
-		kvm_run->emulated = 0;
-	}
-
 	if (kvm_run->io_completed) {
 		if (vcpu->pio_pending)
 			complete_pio(vcpu);
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 15e23bc..c6dd4a7 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -51,10 +51,9 @@ enum kvm_exit_reason {
 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
 struct kvm_run {
 	/* in */
-	__u32 emulated;  /* skip current instruction */
 	__u32 io_completed; /* mmio/pio request completed */
 	__u8 request_interrupt_window;
-	__u8 padding1[7];
+	__u8 padding1[3];
 
 	/* out */
 	__u32 exit_type;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 10/41] KVM: Remove minor wart from KVM_CREATE_VCPU ioctl
@ 2007-04-01 14:35                     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

That ioctl does not transfer any data, so it should be an _IO rather than an
_IOW.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 include/linux/kvm.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c6dd4a7..d89189a 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -241,7 +241,7 @@ struct kvm_cpuid {
  * KVM_CREATE_VCPU receives as a parameter the vcpu slot, and returns
  * a vcpu fd.
  */
-#define KVM_CREATE_VCPU           _IOW(KVMIO, 11, int)
+#define KVM_CREATE_VCPU           _IO(KVMIO, 11)
 #define KVM_GET_DIRTY_LOG         _IOW(KVMIO, 12, struct kvm_dirty_log)
 
 /*
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 10/41] KVM: Remove minor wart from KVM_CREATE_VCPU ioctl
@ 2007-04-01 14:35                     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

That ioctl does not transfer any data, so it should be an _IO rather than an
_IOW.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 include/linux/kvm.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c6dd4a7..d89189a 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -241,7 +241,7 @@ struct kvm_cpuid {
  * KVM_CREATE_VCPU receives as a parameter the vcpu slot, and returns
  * a vcpu fd.
  */
-#define KVM_CREATE_VCPU           _IOW(KVMIO, 11, int)
+#define KVM_CREATE_VCPU           _IO(KVMIO, 11)
 #define KVM_GET_DIRTY_LOG         _IOW(KVMIO, 12, struct kvm_dirty_log)
 
 /*
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 11/41] KVM: Renumber ioctls
@ 2007-04-01 14:35                       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The recent changes have left the ioctl numbers in complete disarray.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 include/linux/kvm.h |   34 +++++++++++++++++-----------------
 1 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index d89189a..93472da 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -229,34 +229,34 @@ struct kvm_cpuid {
 /*
  * ioctls for /dev/kvm fds:
  */
-#define KVM_GET_API_VERSION       _IO(KVMIO, 1)
-#define KVM_CREATE_VM             _IO(KVMIO, 2) /* returns a VM fd */
-#define KVM_GET_MSR_INDEX_LIST    _IOWR(KVMIO, 15, struct kvm_msr_list)
+#define KVM_GET_API_VERSION       _IO(KVMIO,   0x00)
+#define KVM_CREATE_VM             _IO(KVMIO,   0x01) /* returns a VM fd */
+#define KVM_GET_MSR_INDEX_LIST    _IOWR(KVMIO, 0x02, struct kvm_msr_list)
 
 /*
  * ioctls for VM fds
  */
-#define KVM_SET_MEMORY_REGION     _IOW(KVMIO, 10, struct kvm_memory_region)
+#define KVM_SET_MEMORY_REGION     _IOW(KVMIO, 0x40, struct kvm_memory_region)
 /*
  * KVM_CREATE_VCPU receives as a parameter the vcpu slot, and returns
  * a vcpu fd.
  */
-#define KVM_CREATE_VCPU           _IO(KVMIO, 11)
-#define KVM_GET_DIRTY_LOG         _IOW(KVMIO, 12, struct kvm_dirty_log)
+#define KVM_CREATE_VCPU           _IO(KVMIO,  0x41)
+#define KVM_GET_DIRTY_LOG         _IOW(KVMIO, 0x42, struct kvm_dirty_log)
 
 /*
  * ioctls for vcpu fds
  */
-#define KVM_RUN                   _IO(KVMIO, 16)
-#define KVM_GET_REGS              _IOR(KVMIO, 3, struct kvm_regs)
-#define KVM_SET_REGS              _IOW(KVMIO, 4, struct kvm_regs)
-#define KVM_GET_SREGS             _IOR(KVMIO, 5, struct kvm_sregs)
-#define KVM_SET_SREGS             _IOW(KVMIO, 6, struct kvm_sregs)
-#define KVM_TRANSLATE             _IOWR(KVMIO, 7, struct kvm_translation)
-#define KVM_INTERRUPT             _IOW(KVMIO, 8, struct kvm_interrupt)
-#define KVM_DEBUG_GUEST           _IOW(KVMIO, 9, struct kvm_debug_guest)
-#define KVM_GET_MSRS              _IOWR(KVMIO, 13, struct kvm_msrs)
-#define KVM_SET_MSRS              _IOW(KVMIO, 14, struct kvm_msrs)
-#define KVM_SET_CPUID             _IOW(KVMIO, 17, struct kvm_cpuid)
+#define KVM_RUN                   _IO(KVMIO,   0x80)
+#define KVM_GET_REGS              _IOR(KVMIO,  0x81, struct kvm_regs)
+#define KVM_SET_REGS              _IOW(KVMIO,  0x82, struct kvm_regs)
+#define KVM_GET_SREGS             _IOR(KVMIO,  0x83, struct kvm_sregs)
+#define KVM_SET_SREGS             _IOW(KVMIO,  0x84, struct kvm_sregs)
+#define KVM_TRANSLATE             _IOWR(KVMIO, 0x85, struct kvm_translation)
+#define KVM_INTERRUPT             _IOW(KVMIO,  0x86, struct kvm_interrupt)
+#define KVM_DEBUG_GUEST           _IOW(KVMIO,  0x87, struct kvm_debug_guest)
+#define KVM_GET_MSRS              _IOWR(KVMIO, 0x88, struct kvm_msrs)
+#define KVM_SET_MSRS              _IOW(KVMIO,  0x89, struct kvm_msrs)
+#define KVM_SET_CPUID             _IOW(KVMIO,  0x8a, struct kvm_cpuid)
 
 #endif
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 11/41] KVM: Renumber ioctls
@ 2007-04-01 14:35                       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

The recent changes have left the ioctl numbers in complete disarray.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 include/linux/kvm.h |   34 +++++++++++++++++-----------------
 1 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index d89189a..93472da 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -229,34 +229,34 @@ struct kvm_cpuid {
 /*
  * ioctls for /dev/kvm fds:
  */
-#define KVM_GET_API_VERSION       _IO(KVMIO, 1)
-#define KVM_CREATE_VM             _IO(KVMIO, 2) /* returns a VM fd */
-#define KVM_GET_MSR_INDEX_LIST    _IOWR(KVMIO, 15, struct kvm_msr_list)
+#define KVM_GET_API_VERSION       _IO(KVMIO,   0x00)
+#define KVM_CREATE_VM             _IO(KVMIO,   0x01) /* returns a VM fd */
+#define KVM_GET_MSR_INDEX_LIST    _IOWR(KVMIO, 0x02, struct kvm_msr_list)
 
 /*
  * ioctls for VM fds
  */
-#define KVM_SET_MEMORY_REGION     _IOW(KVMIO, 10, struct kvm_memory_region)
+#define KVM_SET_MEMORY_REGION     _IOW(KVMIO, 0x40, struct kvm_memory_region)
 /*
  * KVM_CREATE_VCPU receives as a parameter the vcpu slot, and returns
  * a vcpu fd.
  */
-#define KVM_CREATE_VCPU           _IO(KVMIO, 11)
-#define KVM_GET_DIRTY_LOG         _IOW(KVMIO, 12, struct kvm_dirty_log)
+#define KVM_CREATE_VCPU           _IO(KVMIO,  0x41)
+#define KVM_GET_DIRTY_LOG         _IOW(KVMIO, 0x42, struct kvm_dirty_log)
 
 /*
  * ioctls for vcpu fds
  */
-#define KVM_RUN                   _IO(KVMIO, 16)
-#define KVM_GET_REGS              _IOR(KVMIO, 3, struct kvm_regs)
-#define KVM_SET_REGS              _IOW(KVMIO, 4, struct kvm_regs)
-#define KVM_GET_SREGS             _IOR(KVMIO, 5, struct kvm_sregs)
-#define KVM_SET_SREGS             _IOW(KVMIO, 6, struct kvm_sregs)
-#define KVM_TRANSLATE             _IOWR(KVMIO, 7, struct kvm_translation)
-#define KVM_INTERRUPT             _IOW(KVMIO, 8, struct kvm_interrupt)
-#define KVM_DEBUG_GUEST           _IOW(KVMIO, 9, struct kvm_debug_guest)
-#define KVM_GET_MSRS              _IOWR(KVMIO, 13, struct kvm_msrs)
-#define KVM_SET_MSRS              _IOW(KVMIO, 14, struct kvm_msrs)
-#define KVM_SET_CPUID             _IOW(KVMIO, 17, struct kvm_cpuid)
+#define KVM_RUN                   _IO(KVMIO,   0x80)
+#define KVM_GET_REGS              _IOR(KVMIO,  0x81, struct kvm_regs)
+#define KVM_SET_REGS              _IOW(KVMIO,  0x82, struct kvm_regs)
+#define KVM_GET_SREGS             _IOR(KVMIO,  0x83, struct kvm_sregs)
+#define KVM_SET_SREGS             _IOW(KVMIO,  0x84, struct kvm_sregs)
+#define KVM_TRANSLATE             _IOWR(KVMIO, 0x85, struct kvm_translation)
+#define KVM_INTERRUPT             _IOW(KVMIO,  0x86, struct kvm_interrupt)
+#define KVM_DEBUG_GUEST           _IOW(KVMIO,  0x87, struct kvm_debug_guest)
+#define KVM_GET_MSRS              _IOWR(KVMIO, 0x88, struct kvm_msrs)
+#define KVM_SET_MSRS              _IOW(KVMIO,  0x89, struct kvm_msrs)
+#define KVM_SET_CPUID             _IOW(KVMIO,  0x8a, struct kvm_cpuid)
 
 #endif
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 12/41] KVM: Add method to check for backwards-compatible API extensions
@ 2007-04-01 14:35                         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    6 ++++++
 include/linux/kvm.h    |    5 +++++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 5d24203..39cf8fd 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2416,6 +2416,12 @@ static long kvm_dev_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+	case KVM_CHECK_EXTENSION:
+		/*
+		 * No extensions defined at present.
+		 */
+		r = 0;
+		break;
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 93472da..c93cf53 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -232,6 +232,11 @@ struct kvm_cpuid {
 #define KVM_GET_API_VERSION       _IO(KVMIO,   0x00)
 #define KVM_CREATE_VM             _IO(KVMIO,   0x01) /* returns a VM fd */
 #define KVM_GET_MSR_INDEX_LIST    _IOWR(KVMIO, 0x02, struct kvm_msr_list)
+/*
+ * Check if a kvm extension is available.  Argument is extension number,
+ * return is 1 (yes) or 0 (no, sorry).
+ */
+#define KVM_CHECK_EXTENSION       _IO(KVMIO,   0x03)
 
 /*
  * ioctls for VM fds
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 12/41] KVM: Add method to check for backwards-compatible API extensions
@ 2007-04-01 14:35                         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |    6 ++++++
 include/linux/kvm.h    |    5 +++++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 5d24203..39cf8fd 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2416,6 +2416,12 @@ static long kvm_dev_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+	case KVM_CHECK_EXTENSION:
+		/*
+		 * No extensions defined at present.
+		 */
+		r = 0;
+		break;
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 93472da..c93cf53 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -232,6 +232,11 @@ struct kvm_cpuid {
 #define KVM_GET_API_VERSION       _IO(KVMIO,   0x00)
 #define KVM_CREATE_VM             _IO(KVMIO,   0x01) /* returns a VM fd */
 #define KVM_GET_MSR_INDEX_LIST    _IOWR(KVMIO, 0x02, struct kvm_msr_list)
+/*
+ * Check if a kvm extension is available.  Argument is extension number,
+ * return is 1 (yes) or 0 (no, sorry).
+ */
+#define KVM_CHECK_EXTENSION       _IO(KVMIO,   0x03)
 
 /*
  * ioctls for VM fds
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 13/41] KVM: Allow userspace to process hypercalls which have no kernel handler
@ 2007-04-01 14:35                           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This is useful for paravirtualized graphics devices, for example.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |   18 +++++++++++++++++-
 include/linux/kvm.h    |   10 +++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 39cf8fd..de93117 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1203,7 +1203,16 @@ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	}
 	switch (nr) {
 	default:
-		;
+		run->hypercall.args[0] = a0;
+		run->hypercall.args[1] = a1;
+		run->hypercall.args[2] = a2;
+		run->hypercall.args[3] = a3;
+		run->hypercall.args[4] = a4;
+		run->hypercall.args[5] = a5;
+		run->hypercall.ret = ret;
+		run->hypercall.longmode = is_long_mode(vcpu);
+		kvm_arch_ops->decache_regs(vcpu);
+		return 0;
 	}
 	vcpu->regs[VCPU_REGS_RAX] = ret;
 	kvm_arch_ops->decache_regs(vcpu);
@@ -1599,6 +1608,13 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	vcpu->mmio_needed = 0;
 
+	if (kvm_run->exit_type == KVM_EXIT_TYPE_VM_EXIT
+	    && kvm_run->exit_reason == KVM_EXIT_HYPERCALL) {
+		kvm_arch_ops->cache_regs(vcpu);
+		vcpu->regs[VCPU_REGS_RAX] = kvm_run->hypercall.ret;
+		kvm_arch_ops->decache_regs(vcpu);
+	}
+
 	r = kvm_arch_ops->run(vcpu, kvm_run);
 
 	vcpu_put(vcpu);
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c93cf53..9151ebf 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 6
+#define KVM_API_VERSION 7
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -41,6 +41,7 @@ enum kvm_exit_reason {
 	KVM_EXIT_UNKNOWN          = 0,
 	KVM_EXIT_EXCEPTION        = 1,
 	KVM_EXIT_IO               = 2,
+	KVM_EXIT_HYPERCALL        = 3,
 	KVM_EXIT_DEBUG            = 4,
 	KVM_EXIT_HLT              = 5,
 	KVM_EXIT_MMIO             = 6,
@@ -103,6 +104,13 @@ struct kvm_run {
 			__u32 len;
 			__u8  is_write;
 		} mmio;
+		/* KVM_EXIT_HYPERCALL */
+		struct {
+			__u64 args[6];
+			__u64 ret;
+			__u32 longmode;
+			__u32 pad;
+		} hypercall;
 	};
 };
 
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 13/41] KVM: Allow userspace to process hypercalls which have no kernel handler
@ 2007-04-01 14:35                           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

This is useful for paravirtualized graphics devices, for example.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |   18 +++++++++++++++++-
 include/linux/kvm.h    |   10 +++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 39cf8fd..de93117 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1203,7 +1203,16 @@ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	}
 	switch (nr) {
 	default:
-		;
+		run->hypercall.args[0] = a0;
+		run->hypercall.args[1] = a1;
+		run->hypercall.args[2] = a2;
+		run->hypercall.args[3] = a3;
+		run->hypercall.args[4] = a4;
+		run->hypercall.args[5] = a5;
+		run->hypercall.ret = ret;
+		run->hypercall.longmode = is_long_mode(vcpu);
+		kvm_arch_ops->decache_regs(vcpu);
+		return 0;
 	}
 	vcpu->regs[VCPU_REGS_RAX] = ret;
 	kvm_arch_ops->decache_regs(vcpu);
@@ -1599,6 +1608,13 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	vcpu->mmio_needed = 0;
 
+	if (kvm_run->exit_type == KVM_EXIT_TYPE_VM_EXIT
+	    && kvm_run->exit_reason == KVM_EXIT_HYPERCALL) {
+		kvm_arch_ops->cache_regs(vcpu);
+		vcpu->regs[VCPU_REGS_RAX] = kvm_run->hypercall.ret;
+		kvm_arch_ops->decache_regs(vcpu);
+	}
+
 	r = kvm_arch_ops->run(vcpu, kvm_run);
 
 	vcpu_put(vcpu);
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c93cf53..9151ebf 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 6
+#define KVM_API_VERSION 7
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -41,6 +41,7 @@ enum kvm_exit_reason {
 	KVM_EXIT_UNKNOWN          = 0,
 	KVM_EXIT_EXCEPTION        = 1,
 	KVM_EXIT_IO               = 2,
+	KVM_EXIT_HYPERCALL        = 3,
 	KVM_EXIT_DEBUG            = 4,
 	KVM_EXIT_HLT              = 5,
 	KVM_EXIT_MMIO             = 6,
@@ -103,6 +104,13 @@ struct kvm_run {
 			__u32 len;
 			__u8  is_write;
 		} mmio;
+		/* KVM_EXIT_HYPERCALL */
+		struct {
+			__u64 args[6];
+			__u64 ret;
+			__u32 longmode;
+			__u32 pad;
+		} hypercall;
 	};
 };
 
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 14/41] KVM: Fold kvm_run::exit_type into kvm_run::exit_reason
@ 2007-04-01 14:35                             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Currently, userspace is told about the nature of the last exit from the
guest using two fields, exit_type and exit_reason, where exit_type has
just two enumerations (and no need for more).  So fold exit_type into
exit_reason, reducing the complexity of determining what really happened.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    3 +--
 drivers/kvm/svm.c      |    7 +++----
 drivers/kvm/vmx.c      |    7 +++----
 include/linux/kvm.h    |   15 ++++++++-------
 4 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index de93117..ac44df5 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1608,8 +1608,7 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	vcpu->mmio_needed = 0;
 
-	if (kvm_run->exit_type == KVM_EXIT_TYPE_VM_EXIT
-	    && kvm_run->exit_reason == KVM_EXIT_HYPERCALL) {
+	if (kvm_run->exit_reason == KVM_EXIT_HYPERCALL) {
 		kvm_arch_ops->cache_regs(vcpu);
 		vcpu->regs[VCPU_REGS_RAX] = kvm_run->hypercall.ret;
 		kvm_arch_ops->decache_regs(vcpu);
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index d4b2936..b09928f 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1298,8 +1298,6 @@ static int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u32 exit_code = vcpu->svm->vmcb->control.exit_code;
 
-	kvm_run->exit_type = KVM_EXIT_TYPE_VM_EXIT;
-
 	if (is_external_interrupt(vcpu->svm->vmcb->control.exit_int_info) &&
 	    exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR)
 		printk(KERN_ERR "%s: unexpected exit_ini_info 0x%x "
@@ -1609,8 +1607,9 @@ again:
 	vcpu->svm->next_rip = 0;
 
 	if (vcpu->svm->vmcb->control.exit_code == SVM_EXIT_ERR) {
-		kvm_run->exit_type = KVM_EXIT_TYPE_FAIL_ENTRY;
-		kvm_run->exit_reason = vcpu->svm->vmcb->control.exit_code;
+		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+		kvm_run->fail_entry.hardware_entry_failure_reason
+			= vcpu->svm->vmcb->control.exit_code;
 		post_kvm_run_save(vcpu, kvm_run);
 		return 0;
 	}
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 71410a6..cf9568f 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1922,10 +1922,10 @@ again:
 
 	asm ("mov %0, %%ds; mov %0, %%es" : : "r"(__USER_DS));
 
-	kvm_run->exit_type = 0;
 	if (fail) {
-		kvm_run->exit_type = KVM_EXIT_TYPE_FAIL_ENTRY;
-		kvm_run->exit_reason = vmcs_read32(VM_INSTRUCTION_ERROR);
+		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+		kvm_run->fail_entry.hardware_entry_failure_reason
+			= vmcs_read32(VM_INSTRUCTION_ERROR);
 		r = 0;
 	} else {
 		/*
@@ -1935,7 +1935,6 @@ again:
 			profile_hit(KVM_PROFILING, (void *)vmcs_readl(GUEST_RIP));
 
 		vcpu->launched = 1;
-		kvm_run->exit_type = KVM_EXIT_TYPE_VM_EXIT;
 		r = kvm_handle_exit(kvm_run, vcpu);
 		if (r > 0) {
 			/* Give scheduler a change to reschedule. */
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 9151ebf..57f47ef 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 7
+#define KVM_API_VERSION 8
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -34,9 +34,6 @@ struct kvm_memory_region {
 #define KVM_MEM_LOG_DIRTY_PAGES  1UL
 
 
-#define KVM_EXIT_TYPE_FAIL_ENTRY 1
-#define KVM_EXIT_TYPE_VM_EXIT    2
-
 enum kvm_exit_reason {
 	KVM_EXIT_UNKNOWN          = 0,
 	KVM_EXIT_EXCEPTION        = 1,
@@ -47,6 +44,7 @@ enum kvm_exit_reason {
 	KVM_EXIT_MMIO             = 6,
 	KVM_EXIT_IRQ_WINDOW_OPEN  = 7,
 	KVM_EXIT_SHUTDOWN         = 8,
+	KVM_EXIT_FAIL_ENTRY       = 9,
 };
 
 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
@@ -57,12 +55,11 @@ struct kvm_run {
 	__u8 padding1[3];
 
 	/* out */
-	__u32 exit_type;
 	__u32 exit_reason;
 	__u32 instruction_length;
 	__u8 ready_for_interrupt_injection;
 	__u8 if_flag;
-	__u16 padding2;
+	__u8 padding2[6];
 
 	/* in (pre_kvm_run), out (post_kvm_run) */
 	__u64 cr8;
@@ -71,8 +68,12 @@ struct kvm_run {
 	union {
 		/* KVM_EXIT_UNKNOWN */
 		struct {
-			__u32 hardware_exit_reason;
+			__u64 hardware_exit_reason;
 		} hw;
+		/* KVM_EXIT_FAIL_ENTRY */
+		struct {
+			__u64 hardware_entry_failure_reason;
+		} fail_entry;
 		/* KVM_EXIT_EXCEPTION */
 		struct {
 			__u32 exception;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 14/41] KVM: Fold kvm_run::exit_type into kvm_run::exit_reason
@ 2007-04-01 14:35                             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Currently, userspace is told about the nature of the last exit from the
guest using two fields, exit_type and exit_reason, where exit_type has
just two enumerations (and no need for more).  So fold exit_type into
exit_reason, reducing the complexity of determining what really happened.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |    3 +--
 drivers/kvm/svm.c      |    7 +++----
 drivers/kvm/vmx.c      |    7 +++----
 include/linux/kvm.h    |   15 ++++++++-------
 4 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index de93117..ac44df5 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1608,8 +1608,7 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	vcpu->mmio_needed = 0;
 
-	if (kvm_run->exit_type == KVM_EXIT_TYPE_VM_EXIT
-	    && kvm_run->exit_reason == KVM_EXIT_HYPERCALL) {
+	if (kvm_run->exit_reason == KVM_EXIT_HYPERCALL) {
 		kvm_arch_ops->cache_regs(vcpu);
 		vcpu->regs[VCPU_REGS_RAX] = kvm_run->hypercall.ret;
 		kvm_arch_ops->decache_regs(vcpu);
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index d4b2936..b09928f 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1298,8 +1298,6 @@ static int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u32 exit_code = vcpu->svm->vmcb->control.exit_code;
 
-	kvm_run->exit_type = KVM_EXIT_TYPE_VM_EXIT;
-
 	if (is_external_interrupt(vcpu->svm->vmcb->control.exit_int_info) &&
 	    exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR)
 		printk(KERN_ERR "%s: unexpected exit_ini_info 0x%x "
@@ -1609,8 +1607,9 @@ again:
 	vcpu->svm->next_rip = 0;
 
 	if (vcpu->svm->vmcb->control.exit_code == SVM_EXIT_ERR) {
-		kvm_run->exit_type = KVM_EXIT_TYPE_FAIL_ENTRY;
-		kvm_run->exit_reason = vcpu->svm->vmcb->control.exit_code;
+		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+		kvm_run->fail_entry.hardware_entry_failure_reason
+			= vcpu->svm->vmcb->control.exit_code;
 		post_kvm_run_save(vcpu, kvm_run);
 		return 0;
 	}
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 71410a6..cf9568f 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1922,10 +1922,10 @@ again:
 
 	asm ("mov %0, %%ds; mov %0, %%es" : : "r"(__USER_DS));
 
-	kvm_run->exit_type = 0;
 	if (fail) {
-		kvm_run->exit_type = KVM_EXIT_TYPE_FAIL_ENTRY;
-		kvm_run->exit_reason = vmcs_read32(VM_INSTRUCTION_ERROR);
+		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+		kvm_run->fail_entry.hardware_entry_failure_reason
+			= vmcs_read32(VM_INSTRUCTION_ERROR);
 		r = 0;
 	} else {
 		/*
@@ -1935,7 +1935,6 @@ again:
 			profile_hit(KVM_PROFILING, (void *)vmcs_readl(GUEST_RIP));
 
 		vcpu->launched = 1;
-		kvm_run->exit_type = KVM_EXIT_TYPE_VM_EXIT;
 		r = kvm_handle_exit(kvm_run, vcpu);
 		if (r > 0) {
 			/* Give scheduler a change to reschedule. */
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 9151ebf..57f47ef 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 7
+#define KVM_API_VERSION 8
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -34,9 +34,6 @@ struct kvm_memory_region {
 #define KVM_MEM_LOG_DIRTY_PAGES  1UL
 
 
-#define KVM_EXIT_TYPE_FAIL_ENTRY 1
-#define KVM_EXIT_TYPE_VM_EXIT    2
-
 enum kvm_exit_reason {
 	KVM_EXIT_UNKNOWN          = 0,
 	KVM_EXIT_EXCEPTION        = 1,
@@ -47,6 +44,7 @@ enum kvm_exit_reason {
 	KVM_EXIT_MMIO             = 6,
 	KVM_EXIT_IRQ_WINDOW_OPEN  = 7,
 	KVM_EXIT_SHUTDOWN         = 8,
+	KVM_EXIT_FAIL_ENTRY       = 9,
 };
 
 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
@@ -57,12 +55,11 @@ struct kvm_run {
 	__u8 padding1[3];
 
 	/* out */
-	__u32 exit_type;
 	__u32 exit_reason;
 	__u32 instruction_length;
 	__u8 ready_for_interrupt_injection;
 	__u8 if_flag;
-	__u16 padding2;
+	__u8 padding2[6];
 
 	/* in (pre_kvm_run), out (post_kvm_run) */
 	__u64 cr8;
@@ -71,8 +68,12 @@ struct kvm_run {
 	union {
 		/* KVM_EXIT_UNKNOWN */
 		struct {
-			__u32 hardware_exit_reason;
+			__u64 hardware_exit_reason;
 		} hw;
+		/* KVM_EXIT_FAIL_ENTRY */
+		struct {
+			__u64 hardware_entry_failure_reason;
+		} fail_entry;
 		/* KVM_EXIT_EXCEPTION */
 		struct {
 			__u32 exception;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 15/41] KVM: Add a special exit reason when exiting due to an interrupt
@ 2007-04-01 14:35                               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This is redundant, as we also return -EINTR from the ioctl, but it
allows us to examine the exit_reason field on resume without seeing
old data.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c   |    2 ++
 drivers/kvm/vmx.c   |    2 ++
 include/linux/kvm.h |    3 ++-
 3 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index b09928f..0311665 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1619,12 +1619,14 @@ again:
 		if (signal_pending(current)) {
 			++kvm_stat.signal_exits;
 			post_kvm_run_save(vcpu, kvm_run);
+			kvm_run->exit_reason = KVM_EXIT_INTR;
 			return -EINTR;
 		}
 
 		if (dm_request_for_irq_injection(vcpu, kvm_run)) {
 			++kvm_stat.request_irq_exits;
 			post_kvm_run_save(vcpu, kvm_run);
+			kvm_run->exit_reason = KVM_EXIT_INTR;
 			return -EINTR;
 		}
 		kvm_resched(vcpu);
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index cf9568f..e69bab6 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1941,12 +1941,14 @@ again:
 			if (signal_pending(current)) {
 				++kvm_stat.signal_exits;
 				post_kvm_run_save(vcpu, kvm_run);
+				kvm_run->exit_reason = KVM_EXIT_INTR;
 				return -EINTR;
 			}
 
 			if (dm_request_for_irq_injection(vcpu, kvm_run)) {
 				++kvm_stat.request_irq_exits;
 				post_kvm_run_save(vcpu, kvm_run);
+				kvm_run->exit_reason = KVM_EXIT_INTR;
 				return -EINTR;
 			}
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 57f47ef..b3af92e 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 8
+#define KVM_API_VERSION 9
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -45,6 +45,7 @@ enum kvm_exit_reason {
 	KVM_EXIT_IRQ_WINDOW_OPEN  = 7,
 	KVM_EXIT_SHUTDOWN         = 8,
 	KVM_EXIT_FAIL_ENTRY       = 9,
+	KVM_EXIT_INTR             = 10,
 };
 
 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 15/41] KVM: Add a special exit reason when exiting due to an interrupt
@ 2007-04-01 14:35                               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

This is redundant, as we also return -EINTR from the ioctl, but it
allows us to examine the exit_reason field on resume without seeing
old data.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/svm.c   |    2 ++
 drivers/kvm/vmx.c   |    2 ++
 include/linux/kvm.h |    3 ++-
 3 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index b09928f..0311665 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1619,12 +1619,14 @@ again:
 		if (signal_pending(current)) {
 			++kvm_stat.signal_exits;
 			post_kvm_run_save(vcpu, kvm_run);
+			kvm_run->exit_reason = KVM_EXIT_INTR;
 			return -EINTR;
 		}
 
 		if (dm_request_for_irq_injection(vcpu, kvm_run)) {
 			++kvm_stat.request_irq_exits;
 			post_kvm_run_save(vcpu, kvm_run);
+			kvm_run->exit_reason = KVM_EXIT_INTR;
 			return -EINTR;
 		}
 		kvm_resched(vcpu);
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index cf9568f..e69bab6 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1941,12 +1941,14 @@ again:
 			if (signal_pending(current)) {
 				++kvm_stat.signal_exits;
 				post_kvm_run_save(vcpu, kvm_run);
+				kvm_run->exit_reason = KVM_EXIT_INTR;
 				return -EINTR;
 			}
 
 			if (dm_request_for_irq_injection(vcpu, kvm_run)) {
 				++kvm_stat.request_irq_exits;
 				post_kvm_run_save(vcpu, kvm_run);
+				kvm_run->exit_reason = KVM_EXIT_INTR;
 				return -EINTR;
 			}
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 57f47ef..b3af92e 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 8
+#define KVM_API_VERSION 9
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -45,6 +45,7 @@ enum kvm_exit_reason {
 	KVM_EXIT_IRQ_WINDOW_OPEN  = 7,
 	KVM_EXIT_SHUTDOWN         = 8,
 	KVM_EXIT_FAIL_ENTRY       = 9,
+	KVM_EXIT_INTR             = 10,
 };
 
 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 16/41] KVM: Initialize the apic_base msr on svm too
@ 2007-04-01 14:35                                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Older userspace didn't care, but newer userspace (with the cpuid changes)
does.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 0311665..2396ada 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -582,6 +582,9 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	init_vmcb(vcpu->svm->vmcb);
 
 	fx_init(vcpu);
+	vcpu->apic_base = 0xfee00000 |
+			/*for vcpu 0*/ MSR_IA32_APICBASE_BSP |
+			MSR_IA32_APICBASE_ENABLE;
 
 	return 0;
 
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 16/41] KVM: Initialize the apic_base msr on svm too
@ 2007-04-01 14:35                                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Older userspace didn't care, but newer userspace (with the cpuid changes)
does.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/svm.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 0311665..2396ada 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -582,6 +582,9 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	init_vmcb(vcpu->svm->vmcb);
 
 	fx_init(vcpu);
+	vcpu->apic_base = 0xfee00000 |
+			/*for vcpu 0*/ MSR_IA32_APICBASE_BSP |
+			MSR_IA32_APICBASE_ENABLE;
 
 	return 0;
 
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 17/41] KVM: Add guest mode signal mask
@ 2007-04-01 14:35                                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Allow a special signal mask to be used while executing in guest mode.  This
allows signals to be used to interrupt a vcpu without requiring signal
delivery to a userspace handler, which is quite expensive.  Userspace still
receives -EINTR and can get the signal via sigwait().

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    3 +++
 drivers/kvm/kvm_main.c |   41 +++++++++++++++++++++++++++++++++++++++++
 include/linux/kvm.h    |    7 +++++++
 3 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index be3a0e7..1c4a581 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -277,6 +277,9 @@ struct kvm_vcpu {
 	gpa_t mmio_phys_addr;
 	int pio_pending;
 
+	int sigset_active;
+	sigset_t sigset;
+
 	struct {
 		int active;
 		u8 save_iopl;
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index ac44df5..df85f5f 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1591,9 +1591,13 @@ static void complete_pio(struct kvm_vcpu *vcpu)
 static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	int r;
+	sigset_t sigsaved;
 
 	vcpu_load(vcpu);
 
+	if (vcpu->sigset_active)
+		sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
+
 	/* re-sync apic's tpr */
 	vcpu->cr8 = kvm_run->cr8;
 
@@ -1616,6 +1620,9 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	r = kvm_arch_ops->run(vcpu, kvm_run);
 
+	if (vcpu->sigset_active)
+		sigprocmask(SIG_SETMASK, &sigsaved, NULL);
+
 	vcpu_put(vcpu);
 	return r;
 }
@@ -2142,6 +2149,17 @@ out:
 	return r;
 }
 
+static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
+{
+	if (sigset) {
+		sigdelsetmask(sigset, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		vcpu->sigset_active = 1;
+		vcpu->sigset = *sigset;
+	} else
+		vcpu->sigset_active = 0;
+	return 0;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2260,6 +2278,29 @@ static long kvm_vcpu_ioctl(struct file *filp,
 			goto out;
 		break;
 	}
+	case KVM_SET_SIGNAL_MASK: {
+		struct kvm_signal_mask __user *sigmask_arg = argp;
+		struct kvm_signal_mask kvm_sigmask;
+		sigset_t sigset, *p;
+
+		p = NULL;
+		if (argp) {
+			r = -EFAULT;
+			if (copy_from_user(&kvm_sigmask, argp,
+					   sizeof kvm_sigmask))
+				goto out;
+			r = -EINVAL;
+			if (kvm_sigmask.len != sizeof sigset)
+				goto out;
+			r = -EFAULT;
+			if (copy_from_user(&sigset, sigmask_arg->sigset,
+					   sizeof sigset))
+				goto out;
+			p = &sigset;
+		}
+		r = kvm_vcpu_ioctl_set_sigmask(vcpu, &sigset);
+		break;
+	}
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index b3af92e..c0d10cd 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -234,6 +234,12 @@ struct kvm_cpuid {
 	struct kvm_cpuid_entry entries[0];
 };
 
+/* for KVM_SET_SIGNAL_MASK */
+struct kvm_signal_mask {
+	__u32 len;
+	__u8  sigset[0];
+};
+
 #define KVMIO 0xAE
 
 /*
@@ -273,5 +279,6 @@ struct kvm_cpuid {
 #define KVM_GET_MSRS              _IOWR(KVMIO, 0x88, struct kvm_msrs)
 #define KVM_SET_MSRS              _IOW(KVMIO,  0x89, struct kvm_msrs)
 #define KVM_SET_CPUID             _IOW(KVMIO,  0x8a, struct kvm_cpuid)
+#define KVM_SET_SIGNAL_MASK       _IOW(KVMIO,  0x8b, struct kvm_signal_mask)
 
 #endif
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 17/41] KVM: Add guest mode signal mask
@ 2007-04-01 14:35                                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Allow a special signal mask to be used while executing in guest mode.  This
allows signals to be used to interrupt a vcpu without requiring signal
delivery to a userspace handler, which is quite expensive.  Userspace still
receives -EINTR and can get the signal via sigwait().

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    3 +++
 drivers/kvm/kvm_main.c |   41 +++++++++++++++++++++++++++++++++++++++++
 include/linux/kvm.h    |    7 +++++++
 3 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index be3a0e7..1c4a581 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -277,6 +277,9 @@ struct kvm_vcpu {
 	gpa_t mmio_phys_addr;
 	int pio_pending;
 
+	int sigset_active;
+	sigset_t sigset;
+
 	struct {
 		int active;
 		u8 save_iopl;
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index ac44df5..df85f5f 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1591,9 +1591,13 @@ static void complete_pio(struct kvm_vcpu *vcpu)
 static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	int r;
+	sigset_t sigsaved;
 
 	vcpu_load(vcpu);
 
+	if (vcpu->sigset_active)
+		sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
+
 	/* re-sync apic's tpr */
 	vcpu->cr8 = kvm_run->cr8;
 
@@ -1616,6 +1620,9 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	r = kvm_arch_ops->run(vcpu, kvm_run);
 
+	if (vcpu->sigset_active)
+		sigprocmask(SIG_SETMASK, &sigsaved, NULL);
+
 	vcpu_put(vcpu);
 	return r;
 }
@@ -2142,6 +2149,17 @@ out:
 	return r;
 }
 
+static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
+{
+	if (sigset) {
+		sigdelsetmask(sigset, sigmask(SIGKILL)|sigmask(SIGSTOP));
+		vcpu->sigset_active = 1;
+		vcpu->sigset = *sigset;
+	} else
+		vcpu->sigset_active = 0;
+	return 0;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2260,6 +2278,29 @@ static long kvm_vcpu_ioctl(struct file *filp,
 			goto out;
 		break;
 	}
+	case KVM_SET_SIGNAL_MASK: {
+		struct kvm_signal_mask __user *sigmask_arg = argp;
+		struct kvm_signal_mask kvm_sigmask;
+		sigset_t sigset, *p;
+
+		p = NULL;
+		if (argp) {
+			r = -EFAULT;
+			if (copy_from_user(&kvm_sigmask, argp,
+					   sizeof kvm_sigmask))
+				goto out;
+			r = -EINVAL;
+			if (kvm_sigmask.len != sizeof sigset)
+				goto out;
+			r = -EFAULT;
+			if (copy_from_user(&sigset, sigmask_arg->sigset,
+					   sizeof sigset))
+				goto out;
+			p = &sigset;
+		}
+		r = kvm_vcpu_ioctl_set_sigmask(vcpu, &sigset);
+		break;
+	}
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index b3af92e..c0d10cd 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -234,6 +234,12 @@ struct kvm_cpuid {
 	struct kvm_cpuid_entry entries[0];
 };
 
+/* for KVM_SET_SIGNAL_MASK */
+struct kvm_signal_mask {
+	__u32 len;
+	__u8  sigset[0];
+};
+
 #define KVMIO 0xAE
 
 /*
@@ -273,5 +279,6 @@ struct kvm_cpuid {
 #define KVM_GET_MSRS              _IOWR(KVMIO, 0x88, struct kvm_msrs)
 #define KVM_SET_MSRS              _IOW(KVMIO,  0x89, struct kvm_msrs)
 #define KVM_SET_CPUID             _IOW(KVMIO,  0x8a, struct kvm_cpuid)
+#define KVM_SET_SIGNAL_MASK       _IOW(KVMIO,  0x8b, struct kvm_signal_mask)
 
 #endif
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 18/41] KVM: Allow kernel to select size of mmap() buffer
@ 2007-04-01 14:35                                     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

This allows us to store offsets in the kernel/user kvm_run area, and be
sure that userspace has them mapped.  As offsets can be outside the
kvm_run struct, userspace has no way of knowing how much to mmap.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    8 +++++++-
 include/linux/kvm.h    |    4 ++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index df85f5f..cba0b87 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2436,7 +2436,7 @@ static long kvm_dev_ioctl(struct file *filp,
 			  unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
-	int r = -EINVAL;
+	long r = -EINVAL;
 
 	switch (ioctl) {
 	case KVM_GET_API_VERSION:
@@ -2478,6 +2478,12 @@ static long kvm_dev_ioctl(struct file *filp,
 		 */
 		r = 0;
 		break;
+	case KVM_GET_VCPU_MMAP_SIZE:
+		r = -EINVAL;
+		if (arg)
+			goto out;
+		r = PAGE_SIZE;
+		break;
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c0d10cd..dad9081 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -253,6 +253,10 @@ struct kvm_signal_mask {
  * return is 1 (yes) or 0 (no, sorry).
  */
 #define KVM_CHECK_EXTENSION       _IO(KVMIO,   0x03)
+/*
+ * Get size for mmap(vcpu_fd)
+ */
+#define KVM_GET_VCPU_MMAP_SIZE    _IO(KVMIO,   0x04) /* in bytes */
 
 /*
  * ioctls for VM fds
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 18/41] KVM: Allow kernel to select size of mmap() buffer
@ 2007-04-01 14:35                                     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

This allows us to store offsets in the kernel/user kvm_run area, and be
sure that userspace has them mapped.  As offsets can be outside the
kvm_run struct, userspace has no way of knowing how much to mmap.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |    8 +++++++-
 include/linux/kvm.h    |    4 ++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index df85f5f..cba0b87 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2436,7 +2436,7 @@ static long kvm_dev_ioctl(struct file *filp,
 			  unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
-	int r = -EINVAL;
+	long r = -EINVAL;
 
 	switch (ioctl) {
 	case KVM_GET_API_VERSION:
@@ -2478,6 +2478,12 @@ static long kvm_dev_ioctl(struct file *filp,
 		 */
 		r = 0;
 		break;
+	case KVM_GET_VCPU_MMAP_SIZE:
+		r = -EINVAL;
+		if (arg)
+			goto out;
+		r = PAGE_SIZE;
+		break;
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c0d10cd..dad9081 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -253,6 +253,10 @@ struct kvm_signal_mask {
  * return is 1 (yes) or 0 (no, sorry).
  */
 #define KVM_CHECK_EXTENSION       _IO(KVMIO,   0x03)
+/*
+ * Get size for mmap(vcpu_fd)
+ */
+#define KVM_GET_VCPU_MMAP_SIZE    _IO(KVMIO,   0x04) /* in bytes */
 
 /*
  * ioctls for VM fds
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 19/41] KVM: Future-proof argument-less ioctls
@ 2007-04-01 14:35                                       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Some ioctls ignore their arguments.  By requiring them to be zero now,
we allow a nonzero value to have some special meaning in the future.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index cba0b87..ba7f43a 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2169,6 +2169,9 @@ static long kvm_vcpu_ioctl(struct file *filp,
 
 	switch (ioctl) {
 	case KVM_RUN:
+		r = -EINVAL;
+		if (arg)
+			goto out;
 		r = kvm_vcpu_ioctl_run(vcpu, vcpu->run);
 		break;
 	case KVM_GET_REGS: {
@@ -2440,9 +2443,15 @@ static long kvm_dev_ioctl(struct file *filp,
 
 	switch (ioctl) {
 	case KVM_GET_API_VERSION:
+		r = -EINVAL;
+		if (arg)
+			goto out;
 		r = KVM_API_VERSION;
 		break;
 	case KVM_CREATE_VM:
+		r = -EINVAL;
+		if (arg)
+			goto out;
 		r = kvm_dev_ioctl_create_vm();
 		break;
 	case KVM_GET_MSR_INDEX_LIST: {
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 19/41] KVM: Future-proof argument-less ioctls
@ 2007-04-01 14:35                                       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Some ioctls ignore their arguments.  By requiring them to be zero now,
we allow a nonzero value to have some special meaning in the future.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index cba0b87..ba7f43a 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2169,6 +2169,9 @@ static long kvm_vcpu_ioctl(struct file *filp,
 
 	switch (ioctl) {
 	case KVM_RUN:
+		r = -EINVAL;
+		if (arg)
+			goto out;
 		r = kvm_vcpu_ioctl_run(vcpu, vcpu->run);
 		break;
 	case KVM_GET_REGS: {
@@ -2440,9 +2443,15 @@ static long kvm_dev_ioctl(struct file *filp,
 
 	switch (ioctl) {
 	case KVM_GET_API_VERSION:
+		r = -EINVAL;
+		if (arg)
+			goto out;
 		r = KVM_API_VERSION;
 		break;
 	case KVM_CREATE_VM:
+		r = -EINVAL;
+		if (arg)
+			goto out;
 		r = kvm_dev_ioctl_create_vm();
 		break;
 	case KVM_GET_MSR_INDEX_LIST: {
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 20/41] KVM: Avoid guest virtual addresses in string pio userspace interface
@ 2007-04-01 14:35                                         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The current string pio interface communicates using guest virtual addresses,
relying on userspace to translate addresses and to check permissions.  This
interface cannot fully support guest smp, as the check needs to take into
account two pages at one in case an unaligned string transfer straddles a
page boundary.

Change the interface not to communicate guest addresses at all; instead use
a buffer page (mmaped by userspace) and do transfers there.  The kernel
manages the virtual to physical translation and can perform the checks
atomically by taking the appropriate locks.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |   21 ++++++-
 drivers/kvm/kvm_main.c |  174 +++++++++++++++++++++++++++++++++++++++++++----
 drivers/kvm/mmu.c      |    9 +++
 drivers/kvm/svm.c      |   40 +++++------
 drivers/kvm/vmx.c      |   40 ++++++------
 include/linux/kvm.h    |   11 +---
 6 files changed, 229 insertions(+), 66 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 1c4a581..7866b34 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -74,6 +74,8 @@
 
 #define IOPL_SHIFT 12
 
+#define KVM_PIO_PAGE_OFFSET 1
+
 /*
  * Address types:
  *
@@ -220,6 +222,18 @@ enum {
 	VCPU_SREG_LDTR,
 };
 
+struct kvm_pio_request {
+	unsigned long count;
+	int cur_count;
+	struct page *guest_pages[2];
+	unsigned guest_page_offset;
+	int in;
+	int size;
+	int string;
+	int down;
+	int rep;
+};
+
 struct kvm_vcpu {
 	struct kvm *kvm;
 	union {
@@ -275,7 +289,8 @@ struct kvm_vcpu {
 	int mmio_size;
 	unsigned char mmio_data[8];
 	gpa_t mmio_phys_addr;
-	int pio_pending;
+	struct kvm_pio_request pio;
+	void *pio_data;
 
 	int sigset_active;
 	sigset_t sigset;
@@ -421,6 +436,7 @@ hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa);
 #define HPA_ERR_MASK ((hpa_t)1 << HPA_MSB)
 static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; }
 hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva);
+struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva);
 
 void kvm_emulator_want_group7_invlpg(void);
 
@@ -453,6 +469,9 @@ void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value,
 
 struct x86_emulate_ctxt;
 
+int kvm_setup_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
+		  int size, unsigned long count, int string, int down,
+		  gva_t address, int rep, unsigned port);
 void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address);
 int emulate_clts(struct kvm_vcpu *vcpu);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index ba7f43a..0c30e1b 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -346,6 +346,17 @@ static void kvm_free_physmem(struct kvm *kvm)
 		kvm_free_physmem_slot(&kvm->memslots[i], NULL);
 }
 
+static void free_pio_guest_pages(struct kvm_vcpu *vcpu)
+{
+	int i;
+
+	for (i = 0; i < 2; ++i)
+		if (vcpu->pio.guest_pages[i]) {
+			__free_page(vcpu->pio.guest_pages[i]);
+			vcpu->pio.guest_pages[i] = NULL;
+		}
+}
+
 static void kvm_free_vcpu(struct kvm_vcpu *vcpu)
 {
 	if (!vcpu->vmcs)
@@ -357,6 +368,9 @@ static void kvm_free_vcpu(struct kvm_vcpu *vcpu)
 	kvm_arch_ops->vcpu_free(vcpu);
 	free_page((unsigned long)vcpu->run);
 	vcpu->run = NULL;
+	free_page((unsigned long)vcpu->pio_data);
+	vcpu->pio_data = NULL;
+	free_pio_guest_pages(vcpu);
 }
 
 static void kvm_free_vcpus(struct kvm *kvm)
@@ -1550,44 +1564,159 @@ void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
 
-static void complete_pio(struct kvm_vcpu *vcpu)
+static int pio_copy_data(struct kvm_vcpu *vcpu)
+{
+	void *p = vcpu->pio_data;
+	void *q;
+	unsigned bytes;
+	int nr_pages = vcpu->pio.guest_pages[1] ? 2 : 1;
+
+	kvm_arch_ops->vcpu_put(vcpu);
+	q = vmap(vcpu->pio.guest_pages, nr_pages, VM_READ|VM_WRITE,
+		 PAGE_KERNEL);
+	if (!q) {
+		kvm_arch_ops->vcpu_load(vcpu);
+		return -ENOMEM;
+	}
+	q += vcpu->pio.guest_page_offset;
+	bytes = vcpu->pio.size * vcpu->pio.cur_count;
+	if (vcpu->pio.in)
+		memcpy(q, p, bytes);
+	else
+		memcpy(p, q, bytes);
+	q -= vcpu->pio.guest_page_offset;
+	vunmap(q);
+	kvm_arch_ops->vcpu_load(vcpu);
+	return 0;
+}
+
+static int complete_pio(struct kvm_vcpu *vcpu)
 {
-	struct kvm_io *io = &vcpu->run->io;
+	struct kvm_pio_request *io = &vcpu->pio;
 	long delta;
+	int r;
 
 	kvm_arch_ops->cache_regs(vcpu);
 
+	io->count -= io->cur_count;
 	if (!io->string) {
-		if (io->direction == KVM_EXIT_IO_IN)
-			memcpy(&vcpu->regs[VCPU_REGS_RAX], &io->value,
+		if (io->in)
+			memcpy(&vcpu->regs[VCPU_REGS_RAX], vcpu->pio_data,
 			       io->size);
 	} else {
+		if (io->in) {
+			r = pio_copy_data(vcpu);
+			if (r) {
+				kvm_arch_ops->cache_regs(vcpu);
+				return r;
+			}
+		}
+
 		delta = 1;
 		if (io->rep) {
-			delta *= io->count;
+			delta *= io->cur_count;
 			/*
 			 * The size of the register should really depend on
 			 * current address size.
 			 */
 			vcpu->regs[VCPU_REGS_RCX] -= delta;
 		}
-		if (io->string_down)
+		if (io->down)
 			delta = -delta;
 		delta *= io->size;
-		if (io->direction == KVM_EXIT_IO_IN)
+		if (io->in)
 			vcpu->regs[VCPU_REGS_RDI] += delta;
 		else
 			vcpu->regs[VCPU_REGS_RSI] += delta;
 	}
 
-	vcpu->pio_pending = 0;
 	vcpu->run->io_completed = 0;
 
 	kvm_arch_ops->decache_regs(vcpu);
 
-	kvm_arch_ops->skip_emulated_instruction(vcpu);
+	if (!io->count)
+		kvm_arch_ops->skip_emulated_instruction(vcpu);
+	return 0;
 }
 
+int kvm_setup_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
+		  int size, unsigned long count, int string, int down,
+		  gva_t address, int rep, unsigned port)
+{
+	unsigned now, in_page;
+	int i;
+	int nr_pages = 1;
+	struct page *page;
+
+	vcpu->run->exit_reason = KVM_EXIT_IO;
+	vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
+	vcpu->run->io.size = size;
+	vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
+	vcpu->run->io.count = count;
+	vcpu->run->io.port = port;
+	vcpu->pio.count = count;
+	vcpu->pio.cur_count = count;
+	vcpu->pio.size = size;
+	vcpu->pio.in = in;
+	vcpu->pio.string = string;
+	vcpu->pio.down = down;
+	vcpu->pio.guest_page_offset = offset_in_page(address);
+	vcpu->pio.rep = rep;
+
+	if (!string) {
+		kvm_arch_ops->cache_regs(vcpu);
+		memcpy(vcpu->pio_data, &vcpu->regs[VCPU_REGS_RAX], 4);
+		kvm_arch_ops->decache_regs(vcpu);
+		return 0;
+	}
+
+	now = min(count, PAGE_SIZE / size);
+
+	if (!down)
+		in_page = PAGE_SIZE - offset_in_page(address);
+	else
+		in_page = offset_in_page(address) + size;
+	now = min(count, (unsigned long)in_page / size);
+	if (!now) {
+		/*
+		 * String I/O straddles page boundary.  Pin two guest pages
+		 * so that we satisfy atomicity constraints.  Do just one
+		 * transaction to avoid complexity.
+		 */
+		nr_pages = 2;
+		now = 1;
+	}
+	if (down) {
+		/*
+		 * String I/O in reverse.  Yuck.  Kill the guest, fix later.
+		 */
+		printk(KERN_ERR "kvm: guest string pio down\n");
+		inject_gp(vcpu);
+		return 1;
+	}
+	vcpu->run->io.count = now;
+	vcpu->pio.cur_count = now;
+
+	for (i = 0; i < nr_pages; ++i) {
+		spin_lock(&vcpu->kvm->lock);
+		page = gva_to_page(vcpu, address + i * PAGE_SIZE);
+		if (page)
+			get_page(page);
+		vcpu->pio.guest_pages[i] = page;
+		spin_unlock(&vcpu->kvm->lock);
+		if (!page) {
+			inject_gp(vcpu);
+			free_pio_guest_pages(vcpu);
+			return 1;
+		}
+	}
+
+	if (!vcpu->pio.in)
+		return pio_copy_data(vcpu);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_setup_pio);
+
 static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	int r;
@@ -1602,9 +1731,11 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	vcpu->cr8 = kvm_run->cr8;
 
 	if (kvm_run->io_completed) {
-		if (vcpu->pio_pending)
-			complete_pio(vcpu);
-		else {
+		if (vcpu->pio.count) {
+			r = complete_pio(vcpu);
+			if (r)
+				goto out;
+		} else {
 			memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
 			vcpu->mmio_read_completed = 1;
 		}
@@ -1620,6 +1751,7 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	r = kvm_arch_ops->run(vcpu, kvm_run);
 
+out:
 	if (vcpu->sigset_active)
 		sigprocmask(SIG_SETMASK, &sigsaved, NULL);
 
@@ -1995,9 +2127,12 @@ static struct page *kvm_vcpu_nopage(struct vm_area_struct *vma,
 
 	*type = VM_FAULT_MINOR;
 	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
-	if (pgoff != 0)
+	if (pgoff == 0)
+		page = virt_to_page(vcpu->run);
+	else if (pgoff == KVM_PIO_PAGE_OFFSET)
+		page = virt_to_page(vcpu->pio_data);
+	else
 		return NOPAGE_SIGBUS;
-	page = virt_to_page(vcpu->run);
 	get_page(page);
 	return page;
 }
@@ -2094,6 +2229,12 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 		goto out_unlock;
 	vcpu->run = page_address(page);
 
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	r = -ENOMEM;
+	if (!page)
+		goto out_free_run;
+	vcpu->pio_data = page_address(page);
+
 	vcpu->host_fx_image = (char*)ALIGN((hva_t)vcpu->fx_buf,
 					   FX_IMAGE_ALIGN);
 	vcpu->guest_fx_image = vcpu->host_fx_image + FX_IMAGE_SIZE;
@@ -2123,6 +2264,9 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 
 out_free_vcpus:
 	kvm_free_vcpu(vcpu);
+out_free_run:
+	free_page((unsigned long)vcpu->run);
+	vcpu->run = NULL;
 out_unlock:
 	mutex_unlock(&vcpu->mutex);
 out:
@@ -2491,7 +2635,7 @@ static long kvm_dev_ioctl(struct file *filp,
 		r = -EINVAL;
 		if (arg)
 			goto out;
-		r = PAGE_SIZE;
+		r = 2 * PAGE_SIZE;
 		break;
 	default:
 		;
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 266444a..e8e2750 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -735,6 +735,15 @@ hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva)
 	return gpa_to_hpa(vcpu, gpa);
 }
 
+struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva)
+{
+	gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, gva);
+
+	if (gpa == UNMAPPED_GVA)
+		return NULL;
+	return pfn_to_page(gpa_to_hpa(vcpu, gpa) >> PAGE_SHIFT);
+}
+
 static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
 {
 }
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 2396ada..64afc5c 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -984,7 +984,7 @@ static int io_get_override(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
-static unsigned long io_adress(struct kvm_vcpu *vcpu, int ins, u64 *address)
+static unsigned long io_adress(struct kvm_vcpu *vcpu, int ins, gva_t *address)
 {
 	unsigned long addr_mask;
 	unsigned long *reg;
@@ -1028,40 +1028,38 @@ static unsigned long io_adress(struct kvm_vcpu *vcpu, int ins, u64 *address)
 static int io_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u32 io_info = vcpu->svm->vmcb->control.exit_info_1; //address size bug?
-	int _in = io_info & SVM_IOIO_TYPE_MASK;
+	int size, down, in, string, rep;
+	unsigned port;
+	unsigned long count;
+	gva_t address = 0;
 
 	++kvm_stat.io_exits;
 
 	vcpu->svm->next_rip = vcpu->svm->vmcb->control.exit_info_2;
 
-	kvm_run->exit_reason = KVM_EXIT_IO;
-	kvm_run->io.port = io_info >> 16;
-	kvm_run->io.direction = (_in) ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
-	kvm_run->io.size = ((io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT);
-	kvm_run->io.string = (io_info & SVM_IOIO_STR_MASK) != 0;
-	kvm_run->io.rep = (io_info & SVM_IOIO_REP_MASK) != 0;
-	kvm_run->io.count = 1;
+	in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
+	port = io_info >> 16;
+	size = (io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT;
+	string = (io_info & SVM_IOIO_STR_MASK) != 0;
+	rep = (io_info & SVM_IOIO_REP_MASK) != 0;
+	count = 1;
+	down = (vcpu->svm->vmcb->save.rflags & X86_EFLAGS_DF) != 0;
 
-	if (kvm_run->io.string) {
+	if (string) {
 		unsigned addr_mask;
 
-		addr_mask = io_adress(vcpu, _in, &kvm_run->io.address);
+		addr_mask = io_adress(vcpu, in, &address);
 		if (!addr_mask) {
 			printk(KERN_DEBUG "%s: get io address failed\n",
 			       __FUNCTION__);
 			return 1;
 		}
 
-		if (kvm_run->io.rep) {
-			kvm_run->io.count
-				= vcpu->regs[VCPU_REGS_RCX] & addr_mask;
-			kvm_run->io.string_down = (vcpu->svm->vmcb->save.rflags
-						   & X86_EFLAGS_DF) != 0;
-		}
-	} else
-		kvm_run->io.value = vcpu->svm->vmcb->save.rax;
-	vcpu->pio_pending = 1;
-	return 0;
+		if (rep)
+			count = vcpu->regs[VCPU_REGS_RCX] & addr_mask;
+	}
+	return kvm_setup_pio(vcpu, kvm_run, in, size, count, string, down,
+			     address, rep, port);
 }
 
 static int nop_on_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index e69bab6..0d9bf0b 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1394,7 +1394,7 @@ static int handle_triple_fault(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	return 0;
 }
 
-static int get_io_count(struct kvm_vcpu *vcpu, u64 *count)
+static int get_io_count(struct kvm_vcpu *vcpu, unsigned long *count)
 {
 	u64 inst;
 	gva_t rip;
@@ -1439,35 +1439,35 @@ static int get_io_count(struct kvm_vcpu *vcpu, u64 *count)
 done:
 	countr_size *= 8;
 	*count = vcpu->regs[VCPU_REGS_RCX] & (~0ULL >> (64 - countr_size));
+	//printk("cx: %lx\n", vcpu->regs[VCPU_REGS_RCX]);
 	return 1;
 }
 
 static int handle_io(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u64 exit_qualification;
+	int size, down, in, string, rep;
+	unsigned port;
+	unsigned long count;
+	gva_t address;
 
 	++kvm_stat.io_exits;
 	exit_qualification = vmcs_read64(EXIT_QUALIFICATION);
-	kvm_run->exit_reason = KVM_EXIT_IO;
-	if (exit_qualification & 8)
-		kvm_run->io.direction = KVM_EXIT_IO_IN;
-	else
-		kvm_run->io.direction = KVM_EXIT_IO_OUT;
-	kvm_run->io.size = (exit_qualification & 7) + 1;
-	kvm_run->io.string = (exit_qualification & 16) != 0;
-	kvm_run->io.string_down
-		= (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_DF) != 0;
-	kvm_run->io.rep = (exit_qualification & 32) != 0;
-	kvm_run->io.port = exit_qualification >> 16;
-	kvm_run->io.count = 1;
-	if (kvm_run->io.string) {
-		if (!get_io_count(vcpu, &kvm_run->io.count))
+	in = (exit_qualification & 8) != 0;
+	size = (exit_qualification & 7) + 1;
+	string = (exit_qualification & 16) != 0;
+	down = (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_DF) != 0;
+	count = 1;
+	rep = (exit_qualification & 32) != 0;
+	port = exit_qualification >> 16;
+	address = 0;
+	if (string) {
+		if (rep && !get_io_count(vcpu, &count))
 			return 1;
-		kvm_run->io.address = vmcs_readl(GUEST_LINEAR_ADDRESS);
-	} else
-		kvm_run->io.value = vcpu->regs[VCPU_REGS_RAX]; /* rax */
-	vcpu->pio_pending = 1;
-	return 0;
+		address = vmcs_readl(GUEST_LINEAR_ADDRESS);
+	}
+	return kvm_setup_pio(vcpu, kvm_run, in, size, count, string, down,
+			     address, rep, port);
 }
 
 static void
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index dad9081..728b24c 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -86,16 +86,9 @@ struct kvm_run {
 #define KVM_EXIT_IO_OUT 1
 			__u8 direction;
 			__u8 size; /* bytes */
-			__u8 string;
-			__u8 string_down;
-			__u8 rep;
-			__u8 pad;
 			__u16 port;
-			__u64 count;
-			union {
-				__u64 address;
-				__u32 value;
-			};
+			__u32 count;
+			__u64 data_offset; /* relative to kvm_run start */
 		} io;
 		struct {
 		} debug;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 20/41] KVM: Avoid guest virtual addresses in string pio userspace interface
@ 2007-04-01 14:35                                         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

The current string pio interface communicates using guest virtual addresses,
relying on userspace to translate addresses and to check permissions.  This
interface cannot fully support guest smp, as the check needs to take into
account two pages at one in case an unaligned string transfer straddles a
page boundary.

Change the interface not to communicate guest addresses at all; instead use
a buffer page (mmaped by userspace) and do transfers there.  The kernel
manages the virtual to physical translation and can perform the checks
atomically by taking the appropriate locks.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |   21 ++++++-
 drivers/kvm/kvm_main.c |  174 +++++++++++++++++++++++++++++++++++++++++++----
 drivers/kvm/mmu.c      |    9 +++
 drivers/kvm/svm.c      |   40 +++++------
 drivers/kvm/vmx.c      |   40 ++++++------
 include/linux/kvm.h    |   11 +---
 6 files changed, 229 insertions(+), 66 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 1c4a581..7866b34 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -74,6 +74,8 @@
 
 #define IOPL_SHIFT 12
 
+#define KVM_PIO_PAGE_OFFSET 1
+
 /*
  * Address types:
  *
@@ -220,6 +222,18 @@ enum {
 	VCPU_SREG_LDTR,
 };
 
+struct kvm_pio_request {
+	unsigned long count;
+	int cur_count;
+	struct page *guest_pages[2];
+	unsigned guest_page_offset;
+	int in;
+	int size;
+	int string;
+	int down;
+	int rep;
+};
+
 struct kvm_vcpu {
 	struct kvm *kvm;
 	union {
@@ -275,7 +289,8 @@ struct kvm_vcpu {
 	int mmio_size;
 	unsigned char mmio_data[8];
 	gpa_t mmio_phys_addr;
-	int pio_pending;
+	struct kvm_pio_request pio;
+	void *pio_data;
 
 	int sigset_active;
 	sigset_t sigset;
@@ -421,6 +436,7 @@ hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa);
 #define HPA_ERR_MASK ((hpa_t)1 << HPA_MSB)
 static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; }
 hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva);
+struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva);
 
 void kvm_emulator_want_group7_invlpg(void);
 
@@ -453,6 +469,9 @@ void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value,
 
 struct x86_emulate_ctxt;
 
+int kvm_setup_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
+		  int size, unsigned long count, int string, int down,
+		  gva_t address, int rep, unsigned port);
 void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address);
 int emulate_clts(struct kvm_vcpu *vcpu);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index ba7f43a..0c30e1b 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -346,6 +346,17 @@ static void kvm_free_physmem(struct kvm *kvm)
 		kvm_free_physmem_slot(&kvm->memslots[i], NULL);
 }
 
+static void free_pio_guest_pages(struct kvm_vcpu *vcpu)
+{
+	int i;
+
+	for (i = 0; i < 2; ++i)
+		if (vcpu->pio.guest_pages[i]) {
+			__free_page(vcpu->pio.guest_pages[i]);
+			vcpu->pio.guest_pages[i] = NULL;
+		}
+}
+
 static void kvm_free_vcpu(struct kvm_vcpu *vcpu)
 {
 	if (!vcpu->vmcs)
@@ -357,6 +368,9 @@ static void kvm_free_vcpu(struct kvm_vcpu *vcpu)
 	kvm_arch_ops->vcpu_free(vcpu);
 	free_page((unsigned long)vcpu->run);
 	vcpu->run = NULL;
+	free_page((unsigned long)vcpu->pio_data);
+	vcpu->pio_data = NULL;
+	free_pio_guest_pages(vcpu);
 }
 
 static void kvm_free_vcpus(struct kvm *kvm)
@@ -1550,44 +1564,159 @@ void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
 
-static void complete_pio(struct kvm_vcpu *vcpu)
+static int pio_copy_data(struct kvm_vcpu *vcpu)
+{
+	void *p = vcpu->pio_data;
+	void *q;
+	unsigned bytes;
+	int nr_pages = vcpu->pio.guest_pages[1] ? 2 : 1;
+
+	kvm_arch_ops->vcpu_put(vcpu);
+	q = vmap(vcpu->pio.guest_pages, nr_pages, VM_READ|VM_WRITE,
+		 PAGE_KERNEL);
+	if (!q) {
+		kvm_arch_ops->vcpu_load(vcpu);
+		return -ENOMEM;
+	}
+	q += vcpu->pio.guest_page_offset;
+	bytes = vcpu->pio.size * vcpu->pio.cur_count;
+	if (vcpu->pio.in)
+		memcpy(q, p, bytes);
+	else
+		memcpy(p, q, bytes);
+	q -= vcpu->pio.guest_page_offset;
+	vunmap(q);
+	kvm_arch_ops->vcpu_load(vcpu);
+	return 0;
+}
+
+static int complete_pio(struct kvm_vcpu *vcpu)
 {
-	struct kvm_io *io = &vcpu->run->io;
+	struct kvm_pio_request *io = &vcpu->pio;
 	long delta;
+	int r;
 
 	kvm_arch_ops->cache_regs(vcpu);
 
+	io->count -= io->cur_count;
 	if (!io->string) {
-		if (io->direction == KVM_EXIT_IO_IN)
-			memcpy(&vcpu->regs[VCPU_REGS_RAX], &io->value,
+		if (io->in)
+			memcpy(&vcpu->regs[VCPU_REGS_RAX], vcpu->pio_data,
 			       io->size);
 	} else {
+		if (io->in) {
+			r = pio_copy_data(vcpu);
+			if (r) {
+				kvm_arch_ops->cache_regs(vcpu);
+				return r;
+			}
+		}
+
 		delta = 1;
 		if (io->rep) {
-			delta *= io->count;
+			delta *= io->cur_count;
 			/*
 			 * The size of the register should really depend on
 			 * current address size.
 			 */
 			vcpu->regs[VCPU_REGS_RCX] -= delta;
 		}
-		if (io->string_down)
+		if (io->down)
 			delta = -delta;
 		delta *= io->size;
-		if (io->direction == KVM_EXIT_IO_IN)
+		if (io->in)
 			vcpu->regs[VCPU_REGS_RDI] += delta;
 		else
 			vcpu->regs[VCPU_REGS_RSI] += delta;
 	}
 
-	vcpu->pio_pending = 0;
 	vcpu->run->io_completed = 0;
 
 	kvm_arch_ops->decache_regs(vcpu);
 
-	kvm_arch_ops->skip_emulated_instruction(vcpu);
+	if (!io->count)
+		kvm_arch_ops->skip_emulated_instruction(vcpu);
+	return 0;
 }
 
+int kvm_setup_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
+		  int size, unsigned long count, int string, int down,
+		  gva_t address, int rep, unsigned port)
+{
+	unsigned now, in_page;
+	int i;
+	int nr_pages = 1;
+	struct page *page;
+
+	vcpu->run->exit_reason = KVM_EXIT_IO;
+	vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
+	vcpu->run->io.size = size;
+	vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
+	vcpu->run->io.count = count;
+	vcpu->run->io.port = port;
+	vcpu->pio.count = count;
+	vcpu->pio.cur_count = count;
+	vcpu->pio.size = size;
+	vcpu->pio.in = in;
+	vcpu->pio.string = string;
+	vcpu->pio.down = down;
+	vcpu->pio.guest_page_offset = offset_in_page(address);
+	vcpu->pio.rep = rep;
+
+	if (!string) {
+		kvm_arch_ops->cache_regs(vcpu);
+		memcpy(vcpu->pio_data, &vcpu->regs[VCPU_REGS_RAX], 4);
+		kvm_arch_ops->decache_regs(vcpu);
+		return 0;
+	}
+
+	now = min(count, PAGE_SIZE / size);
+
+	if (!down)
+		in_page = PAGE_SIZE - offset_in_page(address);
+	else
+		in_page = offset_in_page(address) + size;
+	now = min(count, (unsigned long)in_page / size);
+	if (!now) {
+		/*
+		 * String I/O straddles page boundary.  Pin two guest pages
+		 * so that we satisfy atomicity constraints.  Do just one
+		 * transaction to avoid complexity.
+		 */
+		nr_pages = 2;
+		now = 1;
+	}
+	if (down) {
+		/*
+		 * String I/O in reverse.  Yuck.  Kill the guest, fix later.
+		 */
+		printk(KERN_ERR "kvm: guest string pio down\n");
+		inject_gp(vcpu);
+		return 1;
+	}
+	vcpu->run->io.count = now;
+	vcpu->pio.cur_count = now;
+
+	for (i = 0; i < nr_pages; ++i) {
+		spin_lock(&vcpu->kvm->lock);
+		page = gva_to_page(vcpu, address + i * PAGE_SIZE);
+		if (page)
+			get_page(page);
+		vcpu->pio.guest_pages[i] = page;
+		spin_unlock(&vcpu->kvm->lock);
+		if (!page) {
+			inject_gp(vcpu);
+			free_pio_guest_pages(vcpu);
+			return 1;
+		}
+	}
+
+	if (!vcpu->pio.in)
+		return pio_copy_data(vcpu);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_setup_pio);
+
 static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	int r;
@@ -1602,9 +1731,11 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	vcpu->cr8 = kvm_run->cr8;
 
 	if (kvm_run->io_completed) {
-		if (vcpu->pio_pending)
-			complete_pio(vcpu);
-		else {
+		if (vcpu->pio.count) {
+			r = complete_pio(vcpu);
+			if (r)
+				goto out;
+		} else {
 			memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
 			vcpu->mmio_read_completed = 1;
 		}
@@ -1620,6 +1751,7 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	r = kvm_arch_ops->run(vcpu, kvm_run);
 
+out:
 	if (vcpu->sigset_active)
 		sigprocmask(SIG_SETMASK, &sigsaved, NULL);
 
@@ -1995,9 +2127,12 @@ static struct page *kvm_vcpu_nopage(struct vm_area_struct *vma,
 
 	*type = VM_FAULT_MINOR;
 	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
-	if (pgoff != 0)
+	if (pgoff == 0)
+		page = virt_to_page(vcpu->run);
+	else if (pgoff == KVM_PIO_PAGE_OFFSET)
+		page = virt_to_page(vcpu->pio_data);
+	else
 		return NOPAGE_SIGBUS;
-	page = virt_to_page(vcpu->run);
 	get_page(page);
 	return page;
 }
@@ -2094,6 +2229,12 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 		goto out_unlock;
 	vcpu->run = page_address(page);
 
+	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	r = -ENOMEM;
+	if (!page)
+		goto out_free_run;
+	vcpu->pio_data = page_address(page);
+
 	vcpu->host_fx_image = (char*)ALIGN((hva_t)vcpu->fx_buf,
 					   FX_IMAGE_ALIGN);
 	vcpu->guest_fx_image = vcpu->host_fx_image + FX_IMAGE_SIZE;
@@ -2123,6 +2264,9 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, int n)
 
 out_free_vcpus:
 	kvm_free_vcpu(vcpu);
+out_free_run:
+	free_page((unsigned long)vcpu->run);
+	vcpu->run = NULL;
 out_unlock:
 	mutex_unlock(&vcpu->mutex);
 out:
@@ -2491,7 +2635,7 @@ static long kvm_dev_ioctl(struct file *filp,
 		r = -EINVAL;
 		if (arg)
 			goto out;
-		r = PAGE_SIZE;
+		r = 2 * PAGE_SIZE;
 		break;
 	default:
 		;
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 266444a..e8e2750 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -735,6 +735,15 @@ hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva)
 	return gpa_to_hpa(vcpu, gpa);
 }
 
+struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva)
+{
+	gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, gva);
+
+	if (gpa == UNMAPPED_GVA)
+		return NULL;
+	return pfn_to_page(gpa_to_hpa(vcpu, gpa) >> PAGE_SHIFT);
+}
+
 static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
 {
 }
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 2396ada..64afc5c 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -984,7 +984,7 @@ static int io_get_override(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
-static unsigned long io_adress(struct kvm_vcpu *vcpu, int ins, u64 *address)
+static unsigned long io_adress(struct kvm_vcpu *vcpu, int ins, gva_t *address)
 {
 	unsigned long addr_mask;
 	unsigned long *reg;
@@ -1028,40 +1028,38 @@ static unsigned long io_adress(struct kvm_vcpu *vcpu, int ins, u64 *address)
 static int io_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u32 io_info = vcpu->svm->vmcb->control.exit_info_1; //address size bug?
-	int _in = io_info & SVM_IOIO_TYPE_MASK;
+	int size, down, in, string, rep;
+	unsigned port;
+	unsigned long count;
+	gva_t address = 0;
 
 	++kvm_stat.io_exits;
 
 	vcpu->svm->next_rip = vcpu->svm->vmcb->control.exit_info_2;
 
-	kvm_run->exit_reason = KVM_EXIT_IO;
-	kvm_run->io.port = io_info >> 16;
-	kvm_run->io.direction = (_in) ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
-	kvm_run->io.size = ((io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT);
-	kvm_run->io.string = (io_info & SVM_IOIO_STR_MASK) != 0;
-	kvm_run->io.rep = (io_info & SVM_IOIO_REP_MASK) != 0;
-	kvm_run->io.count = 1;
+	in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
+	port = io_info >> 16;
+	size = (io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT;
+	string = (io_info & SVM_IOIO_STR_MASK) != 0;
+	rep = (io_info & SVM_IOIO_REP_MASK) != 0;
+	count = 1;
+	down = (vcpu->svm->vmcb->save.rflags & X86_EFLAGS_DF) != 0;
 
-	if (kvm_run->io.string) {
+	if (string) {
 		unsigned addr_mask;
 
-		addr_mask = io_adress(vcpu, _in, &kvm_run->io.address);
+		addr_mask = io_adress(vcpu, in, &address);
 		if (!addr_mask) {
 			printk(KERN_DEBUG "%s: get io address failed\n",
 			       __FUNCTION__);
 			return 1;
 		}
 
-		if (kvm_run->io.rep) {
-			kvm_run->io.count
-				= vcpu->regs[VCPU_REGS_RCX] & addr_mask;
-			kvm_run->io.string_down = (vcpu->svm->vmcb->save.rflags
-						   & X86_EFLAGS_DF) != 0;
-		}
-	} else
-		kvm_run->io.value = vcpu->svm->vmcb->save.rax;
-	vcpu->pio_pending = 1;
-	return 0;
+		if (rep)
+			count = vcpu->regs[VCPU_REGS_RCX] & addr_mask;
+	}
+	return kvm_setup_pio(vcpu, kvm_run, in, size, count, string, down,
+			     address, rep, port);
 }
 
 static int nop_on_interception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index e69bab6..0d9bf0b 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1394,7 +1394,7 @@ static int handle_triple_fault(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	return 0;
 }
 
-static int get_io_count(struct kvm_vcpu *vcpu, u64 *count)
+static int get_io_count(struct kvm_vcpu *vcpu, unsigned long *count)
 {
 	u64 inst;
 	gva_t rip;
@@ -1439,35 +1439,35 @@ static int get_io_count(struct kvm_vcpu *vcpu, u64 *count)
 done:
 	countr_size *= 8;
 	*count = vcpu->regs[VCPU_REGS_RCX] & (~0ULL >> (64 - countr_size));
+	//printk("cx: %lx\n", vcpu->regs[VCPU_REGS_RCX]);
 	return 1;
 }
 
 static int handle_io(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u64 exit_qualification;
+	int size, down, in, string, rep;
+	unsigned port;
+	unsigned long count;
+	gva_t address;
 
 	++kvm_stat.io_exits;
 	exit_qualification = vmcs_read64(EXIT_QUALIFICATION);
-	kvm_run->exit_reason = KVM_EXIT_IO;
-	if (exit_qualification & 8)
-		kvm_run->io.direction = KVM_EXIT_IO_IN;
-	else
-		kvm_run->io.direction = KVM_EXIT_IO_OUT;
-	kvm_run->io.size = (exit_qualification & 7) + 1;
-	kvm_run->io.string = (exit_qualification & 16) != 0;
-	kvm_run->io.string_down
-		= (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_DF) != 0;
-	kvm_run->io.rep = (exit_qualification & 32) != 0;
-	kvm_run->io.port = exit_qualification >> 16;
-	kvm_run->io.count = 1;
-	if (kvm_run->io.string) {
-		if (!get_io_count(vcpu, &kvm_run->io.count))
+	in = (exit_qualification & 8) != 0;
+	size = (exit_qualification & 7) + 1;
+	string = (exit_qualification & 16) != 0;
+	down = (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_DF) != 0;
+	count = 1;
+	rep = (exit_qualification & 32) != 0;
+	port = exit_qualification >> 16;
+	address = 0;
+	if (string) {
+		if (rep && !get_io_count(vcpu, &count))
 			return 1;
-		kvm_run->io.address = vmcs_readl(GUEST_LINEAR_ADDRESS);
-	} else
-		kvm_run->io.value = vcpu->regs[VCPU_REGS_RAX]; /* rax */
-	vcpu->pio_pending = 1;
-	return 0;
+		address = vmcs_readl(GUEST_LINEAR_ADDRESS);
+	}
+	return kvm_setup_pio(vcpu, kvm_run, in, size, count, string, down,
+			     address, rep, port);
 }
 
 static void
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index dad9081..728b24c 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -86,16 +86,9 @@ struct kvm_run {
 #define KVM_EXIT_IO_OUT 1
 			__u8 direction;
 			__u8 size; /* bytes */
-			__u8 string;
-			__u8 string_down;
-			__u8 rep;
-			__u8 pad;
 			__u16 port;
-			__u64 count;
-			union {
-				__u64 address;
-				__u32 value;
-			};
+			__u32 count;
+			__u64 data_offset; /* relative to kvm_run start */
 		} io;
 		struct {
 		} debug;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 21/41] KVM: MMU: Remove unnecessary check for pdptr access
@ 2007-04-01 14:35                                           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

We already special case the pdptr access, so no need to check it again.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/paging_tmpl.h |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index f3bcee9..17bd440 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -148,8 +148,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 			break;
 		}
 
-		if (walker->level != 3 || is_long_mode(vcpu))
-			walker->inherited_ar &= walker->table[index];
+		walker->inherited_ar &= walker->table[index];
 		table_gfn = (*ptep & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
 		paddr = safe_gpa_to_hpa(vcpu, *ptep & PT_BASE_ADDR_MASK);
 		kunmap_atomic(walker->table, KM_USER0);
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 21/41] KVM: MMU: Remove unnecessary check for pdptr access
@ 2007-04-01 14:35                                           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

We already special case the pdptr access, so no need to check it again.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/paging_tmpl.h |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index f3bcee9..17bd440 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -148,8 +148,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 			break;
 		}
 
-		if (walker->level != 3 || is_long_mode(vcpu))
-			walker->inherited_ar &= walker->table[index];
+		walker->inherited_ar &= walker->table[index];
 		table_gfn = (*ptep & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
 		paddr = safe_gpa_to_hpa(vcpu, *ptep & PT_BASE_ADDR_MASK);
 		kunmap_atomic(walker->table, KM_USER0);
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 22/41] KVM: MMU: Remove global pte tracking
@ 2007-04-01 14:35                                             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The initial, noncaching, version of the kvm mmu flushed the all nonglobal
shadow page table translations (much like a native tlb flush).  The new
implementation flushes translations only when they change, rendering global
pte tracking superfluous.

This removes the unused tracking mechanism and storage space.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    1 -
 drivers/kvm/mmu.c |    9 ---------
 2 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 7866b34..a4331da 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -136,7 +136,6 @@ struct kvm_mmu_page {
 	unsigned long slot_bitmap; /* One bit set per slot which has memory
 				    * in this shadow page.
 				    */
-	int global;              /* Set if all ptes in this page are global */
 	int multimapped;         /* More than one parent_pte? */
 	int root_count;          /* Currently serving as active root */
 	union {
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index e8e2750..b181106 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -461,7 +461,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 	list_add(&page->link, &vcpu->kvm->active_mmu_pages);
 	ASSERT(is_empty_shadow_page(page->page_hpa));
 	page->slot_bitmap = 0;
-	page->global = 1;
 	page->multimapped = 0;
 	page->parent_pte = parent_pte;
 	--vcpu->kvm->n_free_mmu_pages;
@@ -927,11 +926,6 @@ static void paging_new_cr3(struct kvm_vcpu *vcpu)
 	kvm_arch_ops->set_cr3(vcpu, vcpu->mmu.root_hpa);
 }
 
-static void mark_pagetable_nonglobal(void *shadow_pte)
-{
-	page_header(__pa(shadow_pte))->global = 0;
-}
-
 static inline void set_pte_common(struct kvm_vcpu *vcpu,
 			     u64 *shadow_pte,
 			     gpa_t gaddr,
@@ -949,9 +943,6 @@ static inline void set_pte_common(struct kvm_vcpu *vcpu,
 
 	*shadow_pte |= access_bits;
 
-	if (!(*shadow_pte & PT_GLOBAL_MASK))
-		mark_pagetable_nonglobal(shadow_pte);
-
 	if (is_error_hpa(paddr)) {
 		*shadow_pte |= gaddr;
 		*shadow_pte |= PT_SHADOW_IO_MARK;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 22/41] KVM: MMU: Remove global pte tracking
@ 2007-04-01 14:35                                             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

The initial, noncaching, version of the kvm mmu flushed the all nonglobal
shadow page table translations (much like a native tlb flush).  The new
implementation flushes translations only when they change, rendering global
pte tracking superfluous.

This removes the unused tracking mechanism and storage space.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h |    1 -
 drivers/kvm/mmu.c |    9 ---------
 2 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 7866b34..a4331da 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -136,7 +136,6 @@ struct kvm_mmu_page {
 	unsigned long slot_bitmap; /* One bit set per slot which has memory
 				    * in this shadow page.
 				    */
-	int global;              /* Set if all ptes in this page are global */
 	int multimapped;         /* More than one parent_pte? */
 	int root_count;          /* Currently serving as active root */
 	union {
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index e8e2750..b181106 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -461,7 +461,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 	list_add(&page->link, &vcpu->kvm->active_mmu_pages);
 	ASSERT(is_empty_shadow_page(page->page_hpa));
 	page->slot_bitmap = 0;
-	page->global = 1;
 	page->multimapped = 0;
 	page->parent_pte = parent_pte;
 	--vcpu->kvm->n_free_mmu_pages;
@@ -927,11 +926,6 @@ static void paging_new_cr3(struct kvm_vcpu *vcpu)
 	kvm_arch_ops->set_cr3(vcpu, vcpu->mmu.root_hpa);
 }
 
-static void mark_pagetable_nonglobal(void *shadow_pte)
-{
-	page_header(__pa(shadow_pte))->global = 0;
-}
-
 static inline void set_pte_common(struct kvm_vcpu *vcpu,
 			     u64 *shadow_pte,
 			     gpa_t gaddr,
@@ -949,9 +943,6 @@ static inline void set_pte_common(struct kvm_vcpu *vcpu,
 
 	*shadow_pte |= access_bits;
 
-	if (!(*shadow_pte & PT_GLOBAL_MASK))
-		mark_pagetable_nonglobal(shadow_pte);
-
 	if (is_error_hpa(paddr)) {
 		*shadow_pte |= gaddr;
 		*shadow_pte |= PT_SHADOW_IO_MARK;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 23/41] KVM: Workaround vmx inability to virtualize the reset state
@ 2007-04-01 14:35                                               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The reset state has cs.selector == 0xf000 and cs.base == 0xffff0000,
which aren't compatible with vm86 mode, which is used for real mode
virtualization.

When we create a vcpu, we set cs.base to 0xf0000, but if we get there by
way of a reset, the values are inconsistent and vmx refuses to enter
guest mode.

Workaround by detecting the state and munging it appropriately.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 0d9bf0b..aa7e2ba 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -712,6 +712,8 @@ static void enter_rmode(struct kvm_vcpu *vcpu)
 
 	vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
 	vmcs_write32(GUEST_CS_LIMIT, 0xffff);
+	if (vmcs_readl(GUEST_CS_BASE) == 0xffff0000)
+		vmcs_writel(GUEST_CS_BASE, 0xf0000);
 	vmcs_write16(GUEST_CS_SELECTOR, vmcs_readl(GUEST_CS_BASE) >> 4);
 
 	fix_rmode_seg(VCPU_SREG_ES, &vcpu->rmode.es);
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 23/41] KVM: Workaround vmx inability to virtualize the reset state
@ 2007-04-01 14:35                                               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

The reset state has cs.selector == 0xf000 and cs.base == 0xffff0000,
which aren't compatible with vm86 mode, which is used for real mode
virtualization.

When we create a vcpu, we set cs.base to 0xf0000, but if we get there by
way of a reset, the values are inconsistent and vmx refuses to enter
guest mode.

Workaround by detecting the state and munging it appropriately.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/vmx.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 0d9bf0b..aa7e2ba 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -712,6 +712,8 @@ static void enter_rmode(struct kvm_vcpu *vcpu)
 
 	vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
 	vmcs_write32(GUEST_CS_LIMIT, 0xffff);
+	if (vmcs_readl(GUEST_CS_BASE) == 0xffff0000)
+		vmcs_writel(GUEST_CS_BASE, 0xf0000);
 	vmcs_write16(GUEST_CS_SELECTOR, vmcs_readl(GUEST_CS_BASE) >> 4);
 
 	fix_rmode_seg(VCPU_SREG_ES, &vcpu->rmode.es);
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 24/41] KVM: Remove set_cr0_no_modeswitch() arch op
@ 2007-04-01 14:35                                                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers.
As we now cache the protected mode values on entry to real mode, this
isn't an issue anymore, and it interferes with reboot (which usually _is_
a modeswitch).

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    2 --
 drivers/kvm/kvm_main.c |    2 +-
 drivers/kvm/svm.c      |    1 -
 drivers/kvm/vmx.c      |   17 -----------------
 4 files changed, 1 insertions(+), 21 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index a4331da..7361c45 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -383,8 +383,6 @@ struct kvm_arch_ops {
 	void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l);
 	void (*decache_cr0_cr4_guest_bits)(struct kvm_vcpu *vcpu);
 	void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0);
-	void (*set_cr0_no_modeswitch)(struct kvm_vcpu *vcpu,
-				      unsigned long cr0);
 	void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
 	void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
 	void (*set_efer)(struct kvm_vcpu *vcpu, u64 efer);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 0c30e1b..5ada7aa 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1927,7 +1927,7 @@ static int kvm_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 	kvm_arch_ops->decache_cr0_cr4_guest_bits(vcpu);
 
 	mmu_reset_needed |= vcpu->cr0 != sregs->cr0;
-	kvm_arch_ops->set_cr0_no_modeswitch(vcpu, sregs->cr0);
+	kvm_arch_ops->set_cr0(vcpu, sregs->cr0);
 
 	mmu_reset_needed |= vcpu->cr4 != sregs->cr4;
 	kvm_arch_ops->set_cr4(vcpu, sregs->cr4);
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 64afc5c..d3cc115 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1716,7 +1716,6 @@ static struct kvm_arch_ops svm_arch_ops = {
 	.get_cs_db_l_bits = svm_get_cs_db_l_bits,
 	.decache_cr0_cr4_guest_bits = svm_decache_cr0_cr4_guest_bits,
 	.set_cr0 = svm_set_cr0,
-	.set_cr0_no_modeswitch = svm_set_cr0,
 	.set_cr3 = svm_set_cr3,
 	.set_cr4 = svm_set_cr4,
 	.set_efer = svm_set_efer,
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index aa7e2ba..027a962 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -788,22 +788,6 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	vcpu->cr0 = cr0;
 }
 
-/*
- * Used when restoring the VM to avoid corrupting segment registers
- */
-static void vmx_set_cr0_no_modeswitch(struct kvm_vcpu *vcpu, unsigned long cr0)
-{
-	if (!vcpu->rmode.active && !(cr0 & CR0_PE_MASK))
-		enter_rmode(vcpu);
-
-	vcpu->rmode.active = ((cr0 & CR0_PE_MASK) == 0);
-	update_exception_bitmap(vcpu);
-	vmcs_writel(CR0_READ_SHADOW, cr0);
-	vmcs_writel(GUEST_CR0,
-		    (cr0 & ~KVM_GUEST_CR0_MASK) | KVM_VM_CR0_ALWAYS_ON);
-	vcpu->cr0 = cr0;
-}
-
 static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
 	vmcs_writel(GUEST_CR3, cr3);
@@ -2069,7 +2053,6 @@ static struct kvm_arch_ops vmx_arch_ops = {
 	.get_cs_db_l_bits = vmx_get_cs_db_l_bits,
 	.decache_cr0_cr4_guest_bits = vmx_decache_cr0_cr4_guest_bits,
 	.set_cr0 = vmx_set_cr0,
-	.set_cr0_no_modeswitch = vmx_set_cr0_no_modeswitch,
 	.set_cr3 = vmx_set_cr3,
 	.set_cr4 = vmx_set_cr4,
 #ifdef CONFIG_X86_64
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 24/41] KVM: Remove set_cr0_no_modeswitch() arch op
@ 2007-04-01 14:35                                                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers.
As we now cache the protected mode values on entry to real mode, this
isn't an issue anymore, and it interferes with reboot (which usually _is_
a modeswitch).

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    2 --
 drivers/kvm/kvm_main.c |    2 +-
 drivers/kvm/svm.c      |    1 -
 drivers/kvm/vmx.c      |   17 -----------------
 4 files changed, 1 insertions(+), 21 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index a4331da..7361c45 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -383,8 +383,6 @@ struct kvm_arch_ops {
 	void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l);
 	void (*decache_cr0_cr4_guest_bits)(struct kvm_vcpu *vcpu);
 	void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0);
-	void (*set_cr0_no_modeswitch)(struct kvm_vcpu *vcpu,
-				      unsigned long cr0);
 	void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
 	void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
 	void (*set_efer)(struct kvm_vcpu *vcpu, u64 efer);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 0c30e1b..5ada7aa 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1927,7 +1927,7 @@ static int kvm_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 	kvm_arch_ops->decache_cr0_cr4_guest_bits(vcpu);
 
 	mmu_reset_needed |= vcpu->cr0 != sregs->cr0;
-	kvm_arch_ops->set_cr0_no_modeswitch(vcpu, sregs->cr0);
+	kvm_arch_ops->set_cr0(vcpu, sregs->cr0);
 
 	mmu_reset_needed |= vcpu->cr4 != sregs->cr4;
 	kvm_arch_ops->set_cr4(vcpu, sregs->cr4);
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 64afc5c..d3cc115 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1716,7 +1716,6 @@ static struct kvm_arch_ops svm_arch_ops = {
 	.get_cs_db_l_bits = svm_get_cs_db_l_bits,
 	.decache_cr0_cr4_guest_bits = svm_decache_cr0_cr4_guest_bits,
 	.set_cr0 = svm_set_cr0,
-	.set_cr0_no_modeswitch = svm_set_cr0,
 	.set_cr3 = svm_set_cr3,
 	.set_cr4 = svm_set_cr4,
 	.set_efer = svm_set_efer,
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index aa7e2ba..027a962 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -788,22 +788,6 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	vcpu->cr0 = cr0;
 }
 
-/*
- * Used when restoring the VM to avoid corrupting segment registers
- */
-static void vmx_set_cr0_no_modeswitch(struct kvm_vcpu *vcpu, unsigned long cr0)
-{
-	if (!vcpu->rmode.active && !(cr0 & CR0_PE_MASK))
-		enter_rmode(vcpu);
-
-	vcpu->rmode.active = ((cr0 & CR0_PE_MASK) == 0);
-	update_exception_bitmap(vcpu);
-	vmcs_writel(CR0_READ_SHADOW, cr0);
-	vmcs_writel(GUEST_CR0,
-		    (cr0 & ~KVM_GUEST_CR0_MASK) | KVM_VM_CR0_ALWAYS_ON);
-	vcpu->cr0 = cr0;
-}
-
 static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
 	vmcs_writel(GUEST_CR3, cr3);
@@ -2069,7 +2053,6 @@ static struct kvm_arch_ops vmx_arch_ops = {
 	.get_cs_db_l_bits = vmx_get_cs_db_l_bits,
 	.decache_cr0_cr4_guest_bits = vmx_decache_cr0_cr4_guest_bits,
 	.set_cr0 = vmx_set_cr0,
-	.set_cr0_no_modeswitch = vmx_set_cr0_no_modeswitch,
 	.set_cr3 = vmx_set_cr3,
 	.set_cr4 = vmx_set_cr4,
 #ifdef CONFIG_X86_64
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 25/41] KVM: Modify guest segments after potentially switching modes
@ 2007-04-01 14:35                                                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The SET_SREGS ioctl modifies both cr0.pe (real mode/protected mode) and
guest segment registers.  Since segment handling is modified by the mode on
Intel procesors, update the segment registers after the mode switch has taken
place.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 5ada7aa..12c3388 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1895,16 +1895,6 @@ static int kvm_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 
 	vcpu_load(vcpu);
 
-	set_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
-	set_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
-	set_segment(vcpu, &sregs->es, VCPU_SREG_ES);
-	set_segment(vcpu, &sregs->fs, VCPU_SREG_FS);
-	set_segment(vcpu, &sregs->gs, VCPU_SREG_GS);
-	set_segment(vcpu, &sregs->ss, VCPU_SREG_SS);
-
-	set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
-	set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
-
 	dt.limit = sregs->idt.limit;
 	dt.base = sregs->idt.base;
 	kvm_arch_ops->set_idt(vcpu, &dt);
@@ -1944,6 +1934,16 @@ static int kvm_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 		if (vcpu->irq_pending[i])
 			__set_bit(i, &vcpu->irq_summary);
 
+	set_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
+	set_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
+	set_segment(vcpu, &sregs->es, VCPU_SREG_ES);
+	set_segment(vcpu, &sregs->fs, VCPU_SREG_FS);
+	set_segment(vcpu, &sregs->gs, VCPU_SREG_GS);
+	set_segment(vcpu, &sregs->ss, VCPU_SREG_SS);
+
+	set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
+	set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
+
 	vcpu_put(vcpu);
 
 	return 0;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 25/41] KVM: Modify guest segments after potentially switching modes
@ 2007-04-01 14:35                                                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

The SET_SREGS ioctl modifies both cr0.pe (real mode/protected mode) and
guest segment registers.  Since segment handling is modified by the mode on
Intel procesors, update the segment registers after the mode switch has taken
place.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 5ada7aa..12c3388 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1895,16 +1895,6 @@ static int kvm_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 
 	vcpu_load(vcpu);
 
-	set_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
-	set_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
-	set_segment(vcpu, &sregs->es, VCPU_SREG_ES);
-	set_segment(vcpu, &sregs->fs, VCPU_SREG_FS);
-	set_segment(vcpu, &sregs->gs, VCPU_SREG_GS);
-	set_segment(vcpu, &sregs->ss, VCPU_SREG_SS);
-
-	set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
-	set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
-
 	dt.limit = sregs->idt.limit;
 	dt.base = sregs->idt.base;
 	kvm_arch_ops->set_idt(vcpu, &dt);
@@ -1944,6 +1934,16 @@ static int kvm_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 		if (vcpu->irq_pending[i])
 			__set_bit(i, &vcpu->irq_summary);
 
+	set_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
+	set_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
+	set_segment(vcpu, &sregs->es, VCPU_SREG_ES);
+	set_segment(vcpu, &sregs->fs, VCPU_SREG_FS);
+	set_segment(vcpu, &sregs->gs, VCPU_SREG_GS);
+	set_segment(vcpu, &sregs->ss, VCPU_SREG_SS);
+
+	set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
+	set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
+
 	vcpu_put(vcpu);
 
 	return 0;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 26/41] KVM: Hack real-mode segments on vmx from KVM_SET_SREGS
@ 2007-04-01 14:35                                                     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

As usual, we need to mangle segment registers when emulating real mode
as vm86 has specific constraints.  We special case the reset segment base,
and set the "access rights" (or descriptor flags) to vm86 comaptible values.

This fixes reboot on vmx.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 027a962..578dff5 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -864,7 +864,14 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
 	vmcs_writel(sf->base, var->base);
 	vmcs_write32(sf->limit, var->limit);
 	vmcs_write16(sf->selector, var->selector);
-	if (var->unusable)
+	if (vcpu->rmode.active && var->s) {
+		/*
+		 * Hack real-mode segments into vm86 compatibility.
+		 */
+		if (var->base == 0xffff0000 && var->selector == 0xf000)
+			vmcs_writel(sf->base, 0xf0000);
+		ar = 0xf3;
+	} else if (var->unusable)
 		ar = 1 << 16;
 	else {
 		ar = var->type & 15;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 26/41] KVM: Hack real-mode segments on vmx from KVM_SET_SREGS
@ 2007-04-01 14:35                                                     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

As usual, we need to mangle segment registers when emulating real mode
as vm86 has specific constraints.  We special case the reset segment base,
and set the "access rights" (or descriptor flags) to vm86 comaptible values.

This fixes reboot on vmx.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/vmx.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 027a962..578dff5 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -864,7 +864,14 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
 	vmcs_writel(sf->base, var->base);
 	vmcs_write32(sf->limit, var->limit);
 	vmcs_write16(sf->selector, var->selector);
-	if (var->unusable)
+	if (vcpu->rmode.active && var->s) {
+		/*
+		 * Hack real-mode segments into vm86 compatibility.
+		 */
+		if (var->base == 0xffff0000 && var->selector == 0xf000)
+			vmcs_writel(sf->base, 0xf0000);
+		ar = 0xf3;
+	} else if (var->unusable)
 		ar = 1 << 16;
 	else {
 		ar = var->type & 15;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 27/41] KVM: Don't allow the guest to turn off the cpu cache
@ 2007-04-01 14:35                                                       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The cpu cache is a host resource; the guest should not be able to turn
it off (even for itself).

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index d3cc115..191bc45 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -737,8 +737,10 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	}
 #endif
 	vcpu->svm->cr0 = cr0;
-	vcpu->svm->vmcb->save.cr0 = cr0 | CR0_PG_MASK | CR0_WP_MASK;
 	vcpu->cr0 = cr0;
+	cr0 |= CR0_PG_MASK | CR0_WP_MASK;
+	cr0 &= ~(CR0_CD_MASK | CR0_NW_MASK);
+	vcpu->svm->vmcb->save.cr0 = cr0;
 }
 
 static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 27/41] KVM: Don't allow the guest to turn off the cpu cache
@ 2007-04-01 14:35                                                       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

The cpu cache is a host resource; the guest should not be able to turn
it off (even for itself).

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/svm.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index d3cc115..191bc45 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -737,8 +737,10 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	}
 #endif
 	vcpu->svm->cr0 = cr0;
-	vcpu->svm->vmcb->save.cr0 = cr0 | CR0_PG_MASK | CR0_WP_MASK;
 	vcpu->cr0 = cr0;
+	cr0 |= CR0_PG_MASK | CR0_WP_MASK;
+	cr0 &= ~(CR0_CD_MASK | CR0_NW_MASK);
+	vcpu->svm->vmcb->save.cr0 = cr0;
 }
 
 static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 28/41] KVM: Remove unused and write-only variables
@ 2007-04-01 14:35                                                         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Trivial cleanup.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_svm.h |    2 --
 drivers/kvm/svm.c     |    2 --
 2 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/kvm_svm.h b/drivers/kvm/kvm_svm.h
index 624f1ca..a1a9eba 100644
--- a/drivers/kvm/kvm_svm.h
+++ b/drivers/kvm/kvm_svm.h
@@ -28,8 +28,6 @@ struct vcpu_svm {
 	struct svm_cpu_data *svm_data;
 	uint64_t asid_generation;
 
-	unsigned long cr0;
-	unsigned long cr4;
 	unsigned long db_regs[NUM_DB_REGS];
 
 	u64 next_rip;
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 191bc45..ddc0505 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -576,7 +576,6 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	vcpu->svm->vmcb = page_address(page);
 	memset(vcpu->svm->vmcb, 0, PAGE_SIZE);
 	vcpu->svm->vmcb_pa = page_to_pfn(page) << PAGE_SHIFT;
-	vcpu->svm->cr0 = 0x00000010;
 	vcpu->svm->asid_generation = 0;
 	memset(vcpu->svm->db_regs, 0, sizeof(vcpu->svm->db_regs));
 	init_vmcb(vcpu->svm->vmcb);
@@ -736,7 +735,6 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 		}
 	}
 #endif
-	vcpu->svm->cr0 = cr0;
 	vcpu->cr0 = cr0;
 	cr0 |= CR0_PG_MASK | CR0_WP_MASK;
 	cr0 &= ~(CR0_CD_MASK | CR0_NW_MASK);
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 28/41] KVM: Remove unused and write-only variables
@ 2007-04-01 14:35                                                         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Trivial cleanup.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_svm.h |    2 --
 drivers/kvm/svm.c     |    2 --
 2 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/kvm_svm.h b/drivers/kvm/kvm_svm.h
index 624f1ca..a1a9eba 100644
--- a/drivers/kvm/kvm_svm.h
+++ b/drivers/kvm/kvm_svm.h
@@ -28,8 +28,6 @@ struct vcpu_svm {
 	struct svm_cpu_data *svm_data;
 	uint64_t asid_generation;
 
-	unsigned long cr0;
-	unsigned long cr4;
 	unsigned long db_regs[NUM_DB_REGS];
 
 	u64 next_rip;
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 191bc45..ddc0505 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -576,7 +576,6 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	vcpu->svm->vmcb = page_address(page);
 	memset(vcpu->svm->vmcb, 0, PAGE_SIZE);
 	vcpu->svm->vmcb_pa = page_to_pfn(page) << PAGE_SHIFT;
-	vcpu->svm->cr0 = 0x00000010;
 	vcpu->svm->asid_generation = 0;
 	memset(vcpu->svm->db_regs, 0, sizeof(vcpu->svm->db_regs));
 	init_vmcb(vcpu->svm->vmcb);
@@ -736,7 +735,6 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 		}
 	}
 #endif
-	vcpu->svm->cr0 = cr0;
 	vcpu->cr0 = cr0;
 	cr0 |= CR0_PG_MASK | CR0_WP_MASK;
 	cr0 &= ~(CR0_CD_MASK | CR0_NW_MASK);
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 29/41] KVM: Handle writes to MCG_STATUS msr
@ 2007-04-01 14:35                                                           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Sergey Kiselev, Avi Kivity

From: Sergey Kiselev <sergey.kiselev@intel.com>

Some older (~2.6.7) kernels write MCG_STATUS register during kernel
boot (mce_clear_all() function, called from mce_init()). It's not
currently handled by kvm and will cause it to inject a GPF.
Following patch adds a "nop" handler for this.

Signed-off-by: Sergey Kiselev <sergey.kiselev@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 12c3388..1ef5e9a 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1467,6 +1467,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
 		printk(KERN_WARNING "%s: MSR_IA32_MC0_STATUS 0x%llx, nop\n",
 		       __FUNCTION__, data);
 		break;
+	case MSR_IA32_MCG_STATUS:
+		printk(KERN_WARNING "%s: MSR_IA32_MCG_STATUS 0x%llx, nop\n",
+			__FUNCTION__, data);
+		break;
 	case MSR_IA32_UCODE_REV:
 	case MSR_IA32_UCODE_WRITE:
 	case 0x200 ... 0x2ff: /* MTRRs */
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 29/41] KVM: Handle writes to MCG_STATUS msr
@ 2007-04-01 14:35                                                           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Sergey Kiselev

From: Sergey Kiselev <sergey.kiselev-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Some older (~2.6.7) kernels write MCG_STATUS register during kernel
boot (mce_clear_all() function, called from mce_init()). It's not
currently handled by kvm and will cause it to inject a GPF.
Following patch adds a "nop" handler for this.

Signed-off-by: Sergey Kiselev <sergey.kiselev-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 12c3388..1ef5e9a 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1467,6 +1467,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
 		printk(KERN_WARNING "%s: MSR_IA32_MC0_STATUS 0x%llx, nop\n",
 		       __FUNCTION__, data);
 		break;
+	case MSR_IA32_MCG_STATUS:
+		printk(KERN_WARNING "%s: MSR_IA32_MCG_STATUS 0x%llx, nop\n",
+			__FUNCTION__, data);
+		break;
 	case MSR_IA32_UCODE_REV:
 	case MSR_IA32_UCODE_WRITE:
 	case 0x200 ... 0x2ff: /* MTRRs */
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 30/41] KVM: SVM: forbid guest to execute monitor/mwait
@ 2007-04-01 14:35                                                             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Joerg Roedel, Avi Kivity

From: Joerg Roedel <joerg.roedel@amd.com>

This patch forbids the guest to execute monitor/mwait instructions on
SVM. This is necessary because the guest can execute these instructions
if they are available even if the kvm cpuid doesn't report its
existence.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    6 +++++-
 drivers/kvm/svm.h |    6 ++++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index ddc0505..0542d33 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -511,7 +511,9 @@ static void init_vmcb(struct vmcb *vmcb)
 				(1ULL << INTERCEPT_VMSAVE) |
 				(1ULL << INTERCEPT_STGI) |
 				(1ULL << INTERCEPT_CLGI) |
-				(1ULL << INTERCEPT_SKINIT);
+				(1ULL << INTERCEPT_SKINIT) |
+				(1ULL << INTERCEPT_MONITOR) |
+				(1ULL << INTERCEPT_MWAIT);
 
 	control->iopm_base_pa = iopm_base;
 	control->msrpm_base_pa = msrpm_base;
@@ -1292,6 +1294,8 @@ static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu,
 	[SVM_EXIT_STGI]				= invalid_op_interception,
 	[SVM_EXIT_CLGI]				= invalid_op_interception,
 	[SVM_EXIT_SKINIT]			= invalid_op_interception,
+	[SVM_EXIT_MONITOR]			= invalid_op_interception,
+	[SVM_EXIT_MWAIT]			= invalid_op_interception,
 };
 
 
diff --git a/drivers/kvm/svm.h b/drivers/kvm/svm.h
index df731c3..5e93814 100644
--- a/drivers/kvm/svm.h
+++ b/drivers/kvm/svm.h
@@ -44,6 +44,9 @@ enum {
 	INTERCEPT_RDTSCP,
 	INTERCEPT_ICEBP,
 	INTERCEPT_WBINVD,
+	INTERCEPT_MONITOR,
+	INTERCEPT_MWAIT,
+	INTERCEPT_MWAIT_COND,
 };
 
 
@@ -298,6 +301,9 @@ struct __attribute__ ((__packed__)) vmcb {
 #define SVM_EXIT_RDTSCP		0x087
 #define SVM_EXIT_ICEBP		0x088
 #define SVM_EXIT_WBINVD		0x089
+#define SVM_EXIT_MONITOR	0x08a
+#define SVM_EXIT_MWAIT		0x08b
+#define SVM_EXIT_MWAIT_COND	0x08c
 #define SVM_EXIT_NPF  		0x400
 
 #define SVM_EXIT_ERR		-1
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 30/41] KVM: SVM: forbid guest to execute monitor/mwait
@ 2007-04-01 14:35                                                             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <joerg.roedel-5C7GfCeVMHo@public.gmane.org>

This patch forbids the guest to execute monitor/mwait instructions on
SVM. This is necessary because the guest can execute these instructions
if they are available even if the kvm cpuid doesn't report its
existence.

Signed-off-by: Joerg Roedel <joerg.roedel-5C7GfCeVMHo@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/svm.c |    6 +++++-
 drivers/kvm/svm.h |    6 ++++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index ddc0505..0542d33 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -511,7 +511,9 @@ static void init_vmcb(struct vmcb *vmcb)
 				(1ULL << INTERCEPT_VMSAVE) |
 				(1ULL << INTERCEPT_STGI) |
 				(1ULL << INTERCEPT_CLGI) |
-				(1ULL << INTERCEPT_SKINIT);
+				(1ULL << INTERCEPT_SKINIT) |
+				(1ULL << INTERCEPT_MONITOR) |
+				(1ULL << INTERCEPT_MWAIT);
 
 	control->iopm_base_pa = iopm_base;
 	control->msrpm_base_pa = msrpm_base;
@@ -1292,6 +1294,8 @@ static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu,
 	[SVM_EXIT_STGI]				= invalid_op_interception,
 	[SVM_EXIT_CLGI]				= invalid_op_interception,
 	[SVM_EXIT_SKINIT]			= invalid_op_interception,
+	[SVM_EXIT_MONITOR]			= invalid_op_interception,
+	[SVM_EXIT_MWAIT]			= invalid_op_interception,
 };
 
 
diff --git a/drivers/kvm/svm.h b/drivers/kvm/svm.h
index df731c3..5e93814 100644
--- a/drivers/kvm/svm.h
+++ b/drivers/kvm/svm.h
@@ -44,6 +44,9 @@ enum {
 	INTERCEPT_RDTSCP,
 	INTERCEPT_ICEBP,
 	INTERCEPT_WBINVD,
+	INTERCEPT_MONITOR,
+	INTERCEPT_MWAIT,
+	INTERCEPT_MWAIT_COND,
 };
 
 
@@ -298,6 +301,9 @@ struct __attribute__ ((__packed__)) vmcb {
 #define SVM_EXIT_RDTSCP		0x087
 #define SVM_EXIT_ICEBP		0x088
 #define SVM_EXIT_WBINVD		0x089
+#define SVM_EXIT_MONITOR	0x08a
+#define SVM_EXIT_MWAIT		0x08b
+#define SVM_EXIT_MWAIT_COND	0x08c
 #define SVM_EXIT_NPF  		0x400
 
 #define SVM_EXIT_ERR		-1
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 31/41] KVM: MMU: Fix hugepage pdes mapping same physical address with different access
@ 2007-04-01 14:35                                                               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
the same physical address, they share the same shadow page.  This is a fairly
common case (kernel mappings on i386 nonpae Linux, for example).

However, if the two pdes map the same memory but with different permissions, kvm
will happily use the cached shadow page.  If the access through the more
permissive pde will occur after the access to the strict pde, an endless pagefault
loop will be generated and the guest will make no progress.

Fix by making the access permissions part of the cache lookup key.

The fix allows Xen pae to boot on kvm and run guest domains.

Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h         |    2 ++
 drivers/kvm/mmu.c         |    8 +++++---
 drivers/kvm/paging_tmpl.h |    7 ++++++-
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 7361c45..f5e343c 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -109,6 +109,7 @@ struct kvm_pte_chain {
  *   bits 4:7 - page table level for this shadow (1-4)
  *   bits 8:9 - page table quadrant for 2-level guests
  *   bit   16 - "metaphysical" - gfn is not a real page (huge page/real mode)
+ *   bits 17:18 - "access" - the user and writable bits of a huge page pde
  */
 union kvm_mmu_page_role {
 	unsigned word;
@@ -118,6 +119,7 @@ union kvm_mmu_page_role {
 		unsigned quadrant : 2;
 		unsigned pad_for_nice_hex_output : 6;
 		unsigned metaphysical : 1;
+		unsigned hugepage_access : 2;
 	};
 };
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index b181106..0216b77 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -568,6 +568,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 					     gva_t gaddr,
 					     unsigned level,
 					     int metaphysical,
+					     unsigned hugepage_access,
 					     u64 *parent_pte)
 {
 	union kvm_mmu_page_role role;
@@ -581,6 +582,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 	role.glevels = vcpu->mmu.root_level;
 	role.level = level;
 	role.metaphysical = metaphysical;
+	role.hugepage_access = hugepage_access;
 	if (vcpu->mmu.root_level <= PT32_ROOT_LEVEL) {
 		quadrant = gaddr >> (PAGE_SHIFT + (PT64_PT_BITS * level));
 		quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
@@ -780,7 +782,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 				>> PAGE_SHIFT;
 			new_table = kvm_mmu_get_page(vcpu, pseudo_gfn,
 						     v, level - 1,
-						     1, &table[index]);
+						     1, 0, &table[index]);
 			if (!new_table) {
 				pgprintk("nonpaging_map: ENOMEM\n");
 				return -ENOMEM;
@@ -835,7 +837,7 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
 
 		ASSERT(!VALID_PAGE(root));
 		page = kvm_mmu_get_page(vcpu, root_gfn, 0,
-					PT64_ROOT_LEVEL, 0, NULL);
+					PT64_ROOT_LEVEL, 0, 0, NULL);
 		root = page->page_hpa;
 		++page->root_count;
 		vcpu->mmu.root_hpa = root;
@@ -852,7 +854,7 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
 			root_gfn = 0;
 		page = kvm_mmu_get_page(vcpu, root_gfn, i << 30,
 					PT32_ROOT_LEVEL, !is_paging(vcpu),
-					NULL);
+					0, NULL);
 		root = page->page_hpa;
 		++page->root_count;
 		vcpu->mmu.pae_root[i] = root | PT_PRESENT_MASK;
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 17bd440..b94010d 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -247,6 +247,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		u64 shadow_pte;
 		int metaphysical;
 		gfn_t table_gfn;
+		unsigned hugepage_access = 0;
 
 		if (is_present_pte(*shadow_ent) || is_io_pte(*shadow_ent)) {
 			if (level == PT_PAGE_TABLE_LEVEL)
@@ -276,6 +277,9 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		if (level - 1 == PT_PAGE_TABLE_LEVEL
 		    && walker->level == PT_DIRECTORY_LEVEL) {
 			metaphysical = 1;
+			hugepage_access = *guest_ent;
+			hugepage_access &= PT_USER_MASK | PT_WRITABLE_MASK;
+			hugepage_access >>= PT_WRITABLE_SHIFT;
 			table_gfn = (*guest_ent & PT_BASE_ADDR_MASK)
 				>> PAGE_SHIFT;
 		} else {
@@ -283,7 +287,8 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 			table_gfn = walker->table_gfn[level - 2];
 		}
 		shadow_page = kvm_mmu_get_page(vcpu, table_gfn, addr, level-1,
-					       metaphysical, shadow_ent);
+					       metaphysical, hugepage_access,
+					       shadow_ent);
 		shadow_addr = shadow_page->page_hpa;
 		shadow_pte = shadow_addr | PT_PRESENT_MASK | PT_ACCESSED_MASK
 			| PT_WRITABLE_MASK | PT_USER_MASK;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 31/41] KVM: MMU: Fix hugepage pdes mapping same physical address with different access
@ 2007-04-01 14:35                                                               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
the same physical address, they share the same shadow page.  This is a fairly
common case (kernel mappings on i386 nonpae Linux, for example).

However, if the two pdes map the same memory but with different permissions, kvm
will happily use the cached shadow page.  If the access through the more
permissive pde will occur after the access to the strict pde, an endless pagefault
loop will be generated and the guest will make no progress.

Fix by making the access permissions part of the cache lookup key.

The fix allows Xen pae to boot on kvm and run guest domains.

Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    2 ++
 drivers/kvm/mmu.c         |    8 +++++---
 drivers/kvm/paging_tmpl.h |    7 ++++++-
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 7361c45..f5e343c 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -109,6 +109,7 @@ struct kvm_pte_chain {
  *   bits 4:7 - page table level for this shadow (1-4)
  *   bits 8:9 - page table quadrant for 2-level guests
  *   bit   16 - "metaphysical" - gfn is not a real page (huge page/real mode)
+ *   bits 17:18 - "access" - the user and writable bits of a huge page pde
  */
 union kvm_mmu_page_role {
 	unsigned word;
@@ -118,6 +119,7 @@ union kvm_mmu_page_role {
 		unsigned quadrant : 2;
 		unsigned pad_for_nice_hex_output : 6;
 		unsigned metaphysical : 1;
+		unsigned hugepage_access : 2;
 	};
 };
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index b181106..0216b77 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -568,6 +568,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 					     gva_t gaddr,
 					     unsigned level,
 					     int metaphysical,
+					     unsigned hugepage_access,
 					     u64 *parent_pte)
 {
 	union kvm_mmu_page_role role;
@@ -581,6 +582,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 	role.glevels = vcpu->mmu.root_level;
 	role.level = level;
 	role.metaphysical = metaphysical;
+	role.hugepage_access = hugepage_access;
 	if (vcpu->mmu.root_level <= PT32_ROOT_LEVEL) {
 		quadrant = gaddr >> (PAGE_SHIFT + (PT64_PT_BITS * level));
 		quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
@@ -780,7 +782,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 				>> PAGE_SHIFT;
 			new_table = kvm_mmu_get_page(vcpu, pseudo_gfn,
 						     v, level - 1,
-						     1, &table[index]);
+						     1, 0, &table[index]);
 			if (!new_table) {
 				pgprintk("nonpaging_map: ENOMEM\n");
 				return -ENOMEM;
@@ -835,7 +837,7 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
 
 		ASSERT(!VALID_PAGE(root));
 		page = kvm_mmu_get_page(vcpu, root_gfn, 0,
-					PT64_ROOT_LEVEL, 0, NULL);
+					PT64_ROOT_LEVEL, 0, 0, NULL);
 		root = page->page_hpa;
 		++page->root_count;
 		vcpu->mmu.root_hpa = root;
@@ -852,7 +854,7 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu)
 			root_gfn = 0;
 		page = kvm_mmu_get_page(vcpu, root_gfn, i << 30,
 					PT32_ROOT_LEVEL, !is_paging(vcpu),
-					NULL);
+					0, NULL);
 		root = page->page_hpa;
 		++page->root_count;
 		vcpu->mmu.pae_root[i] = root | PT_PRESENT_MASK;
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 17bd440..b94010d 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -247,6 +247,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		u64 shadow_pte;
 		int metaphysical;
 		gfn_t table_gfn;
+		unsigned hugepage_access = 0;
 
 		if (is_present_pte(*shadow_ent) || is_io_pte(*shadow_ent)) {
 			if (level == PT_PAGE_TABLE_LEVEL)
@@ -276,6 +277,9 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		if (level - 1 == PT_PAGE_TABLE_LEVEL
 		    && walker->level == PT_DIRECTORY_LEVEL) {
 			metaphysical = 1;
+			hugepage_access = *guest_ent;
+			hugepage_access &= PT_USER_MASK | PT_WRITABLE_MASK;
+			hugepage_access >>= PT_WRITABLE_SHIFT;
 			table_gfn = (*guest_ent & PT_BASE_ADDR_MASK)
 				>> PAGE_SHIFT;
 		} else {
@@ -283,7 +287,8 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 			table_gfn = walker->table_gfn[level - 2];
 		}
 		shadow_page = kvm_mmu_get_page(vcpu, table_gfn, addr, level-1,
-					       metaphysical, shadow_ent);
+					       metaphysical, hugepage_access,
+					       shadow_ent);
 		shadow_addr = shadow_page->page_hpa;
 		shadow_pte = shadow_addr | PT_PRESENT_MASK | PT_ACCESSED_MASK
 			| PT_WRITABLE_MASK | PT_USER_MASK;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 32/41] KVM: SVM: Ensure timestamp counter monotonicity
@ 2007-04-01 14:35                                                                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

When a vcpu is migrated from one cpu to another, its timestamp counter
may lose its monotonic property if the host has unsynced timestamp counters.
This can confuse the guest, sometimes to the point of refusing to boot.

As the rdtsc instruction is rather fast on AMD processors (7-10 cycles),
we can simply record the last host tsc when we drop the cpu, and adjust
the vcpu tsc offset when we detect that we've migrated to a different cpu.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    1 +
 drivers/kvm/svm.c |   21 +++++++++++++++++----
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index f5e343c..6d0bd7a 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -244,6 +244,7 @@ struct kvm_vcpu {
 	struct mutex mutex;
 	int   cpu;
 	int   launched;
+	u64 host_tsc;
 	struct kvm_run *run;
 	int interrupt_window_open;
 	unsigned long irq_summary; /* bit vector: 1 per word in irq_pending */
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 0542d33..ca2642f 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -459,7 +459,6 @@ static void init_vmcb(struct vmcb *vmcb)
 {
 	struct vmcb_control_area *control = &vmcb->control;
 	struct vmcb_save_area *save = &vmcb->save;
-	u64 tsc;
 
 	control->intercept_cr_read = 	INTERCEPT_CR0_MASK |
 					INTERCEPT_CR3_MASK |
@@ -517,8 +516,7 @@ static void init_vmcb(struct vmcb *vmcb)
 
 	control->iopm_base_pa = iopm_base;
 	control->msrpm_base_pa = msrpm_base;
-	rdtscll(tsc);
-	control->tsc_offset = -tsc;
+	control->tsc_offset = 0;
 	control->int_ctl = V_INTR_MASKING_MASK;
 
 	init_seg(&save->es);
@@ -606,11 +604,26 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
 
 static void svm_vcpu_load(struct kvm_vcpu *vcpu)
 {
-	get_cpu();
+	int cpu;
+
+	cpu = get_cpu();
+	if (unlikely(cpu != vcpu->cpu)) {
+		u64 tsc_this, delta;
+
+		/*
+		 * Make sure that the guest sees a monotonically
+		 * increasing TSC.
+		 */
+		rdtscll(tsc_this);
+		delta = vcpu->host_tsc - tsc_this;
+		vcpu->svm->vmcb->control.tsc_offset += delta;
+		vcpu->cpu = cpu;
+	}
 }
 
 static void svm_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	rdtscll(vcpu->host_tsc);
 	put_cpu();
 }
 
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 32/41] KVM: SVM: Ensure timestamp counter monotonicity
@ 2007-04-01 14:35                                                                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

When a vcpu is migrated from one cpu to another, its timestamp counter
may lose its monotonic property if the host has unsynced timestamp counters.
This can confuse the guest, sometimes to the point of refusing to boot.

As the rdtsc instruction is rather fast on AMD processors (7-10 cycles),
we can simply record the last host tsc when we drop the cpu, and adjust
the vcpu tsc offset when we detect that we've migrated to a different cpu.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h |    1 +
 drivers/kvm/svm.c |   21 +++++++++++++++++----
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index f5e343c..6d0bd7a 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -244,6 +244,7 @@ struct kvm_vcpu {
 	struct mutex mutex;
 	int   cpu;
 	int   launched;
+	u64 host_tsc;
 	struct kvm_run *run;
 	int interrupt_window_open;
 	unsigned long irq_summary; /* bit vector: 1 per word in irq_pending */
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 0542d33..ca2642f 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -459,7 +459,6 @@ static void init_vmcb(struct vmcb *vmcb)
 {
 	struct vmcb_control_area *control = &vmcb->control;
 	struct vmcb_save_area *save = &vmcb->save;
-	u64 tsc;
 
 	control->intercept_cr_read = 	INTERCEPT_CR0_MASK |
 					INTERCEPT_CR3_MASK |
@@ -517,8 +516,7 @@ static void init_vmcb(struct vmcb *vmcb)
 
 	control->iopm_base_pa = iopm_base;
 	control->msrpm_base_pa = msrpm_base;
-	rdtscll(tsc);
-	control->tsc_offset = -tsc;
+	control->tsc_offset = 0;
 	control->int_ctl = V_INTR_MASKING_MASK;
 
 	init_seg(&save->es);
@@ -606,11 +604,26 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
 
 static void svm_vcpu_load(struct kvm_vcpu *vcpu)
 {
-	get_cpu();
+	int cpu;
+
+	cpu = get_cpu();
+	if (unlikely(cpu != vcpu->cpu)) {
+		u64 tsc_this, delta;
+
+		/*
+		 * Make sure that the guest sees a monotonically
+		 * increasing TSC.
+		 */
+		rdtscll(tsc_this);
+		delta = vcpu->host_tsc - tsc_this;
+		vcpu->svm->vmcb->control.tsc_offset += delta;
+		vcpu->cpu = cpu;
+	}
 }
 
 static void svm_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	rdtscll(vcpu->host_tsc);
 	put_cpu();
 }
 
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 33/41] KVM: Remove unused function
@ 2007-04-01 14:35                                                                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Michal Piotrowski, Avi Kivity

From: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>

Remove unused function

CC      drivers/kvm/svm.o
drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used

Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    7 -------
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index ca2642f..303e959 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -203,13 +203,6 @@ static void inject_ud(struct kvm_vcpu *vcpu)
 						UD_VECTOR;
 }
 
-static void inject_db(struct kvm_vcpu *vcpu)
-{
-	vcpu->svm->vmcb->control.event_inj = 	SVM_EVTINJ_VALID |
-						SVM_EVTINJ_TYPE_EXEPT |
-						DB_VECTOR;
-}
-
 static int is_page_fault(uint32_t info)
 {
 	info &= SVM_EVTINJ_VEC_MASK | SVM_EVTINJ_TYPE_MASK | SVM_EVTINJ_VALID;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 33/41] KVM: Remove unused function
@ 2007-04-01 14:35                                                                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Michal Piotrowski

From: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>

Remove unused function

CC      drivers/kvm/svm.o
drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used

Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |    7 -------
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index ca2642f..303e959 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -203,13 +203,6 @@ static void inject_ud(struct kvm_vcpu *vcpu)
 						UD_VECTOR;
 }
 
-static void inject_db(struct kvm_vcpu *vcpu)
-{
-	vcpu->svm->vmcb->control.event_inj = 	SVM_EVTINJ_VALID |
-						SVM_EVTINJ_TYPE_EXEPT |
-						DB_VECTOR;
-}
-
 static int is_page_fault(uint32_t info)
 {
 	info &= SVM_EVTINJ_VEC_MASK | SVM_EVTINJ_TYPE_MASK | SVM_EVTINJ_VALID;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 34/41] KVM: Use list_move()
@ 2007-04-01 14:35                                                                     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Use list_move() where possible.  Noticed by Dor Laor.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/mmu.c |   12 ++++--------
 1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 0216b77..c2487b6 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -437,9 +437,8 @@ static void kvm_mmu_free_page(struct kvm_vcpu *vcpu, hpa_t page_hpa)
 	struct kvm_mmu_page *page_head = page_header(page_hpa);
 
 	ASSERT(is_empty_shadow_page(page_hpa));
-	list_del(&page_head->link);
 	page_head->page_hpa = page_hpa;
-	list_add(&page_head->link, &vcpu->free_pages);
+	list_move(&page_head->link, &vcpu->free_pages);
 	++vcpu->kvm->n_free_mmu_pages;
 }
 
@@ -457,8 +456,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 		return NULL;
 
 	page = list_entry(vcpu->free_pages.next, struct kvm_mmu_page, link);
-	list_del(&page->link);
-	list_add(&page->link, &vcpu->kvm->active_mmu_pages);
+	list_move(&page->link, &vcpu->kvm->active_mmu_pages);
 	ASSERT(is_empty_shadow_page(page->page_hpa));
 	page->slot_bitmap = 0;
 	page->multimapped = 0;
@@ -670,10 +668,8 @@ static void kvm_mmu_zap_page(struct kvm_vcpu *vcpu,
 	if (!page->root_count) {
 		hlist_del(&page->hash_link);
 		kvm_mmu_free_page(vcpu, page->page_hpa);
-	} else {
-		list_del(&page->link);
-		list_add(&page->link, &vcpu->kvm->active_mmu_pages);
-	}
+	} else
+		list_move(&page->link, &vcpu->kvm->active_mmu_pages);
 }
 
 static int kvm_mmu_unprotect_page(struct kvm_vcpu *vcpu, gfn_t gfn)
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 34/41] KVM: Use list_move()
@ 2007-04-01 14:35                                                                     ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Use list_move() where possible.  Noticed by Dor Laor.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c |   12 ++++--------
 1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 0216b77..c2487b6 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -437,9 +437,8 @@ static void kvm_mmu_free_page(struct kvm_vcpu *vcpu, hpa_t page_hpa)
 	struct kvm_mmu_page *page_head = page_header(page_hpa);
 
 	ASSERT(is_empty_shadow_page(page_hpa));
-	list_del(&page_head->link);
 	page_head->page_hpa = page_hpa;
-	list_add(&page_head->link, &vcpu->free_pages);
+	list_move(&page_head->link, &vcpu->free_pages);
 	++vcpu->kvm->n_free_mmu_pages;
 }
 
@@ -457,8 +456,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 		return NULL;
 
 	page = list_entry(vcpu->free_pages.next, struct kvm_mmu_page, link);
-	list_del(&page->link);
-	list_add(&page->link, &vcpu->kvm->active_mmu_pages);
+	list_move(&page->link, &vcpu->kvm->active_mmu_pages);
 	ASSERT(is_empty_shadow_page(page->page_hpa));
 	page->slot_bitmap = 0;
 	page->multimapped = 0;
@@ -670,10 +668,8 @@ static void kvm_mmu_zap_page(struct kvm_vcpu *vcpu,
 	if (!page->root_count) {
 		hlist_del(&page->hash_link);
 		kvm_mmu_free_page(vcpu, page->page_hpa);
-	} else {
-		list_del(&page->link);
-		list_add(&page->link, &vcpu->kvm->active_mmu_pages);
-	}
+	} else
+		list_move(&page->link, &vcpu->kvm->active_mmu_pages);
 }
 
 static int kvm_mmu_unprotect_page(struct kvm_vcpu *vcpu, gfn_t gfn)
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 35/41] KVM: Remove debug message
@ 2007-04-01 14:35                                                                       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

No longer interesting.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/vmx.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 578dff5..b64b7b7 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1131,7 +1131,6 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 		vcpu->guest_msrs[j] = vcpu->host_msrs[j];
 		++vcpu->nmsrs;
 	}
-	printk(KERN_DEBUG "kvm: msrs: %d\n", vcpu->nmsrs);
 
 	nr_good_msrs = vcpu->nmsrs - NR_BAD_MSRS;
 	vmcs_writel(VM_ENTRY_MSR_LOAD_ADDR,
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 35/41] KVM: Remove debug message
@ 2007-04-01 14:35                                                                       ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

No longer interesting.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/vmx.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 578dff5..b64b7b7 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1131,7 +1131,6 @@ static int vmx_vcpu_setup(struct kvm_vcpu *vcpu)
 		vcpu->guest_msrs[j] = vcpu->host_msrs[j];
 		++vcpu->nmsrs;
 	}
-	printk(KERN_DEBUG "kvm: msrs: %d\n", vcpu->nmsrs);
 
 	nr_good_msrs = vcpu->nmsrs - NR_BAD_MSRS;
 	vmcs_writel(VM_ENTRY_MSR_LOAD_ADDR,
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 36/41] KVM: x86 emulator: fix bit string operations operand size
@ 2007-04-01 14:35                                                                         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

On x86, bit operations operate on a string of bits that can reside in
multiple words.  For example, 'btsl %eax, (blah)' will touch the word
at blah+4 if %eax is between 32 and 63.

The x86 emulator compensates for that by advancing the operand address
by (bit offset / BITS_PER_LONG) and truncating the bit offset to the
range (0..BITS_PER_LONG-1).  This has a side effect of forcing the operand
size to 8 bytes on 64-bit hosts.

Now, a 32-bit guest goes and fork()s a process.  It write protects a stack
page at 0xbffff000 using the 'btr' instruction, at offset 0xffc in the page
table, with bit offset 1 (for the write permission bit).

The emulator now forces the operand size to 8 bytes as previously described,
and an innocent page table update turns into a cross-page-boundary write,
which is assumed by the mmu code not to be a page table, so it doesn't
actually clear the corresponding shadow page table entry.  The guest and
host permissions are out of sync and guest memory is corrupted soon
afterwards, leading to guest failure.

Fix by not using BITS_PER_LONG as the word size; instead use the actual
operand size, so we get a 32-bit write in that case.

Note we still have to teach the mmu to handle cross-page-boundary writes
to guest page table; but for now this allows Damn Small Linux 0.4 (2.4.20)
to boot.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/x86_emulate.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 7513cdd..bcf872b 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -833,8 +833,9 @@ done_prefixes:
 		dst.ptr = (unsigned long *)cr2;
 		dst.bytes = (d & ByteOp) ? 1 : op_bytes;
 		if (d & BitOp) {
-			dst.ptr += src.val / BITS_PER_LONG;
-			dst.bytes = sizeof(long);
+			unsigned long mask = ~(dst.bytes * 8 - 1);
+
+			dst.ptr = (void *)dst.ptr + (src.val & mask) / 8;
 		}
 		if (!(d & Mov) && /* optimisation - avoid slow emulated read */
 		    ((rc = ops->read_emulated((unsigned long)dst.ptr,
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 36/41] KVM: x86 emulator: fix bit string operations operand size
@ 2007-04-01 14:35                                                                         ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

On x86, bit operations operate on a string of bits that can reside in
multiple words.  For example, 'btsl %eax, (blah)' will touch the word
at blah+4 if %eax is between 32 and 63.

The x86 emulator compensates for that by advancing the operand address
by (bit offset / BITS_PER_LONG) and truncating the bit offset to the
range (0..BITS_PER_LONG-1).  This has a side effect of forcing the operand
size to 8 bytes on 64-bit hosts.

Now, a 32-bit guest goes and fork()s a process.  It write protects a stack
page at 0xbffff000 using the 'btr' instruction, at offset 0xffc in the page
table, with bit offset 1 (for the write permission bit).

The emulator now forces the operand size to 8 bytes as previously described,
and an innocent page table update turns into a cross-page-boundary write,
which is assumed by the mmu code not to be a page table, so it doesn't
actually clear the corresponding shadow page table entry.  The guest and
host permissions are out of sync and guest memory is corrupted soon
afterwards, leading to guest failure.

Fix by not using BITS_PER_LONG as the word size; instead use the actual
operand size, so we get a 32-bit write in that case.

Note we still have to teach the mmu to handle cross-page-boundary writes
to guest page table; but for now this allows Damn Small Linux 0.4 (2.4.20)
to boot.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 7513cdd..bcf872b 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -833,8 +833,9 @@ done_prefixes:
 		dst.ptr = (unsigned long *)cr2;
 		dst.bytes = (d & ByteOp) ? 1 : op_bytes;
 		if (d & BitOp) {
-			dst.ptr += src.val / BITS_PER_LONG;
-			dst.bytes = sizeof(long);
+			unsigned long mask = ~(dst.bytes * 8 - 1);
+
+			dst.ptr = (void *)dst.ptr + (src.val & mask) / 8;
 		}
 		if (!(d & Mov) && /* optimisation - avoid slow emulated read */
 		    ((rc = ops->read_emulated((unsigned long)dst.ptr,
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 37/41] KVM: Add mmu cache clear function
@ 2007-04-01 14:35                                                                           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Dor Laor, Avi Kivity

From: Dor Laor <dor.laor@qumranet.com>

Functions that play around with the physical memory map
need a way to clear mappings to possibly nonexistent or
invalid memory.  Both the mmu cache and the processor tlb
are cleared.

Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h |    1 +
 drivers/kvm/mmu.c |   17 +++++++++++++++++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 6d0bd7a..59357be 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -430,6 +430,7 @@ int kvm_mmu_setup(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm_vcpu *vcpu, int slot);
+void kvm_mmu_zap_all(struct kvm_vcpu *vcpu);
 
 hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa);
 #define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index c2487b6..707f63c 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -1313,6 +1313,23 @@ void kvm_mmu_slot_remove_write_access(struct kvm_vcpu *vcpu, int slot)
 	}
 }
 
+void kvm_mmu_zap_all(struct kvm_vcpu *vcpu)
+{
+	destroy_kvm_mmu(vcpu);
+
+	while (!list_empty(&vcpu->kvm->active_mmu_pages)) {
+		struct kvm_mmu_page *page;
+
+		page = container_of(vcpu->kvm->active_mmu_pages.next,
+				    struct kvm_mmu_page, link);
+		kvm_mmu_zap_page(vcpu, page);
+	}
+
+	mmu_free_memory_caches(vcpu);
+	kvm_arch_ops->tlb_flush(vcpu);
+	init_kvm_mmu(vcpu);
+}
+
 #ifdef AUDIT
 
 static const char *audit_msg;
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 37/41] KVM: Add mmu cache clear function
@ 2007-04-01 14:35                                                                           ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Dor Laor <dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

Functions that play around with the physical memory map
need a way to clear mappings to possibly nonexistent or
invalid memory.  Both the mmu cache and the processor tlb
are cleared.

Signed-off-by: Dor Laor <dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h |    1 +
 drivers/kvm/mmu.c |   17 +++++++++++++++++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 6d0bd7a..59357be 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -430,6 +430,7 @@ int kvm_mmu_setup(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm_vcpu *vcpu, int slot);
+void kvm_mmu_zap_all(struct kvm_vcpu *vcpu);
 
 hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa);
 #define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index c2487b6..707f63c 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -1313,6 +1313,23 @@ void kvm_mmu_slot_remove_write_access(struct kvm_vcpu *vcpu, int slot)
 	}
 }
 
+void kvm_mmu_zap_all(struct kvm_vcpu *vcpu)
+{
+	destroy_kvm_mmu(vcpu);
+
+	while (!list_empty(&vcpu->kvm->active_mmu_pages)) {
+		struct kvm_mmu_page *page;
+
+		page = container_of(vcpu->kvm->active_mmu_pages.next,
+				    struct kvm_mmu_page, link);
+		kvm_mmu_zap_page(vcpu, page);
+	}
+
+	mmu_free_memory_caches(vcpu);
+	kvm_arch_ops->tlb_flush(vcpu);
+	init_kvm_mmu(vcpu);
+}
+
 #ifdef AUDIT
 
 static const char *audit_msg;
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 38/41] KVM: Simply gfn_to_page()
@ 2007-04-01 14:35                                                                             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

Mapping a guest page to a host page is a common operation.  Currently,
one has first to find the memory slot where the page belongs (gfn_to_memslot),
then locate the page itself (gfn_to_page()).

This is clumsy, and also won't work well with memory aliases.  So simplify
gfn_to_page() not to require memory slot translation first, and instead do it
internally.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |   12 +-----------
 drivers/kvm/kvm_main.c |   45 +++++++++++++++++++++++++--------------------
 drivers/kvm/mmu.c      |   12 ++++--------
 drivers/kvm/vmx.c      |    6 +++---
 4 files changed, 33 insertions(+), 42 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 59357be..d19985a 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -443,11 +443,7 @@ void kvm_emulator_want_group7_invlpg(void);
 
 extern hpa_t bad_page_address;
 
-static inline struct page *gfn_to_page(struct kvm_memory_slot *slot, gfn_t gfn)
-{
-	return slot->phys_mem[gfn - slot->base_gfn];
-}
-
+struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
 
@@ -523,12 +519,6 @@ static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 	return vcpu->mmu.page_fault(vcpu, gva, error_code);
 }
 
-static inline struct page *_gfn_to_page(struct kvm *kvm, gfn_t gfn)
-{
-	struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
-	return (slot) ? slot->phys_mem[gfn - slot->base_gfn] : NULL;
-}
-
 static inline int is_long_mode(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_X86_64
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 1ef5e9a..aaaf306 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -420,12 +420,12 @@ static int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
 	u64 pdpte;
 	u64 *pdpt;
 	int ret;
-	struct kvm_memory_slot *memslot;
+	struct page *page;
 
 	spin_lock(&vcpu->kvm->lock);
-	memslot = gfn_to_memslot(vcpu->kvm, pdpt_gfn);
-	/* FIXME: !memslot - emulate? 0xff? */
-	pdpt = kmap_atomic(gfn_to_page(memslot, pdpt_gfn), KM_USER0);
+	page = gfn_to_page(vcpu->kvm, pdpt_gfn);
+	/* FIXME: !page - emulate? 0xff? */
+	pdpt = kmap_atomic(page, KM_USER0);
 
 	ret = 1;
 	for (i = 0; i < 4; ++i) {
@@ -861,6 +861,17 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(gfn_to_memslot);
 
+struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
+{
+	struct kvm_memory_slot *slot;
+
+	slot = gfn_to_memslot(kvm, gfn);
+	if (!slot)
+		return NULL;
+	return slot->phys_mem[gfn - slot->base_gfn];
+}
+EXPORT_SYMBOL_GPL(gfn_to_page);
+
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
@@ -899,20 +910,20 @@ static int emulator_read_std(unsigned long addr,
 		unsigned offset = addr & (PAGE_SIZE-1);
 		unsigned tocopy = min(bytes, (unsigned)PAGE_SIZE - offset);
 		unsigned long pfn;
-		struct kvm_memory_slot *memslot;
-		void *page;
+		struct page *page;
+		void *page_virt;
 
 		if (gpa == UNMAPPED_GVA)
 			return X86EMUL_PROPAGATE_FAULT;
 		pfn = gpa >> PAGE_SHIFT;
-		memslot = gfn_to_memslot(vcpu->kvm, pfn);
-		if (!memslot)
+		page = gfn_to_page(vcpu->kvm, pfn);
+		if (!page)
 			return X86EMUL_UNHANDLEABLE;
-		page = kmap_atomic(gfn_to_page(memslot, pfn), KM_USER0);
+		page_virt = kmap_atomic(page, KM_USER0);
 
-		memcpy(data, page + offset, tocopy);
+		memcpy(data, page_virt + offset, tocopy);
 
-		kunmap_atomic(page, KM_USER0);
+		kunmap_atomic(page_virt, KM_USER0);
 
 		bytes -= tocopy;
 		data += tocopy;
@@ -963,16 +974,14 @@ static int emulator_read_emulated(unsigned long addr,
 static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
 			       unsigned long val, int bytes)
 {
-	struct kvm_memory_slot *m;
 	struct page *page;
 	void *virt;
 
 	if (((gpa + bytes - 1) >> PAGE_SHIFT) != (gpa >> PAGE_SHIFT))
 		return 0;
-	m = gfn_to_memslot(vcpu->kvm, gpa >> PAGE_SHIFT);
-	if (!m)
+	page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
+	if (!page)
 		return 0;
-	page = gfn_to_page(m, gpa >> PAGE_SHIFT);
 	kvm_mmu_pre_write(vcpu, gpa, bytes);
 	mark_page_dirty(vcpu->kvm, gpa >> PAGE_SHIFT);
 	virt = kmap_atomic(page, KM_USER0);
@@ -2507,15 +2516,11 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
 {
 	struct kvm *kvm = vma->vm_file->private_data;
 	unsigned long pgoff;
-	struct kvm_memory_slot *slot;
 	struct page *page;
 
 	*type = VM_FAULT_MINOR;
 	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
-	slot = gfn_to_memslot(kvm, pgoff);
-	if (!slot)
-		return NOPAGE_SIGBUS;
-	page = gfn_to_page(slot, pgoff);
+	page = gfn_to_page(kvm, pgoff);
 	if (!page)
 		return NOPAGE_SIGBUS;
 	get_page(page);
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 707f63c..35135ec 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -390,13 +390,11 @@ static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
 {
 	struct kvm *kvm = vcpu->kvm;
 	struct page *page;
-	struct kvm_memory_slot *slot;
 	struct kvm_rmap_desc *desc;
 	u64 *spte;
 
-	slot = gfn_to_memslot(kvm, gfn);
-	BUG_ON(!slot);
-	page = gfn_to_page(slot, gfn);
+	page = gfn_to_page(kvm, gfn);
+	BUG_ON(!page);
 
 	while (page_private(page)) {
 		if (!(page_private(page) & 1))
@@ -711,14 +709,12 @@ hpa_t safe_gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
 
 hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
 {
-	struct kvm_memory_slot *slot;
 	struct page *page;
 
 	ASSERT((gpa & HPA_ERR_MASK) == 0);
-	slot = gfn_to_memslot(vcpu->kvm, gpa >> PAGE_SHIFT);
-	if (!slot)
+	page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
+	if (!page)
 		return gpa | HPA_ERR_MASK;
-	page = gfn_to_page(slot, gpa >> PAGE_SHIFT);
 	return ((hpa_t)page_to_pfn(page) << PAGE_SHIFT)
 		| (gpa & (PAGE_SIZE-1));
 }
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index b64b7b7..61a6116 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -926,9 +926,9 @@ static int init_rmode_tss(struct kvm* kvm)
 	gfn_t fn = rmode_tss_base(kvm) >> PAGE_SHIFT;
 	char *page;
 
-	p1 = _gfn_to_page(kvm, fn++);
-	p2 = _gfn_to_page(kvm, fn++);
-	p3 = _gfn_to_page(kvm, fn);
+	p1 = gfn_to_page(kvm, fn++);
+	p2 = gfn_to_page(kvm, fn++);
+	p3 = gfn_to_page(kvm, fn);
 
 	if (!p1 || !p2 || !p3) {
 		kvm_printf(kvm,"%s: gfn_to_page failed\n", __FUNCTION__);
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 38/41] KVM: Simply gfn_to_page()
@ 2007-04-01 14:35                                                                             ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

Mapping a guest page to a host page is a common operation.  Currently,
one has first to find the memory slot where the page belongs (gfn_to_memslot),
then locate the page itself (gfn_to_page()).

This is clumsy, and also won't work well with memory aliases.  So simplify
gfn_to_page() not to require memory slot translation first, and instead do it
internally.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |   12 +-----------
 drivers/kvm/kvm_main.c |   45 +++++++++++++++++++++++++--------------------
 drivers/kvm/mmu.c      |   12 ++++--------
 drivers/kvm/vmx.c      |    6 +++---
 4 files changed, 33 insertions(+), 42 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 59357be..d19985a 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -443,11 +443,7 @@ void kvm_emulator_want_group7_invlpg(void);
 
 extern hpa_t bad_page_address;
 
-static inline struct page *gfn_to_page(struct kvm_memory_slot *slot, gfn_t gfn)
-{
-	return slot->phys_mem[gfn - slot->base_gfn];
-}
-
+struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
 
@@ -523,12 +519,6 @@ static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 	return vcpu->mmu.page_fault(vcpu, gva, error_code);
 }
 
-static inline struct page *_gfn_to_page(struct kvm *kvm, gfn_t gfn)
-{
-	struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
-	return (slot) ? slot->phys_mem[gfn - slot->base_gfn] : NULL;
-}
-
 static inline int is_long_mode(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_X86_64
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 1ef5e9a..aaaf306 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -420,12 +420,12 @@ static int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
 	u64 pdpte;
 	u64 *pdpt;
 	int ret;
-	struct kvm_memory_slot *memslot;
+	struct page *page;
 
 	spin_lock(&vcpu->kvm->lock);
-	memslot = gfn_to_memslot(vcpu->kvm, pdpt_gfn);
-	/* FIXME: !memslot - emulate? 0xff? */
-	pdpt = kmap_atomic(gfn_to_page(memslot, pdpt_gfn), KM_USER0);
+	page = gfn_to_page(vcpu->kvm, pdpt_gfn);
+	/* FIXME: !page - emulate? 0xff? */
+	pdpt = kmap_atomic(page, KM_USER0);
 
 	ret = 1;
 	for (i = 0; i < 4; ++i) {
@@ -861,6 +861,17 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(gfn_to_memslot);
 
+struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
+{
+	struct kvm_memory_slot *slot;
+
+	slot = gfn_to_memslot(kvm, gfn);
+	if (!slot)
+		return NULL;
+	return slot->phys_mem[gfn - slot->base_gfn];
+}
+EXPORT_SYMBOL_GPL(gfn_to_page);
+
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
@@ -899,20 +910,20 @@ static int emulator_read_std(unsigned long addr,
 		unsigned offset = addr & (PAGE_SIZE-1);
 		unsigned tocopy = min(bytes, (unsigned)PAGE_SIZE - offset);
 		unsigned long pfn;
-		struct kvm_memory_slot *memslot;
-		void *page;
+		struct page *page;
+		void *page_virt;
 
 		if (gpa == UNMAPPED_GVA)
 			return X86EMUL_PROPAGATE_FAULT;
 		pfn = gpa >> PAGE_SHIFT;
-		memslot = gfn_to_memslot(vcpu->kvm, pfn);
-		if (!memslot)
+		page = gfn_to_page(vcpu->kvm, pfn);
+		if (!page)
 			return X86EMUL_UNHANDLEABLE;
-		page = kmap_atomic(gfn_to_page(memslot, pfn), KM_USER0);
+		page_virt = kmap_atomic(page, KM_USER0);
 
-		memcpy(data, page + offset, tocopy);
+		memcpy(data, page_virt + offset, tocopy);
 
-		kunmap_atomic(page, KM_USER0);
+		kunmap_atomic(page_virt, KM_USER0);
 
 		bytes -= tocopy;
 		data += tocopy;
@@ -963,16 +974,14 @@ static int emulator_read_emulated(unsigned long addr,
 static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
 			       unsigned long val, int bytes)
 {
-	struct kvm_memory_slot *m;
 	struct page *page;
 	void *virt;
 
 	if (((gpa + bytes - 1) >> PAGE_SHIFT) != (gpa >> PAGE_SHIFT))
 		return 0;
-	m = gfn_to_memslot(vcpu->kvm, gpa >> PAGE_SHIFT);
-	if (!m)
+	page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
+	if (!page)
 		return 0;
-	page = gfn_to_page(m, gpa >> PAGE_SHIFT);
 	kvm_mmu_pre_write(vcpu, gpa, bytes);
 	mark_page_dirty(vcpu->kvm, gpa >> PAGE_SHIFT);
 	virt = kmap_atomic(page, KM_USER0);
@@ -2507,15 +2516,11 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
 {
 	struct kvm *kvm = vma->vm_file->private_data;
 	unsigned long pgoff;
-	struct kvm_memory_slot *slot;
 	struct page *page;
 
 	*type = VM_FAULT_MINOR;
 	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
-	slot = gfn_to_memslot(kvm, pgoff);
-	if (!slot)
-		return NOPAGE_SIGBUS;
-	page = gfn_to_page(slot, pgoff);
+	page = gfn_to_page(kvm, pgoff);
 	if (!page)
 		return NOPAGE_SIGBUS;
 	get_page(page);
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 707f63c..35135ec 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -390,13 +390,11 @@ static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
 {
 	struct kvm *kvm = vcpu->kvm;
 	struct page *page;
-	struct kvm_memory_slot *slot;
 	struct kvm_rmap_desc *desc;
 	u64 *spte;
 
-	slot = gfn_to_memslot(kvm, gfn);
-	BUG_ON(!slot);
-	page = gfn_to_page(slot, gfn);
+	page = gfn_to_page(kvm, gfn);
+	BUG_ON(!page);
 
 	while (page_private(page)) {
 		if (!(page_private(page) & 1))
@@ -711,14 +709,12 @@ hpa_t safe_gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
 
 hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
 {
-	struct kvm_memory_slot *slot;
 	struct page *page;
 
 	ASSERT((gpa & HPA_ERR_MASK) == 0);
-	slot = gfn_to_memslot(vcpu->kvm, gpa >> PAGE_SHIFT);
-	if (!slot)
+	page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
+	if (!page)
 		return gpa | HPA_ERR_MASK;
-	page = gfn_to_page(slot, gpa >> PAGE_SHIFT);
 	return ((hpa_t)page_to_pfn(page) << PAGE_SHIFT)
 		| (gpa & (PAGE_SIZE-1));
 }
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index b64b7b7..61a6116 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -926,9 +926,9 @@ static int init_rmode_tss(struct kvm* kvm)
 	gfn_t fn = rmode_tss_base(kvm) >> PAGE_SHIFT;
 	char *page;
 
-	p1 = _gfn_to_page(kvm, fn++);
-	p2 = _gfn_to_page(kvm, fn++);
-	p3 = _gfn_to_page(kvm, fn);
+	p1 = gfn_to_page(kvm, fn++);
+	p2 = gfn_to_page(kvm, fn++);
+	p3 = gfn_to_page(kvm, fn);
 
 	if (!p1 || !p2 || !p3) {
 		kvm_printf(kvm,"%s: gfn_to_page failed\n", __FUNCTION__);
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 39/41] KVM: Add physical memory aliasing feature
@ 2007-04-01 14:35                                                                               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

With this, we can specify that accesses to one physical memory range will
be remapped to another.  This is useful for the vga window at 0xa0000 which
is used as a movable window into the (much larger) framebuffer.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm.h      |    9 +++++
 drivers/kvm/kvm_main.c |   89 ++++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/kvm.h    |   10 +++++-
 3 files changed, 104 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index d19985a..fceeb84 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -51,6 +51,7 @@
 #define UNMAPPED_GVA (~(gpa_t)0)
 
 #define KVM_MAX_VCPUS 1
+#define KVM_ALIAS_SLOTS 4
 #define KVM_MEMORY_SLOTS 4
 #define KVM_NUM_MMU_PAGES 256
 #define KVM_MIN_FREE_MMU_PAGES 5
@@ -312,6 +313,12 @@ struct kvm_vcpu {
 	struct kvm_cpuid_entry cpuid_entries[KVM_MAX_CPUID_ENTRIES];
 };
 
+struct kvm_mem_alias {
+	gfn_t base_gfn;
+	unsigned long npages;
+	gfn_t target_gfn;
+};
+
 struct kvm_memory_slot {
 	gfn_t base_gfn;
 	unsigned long npages;
@@ -322,6 +329,8 @@ struct kvm_memory_slot {
 
 struct kvm {
 	spinlock_t lock; /* protects everything except vcpus */
+	int naliases;
+	struct kvm_mem_alias aliases[KVM_ALIAS_SLOTS];
 	int nmemslots;
 	struct kvm_memory_slot memslots[KVM_MEMORY_SLOTS];
 	/*
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index aaaf306..56ddbc1 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -846,7 +846,73 @@ out:
 	return r;
 }
 
-struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
+/*
+ * Set a new alias region.  Aliases map a portion of physical memory into
+ * another portion.  This is useful for memory windows, for example the PC
+ * VGA region.
+ */
+static int kvm_vm_ioctl_set_memory_alias(struct kvm *kvm,
+					 struct kvm_memory_alias *alias)
+{
+	int r, n;
+	struct kvm_mem_alias *p;
+
+	r = -EINVAL;
+	/* General sanity checks */
+	if (alias->memory_size & (PAGE_SIZE - 1))
+		goto out;
+	if (alias->guest_phys_addr & (PAGE_SIZE - 1))
+		goto out;
+	if (alias->slot >= KVM_ALIAS_SLOTS)
+		goto out;
+	if (alias->guest_phys_addr + alias->memory_size
+	    < alias->guest_phys_addr)
+		goto out;
+	if (alias->target_phys_addr + alias->memory_size
+	    < alias->target_phys_addr)
+		goto out;
+
+	spin_lock(&kvm->lock);
+
+	p = &kvm->aliases[alias->slot];
+	p->base_gfn = alias->guest_phys_addr >> PAGE_SHIFT;
+	p->npages = alias->memory_size >> PAGE_SHIFT;
+	p->target_gfn = alias->target_phys_addr >> PAGE_SHIFT;
+
+	for (n = KVM_ALIAS_SLOTS; n > 0; --n)
+		if (kvm->aliases[n - 1].npages)
+			break;
+	kvm->naliases = n;
+
+	spin_unlock(&kvm->lock);
+
+	vcpu_load(&kvm->vcpus[0]);
+	spin_lock(&kvm->lock);
+	kvm_mmu_zap_all(&kvm->vcpus[0]);
+	spin_unlock(&kvm->lock);
+	vcpu_put(&kvm->vcpus[0]);
+
+	return 0;
+
+out:
+	return r;
+}
+
+static gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
+{
+	int i;
+	struct kvm_mem_alias *alias;
+
+	for (i = 0; i < kvm->naliases; ++i) {
+		alias = &kvm->aliases[i];
+		if (gfn >= alias->base_gfn
+		    && gfn < alias->base_gfn + alias->npages)
+			return alias->target_gfn + gfn - alias->base_gfn;
+	}
+	return gfn;
+}
+
+static struct kvm_memory_slot *__gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
 
@@ -859,13 +925,19 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
 	}
 	return NULL;
 }
-EXPORT_SYMBOL_GPL(gfn_to_memslot);
+
+struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
+{
+	gfn = unalias_gfn(kvm, gfn);
+	return __gfn_to_memslot(kvm, gfn);
+}
 
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 {
 	struct kvm_memory_slot *slot;
 
-	slot = gfn_to_memslot(kvm, gfn);
+	gfn = unalias_gfn(kvm, gfn);
+	slot = __gfn_to_memslot(kvm, gfn);
 	if (!slot)
 		return NULL;
 	return slot->phys_mem[gfn - slot->base_gfn];
@@ -2503,6 +2575,17 @@ static long kvm_vm_ioctl(struct file *filp,
 			goto out;
 		break;
 	}
+	case KVM_SET_MEMORY_ALIAS: {
+		struct kvm_memory_alias alias;
+
+		r = -EFAULT;
+		if (copy_from_user(&alias, argp, sizeof alias))
+			goto out;
+		r = kvm_vm_ioctl_set_memory_alias(kvm, &alias);
+		if (r)
+			goto out;
+		break;
+	}
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 728b24c..da9b23f 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 9
+#define KVM_API_VERSION 10
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -33,6 +33,13 @@ struct kvm_memory_region {
 /* for kvm_memory_region::flags */
 #define KVM_MEM_LOG_DIRTY_PAGES  1UL
 
+struct kvm_memory_alias {
+	__u32 slot;  /* this has a different namespace than memory slots */
+	__u32 flags;
+	__u64 guest_phys_addr;
+	__u64 memory_size;
+	__u64 target_phys_addr;
+};
 
 enum kvm_exit_reason {
 	KVM_EXIT_UNKNOWN          = 0,
@@ -261,6 +268,7 @@ struct kvm_signal_mask {
  */
 #define KVM_CREATE_VCPU           _IO(KVMIO,  0x41)
 #define KVM_GET_DIRTY_LOG         _IOW(KVMIO, 0x42, struct kvm_dirty_log)
+#define KVM_SET_MEMORY_ALIAS      _IOW(KVMIO, 0x43, struct kvm_memory_alias)
 
 /*
  * ioctls for vcpu fds
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 39/41] KVM: Add physical memory aliasing feature
@ 2007-04-01 14:35                                                                               ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

With this, we can specify that accesses to one physical memory range will
be remapped to another.  This is useful for the vga window at 0xa0000 which
is used as a movable window into the (much larger) framebuffer.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    9 +++++
 drivers/kvm/kvm_main.c |   89 ++++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/kvm.h    |   10 +++++-
 3 files changed, 104 insertions(+), 4 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index d19985a..fceeb84 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -51,6 +51,7 @@
 #define UNMAPPED_GVA (~(gpa_t)0)
 
 #define KVM_MAX_VCPUS 1
+#define KVM_ALIAS_SLOTS 4
 #define KVM_MEMORY_SLOTS 4
 #define KVM_NUM_MMU_PAGES 256
 #define KVM_MIN_FREE_MMU_PAGES 5
@@ -312,6 +313,12 @@ struct kvm_vcpu {
 	struct kvm_cpuid_entry cpuid_entries[KVM_MAX_CPUID_ENTRIES];
 };
 
+struct kvm_mem_alias {
+	gfn_t base_gfn;
+	unsigned long npages;
+	gfn_t target_gfn;
+};
+
 struct kvm_memory_slot {
 	gfn_t base_gfn;
 	unsigned long npages;
@@ -322,6 +329,8 @@ struct kvm_memory_slot {
 
 struct kvm {
 	spinlock_t lock; /* protects everything except vcpus */
+	int naliases;
+	struct kvm_mem_alias aliases[KVM_ALIAS_SLOTS];
 	int nmemslots;
 	struct kvm_memory_slot memslots[KVM_MEMORY_SLOTS];
 	/*
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index aaaf306..56ddbc1 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -846,7 +846,73 @@ out:
 	return r;
 }
 
-struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
+/*
+ * Set a new alias region.  Aliases map a portion of physical memory into
+ * another portion.  This is useful for memory windows, for example the PC
+ * VGA region.
+ */
+static int kvm_vm_ioctl_set_memory_alias(struct kvm *kvm,
+					 struct kvm_memory_alias *alias)
+{
+	int r, n;
+	struct kvm_mem_alias *p;
+
+	r = -EINVAL;
+	/* General sanity checks */
+	if (alias->memory_size & (PAGE_SIZE - 1))
+		goto out;
+	if (alias->guest_phys_addr & (PAGE_SIZE - 1))
+		goto out;
+	if (alias->slot >= KVM_ALIAS_SLOTS)
+		goto out;
+	if (alias->guest_phys_addr + alias->memory_size
+	    < alias->guest_phys_addr)
+		goto out;
+	if (alias->target_phys_addr + alias->memory_size
+	    < alias->target_phys_addr)
+		goto out;
+
+	spin_lock(&kvm->lock);
+
+	p = &kvm->aliases[alias->slot];
+	p->base_gfn = alias->guest_phys_addr >> PAGE_SHIFT;
+	p->npages = alias->memory_size >> PAGE_SHIFT;
+	p->target_gfn = alias->target_phys_addr >> PAGE_SHIFT;
+
+	for (n = KVM_ALIAS_SLOTS; n > 0; --n)
+		if (kvm->aliases[n - 1].npages)
+			break;
+	kvm->naliases = n;
+
+	spin_unlock(&kvm->lock);
+
+	vcpu_load(&kvm->vcpus[0]);
+	spin_lock(&kvm->lock);
+	kvm_mmu_zap_all(&kvm->vcpus[0]);
+	spin_unlock(&kvm->lock);
+	vcpu_put(&kvm->vcpus[0]);
+
+	return 0;
+
+out:
+	return r;
+}
+
+static gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
+{
+	int i;
+	struct kvm_mem_alias *alias;
+
+	for (i = 0; i < kvm->naliases; ++i) {
+		alias = &kvm->aliases[i];
+		if (gfn >= alias->base_gfn
+		    && gfn < alias->base_gfn + alias->npages)
+			return alias->target_gfn + gfn - alias->base_gfn;
+	}
+	return gfn;
+}
+
+static struct kvm_memory_slot *__gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
 
@@ -859,13 +925,19 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
 	}
 	return NULL;
 }
-EXPORT_SYMBOL_GPL(gfn_to_memslot);
+
+struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
+{
+	gfn = unalias_gfn(kvm, gfn);
+	return __gfn_to_memslot(kvm, gfn);
+}
 
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 {
 	struct kvm_memory_slot *slot;
 
-	slot = gfn_to_memslot(kvm, gfn);
+	gfn = unalias_gfn(kvm, gfn);
+	slot = __gfn_to_memslot(kvm, gfn);
 	if (!slot)
 		return NULL;
 	return slot->phys_mem[gfn - slot->base_gfn];
@@ -2503,6 +2575,17 @@ static long kvm_vm_ioctl(struct file *filp,
 			goto out;
 		break;
 	}
+	case KVM_SET_MEMORY_ALIAS: {
+		struct kvm_memory_alias alias;
+
+		r = -EFAULT;
+		if (copy_from_user(&alias, argp, sizeof alias))
+			goto out;
+		r = kvm_vm_ioctl_set_memory_alias(kvm, &alias);
+		if (r)
+			goto out;
+		break;
+	}
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 728b24c..da9b23f 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -11,7 +11,7 @@
 #include <asm/types.h>
 #include <linux/ioctl.h>
 
-#define KVM_API_VERSION 9
+#define KVM_API_VERSION 10
 
 /*
  * Architectural interrupt line count, and the size of the bitmap needed
@@ -33,6 +33,13 @@ struct kvm_memory_region {
 /* for kvm_memory_region::flags */
 #define KVM_MEM_LOG_DIRTY_PAGES  1UL
 
+struct kvm_memory_alias {
+	__u32 slot;  /* this has a different namespace than memory slots */
+	__u32 flags;
+	__u64 guest_phys_addr;
+	__u64 memory_size;
+	__u64 target_phys_addr;
+};
 
 enum kvm_exit_reason {
 	KVM_EXIT_UNKNOWN          = 0,
@@ -261,6 +268,7 @@ struct kvm_signal_mask {
  */
 #define KVM_CREATE_VCPU           _IO(KVMIO,  0x41)
 #define KVM_GET_DIRTY_LOG         _IOW(KVMIO, 0x42, struct kvm_dirty_log)
+#define KVM_SET_MEMORY_ALIAS      _IOW(KVMIO, 0x43, struct kvm_memory_alias)
 
 /*
  * ioctls for vcpu fds
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 40/41] KVM: Add fpu get/set operations
@ 2007-04-01 14:35                                                                                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Avi Kivity

These are really helpful when migrating an floating point app to another
machine.

Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/kvm_main.c |   86 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/kvm.h    |   17 +++++++++
 2 files changed, 103 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 56ddbc1..4473174 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2389,6 +2389,67 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
 	return 0;
 }
 
+/*
+ * fxsave fpu state.  Taken from x86_64/processor.h.  To be killed when
+ * we have asm/x86/processor.h
+ */
+struct fxsave {
+	u16	cwd;
+	u16	swd;
+	u16	twd;
+	u16	fop;
+	u64	rip;
+	u64	rdp;
+	u32	mxcsr;
+	u32	mxcsr_mask;
+	u32	st_space[32];	/* 8*16 bytes for each FP-reg = 128 bytes */
+#ifdef CONFIG_X86_64
+	u32	xmm_space[64];	/* 16*16 bytes for each XMM-reg = 256 bytes */
+#else
+	u32	xmm_space[32];	/* 8*16 bytes for each XMM-reg = 128 bytes */
+#endif
+};
+
+static int kvm_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
+{
+	struct fxsave *fxsave = (struct fxsave *)vcpu->guest_fx_image;
+
+	vcpu_load(vcpu);
+
+	memcpy(fpu->fpr, fxsave->st_space, 128);
+	fpu->fcw = fxsave->cwd;
+	fpu->fsw = fxsave->swd;
+	fpu->ftwx = fxsave->twd;
+	fpu->last_opcode = fxsave->fop;
+	fpu->last_ip = fxsave->rip;
+	fpu->last_dp = fxsave->rdp;
+	memcpy(fpu->xmm, fxsave->xmm_space, sizeof fxsave->xmm_space);
+
+	vcpu_put(vcpu);
+
+	return 0;
+}
+
+static int kvm_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
+{
+	struct fxsave *fxsave = (struct fxsave *)vcpu->guest_fx_image;
+
+	vcpu_load(vcpu);
+
+	memcpy(fxsave->st_space, fpu->fpr, 128);
+	fxsave->cwd = fpu->fcw;
+	fxsave->swd = fpu->fsw;
+	fxsave->twd = fpu->ftwx;
+	fxsave->fop = fpu->last_opcode;
+	fxsave->rip = fpu->last_ip;
+	fxsave->rdp = fpu->last_dp;
+	memcpy(fxsave->xmm_space, fpu->xmm, sizeof fxsave->xmm_space);
+
+	vcpu_put(vcpu);
+
+	return 0;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2533,6 +2594,31 @@ static long kvm_vcpu_ioctl(struct file *filp,
 		r = kvm_vcpu_ioctl_set_sigmask(vcpu, &sigset);
 		break;
 	}
+	case KVM_GET_FPU: {
+		struct kvm_fpu fpu;
+
+		memset(&fpu, 0, sizeof fpu);
+		r = kvm_vcpu_ioctl_get_fpu(vcpu, &fpu);
+		if (r)
+			goto out;
+		r = -EFAULT;
+		if (copy_to_user(argp, &fpu, sizeof fpu))
+			goto out;
+		r = 0;
+		break;
+	}
+	case KVM_SET_FPU: {
+		struct kvm_fpu fpu;
+
+		r = -EFAULT;
+		if (copy_from_user(&fpu, argp, sizeof fpu))
+			goto out;
+		r = kvm_vcpu_ioctl_set_fpu(vcpu, &fpu);
+		if (r)
+			goto out;
+		r = 0;
+		break;
+	}
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index da9b23f..07bf353 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -126,6 +126,21 @@ struct kvm_regs {
 	__u64 rip, rflags;
 };
 
+/* for KVM_GET_FPU and KVM_SET_FPU */
+struct kvm_fpu {
+	__u8  fpr[8][16];
+	__u16 fcw;
+	__u16 fsw;
+	__u8  ftwx;  /* in fxsave format */
+	__u8  pad1;
+	__u16 last_opcode;
+	__u64 last_ip;
+	__u64 last_dp;
+	__u8  xmm[16][16];
+	__u32 mxcsr;
+	__u32 pad2;
+};
+
 struct kvm_segment {
 	__u64 base;
 	__u32 limit;
@@ -285,5 +300,7 @@ struct kvm_signal_mask {
 #define KVM_SET_MSRS              _IOW(KVMIO,  0x89, struct kvm_msrs)
 #define KVM_SET_CPUID             _IOW(KVMIO,  0x8a, struct kvm_cpuid)
 #define KVM_SET_SIGNAL_MASK       _IOW(KVMIO,  0x8b, struct kvm_signal_mask)
+#define KVM_GET_FPU               _IOR(KVMIO,  0x8c, struct kvm_fpu)
+#define KVM_SET_FPU               _IOW(KVMIO,  0x8d, struct kvm_fpu)
 
 #endif
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 40/41] KVM: Add fpu get/set operations
@ 2007-04-01 14:35                                                                                 ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

These are really helpful when migrating an floating point app to another
machine.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |   86 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/kvm.h    |   17 +++++++++
 2 files changed, 103 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 56ddbc1..4473174 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2389,6 +2389,67 @@ static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
 	return 0;
 }
 
+/*
+ * fxsave fpu state.  Taken from x86_64/processor.h.  To be killed when
+ * we have asm/x86/processor.h
+ */
+struct fxsave {
+	u16	cwd;
+	u16	swd;
+	u16	twd;
+	u16	fop;
+	u64	rip;
+	u64	rdp;
+	u32	mxcsr;
+	u32	mxcsr_mask;
+	u32	st_space[32];	/* 8*16 bytes for each FP-reg = 128 bytes */
+#ifdef CONFIG_X86_64
+	u32	xmm_space[64];	/* 16*16 bytes for each XMM-reg = 256 bytes */
+#else
+	u32	xmm_space[32];	/* 8*16 bytes for each XMM-reg = 128 bytes */
+#endif
+};
+
+static int kvm_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
+{
+	struct fxsave *fxsave = (struct fxsave *)vcpu->guest_fx_image;
+
+	vcpu_load(vcpu);
+
+	memcpy(fpu->fpr, fxsave->st_space, 128);
+	fpu->fcw = fxsave->cwd;
+	fpu->fsw = fxsave->swd;
+	fpu->ftwx = fxsave->twd;
+	fpu->last_opcode = fxsave->fop;
+	fpu->last_ip = fxsave->rip;
+	fpu->last_dp = fxsave->rdp;
+	memcpy(fpu->xmm, fxsave->xmm_space, sizeof fxsave->xmm_space);
+
+	vcpu_put(vcpu);
+
+	return 0;
+}
+
+static int kvm_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
+{
+	struct fxsave *fxsave = (struct fxsave *)vcpu->guest_fx_image;
+
+	vcpu_load(vcpu);
+
+	memcpy(fxsave->st_space, fpu->fpr, 128);
+	fxsave->cwd = fpu->fcw;
+	fxsave->swd = fpu->fsw;
+	fxsave->twd = fpu->ftwx;
+	fxsave->fop = fpu->last_opcode;
+	fxsave->rip = fpu->last_ip;
+	fxsave->rdp = fpu->last_dp;
+	memcpy(fxsave->xmm_space, fpu->xmm, sizeof fxsave->xmm_space);
+
+	vcpu_put(vcpu);
+
+	return 0;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2533,6 +2594,31 @@ static long kvm_vcpu_ioctl(struct file *filp,
 		r = kvm_vcpu_ioctl_set_sigmask(vcpu, &sigset);
 		break;
 	}
+	case KVM_GET_FPU: {
+		struct kvm_fpu fpu;
+
+		memset(&fpu, 0, sizeof fpu);
+		r = kvm_vcpu_ioctl_get_fpu(vcpu, &fpu);
+		if (r)
+			goto out;
+		r = -EFAULT;
+		if (copy_to_user(argp, &fpu, sizeof fpu))
+			goto out;
+		r = 0;
+		break;
+	}
+	case KVM_SET_FPU: {
+		struct kvm_fpu fpu;
+
+		r = -EFAULT;
+		if (copy_from_user(&fpu, argp, sizeof fpu))
+			goto out;
+		r = kvm_vcpu_ioctl_set_fpu(vcpu, &fpu);
+		if (r)
+			goto out;
+		r = 0;
+		break;
+	}
 	default:
 		;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index da9b23f..07bf353 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -126,6 +126,21 @@ struct kvm_regs {
 	__u64 rip, rflags;
 };
 
+/* for KVM_GET_FPU and KVM_SET_FPU */
+struct kvm_fpu {
+	__u8  fpr[8][16];
+	__u16 fcw;
+	__u16 fsw;
+	__u8  ftwx;  /* in fxsave format */
+	__u8  pad1;
+	__u16 last_opcode;
+	__u64 last_ip;
+	__u64 last_dp;
+	__u8  xmm[16][16];
+	__u32 mxcsr;
+	__u32 pad2;
+};
+
 struct kvm_segment {
 	__u64 base;
 	__u32 limit;
@@ -285,5 +300,7 @@ struct kvm_signal_mask {
 #define KVM_SET_MSRS              _IOW(KVMIO,  0x89, struct kvm_msrs)
 #define KVM_SET_CPUID             _IOW(KVMIO,  0x8a, struct kvm_cpuid)
 #define KVM_SET_SIGNAL_MASK       _IOW(KVMIO,  0x8b, struct kvm_signal_mask)
+#define KVM_GET_FPU               _IOR(KVMIO,  0x8c, struct kvm_fpu)
+#define KVM_SET_FPU               _IOW(KVMIO,  0x8d, struct kvm_fpu)
 
 #endif
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 41/41] KVM: SVM: enable LBRV virtualization if available
@ 2007-04-01 14:35                                                                                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel; +Cc: linux-kernel, Joerg Roedel, Avi Kivity

From: Joerg Roedel <joerg.roedel@amd.com>

This patch enables the virtualization of the last branch record MSRs on
SVM if this feature is available in hardware. It also introduces a small
and simple check feature for specific SVM extensions.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
---
 drivers/kvm/svm.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 303e959..b7e1410 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -44,6 +44,10 @@ MODULE_LICENSE("GPL");
 #define KVM_EFER_LMA (1 << 10)
 #define KVM_EFER_LME (1 << 8)
 
+#define SVM_FEATURE_NPT  (1 << 0)
+#define SVM_FEATURE_LBRV (1 << 1)
+#define SVM_DEATURE_SVML (1 << 2)
+
 unsigned long iopm_base;
 unsigned long msrpm_base;
 
@@ -68,6 +72,7 @@ struct svm_cpu_data {
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
+static uint32_t svm_features;
 
 struct svm_init_data {
 	int cpu;
@@ -82,6 +87,11 @@ static u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
 
 #define MAX_INST_SIZE 15
 
+static inline u32 svm_has(u32 feat)
+{
+	return svm_features & feat;
+}
+
 static unsigned get_addr_size(struct kvm_vcpu *vcpu)
 {
 	struct vmcb_save_area *sa = &vcpu->svm->vmcb->save;
@@ -302,6 +312,7 @@ static void svm_hardware_enable(void *garbage)
 	svm_data->asid_generation = 1;
 	svm_data->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	svm_data->next_asid = svm_data->max_asid + 1;
+	svm_features = cpuid_edx(SVM_CPUID_FUNC);
 
 	asm volatile ( "sgdt %0" : "=m"(gdt_descr) );
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -511,6 +522,8 @@ static void init_vmcb(struct vmcb *vmcb)
 	control->msrpm_base_pa = msrpm_base;
 	control->tsc_offset = 0;
 	control->int_ctl = V_INTR_MASKING_MASK;
+	if (svm_has(SVM_FEATURE_LBRV))
+		control->lbr_ctl = 1ULL;
 
 	init_seg(&save->es);
 	init_seg(&save->ss);
-- 
1.5.0.5


^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 41/41] KVM: SVM: enable LBRV virtualization if available
@ 2007-04-01 14:35                                                                                   ` Avi Kivity
  0 siblings, 0 replies; 84+ messages in thread
From: Avi Kivity @ 2007-04-01 14:35 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <joerg.roedel-5C7GfCeVMHo@public.gmane.org>

This patch enables the virtualization of the last branch record MSRs on
SVM if this feature is available in hardware. It also introduces a small
and simple check feature for specific SVM extensions.

Signed-off-by: Joerg Roedel <joerg.roedel-5C7GfCeVMHo@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/svm.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 303e959..b7e1410 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -44,6 +44,10 @@ MODULE_LICENSE("GPL");
 #define KVM_EFER_LMA (1 << 10)
 #define KVM_EFER_LME (1 << 8)
 
+#define SVM_FEATURE_NPT  (1 << 0)
+#define SVM_FEATURE_LBRV (1 << 1)
+#define SVM_DEATURE_SVML (1 << 2)
+
 unsigned long iopm_base;
 unsigned long msrpm_base;
 
@@ -68,6 +72,7 @@ struct svm_cpu_data {
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
+static uint32_t svm_features;
 
 struct svm_init_data {
 	int cpu;
@@ -82,6 +87,11 @@ static u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
 
 #define MAX_INST_SIZE 15
 
+static inline u32 svm_has(u32 feat)
+{
+	return svm_features & feat;
+}
+
 static unsigned get_addr_size(struct kvm_vcpu *vcpu)
 {
 	struct vmcb_save_area *sa = &vcpu->svm->vmcb->save;
@@ -302,6 +312,7 @@ static void svm_hardware_enable(void *garbage)
 	svm_data->asid_generation = 1;
 	svm_data->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	svm_data->next_asid = svm_data->max_asid + 1;
+	svm_features = cpuid_edx(SVM_CPUID_FUNC);
 
 	asm volatile ( "sgdt %0" : "=m"(gdt_descr) );
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -511,6 +522,8 @@ static void init_vmcb(struct vmcb *vmcb)
 	control->msrpm_base_pa = msrpm_base;
 	control->tsc_offset = 0;
 	control->int_ctl = V_INTR_MASKING_MASK;
+	if (svm_has(SVM_FEATURE_LBRV))
+		control->lbr_ctl = 1ULL;
 
 	init_seg(&save->es);
 	init_seg(&save->ss);
-- 
1.5.0.5


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply related	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2007-04-01 14:50 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-01 14:34 [PATCH 00/41] kvm updates for 2.6.22 Avi Kivity
2007-04-01 14:34 ` Avi Kivity
2007-04-01 14:34 ` [PATCH 01/41] KVM: Fix guest register corruption on paravirt hypercall Avi Kivity
2007-04-01 14:34   ` Avi Kivity
2007-04-01 14:34   ` [PATCH 02/41] KVM: Use the generic skip_emulated_instruction() in hypercall code Avi Kivity
2007-04-01 14:34     ` Avi Kivity
2007-04-01 14:35     ` [PATCH 03/41] KVM: Use own minor number Avi Kivity
2007-04-01 14:35       ` Avi Kivity
2007-04-01 14:35       ` [PATCH 04/41] KVM: Export <linux/kvm.h> Avi Kivity
2007-04-01 14:35         ` Avi Kivity
2007-04-01 14:35         ` [PATCH 05/41] KVM: Fix bogus sign extension in mmu mapping audit Avi Kivity
2007-04-01 14:35           ` Avi Kivity
2007-04-01 14:35           ` [PATCH 06/41] KVM: Use a shared page for kernel/user communication when runing a vcpu Avi Kivity
2007-04-01 14:35             ` Avi Kivity
2007-04-01 14:35             ` [PATCH 07/41] KVM: Do not communicate to userspace through cpu registers during PIO Avi Kivity
2007-04-01 14:35               ` Avi Kivity
2007-04-01 14:35               ` [PATCH 08/41] KVM: Handle cpuid in the kernel instead of punting to userspace Avi Kivity
2007-04-01 14:35                 ` Avi Kivity
2007-04-01 14:35                 ` [PATCH 09/41] KVM: Remove the 'emulated' field from the userspace interface Avi Kivity
2007-04-01 14:35                   ` Avi Kivity
2007-04-01 14:35                   ` [PATCH 10/41] KVM: Remove minor wart from KVM_CREATE_VCPU ioctl Avi Kivity
2007-04-01 14:35                     ` Avi Kivity
2007-04-01 14:35                     ` [PATCH 11/41] KVM: Renumber ioctls Avi Kivity
2007-04-01 14:35                       ` Avi Kivity
2007-04-01 14:35                       ` [PATCH 12/41] KVM: Add method to check for backwards-compatible API extensions Avi Kivity
2007-04-01 14:35                         ` Avi Kivity
2007-04-01 14:35                         ` [PATCH 13/41] KVM: Allow userspace to process hypercalls which have no kernel handler Avi Kivity
2007-04-01 14:35                           ` Avi Kivity
2007-04-01 14:35                           ` [PATCH 14/41] KVM: Fold kvm_run::exit_type into kvm_run::exit_reason Avi Kivity
2007-04-01 14:35                             ` Avi Kivity
2007-04-01 14:35                             ` [PATCH 15/41] KVM: Add a special exit reason when exiting due to an interrupt Avi Kivity
2007-04-01 14:35                               ` Avi Kivity
2007-04-01 14:35                               ` [PATCH 16/41] KVM: Initialize the apic_base msr on svm too Avi Kivity
2007-04-01 14:35                                 ` Avi Kivity
2007-04-01 14:35                                 ` [PATCH 17/41] KVM: Add guest mode signal mask Avi Kivity
2007-04-01 14:35                                   ` Avi Kivity
2007-04-01 14:35                                   ` [PATCH 18/41] KVM: Allow kernel to select size of mmap() buffer Avi Kivity
2007-04-01 14:35                                     ` Avi Kivity
2007-04-01 14:35                                     ` [PATCH 19/41] KVM: Future-proof argument-less ioctls Avi Kivity
2007-04-01 14:35                                       ` Avi Kivity
2007-04-01 14:35                                       ` [PATCH 20/41] KVM: Avoid guest virtual addresses in string pio userspace interface Avi Kivity
2007-04-01 14:35                                         ` Avi Kivity
2007-04-01 14:35                                         ` [PATCH 21/41] KVM: MMU: Remove unnecessary check for pdptr access Avi Kivity
2007-04-01 14:35                                           ` Avi Kivity
2007-04-01 14:35                                           ` [PATCH 22/41] KVM: MMU: Remove global pte tracking Avi Kivity
2007-04-01 14:35                                             ` Avi Kivity
2007-04-01 14:35                                             ` [PATCH 23/41] KVM: Workaround vmx inability to virtualize the reset state Avi Kivity
2007-04-01 14:35                                               ` Avi Kivity
2007-04-01 14:35                                               ` [PATCH 24/41] KVM: Remove set_cr0_no_modeswitch() arch op Avi Kivity
2007-04-01 14:35                                                 ` Avi Kivity
2007-04-01 14:35                                                 ` [PATCH 25/41] KVM: Modify guest segments after potentially switching modes Avi Kivity
2007-04-01 14:35                                                   ` Avi Kivity
2007-04-01 14:35                                                   ` [PATCH 26/41] KVM: Hack real-mode segments on vmx from KVM_SET_SREGS Avi Kivity
2007-04-01 14:35                                                     ` Avi Kivity
2007-04-01 14:35                                                     ` [PATCH 27/41] KVM: Don't allow the guest to turn off the cpu cache Avi Kivity
2007-04-01 14:35                                                       ` Avi Kivity
2007-04-01 14:35                                                       ` [PATCH 28/41] KVM: Remove unused and write-only variables Avi Kivity
2007-04-01 14:35                                                         ` Avi Kivity
2007-04-01 14:35                                                         ` [PATCH 29/41] KVM: Handle writes to MCG_STATUS msr Avi Kivity
2007-04-01 14:35                                                           ` Avi Kivity
2007-04-01 14:35                                                           ` [PATCH 30/41] KVM: SVM: forbid guest to execute monitor/mwait Avi Kivity
2007-04-01 14:35                                                             ` Avi Kivity
2007-04-01 14:35                                                             ` [PATCH 31/41] KVM: MMU: Fix hugepage pdes mapping same physical address with different access Avi Kivity
2007-04-01 14:35                                                               ` Avi Kivity
2007-04-01 14:35                                                               ` [PATCH 32/41] KVM: SVM: Ensure timestamp counter monotonicity Avi Kivity
2007-04-01 14:35                                                                 ` Avi Kivity
2007-04-01 14:35                                                                 ` [PATCH 33/41] KVM: Remove unused function Avi Kivity
2007-04-01 14:35                                                                   ` Avi Kivity
2007-04-01 14:35                                                                   ` [PATCH 34/41] KVM: Use list_move() Avi Kivity
2007-04-01 14:35                                                                     ` Avi Kivity
2007-04-01 14:35                                                                     ` [PATCH 35/41] KVM: Remove debug message Avi Kivity
2007-04-01 14:35                                                                       ` Avi Kivity
2007-04-01 14:35                                                                       ` [PATCH 36/41] KVM: x86 emulator: fix bit string operations operand size Avi Kivity
2007-04-01 14:35                                                                         ` Avi Kivity
2007-04-01 14:35                                                                         ` [PATCH 37/41] KVM: Add mmu cache clear function Avi Kivity
2007-04-01 14:35                                                                           ` Avi Kivity
2007-04-01 14:35                                                                           ` [PATCH 38/41] KVM: Simply gfn_to_page() Avi Kivity
2007-04-01 14:35                                                                             ` Avi Kivity
2007-04-01 14:35                                                                             ` [PATCH 39/41] KVM: Add physical memory aliasing feature Avi Kivity
2007-04-01 14:35                                                                               ` Avi Kivity
2007-04-01 14:35                                                                               ` [PATCH 40/41] KVM: Add fpu get/set operations Avi Kivity
2007-04-01 14:35                                                                                 ` Avi Kivity
2007-04-01 14:35                                                                                 ` [PATCH 41/41] KVM: SVM: enable LBRV virtualization if available Avi Kivity
2007-04-01 14:35                                                                                   ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.