All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Export offsets of VMCS fields as note information for kdump
@ 2012-07-04  9:58 ` Yanfei Zhang
  0 siblings, 0 replies; 26+ messages in thread
From: Yanfei Zhang @ 2012-07-04  9:58 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

This patch set exports offsets of VMCS fields as note information for
kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve
runtime state of guest machine image, such as registers, in host
machine's crash dump as VMCS format. The problem is that VMCS internal
is hidden by Intel in its specification. So, we slove this problem
by reverse engineering implemented in this patch set. The VMCSINFO
is exported via sysfs (/sys/devices/system/cpu/vmcs/) to kexec-tools.

Here are two usercases for two features that we want.

1) Create guest machine's crash dumpfile from host machine's crash dumpfile

In general, we want to use this feature on failure analysis for the system
where the processing depends on the communication between host and guest
machines to look into the system from both machines's viewpoints.

As a concrete situation, consider where there's heartbeat monitoring
feature on the guest machine's side, where we need to determine in
which machine side the cause of heartbeat stop lies. In our actual
experiments, we encountered such situation and we found the cause of
the bug was in host's process schedular so guest machine's vcpu stopped
for a long time and then led to heartbeat stop.

The module that judges heartbeat stop is on guest machine, so we need
to debug guest machine's data. But if the cause lies in host machine
side, we need to look into host machine's crash dump.

Without this feature, we first create guest machine's dump and then
create host mahine's, but there's only a short time between two
processings, during which it's unlikely that buggy situation remains.

So, we think the feature is useful to debug both guest machine's and
host machine's sides at the same time, and expect we can make failure
analysis efficiently.

Of course, we believe this feature is commonly useful on the situation
where guest machine doesn't work well due to something of host machine's.

2) Get offsets of VMCS information on the CPU running on the host machine

If kdump doesn't work well, then it means we cannot use kvm API to get
register values of guest machine and they are still left on its vmcs
region. In the case, we use crash dump mechanism running outside of
linux kernel, such as sadump, a firmware-based crash dump. Then VMCS
information is then necessary.

TODO:
  1. In kexec-tools, get VMCSINFO via sysfs and dump it as note information
     into vmcore.
  2. Dump VMCS region of each guest vcpu and VMCSINFO into qemu-process
     core file. To do this, we will modify kernel core dumper, gdb gcore
     and crash gcore.
  3. Dump guest image from the qemu-process core file into a vmcore.

Changelog from v3 to v4:
1. All the variables and functions are moved to vmcsinfo-intel module.
2. Add a new sysfs interface /sys/devices/system/cpu/vmcs_id to export
   vmcs revision identifier. And origial sysfs interface is changed
   from /sys/devices/cpu/vmcs to /sys/devices/system/cpu/vmcs. Thanks
   Greg KH for his helpful comments about sysfs.

Changelog from v2 to v3:
1. New VMCSINFO format.
   Now the VMCSINFO is mainly made up of an array that contains all vmcs
   fields' offsets. The offsets aren't encoded because we decode them in
   the module itself. If some field doesn't exist or its offset cannot be
   decoded correctly, the offset in the array is just set to zero.
2. New sysfs interface and Documentation/ABI entry. 
   We expose the actual fields in /sys/devices/cpu/vmcs instead of just
   exporting the address of VMCSINFO in /sys/kernel/vmcsinfo.
   For example, /sys/devices/cpu/vmcs/0800 contains the offset of
   GUEST_DS_SELECTOR. 0800 is the encoding of GUEST_DS_SELECTOR.
   Accordingly, ABI entry in Documentation is changed from sysfs-kernel-vmcsinfo
   to sysfs-devices-cpu-vmcs.

Changelog from v1 to v2:
1. The VMCSINFO now has a simple binary <field><encoded offset> format,
   as below:
     +-------------+--------------------------+
     | Byte offset | Contents                 |
     +-------------+--------------------------+
     | 0           | VMCS revision identifier |
     +-------------+--------------------------+
     | 4           | <field><encoded offset>  |
     +-------------+--------------------------+
     | 16          | <field><encoded offset>  |
     +-------------+--------------------------+
     ......
  
   The first 32 bits of VMCSINFO contains the VMCS revision identifier.
   The remainder of VMCSINFO is used for <field><encoded offset> sets.
   Each set takes 12 bytes: field occupys 4 bytes and its corresponding
   encoded offset occupys 8 bytes.

   Encoded offsets are raw values read by vmcs_read{16, 64, 32, l}, and
   they are all unsigned extended to 8 bytes for each <field><encoded offset>
   set will have the same size. 
   We do not decode offsets here. The decoding work is delayed in userspace
   tools for more flexible handling.
   
   And here are two examples of the new VMCSINFO:
   Processor: Intel(R) Core(TM)2 Duo CPU     E7500  @ 2.93GHz
   VMCSINFO contains:
     <0000000d>                   --> VMCS revision id = 0xd
     <00004000><0000000001840180> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x01840180
     <00004002><0000000001940190> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x01940190
     <0000401e><000000000fe40fe0> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x0fe40fe0
     <0000400c><0000000001e401e0> --> OFFSET(VM_EXIT_CONTROLS) = 0x01e401e0
     ......

   Processor: Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz (24 cores)
   VMCSINFO contains:
     <0000000e>                   --> VMCS revision id = 0xe 
     <00004000><0000000005540550> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x05540550
     <00004002><0000000005440540> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x05440540
     <0000401e><00000000054c0548> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x054c0548
     <0000400c><00000000057c0578> --> OFFSET(VM_EXIT_CONTROLS) = 0x057c0578
     ......

2. Add a new kernel module *vmcsinfo-intel* for filling VMCSINFO instead
   of putting it in module kvm-intel. The new module is auto-loaded
   when the vmx cpufeature is detected and it depends on module kvm-intel.
   *Loading and unloading this module will have no side effect on the
   running guests.*
3. The sysfs file vmcsinfo is splitted into 2 files:
   /sys/kernel/vmcsinfo: shows physical address of VMCSINFO note information.
   /sys/kernel/vmcsinfo_maxsize: shows max size of VMCSINFO.
4. A new Documentation/ABI entry is added for vmcsinfo and vmcsinfo_maxsize.
5. Do not update VMCSINFO note when the kernel is panicked.

zhangyanfei (3):
  KVM: Export symbols for module vmcsinfo-intel
  KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  Documentation: Add ABI entry for vmcs sysfs interface.

 Documentation/ABI/testing/sysfs-devices-system-cpu |   21 +
 arch/x86/include/asm/vmx.h                         |   73 +++
 arch/x86/kvm/Kconfig                               |   11 +
 arch/x86/kvm/Makefile                              |    3 +
 arch/x86/kvm/vmcsinfo.c                            |  586 ++++++++++++++++++++
 arch/x86/kvm/vmx.c                                 |   81 +---
 include/linux/kvm_host.h                           |    3 +
 virt/kvm/kvm_main.c                                |    8 +-
 8 files changed, 714 insertions(+), 72 deletions(-)
 create mode 100644 arch/x86/kvm/vmcsinfo.c

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 0/3] Export offsets of VMCS fields as note information for kdump
@ 2012-07-04  9:58 ` Yanfei Zhang
  0 siblings, 0 replies; 26+ messages in thread
From: Yanfei Zhang @ 2012-07-04  9:58 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: dzickus, luto, kvm, Joerg Roedel, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

This patch set exports offsets of VMCS fields as note information for
kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve
runtime state of guest machine image, such as registers, in host
machine's crash dump as VMCS format. The problem is that VMCS internal
is hidden by Intel in its specification. So, we slove this problem
by reverse engineering implemented in this patch set. The VMCSINFO
is exported via sysfs (/sys/devices/system/cpu/vmcs/) to kexec-tools.

Here are two usercases for two features that we want.

1) Create guest machine's crash dumpfile from host machine's crash dumpfile

In general, we want to use this feature on failure analysis for the system
where the processing depends on the communication between host and guest
machines to look into the system from both machines's viewpoints.

As a concrete situation, consider where there's heartbeat monitoring
feature on the guest machine's side, where we need to determine in
which machine side the cause of heartbeat stop lies. In our actual
experiments, we encountered such situation and we found the cause of
the bug was in host's process schedular so guest machine's vcpu stopped
for a long time and then led to heartbeat stop.

The module that judges heartbeat stop is on guest machine, so we need
to debug guest machine's data. But if the cause lies in host machine
side, we need to look into host machine's crash dump.

Without this feature, we first create guest machine's dump and then
create host mahine's, but there's only a short time between two
processings, during which it's unlikely that buggy situation remains.

So, we think the feature is useful to debug both guest machine's and
host machine's sides at the same time, and expect we can make failure
analysis efficiently.

Of course, we believe this feature is commonly useful on the situation
where guest machine doesn't work well due to something of host machine's.

2) Get offsets of VMCS information on the CPU running on the host machine

If kdump doesn't work well, then it means we cannot use kvm API to get
register values of guest machine and they are still left on its vmcs
region. In the case, we use crash dump mechanism running outside of
linux kernel, such as sadump, a firmware-based crash dump. Then VMCS
information is then necessary.

TODO:
  1. In kexec-tools, get VMCSINFO via sysfs and dump it as note information
     into vmcore.
  2. Dump VMCS region of each guest vcpu and VMCSINFO into qemu-process
     core file. To do this, we will modify kernel core dumper, gdb gcore
     and crash gcore.
  3. Dump guest image from the qemu-process core file into a vmcore.

Changelog from v3 to v4:
1. All the variables and functions are moved to vmcsinfo-intel module.
2. Add a new sysfs interface /sys/devices/system/cpu/vmcs_id to export
   vmcs revision identifier. And origial sysfs interface is changed
   from /sys/devices/cpu/vmcs to /sys/devices/system/cpu/vmcs. Thanks
   Greg KH for his helpful comments about sysfs.

Changelog from v2 to v3:
1. New VMCSINFO format.
   Now the VMCSINFO is mainly made up of an array that contains all vmcs
   fields' offsets. The offsets aren't encoded because we decode them in
   the module itself. If some field doesn't exist or its offset cannot be
   decoded correctly, the offset in the array is just set to zero.
2. New sysfs interface and Documentation/ABI entry. 
   We expose the actual fields in /sys/devices/cpu/vmcs instead of just
   exporting the address of VMCSINFO in /sys/kernel/vmcsinfo.
   For example, /sys/devices/cpu/vmcs/0800 contains the offset of
   GUEST_DS_SELECTOR. 0800 is the encoding of GUEST_DS_SELECTOR.
   Accordingly, ABI entry in Documentation is changed from sysfs-kernel-vmcsinfo
   to sysfs-devices-cpu-vmcs.

Changelog from v1 to v2:
1. The VMCSINFO now has a simple binary <field><encoded offset> format,
   as below:
     +-------------+--------------------------+
     | Byte offset | Contents                 |
     +-------------+--------------------------+
     | 0           | VMCS revision identifier |
     +-------------+--------------------------+
     | 4           | <field><encoded offset>  |
     +-------------+--------------------------+
     | 16          | <field><encoded offset>  |
     +-------------+--------------------------+
     ......
  
   The first 32 bits of VMCSINFO contains the VMCS revision identifier.
   The remainder of VMCSINFO is used for <field><encoded offset> sets.
   Each set takes 12 bytes: field occupys 4 bytes and its corresponding
   encoded offset occupys 8 bytes.

   Encoded offsets are raw values read by vmcs_read{16, 64, 32, l}, and
   they are all unsigned extended to 8 bytes for each <field><encoded offset>
   set will have the same size. 
   We do not decode offsets here. The decoding work is delayed in userspace
   tools for more flexible handling.
   
   And here are two examples of the new VMCSINFO:
   Processor: Intel(R) Core(TM)2 Duo CPU     E7500  @ 2.93GHz
   VMCSINFO contains:
     <0000000d>                   --> VMCS revision id = 0xd
     <00004000><0000000001840180> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x01840180
     <00004002><0000000001940190> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x01940190
     <0000401e><000000000fe40fe0> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x0fe40fe0
     <0000400c><0000000001e401e0> --> OFFSET(VM_EXIT_CONTROLS) = 0x01e401e0
     ......

   Processor: Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz (24 cores)
   VMCSINFO contains:
     <0000000e>                   --> VMCS revision id = 0xe 
     <00004000><0000000005540550> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x05540550
     <00004002><0000000005440540> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x05440540
     <0000401e><00000000054c0548> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x054c0548
     <0000400c><00000000057c0578> --> OFFSET(VM_EXIT_CONTROLS) = 0x057c0578
     ......

2. Add a new kernel module *vmcsinfo-intel* for filling VMCSINFO instead
   of putting it in module kvm-intel. The new module is auto-loaded
   when the vmx cpufeature is detected and it depends on module kvm-intel.
   *Loading and unloading this module will have no side effect on the
   running guests.*
3. The sysfs file vmcsinfo is splitted into 2 files:
   /sys/kernel/vmcsinfo: shows physical address of VMCSINFO note information.
   /sys/kernel/vmcsinfo_maxsize: shows max size of VMCSINFO.
4. A new Documentation/ABI entry is added for vmcsinfo and vmcsinfo_maxsize.
5. Do not update VMCSINFO note when the kernel is panicked.

zhangyanfei (3):
  KVM: Export symbols for module vmcsinfo-intel
  KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  Documentation: Add ABI entry for vmcs sysfs interface.

 Documentation/ABI/testing/sysfs-devices-system-cpu |   21 +
 arch/x86/include/asm/vmx.h                         |   73 +++
 arch/x86/kvm/Kconfig                               |   11 +
 arch/x86/kvm/Makefile                              |    3 +
 arch/x86/kvm/vmcsinfo.c                            |  586 ++++++++++++++++++++
 arch/x86/kvm/vmx.c                                 |   81 +---
 include/linux/kvm_host.h                           |    3 +
 virt/kvm/kvm_main.c                                |    8 +-
 8 files changed, 714 insertions(+), 72 deletions(-)
 create mode 100644 arch/x86/kvm/vmcsinfo.c

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 1/3] KVM: Export symbols for module vmcsinfo-intel
  2012-07-04  9:58 ` Yanfei Zhang
@ 2012-07-04 10:01   ` Yanfei Zhang
  -1 siblings, 0 replies; 26+ messages in thread
From: Yanfei Zhang @ 2012-07-04 10:01 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

A new module named vmcsinfo-intel is used to fill VMCSINFO. And
this module depends on kvm-intel and kvm module. So we should
export some symbols of kvm-intel and kvm module that are needed
by vmcsinfo-intel.

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/include/asm/vmx.h |   73 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx.c         |   81 +++++++-------------------------------------
 include/linux/kvm_host.h   |    3 ++
 virt/kvm/kvm_main.c        |    8 ++--
 4 files changed, 93 insertions(+), 72 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 31f180c..1044e2e 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -26,6 +26,7 @@
  */
 
 #include <linux/types.h>
+#include <linux/kvm_host.h>
 
 /*
  * Definitions of Primary Processor-Based VM-Execution Controls.
@@ -481,4 +482,76 @@ enum vm_instruction_error_number {
 	VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID = 28,
 };
 
+#define __ex(x) __kvm_handle_fault_on_reboot(x)
+#define __ex_clear(x, reg) \
+	____kvm_handle_fault_on_reboot(x, "xor " reg " , " reg)
+
+struct vmcs {
+	u32 revision_id;
+	u32 abort;
+	char data[0];
+};
+
+struct vmcs_config {
+	int size;
+	int order;
+	u32 revision_id;
+	u32 pin_based_exec_ctrl;
+	u32 cpu_based_exec_ctrl;
+	u32 cpu_based_2nd_exec_ctrl;
+	u32 vmexit_ctrl;
+	u32 vmentry_ctrl;
+};
+
+extern struct vmcs_config vmcs_config;
+
+DECLARE_PER_CPU(struct vmcs *, current_vmcs);
+
+enum vmcs_field_type {
+	VMCS_FIELD_TYPE_U16 = 0,
+	VMCS_FIELD_TYPE_U64 = 1,
+	VMCS_FIELD_TYPE_U32 = 2,
+	VMCS_FIELD_TYPE_NATURAL_WIDTH = 3
+};
+
+static inline int vmcs_field_type(unsigned long field)
+{
+	if (0x1 & field)	/* the *_HIGH fields are all 32 bit */
+		return VMCS_FIELD_TYPE_U32;
+	return (field >> 13) & 0x3 ;
+}
+
+static __always_inline unsigned long vmcs_readl(unsigned long field)
+{
+	unsigned long value;
+
+	asm volatile (__ex_clear(ASM_VMX_VMREAD_RDX_RAX, "%0")
+		      : "=a"(value) : "d"(field) : "cc");
+	return value;
+}
+
+static __always_inline u16 vmcs_read16(unsigned long field)
+{
+	return vmcs_readl(field);
+}
+
+static __always_inline u32 vmcs_read32(unsigned long field)
+{
+	return vmcs_readl(field);
+}
+
+static __always_inline u64 vmcs_read64(unsigned long field)
+{
+#ifdef CONFIG_X86_64
+	return vmcs_readl(field);
+#else
+	return vmcs_readl(field) | ((u64)vmcs_readl(field+1) << 32);
+#endif
+}
+
+struct vmcs *alloc_vmcs(void);
+void vmcs_load(struct vmcs *);
+void vmcs_clear(struct vmcs *);
+void free_vmcs(struct vmcs *);
+
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 32eb588..43ceae7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -20,7 +20,6 @@
 #include "mmu.h"
 #include "cpuid.h"
 
-#include <linux/kvm_host.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
@@ -45,10 +44,6 @@
 
 #include "trace.h"
 
-#define __ex(x) __kvm_handle_fault_on_reboot(x)
-#define __ex_clear(x, reg) \
-	____kvm_handle_fault_on_reboot(x, "xor " reg " , " reg)
-
 MODULE_AUTHOR("Qumranet");
 MODULE_LICENSE("GPL");
 
@@ -127,12 +122,6 @@ module_param(ple_window, int, S_IRUGO);
 #define NR_AUTOLOAD_MSRS 8
 #define VMCS02_POOL_SIZE 1
 
-struct vmcs {
-	u32 revision_id;
-	u32 abort;
-	char data[0];
-};
-
 /*
  * Track a VMCS that may be loaded on a certain CPU. If it is (cpu!=-1), also
  * remember whether it was VMLAUNCHed, and maintain a linked list of all VMCSs
@@ -617,7 +606,9 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3);
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
-static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
+DEFINE_PER_CPU(struct vmcs *, current_vmcs);
+EXPORT_SYMBOL_GPL(current_vmcs);
+
 /*
  * We maintain a per-CPU linked-list of VMCS loaded on that CPU. This is needed
  * when a CPU is brought down, and we need to VMCLEAR all VMCSs loaded on it.
@@ -636,16 +627,8 @@ static bool cpu_has_load_perf_global_ctrl;
 static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
 static DEFINE_SPINLOCK(vmx_vpid_lock);
 
-static struct vmcs_config {
-	int size;
-	int order;
-	u32 revision_id;
-	u32 pin_based_exec_ctrl;
-	u32 cpu_based_exec_ctrl;
-	u32 cpu_based_2nd_exec_ctrl;
-	u32 vmexit_ctrl;
-	u32 vmentry_ctrl;
-} vmcs_config;
+struct vmcs_config vmcs_config;
+EXPORT_SYMBOL_GPL(vmcs_config);
 
 static struct vmx_capability {
 	u32 ept;
@@ -940,7 +923,7 @@ static struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr)
 	return NULL;
 }
 
-static void vmcs_clear(struct vmcs *vmcs)
+void vmcs_clear(struct vmcs *vmcs)
 {
 	u64 phys_addr = __pa(vmcs);
 	u8 error;
@@ -952,6 +935,7 @@ static void vmcs_clear(struct vmcs *vmcs)
 		printk(KERN_ERR "kvm: vmclear fail: %p/%llx\n",
 		       vmcs, phys_addr);
 }
+EXPORT_SYMBOL_GPL(vmcs_clear);
 
 static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 {
@@ -960,7 +944,7 @@ static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 	loaded_vmcs->launched = 0;
 }
 
-static void vmcs_load(struct vmcs *vmcs)
+void vmcs_load(struct vmcs *vmcs)
 {
 	u64 phys_addr = __pa(vmcs);
 	u8 error;
@@ -972,6 +956,7 @@ static void vmcs_load(struct vmcs *vmcs)
 		printk(KERN_ERR "kvm: vmptrld %p/%llx failed\n",
 		       vmcs, phys_addr);
 }
+EXPORT_SYMBOL_GPL(vmcs_load);
 
 static void __loaded_vmcs_clear(void *arg)
 {
@@ -1043,34 +1028,6 @@ static inline void ept_sync_individual_addr(u64 eptp, gpa_t gpa)
 	}
 }
 
-static __always_inline unsigned long vmcs_readl(unsigned long field)
-{
-	unsigned long value;
-
-	asm volatile (__ex_clear(ASM_VMX_VMREAD_RDX_RAX, "%0")
-		      : "=a"(value) : "d"(field) : "cc");
-	return value;
-}
-
-static __always_inline u16 vmcs_read16(unsigned long field)
-{
-	return vmcs_readl(field);
-}
-
-static __always_inline u32 vmcs_read32(unsigned long field)
-{
-	return vmcs_readl(field);
-}
-
-static __always_inline u64 vmcs_read64(unsigned long field)
-{
-#ifdef CONFIG_X86_64
-	return vmcs_readl(field);
-#else
-	return vmcs_readl(field) | ((u64)vmcs_readl(field+1) << 32);
-#endif
-}
-
 static noinline void vmwrite_error(unsigned long field, unsigned long value)
 {
 	printk(KERN_ERR "vmwrite error: reg %lx value %lx (err %d)\n",
@@ -2580,15 +2537,17 @@ static struct vmcs *alloc_vmcs_cpu(int cpu)
 	return vmcs;
 }
 
-static struct vmcs *alloc_vmcs(void)
+struct vmcs *alloc_vmcs(void)
 {
 	return alloc_vmcs_cpu(raw_smp_processor_id());
 }
+EXPORT_SYMBOL_GPL(alloc_vmcs);
 
-static void free_vmcs(struct vmcs *vmcs)
+void free_vmcs(struct vmcs *vmcs)
 {
 	free_pages((unsigned long)vmcs, vmcs_config.order);
 }
+EXPORT_SYMBOL_GPL(free_vmcs);
 
 /*
  * Free a VMCS, but before that VMCLEAR it on the CPU where it was last loaded
@@ -5314,20 +5273,6 @@ static int handle_vmresume(struct kvm_vcpu *vcpu)
 	return nested_vmx_run(vcpu, false);
 }
 
-enum vmcs_field_type {
-	VMCS_FIELD_TYPE_U16 = 0,
-	VMCS_FIELD_TYPE_U64 = 1,
-	VMCS_FIELD_TYPE_U32 = 2,
-	VMCS_FIELD_TYPE_NATURAL_WIDTH = 3
-};
-
-static inline int vmcs_field_type(unsigned long field)
-{
-	if (0x1 & field)	/* the *_HIGH fields are all 32 bit */
-		return VMCS_FIELD_TYPE_U32;
-	return (field >> 13) & 0x3 ;
-}
-
 static inline int vmcs_field_readonly(unsigned long field)
 {
 	return (((field >> 10) & 0x3) == 1);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c446435..0930fd9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -95,6 +95,9 @@ enum kvm_bus {
 	KVM_NR_BUSES
 };
 
+int hardware_enable_all(void);
+void hardware_disable_all(void);
+
 int kvm_io_bus_write(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
 		     int len, const void *val);
 int kvm_io_bus_read(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr, int len,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7e14068..26fd04d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -90,8 +90,6 @@ static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
 static long kvm_vcpu_compat_ioctl(struct file *file, unsigned int ioctl,
 				  unsigned long arg);
 #endif
-static int hardware_enable_all(void);
-static void hardware_disable_all(void);
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
 
@@ -2330,14 +2328,15 @@ static void hardware_disable_all_nolock(void)
 		on_each_cpu(hardware_disable_nolock, NULL, 1);
 }
 
-static void hardware_disable_all(void)
+void hardware_disable_all(void)
 {
 	raw_spin_lock(&kvm_lock);
 	hardware_disable_all_nolock();
 	raw_spin_unlock(&kvm_lock);
 }
+EXPORT_SYMBOL_GPL(hardware_disable_all);
 
-static int hardware_enable_all(void)
+int hardware_enable_all(void)
 {
 	int r = 0;
 
@@ -2358,6 +2357,7 @@ static int hardware_enable_all(void)
 
 	return r;
 }
+EXPORT_SYMBOL_GPL(hardware_enable_all);
 
 static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
 			   void *v)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 1/3] KVM: Export symbols for module vmcsinfo-intel
@ 2012-07-04 10:01   ` Yanfei Zhang
  0 siblings, 0 replies; 26+ messages in thread
From: Yanfei Zhang @ 2012-07-04 10:01 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: dzickus, luto, kvm, Joerg Roedel, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

A new module named vmcsinfo-intel is used to fill VMCSINFO. And
this module depends on kvm-intel and kvm module. So we should
export some symbols of kvm-intel and kvm module that are needed
by vmcsinfo-intel.

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/include/asm/vmx.h |   73 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx.c         |   81 +++++++-------------------------------------
 include/linux/kvm_host.h   |    3 ++
 virt/kvm/kvm_main.c        |    8 ++--
 4 files changed, 93 insertions(+), 72 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 31f180c..1044e2e 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -26,6 +26,7 @@
  */
 
 #include <linux/types.h>
+#include <linux/kvm_host.h>
 
 /*
  * Definitions of Primary Processor-Based VM-Execution Controls.
@@ -481,4 +482,76 @@ enum vm_instruction_error_number {
 	VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID = 28,
 };
 
+#define __ex(x) __kvm_handle_fault_on_reboot(x)
+#define __ex_clear(x, reg) \
+	____kvm_handle_fault_on_reboot(x, "xor " reg " , " reg)
+
+struct vmcs {
+	u32 revision_id;
+	u32 abort;
+	char data[0];
+};
+
+struct vmcs_config {
+	int size;
+	int order;
+	u32 revision_id;
+	u32 pin_based_exec_ctrl;
+	u32 cpu_based_exec_ctrl;
+	u32 cpu_based_2nd_exec_ctrl;
+	u32 vmexit_ctrl;
+	u32 vmentry_ctrl;
+};
+
+extern struct vmcs_config vmcs_config;
+
+DECLARE_PER_CPU(struct vmcs *, current_vmcs);
+
+enum vmcs_field_type {
+	VMCS_FIELD_TYPE_U16 = 0,
+	VMCS_FIELD_TYPE_U64 = 1,
+	VMCS_FIELD_TYPE_U32 = 2,
+	VMCS_FIELD_TYPE_NATURAL_WIDTH = 3
+};
+
+static inline int vmcs_field_type(unsigned long field)
+{
+	if (0x1 & field)	/* the *_HIGH fields are all 32 bit */
+		return VMCS_FIELD_TYPE_U32;
+	return (field >> 13) & 0x3 ;
+}
+
+static __always_inline unsigned long vmcs_readl(unsigned long field)
+{
+	unsigned long value;
+
+	asm volatile (__ex_clear(ASM_VMX_VMREAD_RDX_RAX, "%0")
+		      : "=a"(value) : "d"(field) : "cc");
+	return value;
+}
+
+static __always_inline u16 vmcs_read16(unsigned long field)
+{
+	return vmcs_readl(field);
+}
+
+static __always_inline u32 vmcs_read32(unsigned long field)
+{
+	return vmcs_readl(field);
+}
+
+static __always_inline u64 vmcs_read64(unsigned long field)
+{
+#ifdef CONFIG_X86_64
+	return vmcs_readl(field);
+#else
+	return vmcs_readl(field) | ((u64)vmcs_readl(field+1) << 32);
+#endif
+}
+
+struct vmcs *alloc_vmcs(void);
+void vmcs_load(struct vmcs *);
+void vmcs_clear(struct vmcs *);
+void free_vmcs(struct vmcs *);
+
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 32eb588..43ceae7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -20,7 +20,6 @@
 #include "mmu.h"
 #include "cpuid.h"
 
-#include <linux/kvm_host.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
@@ -45,10 +44,6 @@
 
 #include "trace.h"
 
-#define __ex(x) __kvm_handle_fault_on_reboot(x)
-#define __ex_clear(x, reg) \
-	____kvm_handle_fault_on_reboot(x, "xor " reg " , " reg)
-
 MODULE_AUTHOR("Qumranet");
 MODULE_LICENSE("GPL");
 
@@ -127,12 +122,6 @@ module_param(ple_window, int, S_IRUGO);
 #define NR_AUTOLOAD_MSRS 8
 #define VMCS02_POOL_SIZE 1
 
-struct vmcs {
-	u32 revision_id;
-	u32 abort;
-	char data[0];
-};
-
 /*
  * Track a VMCS that may be loaded on a certain CPU. If it is (cpu!=-1), also
  * remember whether it was VMLAUNCHed, and maintain a linked list of all VMCSs
@@ -617,7 +606,9 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3);
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
-static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
+DEFINE_PER_CPU(struct vmcs *, current_vmcs);
+EXPORT_SYMBOL_GPL(current_vmcs);
+
 /*
  * We maintain a per-CPU linked-list of VMCS loaded on that CPU. This is needed
  * when a CPU is brought down, and we need to VMCLEAR all VMCSs loaded on it.
@@ -636,16 +627,8 @@ static bool cpu_has_load_perf_global_ctrl;
 static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
 static DEFINE_SPINLOCK(vmx_vpid_lock);
 
-static struct vmcs_config {
-	int size;
-	int order;
-	u32 revision_id;
-	u32 pin_based_exec_ctrl;
-	u32 cpu_based_exec_ctrl;
-	u32 cpu_based_2nd_exec_ctrl;
-	u32 vmexit_ctrl;
-	u32 vmentry_ctrl;
-} vmcs_config;
+struct vmcs_config vmcs_config;
+EXPORT_SYMBOL_GPL(vmcs_config);
 
 static struct vmx_capability {
 	u32 ept;
@@ -940,7 +923,7 @@ static struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr)
 	return NULL;
 }
 
-static void vmcs_clear(struct vmcs *vmcs)
+void vmcs_clear(struct vmcs *vmcs)
 {
 	u64 phys_addr = __pa(vmcs);
 	u8 error;
@@ -952,6 +935,7 @@ static void vmcs_clear(struct vmcs *vmcs)
 		printk(KERN_ERR "kvm: vmclear fail: %p/%llx\n",
 		       vmcs, phys_addr);
 }
+EXPORT_SYMBOL_GPL(vmcs_clear);
 
 static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 {
@@ -960,7 +944,7 @@ static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 	loaded_vmcs->launched = 0;
 }
 
-static void vmcs_load(struct vmcs *vmcs)
+void vmcs_load(struct vmcs *vmcs)
 {
 	u64 phys_addr = __pa(vmcs);
 	u8 error;
@@ -972,6 +956,7 @@ static void vmcs_load(struct vmcs *vmcs)
 		printk(KERN_ERR "kvm: vmptrld %p/%llx failed\n",
 		       vmcs, phys_addr);
 }
+EXPORT_SYMBOL_GPL(vmcs_load);
 
 static void __loaded_vmcs_clear(void *arg)
 {
@@ -1043,34 +1028,6 @@ static inline void ept_sync_individual_addr(u64 eptp, gpa_t gpa)
 	}
 }
 
-static __always_inline unsigned long vmcs_readl(unsigned long field)
-{
-	unsigned long value;
-
-	asm volatile (__ex_clear(ASM_VMX_VMREAD_RDX_RAX, "%0")
-		      : "=a"(value) : "d"(field) : "cc");
-	return value;
-}
-
-static __always_inline u16 vmcs_read16(unsigned long field)
-{
-	return vmcs_readl(field);
-}
-
-static __always_inline u32 vmcs_read32(unsigned long field)
-{
-	return vmcs_readl(field);
-}
-
-static __always_inline u64 vmcs_read64(unsigned long field)
-{
-#ifdef CONFIG_X86_64
-	return vmcs_readl(field);
-#else
-	return vmcs_readl(field) | ((u64)vmcs_readl(field+1) << 32);
-#endif
-}
-
 static noinline void vmwrite_error(unsigned long field, unsigned long value)
 {
 	printk(KERN_ERR "vmwrite error: reg %lx value %lx (err %d)\n",
@@ -2580,15 +2537,17 @@ static struct vmcs *alloc_vmcs_cpu(int cpu)
 	return vmcs;
 }
 
-static struct vmcs *alloc_vmcs(void)
+struct vmcs *alloc_vmcs(void)
 {
 	return alloc_vmcs_cpu(raw_smp_processor_id());
 }
+EXPORT_SYMBOL_GPL(alloc_vmcs);
 
-static void free_vmcs(struct vmcs *vmcs)
+void free_vmcs(struct vmcs *vmcs)
 {
 	free_pages((unsigned long)vmcs, vmcs_config.order);
 }
+EXPORT_SYMBOL_GPL(free_vmcs);
 
 /*
  * Free a VMCS, but before that VMCLEAR it on the CPU where it was last loaded
@@ -5314,20 +5273,6 @@ static int handle_vmresume(struct kvm_vcpu *vcpu)
 	return nested_vmx_run(vcpu, false);
 }
 
-enum vmcs_field_type {
-	VMCS_FIELD_TYPE_U16 = 0,
-	VMCS_FIELD_TYPE_U64 = 1,
-	VMCS_FIELD_TYPE_U32 = 2,
-	VMCS_FIELD_TYPE_NATURAL_WIDTH = 3
-};
-
-static inline int vmcs_field_type(unsigned long field)
-{
-	if (0x1 & field)	/* the *_HIGH fields are all 32 bit */
-		return VMCS_FIELD_TYPE_U32;
-	return (field >> 13) & 0x3 ;
-}
-
 static inline int vmcs_field_readonly(unsigned long field)
 {
 	return (((field >> 10) & 0x3) == 1);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c446435..0930fd9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -95,6 +95,9 @@ enum kvm_bus {
 	KVM_NR_BUSES
 };
 
+int hardware_enable_all(void);
+void hardware_disable_all(void);
+
 int kvm_io_bus_write(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
 		     int len, const void *val);
 int kvm_io_bus_read(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr, int len,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7e14068..26fd04d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -90,8 +90,6 @@ static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
 static long kvm_vcpu_compat_ioctl(struct file *file, unsigned int ioctl,
 				  unsigned long arg);
 #endif
-static int hardware_enable_all(void);
-static void hardware_disable_all(void);
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
 
@@ -2330,14 +2328,15 @@ static void hardware_disable_all_nolock(void)
 		on_each_cpu(hardware_disable_nolock, NULL, 1);
 }
 
-static void hardware_disable_all(void)
+void hardware_disable_all(void)
 {
 	raw_spin_lock(&kvm_lock);
 	hardware_disable_all_nolock();
 	raw_spin_unlock(&kvm_lock);
 }
+EXPORT_SYMBOL_GPL(hardware_disable_all);
 
-static int hardware_enable_all(void)
+int hardware_enable_all(void)
 {
 	int r = 0;
 
@@ -2358,6 +2357,7 @@ static int hardware_enable_all(void)
 
 	return r;
 }
+EXPORT_SYMBOL_GPL(hardware_enable_all);
 
 static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
 			   void *v)
-- 
1.7.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  2012-07-04  9:58 ` Yanfei Zhang
@ 2012-07-04 10:05   ` Yanfei Zhang
  -1 siblings, 0 replies; 26+ messages in thread
From: Yanfei Zhang @ 2012-07-04 10:05 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

This patch implements a new module named vmcsinfo-intel. The
module fills VMCSINFO with the VMCS revision identifier,
and offsets of VMCS fields.

Note, offsets of fields that defined in Intel specification
(Intel® 64 and IA-32 Architectures Software Developer’s Manual,
Volume 3C) but not defined in *vmcs_field* will not be filled in
VMCSINFO. And, some fields may be unsupported in some machines,
in these machines, corresponding offsets will be zero.

Besides, this patch also exports vmcs revision identifier via
/sys/devices/system/cpu/vmcs_id and offsets of fields via
/sys/devices/system/cpu/vmcs/.
Individual offsets are contained in subfiles named by the filed's
encoding, e.g.: /sys/devices/cpu/vmcs/0800

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/kvm/Kconfig    |   11 +
 arch/x86/kvm/Makefile   |    3 +
 arch/x86/kvm/vmcsinfo.c |  586 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 600 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/kvm/vmcsinfo.c

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index a28f338..1dd64b1 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -63,6 +63,17 @@ config KVM_INTEL
 	  To compile this as a module, choose M here: the module
 	  will be called kvm-intel.
 
+config VMCSINFO_INTEL
+	tristate "Export VMCSINFO for Intel processors"
+	depends on KVM_INTEL
+	---help---
+	  Provides support for exporting VMCSINFO on Intel processors equipped
+	  with the VT extensions. The VMCSINFO contains a VMCS revision
+	  identifier and offsets of VMCS fields.
+
+	  To compile this as a module, choose M here: the module
+	  will be called vmcsinfo-intel.
+
 config KVM_AMD
 	tristate "KVM for AMD processors support"
 	depends on KVM
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 4f579e8..12a1ef6 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -4,6 +4,7 @@ ccflags-y += -Ivirt/kvm -Iarch/x86/kvm
 CFLAGS_x86.o := -I.
 CFLAGS_svm.o := -I.
 CFLAGS_vmx.o := -I.
+CFLAGS_vmcsinfo.o := -I.
 
 kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 				coalesced_mmio.o irq_comm.o eventfd.o \
@@ -15,7 +16,9 @@ kvm-y			+= x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
 			   i8254.o timer.o cpuid.o pmu.o
 kvm-intel-y		+= vmx.o
 kvm-amd-y		+= svm.o
+vmcsinfo-intel-y	+= vmcsinfo.o
 
 obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o
+obj-$(CONFIG_VMCSINFO_INTEL)	+= vmcsinfo-intel.o
diff --git a/arch/x86/kvm/vmcsinfo.c b/arch/x86/kvm/vmcsinfo.c
new file mode 100644
index 0000000..bff6a1e
--- /dev/null
+++ b/arch/x86/kvm/vmcsinfo.c
@@ -0,0 +1,586 @@
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * This module enables machines with Intel VT-x extensions to export
+ * offsets of VMCS fields for guest debugging.
+ *
+ * Copyright (C) 2012 Fujitsu, Inc.
+ *
+ * Authors:
+ *   Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/mod_devicetable.h>
+#include <linux/device.h>
+#include <linux/swab.h>
+#include <linux/cpu.h>
+
+#include <asm/vmx.h>
+
+MODULE_AUTHOR("Fujitsu");
+MODULE_LICENSE("GPL");
+
+static const struct x86_cpu_id vmcsinfo_cpu_id[] = {
+	X86_FEATURE_MATCH(X86_FEATURE_VMX),
+	{}
+};
+MODULE_DEVICE_TABLE(x86cpu, vmcsinfo_cpu_id);
+
+/*
+ * vmcs field offsets.
+ */
+static struct vmcsinfo {
+	u32 vmcs_revision_id;
+	u16 vmcs_field_to_offset_table[HOST_RIP + 1];
+} vmcsinfo;
+
+#define VMCSINFO_MAX_FIELD \
+	ARRAY_SIZE(vmcsinfo.vmcs_field_to_offset_table)
+
+static inline void vmcsinfo_revision_id(u32 id)
+{
+	vmcsinfo.vmcs_revision_id = id;
+}
+
+static inline void vmcsinfo_field(unsigned long field, u16 offset)
+{
+	if (field < VMCSINFO_MAX_FIELD)
+		vmcsinfo.vmcs_field_to_offset_table[field] = offset;
+}
+
+static inline short get_vmcs_field_offset(unsigned long field)
+{
+	if (field >= VMCSINFO_MAX_FIELD ||
+	    vmcsinfo.vmcs_field_to_offset_table[field] == 0)
+		return -1;
+	return vmcsinfo.vmcs_field_to_offset_table[field];
+}
+
+const char vmcs_group_name[] = "vmcs";
+
+#define BUILD_OFFSET_SHOW(field_code)                                         \
+static ssize_t _##field_code##_show(struct device *dev,                       \
+				    struct device_attribute *attr,            \
+				    char *buf)                                \
+{                                                                             \
+	return sprintf(buf, "%d\n",                                           \
+		       vmcsinfo.vmcs_field_to_offset_table[0x##field_code]);  \
+}                                                                             \
+static DEVICE_ATTR(field_code, 0444, _##field_code##_show, NULL);             \
+
+BUILD_OFFSET_SHOW(0000); /* VIRTUAL_PROCESSOR_ID             */
+BUILD_OFFSET_SHOW(0800); /* GUEST_ES_SELECTOR                */
+BUILD_OFFSET_SHOW(0802); /* GUEST_CS_SELECTOR                */
+BUILD_OFFSET_SHOW(0804); /* GUEST_SS_SELECTOR                */
+BUILD_OFFSET_SHOW(0806); /* GUEST_DS_SELECTOR                */
+BUILD_OFFSET_SHOW(0808); /* GUEST_FS_SELECTOR                */
+BUILD_OFFSET_SHOW(080a); /* GUEST_GS_SELECTOR                */
+BUILD_OFFSET_SHOW(080c); /* GUEST_LDTR_SELECTOR              */
+BUILD_OFFSET_SHOW(080e); /* GUEST_TR_SELECTOR                */
+BUILD_OFFSET_SHOW(0c00); /* HOST_ES_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c02); /* HOST_CS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c04); /* HOST_SS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c06); /* HOST_DS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c08); /* HOST_FS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c0a); /* HOST_GS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c0c); /* HOST_TR_SELECTOR                 */
+BUILD_OFFSET_SHOW(2000); /* IO_BITMAP_A                      */
+BUILD_OFFSET_SHOW(2001); /* IO_BITMAP_A_HIGH                 */
+BUILD_OFFSET_SHOW(2002); /* IO_BITMAP_B                      */
+BUILD_OFFSET_SHOW(2003); /* IO_BITMAP_B_HIGH                 */
+BUILD_OFFSET_SHOW(2004); /* MSR_BITMAP                       */
+BUILD_OFFSET_SHOW(2005); /* MSR_BITMAP_HIGH                  */
+BUILD_OFFSET_SHOW(2006); /* VM_EXIT_MSR_STORE_ADDR           */
+BUILD_OFFSET_SHOW(2007); /* VM_EXIT_MSR_STORE_ADDR_HIGH      */
+BUILD_OFFSET_SHOW(2008); /* VM_EXIT_MSR_LOAD_ADDR            */
+BUILD_OFFSET_SHOW(2009); /* VM_EXIT_MSR_LOAD_ADDR_HIGH       */
+BUILD_OFFSET_SHOW(200a); /* VM_ENTRY_MSR_LOAD_ADDR           */
+BUILD_OFFSET_SHOW(200b); /* VM_ENTRY_MSR_LOAD_ADDR_HIGH      */
+BUILD_OFFSET_SHOW(2010); /* TSC_OFFSET                       */
+BUILD_OFFSET_SHOW(2011); /* TSC_OFFSET_HIGH                  */
+BUILD_OFFSET_SHOW(2012); /* VIRTUAL_APIC_PAGE_ADDR           */
+BUILD_OFFSET_SHOW(2013); /* VIRTUAL_APIC_PAGE_ADDR_HIGH      */
+BUILD_OFFSET_SHOW(2014); /* APIC_ACCESS_ADDR                 */
+BUILD_OFFSET_SHOW(2015); /* APIC_ACCESS_ADDR_HIGH            */
+BUILD_OFFSET_SHOW(201a); /* EPT_POINTER                      */
+BUILD_OFFSET_SHOW(201b); /* EPT_POINTER_HIGH                 */
+BUILD_OFFSET_SHOW(2400); /* GUEST_PHYSICAL_ADDRESS           */
+BUILD_OFFSET_SHOW(2401); /* GUEST_PHYSICAL_ADDRESS_HIGH      */
+BUILD_OFFSET_SHOW(2800); /* VMCS_LINK_POINTER                */
+BUILD_OFFSET_SHOW(2801); /* VMCS_LINK_POINTER_HIGH           */
+BUILD_OFFSET_SHOW(2802); /* GUEST_IA32_DEBUGCTL              */
+BUILD_OFFSET_SHOW(2803); /* GUEST_IA32_DEBUGCTL_HIGH         */
+BUILD_OFFSET_SHOW(2804); /* GUEST_IA32_PAT                   */
+BUILD_OFFSET_SHOW(2805); /* GUEST_IA32_PAT_HIGH              */
+BUILD_OFFSET_SHOW(2806); /* GUEST_IA32_EFER                  */
+BUILD_OFFSET_SHOW(2807); /* GUEST_IA32_EFER_HIGH             */
+BUILD_OFFSET_SHOW(2808); /* GUEST_IA32_PERF_GLOBAL_CTRL      */
+BUILD_OFFSET_SHOW(2809); /* GUEST_IA32_PERF_GLOBAL_CTRL_HIGH */
+BUILD_OFFSET_SHOW(280a); /* GUEST_PDPTR0                     */
+BUILD_OFFSET_SHOW(280b); /* GUEST_PDPTR0_HIGH                */
+BUILD_OFFSET_SHOW(280c); /* GUEST_PDPTR1                     */
+BUILD_OFFSET_SHOW(280d); /* GUEST_PDPTR1_HIGH                */
+BUILD_OFFSET_SHOW(280e); /* GUEST_PDPTR2                     */
+BUILD_OFFSET_SHOW(280f); /* GUEST_PDPTR2_HIGH                */
+BUILD_OFFSET_SHOW(2810); /* GUEST_PDPTR3                     */
+BUILD_OFFSET_SHOW(2811); /* GUEST_PDPTR3_HIGH                */
+BUILD_OFFSET_SHOW(2c00); /* HOST_IA32_PAT                    */
+BUILD_OFFSET_SHOW(2c01); /* HOST_IA32_PAT_HIGH               */
+BUILD_OFFSET_SHOW(2c02); /* HOST_IA32_EFER                   */
+BUILD_OFFSET_SHOW(2c03); /* HOST_IA32_EFER_HIGH              */
+BUILD_OFFSET_SHOW(2c04); /* HOST_IA32_PERF_GLOBAL_CTRL       */
+BUILD_OFFSET_SHOW(2c05); /* HOST_IA32_PERF_GLOBAL_CTRL_HIGH  */
+BUILD_OFFSET_SHOW(4000); /* PIN_BASED_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW(4002); /* CPU_BASED_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW(4004); /* EXCEPTION_BITMAP                 */
+BUILD_OFFSET_SHOW(4006); /* PAGE_FAULT_ERROR_CODE_MASK       */
+BUILD_OFFSET_SHOW(4008); /* PAGE_FAULT_ERROR_CODE_MATCH      */
+BUILD_OFFSET_SHOW(400a); /* CR3_TARGET_COUNT                 */
+BUILD_OFFSET_SHOW(400c); /* VM_EXIT_CONTROLS                 */
+BUILD_OFFSET_SHOW(400e); /* VM_EXIT_MSR_STORE_COUNT          */
+BUILD_OFFSET_SHOW(4010); /* VM_EXIT_MSR_LOAD_COUNT           */
+BUILD_OFFSET_SHOW(4012); /* VM_ENTRY_CONTROLS                */
+BUILD_OFFSET_SHOW(4014); /* VM_ENTRY_MSR_LOAD_COUNT          */
+BUILD_OFFSET_SHOW(4016); /* VM_ENTRY_INTR_INFO_FIELD         */
+BUILD_OFFSET_SHOW(4018); /* VM_ENTRY_EXCEPTION_ERROR_CODE    */
+BUILD_OFFSET_SHOW(401a); /* VM_ENTRY_INSTRUCTION_LEN         */
+BUILD_OFFSET_SHOW(401c); /* TPR_THRESHOLD                    */
+BUILD_OFFSET_SHOW(401e); /* SECONDARY_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW(4020); /* PLE_GAP                          */
+BUILD_OFFSET_SHOW(4022); /* PLE_WINDOW                       */
+BUILD_OFFSET_SHOW(4400); /* VM_INSTRUCTION_ERROR             */
+BUILD_OFFSET_SHOW(4402); /* VM_EXIT_REASON                   */
+BUILD_OFFSET_SHOW(4404); /* VM_EXIT_INTR_INFO                */
+BUILD_OFFSET_SHOW(4406); /* VM_EXIT_INTR_ERROR_CODE          */
+BUILD_OFFSET_SHOW(4408); /* IDT_VECTORING_INFO_FIELD         */
+BUILD_OFFSET_SHOW(440a); /* IDT_VECTORING_ERROR_CODE         */
+BUILD_OFFSET_SHOW(440c); /* VM_EXIT_INSTRUCTION_LEN          */
+BUILD_OFFSET_SHOW(440e); /* VMX_INSTRUCTION_INFO             */
+BUILD_OFFSET_SHOW(4800); /* GUEST_ES_LIMIT                   */
+BUILD_OFFSET_SHOW(4802); /* GUEST_CS_LIMIT                   */
+BUILD_OFFSET_SHOW(4804); /* GUEST_SS_LIMIT                   */
+BUILD_OFFSET_SHOW(4806); /* GUEST_DS_LIMIT                   */
+BUILD_OFFSET_SHOW(4808); /* GUEST_FS_LIMIT                   */
+BUILD_OFFSET_SHOW(480a); /* GUEST_GS_LIMIT                   */
+BUILD_OFFSET_SHOW(480c); /* GUEST_LDTR_LIMIT                 */
+BUILD_OFFSET_SHOW(480e); /* GUEST_TR_LIMIT                   */
+BUILD_OFFSET_SHOW(4810); /* GUEST_GDTR_LIMIT                 */
+BUILD_OFFSET_SHOW(4812); /* GUEST_IDTR_LIMIT                 */
+BUILD_OFFSET_SHOW(4814); /* GUEST_ES_AR_BYTES                */
+BUILD_OFFSET_SHOW(4816); /* GUEST_CS_AR_BYTES                */
+BUILD_OFFSET_SHOW(4818); /* GUEST_SS_AR_BYTES                */
+BUILD_OFFSET_SHOW(481a); /* GUEST_DS_AR_BYTES                */
+BUILD_OFFSET_SHOW(481c); /* GUEST_FS_AR_BYTES                */
+BUILD_OFFSET_SHOW(481e); /* GUEST_GS_AR_BYTES                */
+BUILD_OFFSET_SHOW(4820); /* GUEST_LDTR_AR_BYTES              */
+BUILD_OFFSET_SHOW(4822); /* GUEST_TR_AR_BYTES                */
+BUILD_OFFSET_SHOW(4824); /* GUEST_INTERRUPTIBILITY_INFO      */
+BUILD_OFFSET_SHOW(4826); /* GUEST_ACTIVITY_STATE             */
+BUILD_OFFSET_SHOW(482A); /* GUEST_SYSENTER_CS                */
+BUILD_OFFSET_SHOW(4c00); /* HOST_IA32_SYSENTER_CS            */
+BUILD_OFFSET_SHOW(6000); /* CR0_GUEST_HOST_MASK              */
+BUILD_OFFSET_SHOW(6002); /* CR4_GUEST_HOST_MASK              */
+BUILD_OFFSET_SHOW(6004); /* CR0_READ_SHADOW                  */
+BUILD_OFFSET_SHOW(6006); /* CR4_READ_SHADOW                  */
+BUILD_OFFSET_SHOW(6008); /* CR3_TARGET_VALUE0                */
+BUILD_OFFSET_SHOW(600a); /* CR3_TARGET_VALUE1                */
+BUILD_OFFSET_SHOW(600c); /* CR3_TARGET_VALUE2                */
+BUILD_OFFSET_SHOW(600e); /* CR3_TARGET_VALUE3                */
+BUILD_OFFSET_SHOW(6400); /* EXIT_QUALIFICATION               */
+BUILD_OFFSET_SHOW(640a); /* GUEST_LINEAR_ADDRESS             */
+BUILD_OFFSET_SHOW(6800); /* GUEST_CR0                        */
+BUILD_OFFSET_SHOW(6802); /* GUEST_CR3                        */
+BUILD_OFFSET_SHOW(6804); /* GUEST_CR4                        */
+BUILD_OFFSET_SHOW(6806); /* GUEST_ES_BASE                    */
+BUILD_OFFSET_SHOW(6808); /* GUEST_CS_BASE                    */
+BUILD_OFFSET_SHOW(680a); /* GUEST_SS_BASE                    */
+BUILD_OFFSET_SHOW(680c); /* GUEST_DS_BASE                    */
+BUILD_OFFSET_SHOW(680e); /* GUEST_FS_BASE                    */
+BUILD_OFFSET_SHOW(6810); /* GUEST_GS_BASE                    */
+BUILD_OFFSET_SHOW(6812); /* GUEST_LDTR_BASE                  */
+BUILD_OFFSET_SHOW(6814); /* GUEST_TR_BASE                    */
+BUILD_OFFSET_SHOW(6816); /* GUEST_GDTR_BASE                  */
+BUILD_OFFSET_SHOW(6818); /* GUEST_IDTR_BASE                  */
+BUILD_OFFSET_SHOW(681a); /* GUEST_DR7                        */
+BUILD_OFFSET_SHOW(681c); /* GUEST_RSP                        */
+BUILD_OFFSET_SHOW(681e); /* GUEST_RIP                        */
+BUILD_OFFSET_SHOW(6820); /* GUEST_RFLAGS                     */
+BUILD_OFFSET_SHOW(6822); /* GUEST_PENDING_DBG_EXCEPTIONS     */
+BUILD_OFFSET_SHOW(6824); /* GUEST_SYSENTER_ESP               */
+BUILD_OFFSET_SHOW(6826); /* GUEST_SYSENTER_EIP               */
+BUILD_OFFSET_SHOW(6c00); /* HOST_CR0                         */
+BUILD_OFFSET_SHOW(6c02); /* HOST_CR3                         */
+BUILD_OFFSET_SHOW(6c04); /* HOST_CR4                         */
+BUILD_OFFSET_SHOW(6c06); /* HOST_FS_BASE                     */
+BUILD_OFFSET_SHOW(6c08); /* HOST_GS_BASE                     */
+BUILD_OFFSET_SHOW(6c0a); /* HOST_TR_BASE                     */
+BUILD_OFFSET_SHOW(6c0c); /* HOST_GDTR_BASE                   */
+BUILD_OFFSET_SHOW(6c0e); /* HOST_IDTR_BASE                   */
+BUILD_OFFSET_SHOW(6c10); /* HOST_IA32_SYSENTER_ESP           */
+BUILD_OFFSET_SHOW(6c12); /* HOST_IA32_SYSENTER_EIP           */
+BUILD_OFFSET_SHOW(6c14); /* HOST_RSP                         */
+BUILD_OFFSET_SHOW(6c16); /* HOST_RIP                         */
+
+static struct attribute *vmcs_attrs[] = {
+	&dev_attr_0000.attr,
+	&dev_attr_0800.attr,
+	&dev_attr_0802.attr,
+	&dev_attr_0804.attr,
+	&dev_attr_0806.attr,
+	&dev_attr_0808.attr,
+	&dev_attr_080a.attr,
+	&dev_attr_080c.attr,
+	&dev_attr_080e.attr,
+	&dev_attr_0c00.attr,
+	&dev_attr_0c02.attr,
+	&dev_attr_0c04.attr,
+	&dev_attr_0c06.attr,
+	&dev_attr_0c08.attr,
+	&dev_attr_0c0a.attr,
+	&dev_attr_0c0c.attr,
+	&dev_attr_2000.attr,
+	&dev_attr_2001.attr,
+	&dev_attr_2002.attr,
+	&dev_attr_2003.attr,
+	&dev_attr_2004.attr,
+	&dev_attr_2005.attr,
+	&dev_attr_2006.attr,
+	&dev_attr_2007.attr,
+	&dev_attr_2008.attr,
+	&dev_attr_2009.attr,
+	&dev_attr_200a.attr,
+	&dev_attr_200b.attr,
+	&dev_attr_2010.attr,
+	&dev_attr_2011.attr,
+	&dev_attr_2012.attr,
+	&dev_attr_2013.attr,
+	&dev_attr_2014.attr,
+	&dev_attr_2015.attr,
+	&dev_attr_201a.attr,
+	&dev_attr_201b.attr,
+	&dev_attr_2400.attr,
+	&dev_attr_2401.attr,
+	&dev_attr_2800.attr,
+	&dev_attr_2801.attr,
+	&dev_attr_2802.attr,
+	&dev_attr_2803.attr,
+	&dev_attr_2804.attr,
+	&dev_attr_2805.attr,
+	&dev_attr_2806.attr,
+	&dev_attr_2807.attr,
+	&dev_attr_2808.attr,
+	&dev_attr_2809.attr,
+	&dev_attr_280a.attr,
+	&dev_attr_280b.attr,
+	&dev_attr_280c.attr,
+	&dev_attr_280d.attr,
+	&dev_attr_280e.attr,
+	&dev_attr_280f.attr,
+	&dev_attr_2810.attr,
+	&dev_attr_2811.attr,
+	&dev_attr_2c00.attr,
+	&dev_attr_2c01.attr,
+	&dev_attr_2c02.attr,
+	&dev_attr_2c03.attr,
+	&dev_attr_2c04.attr,
+	&dev_attr_2c05.attr,
+	&dev_attr_4000.attr,
+	&dev_attr_4002.attr,
+	&dev_attr_4004.attr,
+	&dev_attr_4006.attr,
+	&dev_attr_4008.attr,
+	&dev_attr_400a.attr,
+	&dev_attr_400c.attr,
+	&dev_attr_400e.attr,
+	&dev_attr_4010.attr,
+	&dev_attr_4012.attr,
+	&dev_attr_4014.attr,
+	&dev_attr_4016.attr,
+	&dev_attr_4018.attr,
+	&dev_attr_401a.attr,
+	&dev_attr_401c.attr,
+	&dev_attr_401e.attr,
+	&dev_attr_4020.attr,
+	&dev_attr_4022.attr,
+	&dev_attr_4400.attr,
+	&dev_attr_4402.attr,
+	&dev_attr_4404.attr,
+	&dev_attr_4406.attr,
+	&dev_attr_4408.attr,
+	&dev_attr_440a.attr,
+	&dev_attr_440c.attr,
+	&dev_attr_440e.attr,
+	&dev_attr_4800.attr,
+	&dev_attr_4802.attr,
+	&dev_attr_4804.attr,
+	&dev_attr_4806.attr,
+	&dev_attr_4808.attr,
+	&dev_attr_480a.attr,
+	&dev_attr_480c.attr,
+	&dev_attr_480e.attr,
+	&dev_attr_4810.attr,
+	&dev_attr_4812.attr,
+	&dev_attr_4814.attr,
+	&dev_attr_4816.attr,
+	&dev_attr_4818.attr,
+	&dev_attr_481a.attr,
+	&dev_attr_481c.attr,
+	&dev_attr_481e.attr,
+	&dev_attr_4820.attr,
+	&dev_attr_4822.attr,
+	&dev_attr_4824.attr,
+	&dev_attr_4826.attr,
+	&dev_attr_482A.attr,
+	&dev_attr_4c00.attr,
+	&dev_attr_6000.attr,
+	&dev_attr_6002.attr,
+	&dev_attr_6004.attr,
+	&dev_attr_6006.attr,
+	&dev_attr_6008.attr,
+	&dev_attr_600a.attr,
+	&dev_attr_600c.attr,
+	&dev_attr_600e.attr,
+	&dev_attr_6400.attr,
+	&dev_attr_640a.attr,
+	&dev_attr_6800.attr,
+	&dev_attr_6802.attr,
+	&dev_attr_6804.attr,
+	&dev_attr_6806.attr,
+	&dev_attr_6808.attr,
+	&dev_attr_680a.attr,
+	&dev_attr_680c.attr,
+	&dev_attr_680e.attr,
+	&dev_attr_6810.attr,
+	&dev_attr_6812.attr,
+	&dev_attr_6814.attr,
+	&dev_attr_6816.attr,
+	&dev_attr_6818.attr,
+	&dev_attr_681a.attr,
+	&dev_attr_681c.attr,
+	&dev_attr_681e.attr,
+	&dev_attr_6820.attr,
+	&dev_attr_6822.attr,
+	&dev_attr_6824.attr,
+	&dev_attr_6826.attr,
+	&dev_attr_6c00.attr,
+	&dev_attr_6c02.attr,
+	&dev_attr_6c04.attr,
+	&dev_attr_6c06.attr,
+	&dev_attr_6c08.attr,
+	&dev_attr_6c0a.attr,
+	&dev_attr_6c0c.attr,
+	&dev_attr_6c0e.attr,
+	&dev_attr_6c10.attr,
+	&dev_attr_6c12.attr,
+	&dev_attr_6c14.attr,
+	&dev_attr_6c16.attr,
+	NULL,
+};
+
+static struct attribute_group vmcs_attr_group = {
+	.name = vmcs_group_name,
+	.attrs = vmcs_attrs,
+};
+
+int vmcs_sysfs_add(struct device *dev)
+{
+	return sysfs_create_group(&dev->kobj, &vmcs_attr_group);
+}
+
+void vmcs_sysfs_remove(struct device *dev)
+{
+	sysfs_remove_group(&dev->kobj, &vmcs_attr_group);
+}
+
+static ssize_t vmcs_id_show(struct device *dev,
+			    struct device_attribute *attr,
+			    char *buf)
+{
+	return sprintf(buf, "%d\n", vmcsinfo.vmcs_revision_id);
+}
+
+static DEVICE_ATTR(vmcs_id, 0444, vmcs_id_show, NULL);
+
+int vmcs_id_sysfs_add(struct device *dev)
+{
+	return device_create_file(dev, &dev_attr_vmcs_id);
+}
+
+void vmcs_id_sysfs_remove(struct device *dev)
+{
+	device_remove_file(dev, &dev_attr_vmcs_id);
+}
+
+/*
+ * For caculating offsets of fields in VMCS data, we index every 16-bit
+ * field by this kind of format:
+ *         | --------- 16 bits ---------- |
+ *         +-------------+-+------------+-+
+ *         | high 7 bits |1| low 7 bits |0|
+ *         +-------------+-+------------+-+
+ * In high byte, the lowest bit must be 1; In low byte, the lowest bit
+ * must be 0. The two bits are set like this in case indexes in VMCS
+ * data are read as big endian mode.
+ * The remaining 14 bits of the index indicate the real offset of the
+ * field. Because the size of a VMCS region is at most 4 KBytes, so
+ * 14 bits are enough to index the whole VMCS region.
+ *
+ * ENCODING_OFFSET: encode the offset into the index of this kind.
+ * DECODING_OFFSET: decode the index of this kind into real offset.
+ */
+#define OFFSET_HIGH_SHIFT (7)
+#define OFFSET_LOW_MASK   ((1 << OFFSET_HIGH_SHIFT) - 1) /* 0x7f */
+#define OFFSET_HIGH_MASK  (OFFSET_LOW_MASK << OFFSET_HIGH_SHIFT) /* 0x3f80 */
+#define ENCODING_OFFSET(offset)                                     \
+	((((offset) & OFFSET_LOW_MASK) << 1) +                      \
+	((((offset) & OFFSET_HIGH_MASK) << 2) | 0x100))
+/*
+ * index here should be always read in little endian mode.
+ */
+#define DECODING_OFFSET_LE(index)                                   \
+	((((index) >> 1) & OFFSET_LOW_MASK) +                       \
+	(((index) >> 2) & OFFSET_HIGH_MASK))
+/*
+ * n indicates the bits of index. We first check if index
+ * is read in big endian mode.
+ */
+#define DECODING_OFFSET(index, n)                                   \
+	((index & 1) ? (DECODING_OFFSET_LE(__swab##n(index))) :     \
+	(DECODING_OFFSET_LE(index)))
+
+#define FIELD_OFFSET16(field, offset)                               \
+	vmcsinfo_field(field, DECODING_OFFSET(offset, 16))
+#define FIELD_OFFSET64(field, offset)                               \
+	vmcsinfo_field(field, DECODING_OFFSET(offset, 64))
+#define FIELD_OFFSET32(field, offset)                               \
+	vmcsinfo_field(field, DECODING_OFFSET(offset, 32))
+#define FIELD_OFFSETNW(field, offset)                               \
+do {                                                                \
+	if (sizeof(offset) == 8)                                    \
+		vmcsinfo_field(field, DECODING_OFFSET(offset, 64)); \
+	else                                                        \
+		vmcsinfo_field(field, DECODING_OFFSET(offset, 32)); \
+} while (0)
+
+#define VMCS_FIELD_CHECK(field, offset, type)                       \
+do {                                                                \
+	if (vmcs_read32(VM_INSTRUCTION_ERROR) !=                    \
+		VMXERR_UNSUPPORTED_VMCS_COMPONENT)                  \
+		FIELD_OFFSET##type(field, offset);                  \
+} while (0)
+
+static inline void vmcs_read_checking(unsigned long field)
+{
+	u16 offset16;
+	u64 offset64;
+	u32 offset32;
+	unsigned long offsetnw;
+
+	switch (vmcs_field_type(field)) {
+	case VMCS_FIELD_TYPE_U16:
+		offset16 = vmcs_read16(field);
+		VMCS_FIELD_CHECK(field, offset16, 16);
+		break;
+	case VMCS_FIELD_TYPE_U64:
+		offset64 = vmcs_read64(field);
+		VMCS_FIELD_CHECK(field, offset64, 64);
+		break;
+	case VMCS_FIELD_TYPE_U32:
+		offset32 = vmcs_read32(field);
+		VMCS_FIELD_CHECK(field, offset32, 32);
+		break;
+	case VMCS_FIELD_TYPE_NATURAL_WIDTH:
+		offsetnw = vmcs_readl(field);
+		VMCS_FIELD_CHECK(field, offsetnw, NW);
+		break;
+	}
+}
+
+/*
+ * Note, offsets of fields below will not be filled into
+ * VMCSINFO:
+ * 1. fields defined in Intel specification (Intel® 64 and
+ *    IA-32 Architectures Software Developer’s Manual, Volume
+ *    3C) but not defined in *vmcs_field*.
+ * 2. fields unsupported.
+ */
+static int __init alloc_vmcsinfo_init(void)
+{
+/*
+ * The first 8 bytes in vmcs region are for
+ *   VMCS revision identifier
+ *   VMX-abort indicator
+ */
+#define FIELD_START (8)
+
+	int r, offset;
+	struct vmcs *vmcs;
+	int cpu;
+	unsigned long field;
+
+	vmcs = alloc_vmcs();
+	if (!vmcs) {
+		return -ENOMEM;
+	}
+
+	r = hardware_enable_all();
+	if (r)
+		goto out;
+
+	/*
+	 * Write encoded offsets into VMCS data for later vmcs_read.
+	 */
+	for (offset = FIELD_START; offset < vmcs_config.size;
+	     offset += sizeof(u16))
+		*(u16 *)((char *)vmcs + offset) = ENCODING_OFFSET(offset);
+
+	cpu = get_cpu();
+	vmcs_clear(vmcs);
+	per_cpu(current_vmcs, cpu) = vmcs;
+	vmcs_load(vmcs);
+
+	vmcsinfo_revision_id(vmcs->revision_id);
+	vmcs_read_checking(VM_INSTRUCTION_ERROR);
+	offset = get_vmcs_field_offset(VM_INSTRUCTION_ERROR);
+	if (offset == -1)
+		goto out_clear;
+
+	for (field = 0; field < VMCSINFO_MAX_FIELD; ++field) {
+		if (field == VM_INSTRUCTION_ERROR)
+			continue;
+		/*
+		 * Before each reading, zeroed field VM_INSTRUCTION_ERROR
+		 */
+		*(u32 *)((char *)vmcs + offset) = 0;
+		vmcs_read_checking(field);
+	}
+
+	r = vmcs_id_sysfs_add(cpu_subsys.dev_root);
+	if (r)
+		goto out_clear;
+	r = vmcs_sysfs_add(cpu_subsys.dev_root);
+	if (r)
+		vmcs_id_sysfs_remove(cpu_subsys.dev_root);
+
+out_clear:
+	vmcs_clear(vmcs);
+	put_cpu();
+	hardware_disable_all();
+out:
+	free_vmcs(vmcs);
+	return r;
+}
+
+static void __exit alloc_vmcsinfo_exit(void)
+{
+	vmcs_sysfs_remove(cpu_subsys.dev_root);
+	vmcs_id_sysfs_remove(cpu_subsys.dev_root);
+}
+
+module_init(alloc_vmcsinfo_init);
+module_exit(alloc_vmcsinfo_exit);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-04 10:05   ` Yanfei Zhang
  0 siblings, 0 replies; 26+ messages in thread
From: Yanfei Zhang @ 2012-07-04 10:05 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: dzickus, luto, kvm, Joerg Roedel, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

This patch implements a new module named vmcsinfo-intel. The
module fills VMCSINFO with the VMCS revision identifier,
and offsets of VMCS fields.

Note, offsets of fields that defined in Intel specification
(Intel® 64 and IA-32 Architectures Software Developer’s Manual,
Volume 3C) but not defined in *vmcs_field* will not be filled in
VMCSINFO. And, some fields may be unsupported in some machines,
in these machines, corresponding offsets will be zero.

Besides, this patch also exports vmcs revision identifier via
/sys/devices/system/cpu/vmcs_id and offsets of fields via
/sys/devices/system/cpu/vmcs/.
Individual offsets are contained in subfiles named by the filed's
encoding, e.g.: /sys/devices/cpu/vmcs/0800

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/kvm/Kconfig    |   11 +
 arch/x86/kvm/Makefile   |    3 +
 arch/x86/kvm/vmcsinfo.c |  586 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 600 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/kvm/vmcsinfo.c

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index a28f338..1dd64b1 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -63,6 +63,17 @@ config KVM_INTEL
 	  To compile this as a module, choose M here: the module
 	  will be called kvm-intel.
 
+config VMCSINFO_INTEL
+	tristate "Export VMCSINFO for Intel processors"
+	depends on KVM_INTEL
+	---help---
+	  Provides support for exporting VMCSINFO on Intel processors equipped
+	  with the VT extensions. The VMCSINFO contains a VMCS revision
+	  identifier and offsets of VMCS fields.
+
+	  To compile this as a module, choose M here: the module
+	  will be called vmcsinfo-intel.
+
 config KVM_AMD
 	tristate "KVM for AMD processors support"
 	depends on KVM
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 4f579e8..12a1ef6 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -4,6 +4,7 @@ ccflags-y += -Ivirt/kvm -Iarch/x86/kvm
 CFLAGS_x86.o := -I.
 CFLAGS_svm.o := -I.
 CFLAGS_vmx.o := -I.
+CFLAGS_vmcsinfo.o := -I.
 
 kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 				coalesced_mmio.o irq_comm.o eventfd.o \
@@ -15,7 +16,9 @@ kvm-y			+= x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
 			   i8254.o timer.o cpuid.o pmu.o
 kvm-intel-y		+= vmx.o
 kvm-amd-y		+= svm.o
+vmcsinfo-intel-y	+= vmcsinfo.o
 
 obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o
+obj-$(CONFIG_VMCSINFO_INTEL)	+= vmcsinfo-intel.o
diff --git a/arch/x86/kvm/vmcsinfo.c b/arch/x86/kvm/vmcsinfo.c
new file mode 100644
index 0000000..bff6a1e
--- /dev/null
+++ b/arch/x86/kvm/vmcsinfo.c
@@ -0,0 +1,586 @@
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * This module enables machines with Intel VT-x extensions to export
+ * offsets of VMCS fields for guest debugging.
+ *
+ * Copyright (C) 2012 Fujitsu, Inc.
+ *
+ * Authors:
+ *   Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/mod_devicetable.h>
+#include <linux/device.h>
+#include <linux/swab.h>
+#include <linux/cpu.h>
+
+#include <asm/vmx.h>
+
+MODULE_AUTHOR("Fujitsu");
+MODULE_LICENSE("GPL");
+
+static const struct x86_cpu_id vmcsinfo_cpu_id[] = {
+	X86_FEATURE_MATCH(X86_FEATURE_VMX),
+	{}
+};
+MODULE_DEVICE_TABLE(x86cpu, vmcsinfo_cpu_id);
+
+/*
+ * vmcs field offsets.
+ */
+static struct vmcsinfo {
+	u32 vmcs_revision_id;
+	u16 vmcs_field_to_offset_table[HOST_RIP + 1];
+} vmcsinfo;
+
+#define VMCSINFO_MAX_FIELD \
+	ARRAY_SIZE(vmcsinfo.vmcs_field_to_offset_table)
+
+static inline void vmcsinfo_revision_id(u32 id)
+{
+	vmcsinfo.vmcs_revision_id = id;
+}
+
+static inline void vmcsinfo_field(unsigned long field, u16 offset)
+{
+	if (field < VMCSINFO_MAX_FIELD)
+		vmcsinfo.vmcs_field_to_offset_table[field] = offset;
+}
+
+static inline short get_vmcs_field_offset(unsigned long field)
+{
+	if (field >= VMCSINFO_MAX_FIELD ||
+	    vmcsinfo.vmcs_field_to_offset_table[field] == 0)
+		return -1;
+	return vmcsinfo.vmcs_field_to_offset_table[field];
+}
+
+const char vmcs_group_name[] = "vmcs";
+
+#define BUILD_OFFSET_SHOW(field_code)                                         \
+static ssize_t _##field_code##_show(struct device *dev,                       \
+				    struct device_attribute *attr,            \
+				    char *buf)                                \
+{                                                                             \
+	return sprintf(buf, "%d\n",                                           \
+		       vmcsinfo.vmcs_field_to_offset_table[0x##field_code]);  \
+}                                                                             \
+static DEVICE_ATTR(field_code, 0444, _##field_code##_show, NULL);             \
+
+BUILD_OFFSET_SHOW(0000); /* VIRTUAL_PROCESSOR_ID             */
+BUILD_OFFSET_SHOW(0800); /* GUEST_ES_SELECTOR                */
+BUILD_OFFSET_SHOW(0802); /* GUEST_CS_SELECTOR                */
+BUILD_OFFSET_SHOW(0804); /* GUEST_SS_SELECTOR                */
+BUILD_OFFSET_SHOW(0806); /* GUEST_DS_SELECTOR                */
+BUILD_OFFSET_SHOW(0808); /* GUEST_FS_SELECTOR                */
+BUILD_OFFSET_SHOW(080a); /* GUEST_GS_SELECTOR                */
+BUILD_OFFSET_SHOW(080c); /* GUEST_LDTR_SELECTOR              */
+BUILD_OFFSET_SHOW(080e); /* GUEST_TR_SELECTOR                */
+BUILD_OFFSET_SHOW(0c00); /* HOST_ES_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c02); /* HOST_CS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c04); /* HOST_SS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c06); /* HOST_DS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c08); /* HOST_FS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c0a); /* HOST_GS_SELECTOR                 */
+BUILD_OFFSET_SHOW(0c0c); /* HOST_TR_SELECTOR                 */
+BUILD_OFFSET_SHOW(2000); /* IO_BITMAP_A                      */
+BUILD_OFFSET_SHOW(2001); /* IO_BITMAP_A_HIGH                 */
+BUILD_OFFSET_SHOW(2002); /* IO_BITMAP_B                      */
+BUILD_OFFSET_SHOW(2003); /* IO_BITMAP_B_HIGH                 */
+BUILD_OFFSET_SHOW(2004); /* MSR_BITMAP                       */
+BUILD_OFFSET_SHOW(2005); /* MSR_BITMAP_HIGH                  */
+BUILD_OFFSET_SHOW(2006); /* VM_EXIT_MSR_STORE_ADDR           */
+BUILD_OFFSET_SHOW(2007); /* VM_EXIT_MSR_STORE_ADDR_HIGH      */
+BUILD_OFFSET_SHOW(2008); /* VM_EXIT_MSR_LOAD_ADDR            */
+BUILD_OFFSET_SHOW(2009); /* VM_EXIT_MSR_LOAD_ADDR_HIGH       */
+BUILD_OFFSET_SHOW(200a); /* VM_ENTRY_MSR_LOAD_ADDR           */
+BUILD_OFFSET_SHOW(200b); /* VM_ENTRY_MSR_LOAD_ADDR_HIGH      */
+BUILD_OFFSET_SHOW(2010); /* TSC_OFFSET                       */
+BUILD_OFFSET_SHOW(2011); /* TSC_OFFSET_HIGH                  */
+BUILD_OFFSET_SHOW(2012); /* VIRTUAL_APIC_PAGE_ADDR           */
+BUILD_OFFSET_SHOW(2013); /* VIRTUAL_APIC_PAGE_ADDR_HIGH      */
+BUILD_OFFSET_SHOW(2014); /* APIC_ACCESS_ADDR                 */
+BUILD_OFFSET_SHOW(2015); /* APIC_ACCESS_ADDR_HIGH            */
+BUILD_OFFSET_SHOW(201a); /* EPT_POINTER                      */
+BUILD_OFFSET_SHOW(201b); /* EPT_POINTER_HIGH                 */
+BUILD_OFFSET_SHOW(2400); /* GUEST_PHYSICAL_ADDRESS           */
+BUILD_OFFSET_SHOW(2401); /* GUEST_PHYSICAL_ADDRESS_HIGH      */
+BUILD_OFFSET_SHOW(2800); /* VMCS_LINK_POINTER                */
+BUILD_OFFSET_SHOW(2801); /* VMCS_LINK_POINTER_HIGH           */
+BUILD_OFFSET_SHOW(2802); /* GUEST_IA32_DEBUGCTL              */
+BUILD_OFFSET_SHOW(2803); /* GUEST_IA32_DEBUGCTL_HIGH         */
+BUILD_OFFSET_SHOW(2804); /* GUEST_IA32_PAT                   */
+BUILD_OFFSET_SHOW(2805); /* GUEST_IA32_PAT_HIGH              */
+BUILD_OFFSET_SHOW(2806); /* GUEST_IA32_EFER                  */
+BUILD_OFFSET_SHOW(2807); /* GUEST_IA32_EFER_HIGH             */
+BUILD_OFFSET_SHOW(2808); /* GUEST_IA32_PERF_GLOBAL_CTRL      */
+BUILD_OFFSET_SHOW(2809); /* GUEST_IA32_PERF_GLOBAL_CTRL_HIGH */
+BUILD_OFFSET_SHOW(280a); /* GUEST_PDPTR0                     */
+BUILD_OFFSET_SHOW(280b); /* GUEST_PDPTR0_HIGH                */
+BUILD_OFFSET_SHOW(280c); /* GUEST_PDPTR1                     */
+BUILD_OFFSET_SHOW(280d); /* GUEST_PDPTR1_HIGH                */
+BUILD_OFFSET_SHOW(280e); /* GUEST_PDPTR2                     */
+BUILD_OFFSET_SHOW(280f); /* GUEST_PDPTR2_HIGH                */
+BUILD_OFFSET_SHOW(2810); /* GUEST_PDPTR3                     */
+BUILD_OFFSET_SHOW(2811); /* GUEST_PDPTR3_HIGH                */
+BUILD_OFFSET_SHOW(2c00); /* HOST_IA32_PAT                    */
+BUILD_OFFSET_SHOW(2c01); /* HOST_IA32_PAT_HIGH               */
+BUILD_OFFSET_SHOW(2c02); /* HOST_IA32_EFER                   */
+BUILD_OFFSET_SHOW(2c03); /* HOST_IA32_EFER_HIGH              */
+BUILD_OFFSET_SHOW(2c04); /* HOST_IA32_PERF_GLOBAL_CTRL       */
+BUILD_OFFSET_SHOW(2c05); /* HOST_IA32_PERF_GLOBAL_CTRL_HIGH  */
+BUILD_OFFSET_SHOW(4000); /* PIN_BASED_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW(4002); /* CPU_BASED_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW(4004); /* EXCEPTION_BITMAP                 */
+BUILD_OFFSET_SHOW(4006); /* PAGE_FAULT_ERROR_CODE_MASK       */
+BUILD_OFFSET_SHOW(4008); /* PAGE_FAULT_ERROR_CODE_MATCH      */
+BUILD_OFFSET_SHOW(400a); /* CR3_TARGET_COUNT                 */
+BUILD_OFFSET_SHOW(400c); /* VM_EXIT_CONTROLS                 */
+BUILD_OFFSET_SHOW(400e); /* VM_EXIT_MSR_STORE_COUNT          */
+BUILD_OFFSET_SHOW(4010); /* VM_EXIT_MSR_LOAD_COUNT           */
+BUILD_OFFSET_SHOW(4012); /* VM_ENTRY_CONTROLS                */
+BUILD_OFFSET_SHOW(4014); /* VM_ENTRY_MSR_LOAD_COUNT          */
+BUILD_OFFSET_SHOW(4016); /* VM_ENTRY_INTR_INFO_FIELD         */
+BUILD_OFFSET_SHOW(4018); /* VM_ENTRY_EXCEPTION_ERROR_CODE    */
+BUILD_OFFSET_SHOW(401a); /* VM_ENTRY_INSTRUCTION_LEN         */
+BUILD_OFFSET_SHOW(401c); /* TPR_THRESHOLD                    */
+BUILD_OFFSET_SHOW(401e); /* SECONDARY_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW(4020); /* PLE_GAP                          */
+BUILD_OFFSET_SHOW(4022); /* PLE_WINDOW                       */
+BUILD_OFFSET_SHOW(4400); /* VM_INSTRUCTION_ERROR             */
+BUILD_OFFSET_SHOW(4402); /* VM_EXIT_REASON                   */
+BUILD_OFFSET_SHOW(4404); /* VM_EXIT_INTR_INFO                */
+BUILD_OFFSET_SHOW(4406); /* VM_EXIT_INTR_ERROR_CODE          */
+BUILD_OFFSET_SHOW(4408); /* IDT_VECTORING_INFO_FIELD         */
+BUILD_OFFSET_SHOW(440a); /* IDT_VECTORING_ERROR_CODE         */
+BUILD_OFFSET_SHOW(440c); /* VM_EXIT_INSTRUCTION_LEN          */
+BUILD_OFFSET_SHOW(440e); /* VMX_INSTRUCTION_INFO             */
+BUILD_OFFSET_SHOW(4800); /* GUEST_ES_LIMIT                   */
+BUILD_OFFSET_SHOW(4802); /* GUEST_CS_LIMIT                   */
+BUILD_OFFSET_SHOW(4804); /* GUEST_SS_LIMIT                   */
+BUILD_OFFSET_SHOW(4806); /* GUEST_DS_LIMIT                   */
+BUILD_OFFSET_SHOW(4808); /* GUEST_FS_LIMIT                   */
+BUILD_OFFSET_SHOW(480a); /* GUEST_GS_LIMIT                   */
+BUILD_OFFSET_SHOW(480c); /* GUEST_LDTR_LIMIT                 */
+BUILD_OFFSET_SHOW(480e); /* GUEST_TR_LIMIT                   */
+BUILD_OFFSET_SHOW(4810); /* GUEST_GDTR_LIMIT                 */
+BUILD_OFFSET_SHOW(4812); /* GUEST_IDTR_LIMIT                 */
+BUILD_OFFSET_SHOW(4814); /* GUEST_ES_AR_BYTES                */
+BUILD_OFFSET_SHOW(4816); /* GUEST_CS_AR_BYTES                */
+BUILD_OFFSET_SHOW(4818); /* GUEST_SS_AR_BYTES                */
+BUILD_OFFSET_SHOW(481a); /* GUEST_DS_AR_BYTES                */
+BUILD_OFFSET_SHOW(481c); /* GUEST_FS_AR_BYTES                */
+BUILD_OFFSET_SHOW(481e); /* GUEST_GS_AR_BYTES                */
+BUILD_OFFSET_SHOW(4820); /* GUEST_LDTR_AR_BYTES              */
+BUILD_OFFSET_SHOW(4822); /* GUEST_TR_AR_BYTES                */
+BUILD_OFFSET_SHOW(4824); /* GUEST_INTERRUPTIBILITY_INFO      */
+BUILD_OFFSET_SHOW(4826); /* GUEST_ACTIVITY_STATE             */
+BUILD_OFFSET_SHOW(482A); /* GUEST_SYSENTER_CS                */
+BUILD_OFFSET_SHOW(4c00); /* HOST_IA32_SYSENTER_CS            */
+BUILD_OFFSET_SHOW(6000); /* CR0_GUEST_HOST_MASK              */
+BUILD_OFFSET_SHOW(6002); /* CR4_GUEST_HOST_MASK              */
+BUILD_OFFSET_SHOW(6004); /* CR0_READ_SHADOW                  */
+BUILD_OFFSET_SHOW(6006); /* CR4_READ_SHADOW                  */
+BUILD_OFFSET_SHOW(6008); /* CR3_TARGET_VALUE0                */
+BUILD_OFFSET_SHOW(600a); /* CR3_TARGET_VALUE1                */
+BUILD_OFFSET_SHOW(600c); /* CR3_TARGET_VALUE2                */
+BUILD_OFFSET_SHOW(600e); /* CR3_TARGET_VALUE3                */
+BUILD_OFFSET_SHOW(6400); /* EXIT_QUALIFICATION               */
+BUILD_OFFSET_SHOW(640a); /* GUEST_LINEAR_ADDRESS             */
+BUILD_OFFSET_SHOW(6800); /* GUEST_CR0                        */
+BUILD_OFFSET_SHOW(6802); /* GUEST_CR3                        */
+BUILD_OFFSET_SHOW(6804); /* GUEST_CR4                        */
+BUILD_OFFSET_SHOW(6806); /* GUEST_ES_BASE                    */
+BUILD_OFFSET_SHOW(6808); /* GUEST_CS_BASE                    */
+BUILD_OFFSET_SHOW(680a); /* GUEST_SS_BASE                    */
+BUILD_OFFSET_SHOW(680c); /* GUEST_DS_BASE                    */
+BUILD_OFFSET_SHOW(680e); /* GUEST_FS_BASE                    */
+BUILD_OFFSET_SHOW(6810); /* GUEST_GS_BASE                    */
+BUILD_OFFSET_SHOW(6812); /* GUEST_LDTR_BASE                  */
+BUILD_OFFSET_SHOW(6814); /* GUEST_TR_BASE                    */
+BUILD_OFFSET_SHOW(6816); /* GUEST_GDTR_BASE                  */
+BUILD_OFFSET_SHOW(6818); /* GUEST_IDTR_BASE                  */
+BUILD_OFFSET_SHOW(681a); /* GUEST_DR7                        */
+BUILD_OFFSET_SHOW(681c); /* GUEST_RSP                        */
+BUILD_OFFSET_SHOW(681e); /* GUEST_RIP                        */
+BUILD_OFFSET_SHOW(6820); /* GUEST_RFLAGS                     */
+BUILD_OFFSET_SHOW(6822); /* GUEST_PENDING_DBG_EXCEPTIONS     */
+BUILD_OFFSET_SHOW(6824); /* GUEST_SYSENTER_ESP               */
+BUILD_OFFSET_SHOW(6826); /* GUEST_SYSENTER_EIP               */
+BUILD_OFFSET_SHOW(6c00); /* HOST_CR0                         */
+BUILD_OFFSET_SHOW(6c02); /* HOST_CR3                         */
+BUILD_OFFSET_SHOW(6c04); /* HOST_CR4                         */
+BUILD_OFFSET_SHOW(6c06); /* HOST_FS_BASE                     */
+BUILD_OFFSET_SHOW(6c08); /* HOST_GS_BASE                     */
+BUILD_OFFSET_SHOW(6c0a); /* HOST_TR_BASE                     */
+BUILD_OFFSET_SHOW(6c0c); /* HOST_GDTR_BASE                   */
+BUILD_OFFSET_SHOW(6c0e); /* HOST_IDTR_BASE                   */
+BUILD_OFFSET_SHOW(6c10); /* HOST_IA32_SYSENTER_ESP           */
+BUILD_OFFSET_SHOW(6c12); /* HOST_IA32_SYSENTER_EIP           */
+BUILD_OFFSET_SHOW(6c14); /* HOST_RSP                         */
+BUILD_OFFSET_SHOW(6c16); /* HOST_RIP                         */
+
+static struct attribute *vmcs_attrs[] = {
+	&dev_attr_0000.attr,
+	&dev_attr_0800.attr,
+	&dev_attr_0802.attr,
+	&dev_attr_0804.attr,
+	&dev_attr_0806.attr,
+	&dev_attr_0808.attr,
+	&dev_attr_080a.attr,
+	&dev_attr_080c.attr,
+	&dev_attr_080e.attr,
+	&dev_attr_0c00.attr,
+	&dev_attr_0c02.attr,
+	&dev_attr_0c04.attr,
+	&dev_attr_0c06.attr,
+	&dev_attr_0c08.attr,
+	&dev_attr_0c0a.attr,
+	&dev_attr_0c0c.attr,
+	&dev_attr_2000.attr,
+	&dev_attr_2001.attr,
+	&dev_attr_2002.attr,
+	&dev_attr_2003.attr,
+	&dev_attr_2004.attr,
+	&dev_attr_2005.attr,
+	&dev_attr_2006.attr,
+	&dev_attr_2007.attr,
+	&dev_attr_2008.attr,
+	&dev_attr_2009.attr,
+	&dev_attr_200a.attr,
+	&dev_attr_200b.attr,
+	&dev_attr_2010.attr,
+	&dev_attr_2011.attr,
+	&dev_attr_2012.attr,
+	&dev_attr_2013.attr,
+	&dev_attr_2014.attr,
+	&dev_attr_2015.attr,
+	&dev_attr_201a.attr,
+	&dev_attr_201b.attr,
+	&dev_attr_2400.attr,
+	&dev_attr_2401.attr,
+	&dev_attr_2800.attr,
+	&dev_attr_2801.attr,
+	&dev_attr_2802.attr,
+	&dev_attr_2803.attr,
+	&dev_attr_2804.attr,
+	&dev_attr_2805.attr,
+	&dev_attr_2806.attr,
+	&dev_attr_2807.attr,
+	&dev_attr_2808.attr,
+	&dev_attr_2809.attr,
+	&dev_attr_280a.attr,
+	&dev_attr_280b.attr,
+	&dev_attr_280c.attr,
+	&dev_attr_280d.attr,
+	&dev_attr_280e.attr,
+	&dev_attr_280f.attr,
+	&dev_attr_2810.attr,
+	&dev_attr_2811.attr,
+	&dev_attr_2c00.attr,
+	&dev_attr_2c01.attr,
+	&dev_attr_2c02.attr,
+	&dev_attr_2c03.attr,
+	&dev_attr_2c04.attr,
+	&dev_attr_2c05.attr,
+	&dev_attr_4000.attr,
+	&dev_attr_4002.attr,
+	&dev_attr_4004.attr,
+	&dev_attr_4006.attr,
+	&dev_attr_4008.attr,
+	&dev_attr_400a.attr,
+	&dev_attr_400c.attr,
+	&dev_attr_400e.attr,
+	&dev_attr_4010.attr,
+	&dev_attr_4012.attr,
+	&dev_attr_4014.attr,
+	&dev_attr_4016.attr,
+	&dev_attr_4018.attr,
+	&dev_attr_401a.attr,
+	&dev_attr_401c.attr,
+	&dev_attr_401e.attr,
+	&dev_attr_4020.attr,
+	&dev_attr_4022.attr,
+	&dev_attr_4400.attr,
+	&dev_attr_4402.attr,
+	&dev_attr_4404.attr,
+	&dev_attr_4406.attr,
+	&dev_attr_4408.attr,
+	&dev_attr_440a.attr,
+	&dev_attr_440c.attr,
+	&dev_attr_440e.attr,
+	&dev_attr_4800.attr,
+	&dev_attr_4802.attr,
+	&dev_attr_4804.attr,
+	&dev_attr_4806.attr,
+	&dev_attr_4808.attr,
+	&dev_attr_480a.attr,
+	&dev_attr_480c.attr,
+	&dev_attr_480e.attr,
+	&dev_attr_4810.attr,
+	&dev_attr_4812.attr,
+	&dev_attr_4814.attr,
+	&dev_attr_4816.attr,
+	&dev_attr_4818.attr,
+	&dev_attr_481a.attr,
+	&dev_attr_481c.attr,
+	&dev_attr_481e.attr,
+	&dev_attr_4820.attr,
+	&dev_attr_4822.attr,
+	&dev_attr_4824.attr,
+	&dev_attr_4826.attr,
+	&dev_attr_482A.attr,
+	&dev_attr_4c00.attr,
+	&dev_attr_6000.attr,
+	&dev_attr_6002.attr,
+	&dev_attr_6004.attr,
+	&dev_attr_6006.attr,
+	&dev_attr_6008.attr,
+	&dev_attr_600a.attr,
+	&dev_attr_600c.attr,
+	&dev_attr_600e.attr,
+	&dev_attr_6400.attr,
+	&dev_attr_640a.attr,
+	&dev_attr_6800.attr,
+	&dev_attr_6802.attr,
+	&dev_attr_6804.attr,
+	&dev_attr_6806.attr,
+	&dev_attr_6808.attr,
+	&dev_attr_680a.attr,
+	&dev_attr_680c.attr,
+	&dev_attr_680e.attr,
+	&dev_attr_6810.attr,
+	&dev_attr_6812.attr,
+	&dev_attr_6814.attr,
+	&dev_attr_6816.attr,
+	&dev_attr_6818.attr,
+	&dev_attr_681a.attr,
+	&dev_attr_681c.attr,
+	&dev_attr_681e.attr,
+	&dev_attr_6820.attr,
+	&dev_attr_6822.attr,
+	&dev_attr_6824.attr,
+	&dev_attr_6826.attr,
+	&dev_attr_6c00.attr,
+	&dev_attr_6c02.attr,
+	&dev_attr_6c04.attr,
+	&dev_attr_6c06.attr,
+	&dev_attr_6c08.attr,
+	&dev_attr_6c0a.attr,
+	&dev_attr_6c0c.attr,
+	&dev_attr_6c0e.attr,
+	&dev_attr_6c10.attr,
+	&dev_attr_6c12.attr,
+	&dev_attr_6c14.attr,
+	&dev_attr_6c16.attr,
+	NULL,
+};
+
+static struct attribute_group vmcs_attr_group = {
+	.name = vmcs_group_name,
+	.attrs = vmcs_attrs,
+};
+
+int vmcs_sysfs_add(struct device *dev)
+{
+	return sysfs_create_group(&dev->kobj, &vmcs_attr_group);
+}
+
+void vmcs_sysfs_remove(struct device *dev)
+{
+	sysfs_remove_group(&dev->kobj, &vmcs_attr_group);
+}
+
+static ssize_t vmcs_id_show(struct device *dev,
+			    struct device_attribute *attr,
+			    char *buf)
+{
+	return sprintf(buf, "%d\n", vmcsinfo.vmcs_revision_id);
+}
+
+static DEVICE_ATTR(vmcs_id, 0444, vmcs_id_show, NULL);
+
+int vmcs_id_sysfs_add(struct device *dev)
+{
+	return device_create_file(dev, &dev_attr_vmcs_id);
+}
+
+void vmcs_id_sysfs_remove(struct device *dev)
+{
+	device_remove_file(dev, &dev_attr_vmcs_id);
+}
+
+/*
+ * For caculating offsets of fields in VMCS data, we index every 16-bit
+ * field by this kind of format:
+ *         | --------- 16 bits ---------- |
+ *         +-------------+-+------------+-+
+ *         | high 7 bits |1| low 7 bits |0|
+ *         +-------------+-+------------+-+
+ * In high byte, the lowest bit must be 1; In low byte, the lowest bit
+ * must be 0. The two bits are set like this in case indexes in VMCS
+ * data are read as big endian mode.
+ * The remaining 14 bits of the index indicate the real offset of the
+ * field. Because the size of a VMCS region is at most 4 KBytes, so
+ * 14 bits are enough to index the whole VMCS region.
+ *
+ * ENCODING_OFFSET: encode the offset into the index of this kind.
+ * DECODING_OFFSET: decode the index of this kind into real offset.
+ */
+#define OFFSET_HIGH_SHIFT (7)
+#define OFFSET_LOW_MASK   ((1 << OFFSET_HIGH_SHIFT) - 1) /* 0x7f */
+#define OFFSET_HIGH_MASK  (OFFSET_LOW_MASK << OFFSET_HIGH_SHIFT) /* 0x3f80 */
+#define ENCODING_OFFSET(offset)                                     \
+	((((offset) & OFFSET_LOW_MASK) << 1) +                      \
+	((((offset) & OFFSET_HIGH_MASK) << 2) | 0x100))
+/*
+ * index here should be always read in little endian mode.
+ */
+#define DECODING_OFFSET_LE(index)                                   \
+	((((index) >> 1) & OFFSET_LOW_MASK) +                       \
+	(((index) >> 2) & OFFSET_HIGH_MASK))
+/*
+ * n indicates the bits of index. We first check if index
+ * is read in big endian mode.
+ */
+#define DECODING_OFFSET(index, n)                                   \
+	((index & 1) ? (DECODING_OFFSET_LE(__swab##n(index))) :     \
+	(DECODING_OFFSET_LE(index)))
+
+#define FIELD_OFFSET16(field, offset)                               \
+	vmcsinfo_field(field, DECODING_OFFSET(offset, 16))
+#define FIELD_OFFSET64(field, offset)                               \
+	vmcsinfo_field(field, DECODING_OFFSET(offset, 64))
+#define FIELD_OFFSET32(field, offset)                               \
+	vmcsinfo_field(field, DECODING_OFFSET(offset, 32))
+#define FIELD_OFFSETNW(field, offset)                               \
+do {                                                                \
+	if (sizeof(offset) == 8)                                    \
+		vmcsinfo_field(field, DECODING_OFFSET(offset, 64)); \
+	else                                                        \
+		vmcsinfo_field(field, DECODING_OFFSET(offset, 32)); \
+} while (0)
+
+#define VMCS_FIELD_CHECK(field, offset, type)                       \
+do {                                                                \
+	if (vmcs_read32(VM_INSTRUCTION_ERROR) !=                    \
+		VMXERR_UNSUPPORTED_VMCS_COMPONENT)                  \
+		FIELD_OFFSET##type(field, offset);                  \
+} while (0)
+
+static inline void vmcs_read_checking(unsigned long field)
+{
+	u16 offset16;
+	u64 offset64;
+	u32 offset32;
+	unsigned long offsetnw;
+
+	switch (vmcs_field_type(field)) {
+	case VMCS_FIELD_TYPE_U16:
+		offset16 = vmcs_read16(field);
+		VMCS_FIELD_CHECK(field, offset16, 16);
+		break;
+	case VMCS_FIELD_TYPE_U64:
+		offset64 = vmcs_read64(field);
+		VMCS_FIELD_CHECK(field, offset64, 64);
+		break;
+	case VMCS_FIELD_TYPE_U32:
+		offset32 = vmcs_read32(field);
+		VMCS_FIELD_CHECK(field, offset32, 32);
+		break;
+	case VMCS_FIELD_TYPE_NATURAL_WIDTH:
+		offsetnw = vmcs_readl(field);
+		VMCS_FIELD_CHECK(field, offsetnw, NW);
+		break;
+	}
+}
+
+/*
+ * Note, offsets of fields below will not be filled into
+ * VMCSINFO:
+ * 1. fields defined in Intel specification (Intel® 64 and
+ *    IA-32 Architectures Software Developer’s Manual, Volume
+ *    3C) but not defined in *vmcs_field*.
+ * 2. fields unsupported.
+ */
+static int __init alloc_vmcsinfo_init(void)
+{
+/*
+ * The first 8 bytes in vmcs region are for
+ *   VMCS revision identifier
+ *   VMX-abort indicator
+ */
+#define FIELD_START (8)
+
+	int r, offset;
+	struct vmcs *vmcs;
+	int cpu;
+	unsigned long field;
+
+	vmcs = alloc_vmcs();
+	if (!vmcs) {
+		return -ENOMEM;
+	}
+
+	r = hardware_enable_all();
+	if (r)
+		goto out;
+
+	/*
+	 * Write encoded offsets into VMCS data for later vmcs_read.
+	 */
+	for (offset = FIELD_START; offset < vmcs_config.size;
+	     offset += sizeof(u16))
+		*(u16 *)((char *)vmcs + offset) = ENCODING_OFFSET(offset);
+
+	cpu = get_cpu();
+	vmcs_clear(vmcs);
+	per_cpu(current_vmcs, cpu) = vmcs;
+	vmcs_load(vmcs);
+
+	vmcsinfo_revision_id(vmcs->revision_id);
+	vmcs_read_checking(VM_INSTRUCTION_ERROR);
+	offset = get_vmcs_field_offset(VM_INSTRUCTION_ERROR);
+	if (offset == -1)
+		goto out_clear;
+
+	for (field = 0; field < VMCSINFO_MAX_FIELD; ++field) {
+		if (field == VM_INSTRUCTION_ERROR)
+			continue;
+		/*
+		 * Before each reading, zeroed field VM_INSTRUCTION_ERROR
+		 */
+		*(u32 *)((char *)vmcs + offset) = 0;
+		vmcs_read_checking(field);
+	}
+
+	r = vmcs_id_sysfs_add(cpu_subsys.dev_root);
+	if (r)
+		goto out_clear;
+	r = vmcs_sysfs_add(cpu_subsys.dev_root);
+	if (r)
+		vmcs_id_sysfs_remove(cpu_subsys.dev_root);
+
+out_clear:
+	vmcs_clear(vmcs);
+	put_cpu();
+	hardware_disable_all();
+out:
+	free_vmcs(vmcs);
+	return r;
+}
+
+static void __exit alloc_vmcsinfo_exit(void)
+{
+	vmcs_sysfs_remove(cpu_subsys.dev_root);
+	vmcs_id_sysfs_remove(cpu_subsys.dev_root);
+}
+
+module_init(alloc_vmcsinfo_init);
+module_exit(alloc_vmcsinfo_exit);
-- 
1.7.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 3/3] Documentation: Add ABI entry for vmcs sysfs interface
  2012-07-04  9:58 ` Yanfei Zhang
@ 2012-07-04 10:06   ` Yanfei Zhang
  -1 siblings, 0 replies; 26+ messages in thread
From: Yanfei Zhang @ 2012-07-04 10:06 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |   21 ++++++++++++++++++++
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 5dab364..6efbd6c 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -9,6 +9,27 @@ Description:
 
 		/sys/devices/system/cpu/cpu#/
 
+What:		/sys/devices/system/cpu/vmcs_id
+Date:		June 2012
+KernelVersion:	3.5.0
+Contact:	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+Description:
+		Vmcs revision identifier in Intel cpu. The value enables
+		software to avoid using a VMCS region formatted for one
+		processor on a processor that uses a different format.
+
+What:		/sys/devices/system/cpu/vmcs/
+Date:		June 2012
+KernelVersion:	3.5.0
+Contact:	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+Description:
+		A collection of vmcs fields' offsets for Intel cpu.
+
+		Individual offsets are contained in subfiles named by
+		the filed's encoding, e.g.:
+
+		/sys/devices/system/cpu/vmcs/0800
+
 What:		/sys/devices/system/cpu/kernel_max
 		/sys/devices/system/cpu/offline
 		/sys/devices/system/cpu/online
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 3/3] Documentation: Add ABI entry for vmcs sysfs interface
@ 2012-07-04 10:06   ` Yanfei Zhang
  0 siblings, 0 replies; 26+ messages in thread
From: Yanfei Zhang @ 2012-07-04 10:06 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: dzickus, luto, kvm, Joerg Roedel, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |   21 ++++++++++++++++++++
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 5dab364..6efbd6c 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -9,6 +9,27 @@ Description:
 
 		/sys/devices/system/cpu/cpu#/
 
+What:		/sys/devices/system/cpu/vmcs_id
+Date:		June 2012
+KernelVersion:	3.5.0
+Contact:	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+Description:
+		Vmcs revision identifier in Intel cpu. The value enables
+		software to avoid using a VMCS region formatted for one
+		processor on a processor that uses a different format.
+
+What:		/sys/devices/system/cpu/vmcs/
+Date:		June 2012
+KernelVersion:	3.5.0
+Contact:	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+Description:
+		A collection of vmcs fields' offsets for Intel cpu.
+
+		Individual offsets are contained in subfiles named by
+		the filed's encoding, e.g.:
+
+		/sys/devices/system/cpu/vmcs/0800
+
 What:		/sys/devices/system/cpu/kernel_max
 		/sys/devices/system/cpu/offline
 		/sys/devices/system/cpu/online
-- 
1.7.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/3] Documentation: Add ABI entry for vmcs sysfs interface
  2012-07-04 10:06   ` Yanfei Zhang
@ 2012-07-04 14:49     ` Greg KH
  -1 siblings, 0 replies; 26+ messages in thread
From: Greg KH @ 2012-07-04 14:49 UTC (permalink / raw)
  To: Yanfei Zhang
  Cc: Avi Kivity, mtosatti, ebiederm, luto, Joerg Roedel, dzickus,
	paul.gortmaker, ludwig.nussel, linux-kernel, kvm, kexec

On Wed, Jul 04, 2012 at 06:06:28PM +0800, Yanfei Zhang wrote:
> Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-system-cpu |   21 ++++++++++++++++++++
>  1 files changed, 21 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 5dab364..6efbd6c 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -9,6 +9,27 @@ Description:
>  
>  		/sys/devices/system/cpu/cpu#/
>  
> +What:		/sys/devices/system/cpu/vmcs_id
> +Date:		June 2012
> +KernelVersion:	3.5.0

3.5.0 will not have this feature in it, so you should probably change
these lines in this patch.

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/3] Documentation: Add ABI entry for vmcs sysfs interface
@ 2012-07-04 14:49     ` Greg KH
  0 siblings, 0 replies; 26+ messages in thread
From: Greg KH @ 2012-07-04 14:49 UTC (permalink / raw)
  To: Yanfei Zhang
  Cc: dzickus, luto, kvm, Joerg Roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, Avi Kivity, ebiederm

On Wed, Jul 04, 2012 at 06:06:28PM +0800, Yanfei Zhang wrote:
> Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-system-cpu |   21 ++++++++++++++++++++
>  1 files changed, 21 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 5dab364..6efbd6c 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -9,6 +9,27 @@ Description:
>  
>  		/sys/devices/system/cpu/cpu#/
>  
> +What:		/sys/devices/system/cpu/vmcs_id
> +Date:		June 2012
> +KernelVersion:	3.5.0

3.5.0 will not have this feature in it, so you should probably change
these lines in this patch.

greg k-h

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  2012-07-04 10:05   ` Yanfei Zhang
@ 2012-07-04 14:52     ` Greg KH
  -1 siblings, 0 replies; 26+ messages in thread
From: Greg KH @ 2012-07-04 14:52 UTC (permalink / raw)
  To: Yanfei Zhang
  Cc: Avi Kivity, mtosatti, ebiederm, luto, Joerg Roedel, dzickus,
	paul.gortmaker, ludwig.nussel, linux-kernel, kvm, kexec

On Wed, Jul 04, 2012 at 06:05:19PM +0800, Yanfei Zhang wrote:
> +int vmcs_sysfs_add(struct device *dev)
> +{
> +	return sysfs_create_group(&dev->kobj, &vmcs_attr_group);
> +}
> +
> +void vmcs_sysfs_remove(struct device *dev)
> +{
> +	sysfs_remove_group(&dev->kobj, &vmcs_attr_group);
> +}

Why are these "add" and "remove" functions here?  Shouldn't you just
write the lines out where you call them instead, as they are only called
once.

And does this race with adding new cpus to the system (is the uevent
being sent to userspace before the attributes are added?)  If so, please
fix that.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-04 14:52     ` Greg KH
  0 siblings, 0 replies; 26+ messages in thread
From: Greg KH @ 2012-07-04 14:52 UTC (permalink / raw)
  To: Yanfei Zhang
  Cc: dzickus, luto, kvm, Joerg Roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, Avi Kivity, ebiederm

On Wed, Jul 04, 2012 at 06:05:19PM +0800, Yanfei Zhang wrote:
> +int vmcs_sysfs_add(struct device *dev)
> +{
> +	return sysfs_create_group(&dev->kobj, &vmcs_attr_group);
> +}
> +
> +void vmcs_sysfs_remove(struct device *dev)
> +{
> +	sysfs_remove_group(&dev->kobj, &vmcs_attr_group);
> +}

Why are these "add" and "remove" functions here?  Shouldn't you just
write the lines out where you call them instead, as they are only called
once.

And does this race with adding new cpus to the system (is the uevent
being sent to userspace before the attributes are added?)  If so, please
fix that.

thanks,

greg k-h

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/3] Documentation: Add ABI entry for vmcs sysfs interface
  2012-07-04 10:06   ` Yanfei Zhang
@ 2012-07-04 14:53     ` Greg KH
  -1 siblings, 0 replies; 26+ messages in thread
From: Greg KH @ 2012-07-04 14:53 UTC (permalink / raw)
  To: Yanfei Zhang
  Cc: Avi Kivity, mtosatti, ebiederm, luto, Joerg Roedel, dzickus,
	paul.gortmaker, ludwig.nussel, linux-kernel, kvm, kexec

On Wed, Jul 04, 2012 at 06:06:28PM +0800, Yanfei Zhang wrote:
> Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-system-cpu |   21 ++++++++++++++++++++
>  1 files changed, 21 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 5dab364..6efbd6c 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -9,6 +9,27 @@ Description:
>  
>  		/sys/devices/system/cpu/cpu#/
>  
> +What:		/sys/devices/system/cpu/vmcs_id

Wait, aren't these a per-cpu value?  You have them as a "global" value
for all cpus, is that really the case?  Shouldn't they be under
/sys/devices/system/cpu/cpu#/ instead?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/3] Documentation: Add ABI entry for vmcs sysfs interface
@ 2012-07-04 14:53     ` Greg KH
  0 siblings, 0 replies; 26+ messages in thread
From: Greg KH @ 2012-07-04 14:53 UTC (permalink / raw)
  To: Yanfei Zhang
  Cc: dzickus, luto, kvm, Joerg Roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, Avi Kivity, ebiederm

On Wed, Jul 04, 2012 at 06:06:28PM +0800, Yanfei Zhang wrote:
> Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-system-cpu |   21 ++++++++++++++++++++
>  1 files changed, 21 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 5dab364..6efbd6c 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -9,6 +9,27 @@ Description:
>  
>  		/sys/devices/system/cpu/cpu#/
>  
> +What:		/sys/devices/system/cpu/vmcs_id

Wait, aren't these a per-cpu value?  You have them as a "global" value
for all cpus, is that really the case?  Shouldn't they be under
/sys/devices/system/cpu/cpu#/ instead?

thanks,

greg k-h

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  2012-07-04 10:05   ` Yanfei Zhang
@ 2012-07-06  8:04     ` HATAYAMA Daisuke
  -1 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-06  8:04 UTC (permalink / raw)
  To: zhangyanfei
  Cc: avi, mtosatti, dzickus, luto, kvm, joerg.roedel, kexec,
	linux-kernel, paul.gortmaker, ludwig.nussel, ebiederm, gregkh

From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Wed, 4 Jul 2012 18:05:19 +0800

> Besides, this patch also exports vmcs revision identifier via
> /sys/devices/system/cpu/vmcs_id and offsets of fields via
> /sys/devices/system/cpu/vmcs/.
> Individual offsets are contained in subfiles named by the filed's
> encoding, e.g.: /sys/devices/cpu/vmcs/0800

According to the discussion starting from

http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html

system can be composed of CPUs with different steppings or differnet
microcode revisions. Becase of the nature that it's hided in the
specification, I suspect layout of vmcs could change across different
steppings or microcode revisions. Then, the interface needs to be
changed as per-cpu like

   /sys/devices/cpu/cpu0/vmcs/0800
   /sys/devices/cpu/cpu1/vmcs/0800
   ...
   /sys/devices/cpu/cpuN/vmcs/0800

Also, processing of vmcsinfo initialization needs to be done per cpu,
and can be triggered when cpu is added not when kvm module is loaded.

Thanks.
HATAYAMA, Daisuke


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-06  8:04     ` HATAYAMA Daisuke
  0 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-06  8:04 UTC (permalink / raw)
  To: zhangyanfei
  Cc: dzickus, luto, kvm, joerg.roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, avi, gregkh, ebiederm

From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Wed, 4 Jul 2012 18:05:19 +0800

> Besides, this patch also exports vmcs revision identifier via
> /sys/devices/system/cpu/vmcs_id and offsets of fields via
> /sys/devices/system/cpu/vmcs/.
> Individual offsets are contained in subfiles named by the filed's
> encoding, e.g.: /sys/devices/cpu/vmcs/0800

According to the discussion starting from

http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html

system can be composed of CPUs with different steppings or differnet
microcode revisions. Becase of the nature that it's hided in the
specification, I suspect layout of vmcs could change across different
steppings or microcode revisions. Then, the interface needs to be
changed as per-cpu like

   /sys/devices/cpu/cpu0/vmcs/0800
   /sys/devices/cpu/cpu1/vmcs/0800
   ...
   /sys/devices/cpu/cpuN/vmcs/0800

Also, processing of vmcsinfo initialization needs to be done per cpu,
and can be triggered when cpu is added not when kvm module is loaded.

Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-06  8:25       ` Wen Congyang
  0 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2012-07-06  8:25 UTC (permalink / raw)
  To: HATAYAMA Daisuke
  Cc: zhangyanfei, dzickus, luto, kvm, joerg.roedel, mtosatti, kexec,
	linux-kernel, paul.gortmaker, ludwig.nussel, avi, gregkh,
	ebiederm

At 07/06/2012 04:04 PM, HATAYAMA Daisuke Wrote:
> From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
> Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
> Date: Wed, 4 Jul 2012 18:05:19 +0800
> 
>> Besides, this patch also exports vmcs revision identifier via
>> /sys/devices/system/cpu/vmcs_id and offsets of fields via
>> /sys/devices/system/cpu/vmcs/.
>> Individual offsets are contained in subfiles named by the filed's
>> encoding, e.g.: /sys/devices/cpu/vmcs/0800
> 
> According to the discussion starting from
> 
> http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html

IIRC, kvm can not work in such environment. The vcpu can run on
different cpu. If the cpu's vmcs is different, I don't know what
will happen. So do we need to support for such environment now?
I think that if kvm can not work in such environment, we should
not provide vmcs information for each physical cpu.

Thanks
Wen Congyang

> 
> system can be composed of CPUs with different steppings or differnet
> microcode revisions. Becase of the nature that it's hided in the
> specification, I suspect layout of vmcs could change across different
> steppings or microcode revisions. Then, the interface needs to be
> changed as per-cpu like
> 
>    /sys/devices/cpu/cpu0/vmcs/0800
>    /sys/devices/cpu/cpu1/vmcs/0800
>    ...
>    /sys/devices/cpu/cpuN/vmcs/0800
> 
> Also, processing of vmcsinfo initialization needs to be done per cpu,
> and can be triggered when cpu is added not when kvm module is loaded.
> 
> Thanks.
> HATAYAMA, Daisuke
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-06  8:25       ` Wen Congyang
  0 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2012-07-06  8:25 UTC (permalink / raw)
  To: HATAYAMA Daisuke
  Cc: dzickus-H+wXaHxf7aLQT0dZR+AlfA, luto-3s7WtUTddSA,
	kvm-u79uwXL29TY76Z2rM5mHXA, joerg.roedel-5C7GfCeVMHo,
	mtosatti-H+wXaHxf7aLQT0dZR+AlfA,
	kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	zhangyanfei-BthXqXjhjHXQFUHtdCDX3A, avi-H+wXaHxf7aLQT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ludwig.nussel-l3A5Bk7waGM, ebiederm-aS9lmoZGLiVWk0Htik3J/w

At 07/06/2012 04:04 PM, HATAYAMA Daisuke Wrote:
> From: Yanfei Zhang <zhangyanfei-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
> Date: Wed, 4 Jul 2012 18:05:19 +0800
> 
>> Besides, this patch also exports vmcs revision identifier via
>> /sys/devices/system/cpu/vmcs_id and offsets of fields via
>> /sys/devices/system/cpu/vmcs/.
>> Individual offsets are contained in subfiles named by the filed's
>> encoding, e.g.: /sys/devices/cpu/vmcs/0800
> 
> According to the discussion starting from
> 
> http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html

IIRC, kvm can not work in such environment. The vcpu can run on
different cpu. If the cpu's vmcs is different, I don't know what
will happen. So do we need to support for such environment now?
I think that if kvm can not work in such environment, we should
not provide vmcs information for each physical cpu.

Thanks
Wen Congyang

> 
> system can be composed of CPUs with different steppings or differnet
> microcode revisions. Becase of the nature that it's hided in the
> specification, I suspect layout of vmcs could change across different
> steppings or microcode revisions. Then, the interface needs to be
> changed as per-cpu like
> 
>    /sys/devices/cpu/cpu0/vmcs/0800
>    /sys/devices/cpu/cpu1/vmcs/0800
>    ...
>    /sys/devices/cpu/cpuN/vmcs/0800
> 
> Also, processing of vmcsinfo initialization needs to be done per cpu,
> and can be triggered when cpu is added not when kvm module is loaded.
> 
> Thanks.
> HATAYAMA, Daisuke
> 
> 
> _______________________________________________
> kexec mailing list
> kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-06  8:25       ` Wen Congyang
  0 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2012-07-06  8:25 UTC (permalink / raw)
  To: HATAYAMA Daisuke
  Cc: dzickus, luto, kvm, joerg.roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, zhangyanfei, avi, gregkh, ludwig.nussel,
	ebiederm

At 07/06/2012 04:04 PM, HATAYAMA Daisuke Wrote:
> From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
> Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
> Date: Wed, 4 Jul 2012 18:05:19 +0800
> 
>> Besides, this patch also exports vmcs revision identifier via
>> /sys/devices/system/cpu/vmcs_id and offsets of fields via
>> /sys/devices/system/cpu/vmcs/.
>> Individual offsets are contained in subfiles named by the filed's
>> encoding, e.g.: /sys/devices/cpu/vmcs/0800
> 
> According to the discussion starting from
> 
> http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html

IIRC, kvm can not work in such environment. The vcpu can run on
different cpu. If the cpu's vmcs is different, I don't know what
will happen. So do we need to support for such environment now?
I think that if kvm can not work in such environment, we should
not provide vmcs information for each physical cpu.

Thanks
Wen Congyang

> 
> system can be composed of CPUs with different steppings or differnet
> microcode revisions. Becase of the nature that it's hided in the
> specification, I suspect layout of vmcs could change across different
> steppings or microcode revisions. Then, the interface needs to be
> changed as per-cpu like
> 
>    /sys/devices/cpu/cpu0/vmcs/0800
>    /sys/devices/cpu/cpu1/vmcs/0800
>    ...
>    /sys/devices/cpu/cpuN/vmcs/0800
> 
> Also, processing of vmcsinfo initialization needs to be done per cpu,
> and can be triggered when cpu is added not when kvm module is loaded.
> 
> Thanks.
> HATAYAMA, Daisuke
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-06  8:38         ` HATAYAMA Daisuke
  0 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-06  8:38 UTC (permalink / raw)
  To: wency
  Cc: zhangyanfei, dzickus, luto, kvm, joerg.roedel, mtosatti, kexec,
	linux-kernel, paul.gortmaker, ludwig.nussel, avi, gregkh,
	ebiederm

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Fri, 6 Jul 2012 16:25:23 +0800

> At 07/06/2012 04:04 PM, HATAYAMA Daisuke Wrote:
>> From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
>> Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
>> Date: Wed, 4 Jul 2012 18:05:19 +0800
>> 
>>> Besides, this patch also exports vmcs revision identifier via
>>> /sys/devices/system/cpu/vmcs_id and offsets of fields via
>>> /sys/devices/system/cpu/vmcs/.
>>> Individual offsets are contained in subfiles named by the filed's
>>> encoding, e.g.: /sys/devices/cpu/vmcs/0800
>> 
>> According to the discussion starting from
>> 
>> http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html
> 
> IIRC, kvm can not work in such environment. The vcpu can run on
> different cpu. If the cpu's vmcs is different, I don't know what
> will happen. So do we need to support for such environment now?
> I think that if kvm can not work in such environment, we should
> not provide vmcs information for each physical cpu.
> 

I think so too. The design can be kept very simple if kvm doesn't
support such case, and it would be good news. Is it true?

Thanks.
HATAYAMA, Daisuke


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-06  8:38         ` HATAYAMA Daisuke
  0 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-06  8:38 UTC (permalink / raw)
  To: wency-BthXqXjhjHXQFUHtdCDX3A
  Cc: dzickus-H+wXaHxf7aLQT0dZR+AlfA, luto-3s7WtUTddSA,
	kvm-u79uwXL29TY76Z2rM5mHXA, joerg.roedel-5C7GfCeVMHo,
	mtosatti-H+wXaHxf7aLQT0dZR+AlfA,
	kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	zhangyanfei-BthXqXjhjHXQFUHtdCDX3A, avi-H+wXaHxf7aLQT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ludwig.nussel-l3A5Bk7waGM, ebiederm-aS9lmoZGLiVWk0Htik3J/w

From: Wen Congyang <wency-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Subject: Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Fri, 6 Jul 2012 16:25:23 +0800

> At 07/06/2012 04:04 PM, HATAYAMA Daisuke Wrote:
>> From: Yanfei Zhang <zhangyanfei-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
>> Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
>> Date: Wed, 4 Jul 2012 18:05:19 +0800
>> 
>>> Besides, this patch also exports vmcs revision identifier via
>>> /sys/devices/system/cpu/vmcs_id and offsets of fields via
>>> /sys/devices/system/cpu/vmcs/.
>>> Individual offsets are contained in subfiles named by the filed's
>>> encoding, e.g.: /sys/devices/cpu/vmcs/0800
>> 
>> According to the discussion starting from
>> 
>> http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html
> 
> IIRC, kvm can not work in such environment. The vcpu can run on
> different cpu. If the cpu's vmcs is different, I don't know what
> will happen. So do we need to support for such environment now?
> I think that if kvm can not work in such environment, we should
> not provide vmcs information for each physical cpu.
> 

I think so too. The design can be kept very simple if kvm doesn't
support such case, and it would be good news. Is it true?

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-06  8:38         ` HATAYAMA Daisuke
  0 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-06  8:38 UTC (permalink / raw)
  To: wency
  Cc: dzickus, luto, kvm, joerg.roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, zhangyanfei, avi, gregkh, ludwig.nussel,
	ebiederm

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Fri, 6 Jul 2012 16:25:23 +0800

> At 07/06/2012 04:04 PM, HATAYAMA Daisuke Wrote:
>> From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
>> Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
>> Date: Wed, 4 Jul 2012 18:05:19 +0800
>> 
>>> Besides, this patch also exports vmcs revision identifier via
>>> /sys/devices/system/cpu/vmcs_id and offsets of fields via
>>> /sys/devices/system/cpu/vmcs/.
>>> Individual offsets are contained in subfiles named by the filed's
>>> encoding, e.g.: /sys/devices/cpu/vmcs/0800
>> 
>> According to the discussion starting from
>> 
>> http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html
> 
> IIRC, kvm can not work in such environment. The vcpu can run on
> different cpu. If the cpu's vmcs is different, I don't know what
> will happen. So do we need to support for such environment now?
> I think that if kvm can not work in such environment, we should
> not provide vmcs information for each physical cpu.
> 

I think so too. The design can be kept very simple if kvm doesn't
support such case, and it would be good news. Is it true?

Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  2012-07-06  8:25       ` Wen Congyang
@ 2012-07-10  1:04         ` HATAYAMA Daisuke
  -1 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-10  1:04 UTC (permalink / raw)
  To: wency
  Cc: dzickus, luto, kvm, joerg.roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, zhangyanfei, avi, gregkh, ludwig.nussel,
	ebiederm

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Fri, 6 Jul 2012 16:25:23 +0800

> At 07/06/2012 04:04 PM, HATAYAMA Daisuke Wrote:
>> From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
>> Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
>> Date: Wed, 4 Jul 2012 18:05:19 +0800
>> 
>>> Besides, this patch also exports vmcs revision identifier via
>>> /sys/devices/system/cpu/vmcs_id and offsets of fields via
>>> /sys/devices/system/cpu/vmcs/.
>>> Individual offsets are contained in subfiles named by the filed's
>>> encoding, e.g.: /sys/devices/cpu/vmcs/0800
>> 
>> According to the discussion starting from
>> 
>> http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html
> 
> IIRC, kvm can not work in such environment. The vcpu can run on
> different cpu. If the cpu's vmcs is different, I don't know what
> will happen. So do we need to support for such environment now?
> I think that if kvm can not work in such environment, we should
> not provide vmcs information for each physical cpu.
> 

Ah, I noticed my basic confusion: if it --- vcpu can run on cpus with
differnet VMCS revision --- were possible, this vmcsinfo would be
unnecessary because it would mean processer could read VMCS data with
revision different from its own one or some kind of reverse
engineering for convertion of differnet VMCS data were done.

I think kvm could probably work if only processors that have the same
VMCS revision are assigned to a single guest. But considering the VMCS
nature, such processor with differnet revision seems unlikely to be
used on host machine.

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-10  1:04         ` HATAYAMA Daisuke
  0 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-10  1:04 UTC (permalink / raw)
  To: wency
  Cc: dzickus, luto, kvm, joerg.roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, zhangyanfei, avi, gregkh, ludwig.nussel,
	ebiederm

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Fri, 6 Jul 2012 16:25:23 +0800

> At 07/06/2012 04:04 PM, HATAYAMA Daisuke Wrote:
>> From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
>> Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
>> Date: Wed, 4 Jul 2012 18:05:19 +0800
>> 
>>> Besides, this patch also exports vmcs revision identifier via
>>> /sys/devices/system/cpu/vmcs_id and offsets of fields via
>>> /sys/devices/system/cpu/vmcs/.
>>> Individual offsets are contained in subfiles named by the filed's
>>> encoding, e.g.: /sys/devices/cpu/vmcs/0800
>> 
>> According to the discussion starting from
>> 
>> http://lkml.indiana.edu/hypermail/linux/kernel/1105.3/00749.html
> 
> IIRC, kvm can not work in such environment. The vcpu can run on
> different cpu. If the cpu's vmcs is different, I don't know what
> will happen. So do we need to support for such environment now?
> I think that if kvm can not work in such environment, we should
> not provide vmcs information for each physical cpu.
> 

Ah, I noticed my basic confusion: if it --- vcpu can run on cpus with
differnet VMCS revision --- were possible, this vmcsinfo would be
unnecessary because it would mean processer could read VMCS data with
revision different from its own one or some kind of reverse
engineering for convertion of differnet VMCS data were done.

I think kvm could probably work if only processors that have the same
VMCS revision are assigned to a single guest. But considering the VMCS
nature, such processor with differnet revision seems unlikely to be
used on host machine.

Thanks.
HATAYAMA, Daisuke

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  2012-07-04 10:05   ` Yanfei Zhang
@ 2012-07-10  1:28     ` HATAYAMA Daisuke
  -1 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-10  1:28 UTC (permalink / raw)
  To: zhangyanfei
  Cc: avi, mtosatti, dzickus, luto, kvm, joerg.roedel, kexec,
	linux-kernel, paul.gortmaker, ludwig.nussel, ebiederm, gregkh

From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Wed, 4 Jul 2012 18:05:19 +0800

> 
> Besides, this patch also exports vmcs revision identifier via
> /sys/devices/system/cpu/vmcs_id and offsets of fields via
> /sys/devices/system/cpu/vmcs/.

I think /sys/devices/system/cpu/vmcs/id is more natural, which also
belongs to vmcs.

<cut>
> +/*
> + * vmcs field offsets.
> + */
> +static struct vmcsinfo {
> +	u32 vmcs_revision_id;
> +	u16 vmcs_field_to_offset_table[HOST_RIP + 1];
> +} vmcsinfo;

This is what I said previously. HOST_RIP is 0x00006c16 => 27670. This
means sizeof (struct vmcsinfo) => 55346 bytes == 54 kbytes. But
actually exported fields are only 153, so 4 + 2 * 153 => 310 bytes are
enough.

How about getting the number of attributes from vmcs_attrs array? I
guess this is exactly the number of vmcs fields exported; here 153.

Thanks.
HATAYAMA, Daisuke

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-10  1:28     ` HATAYAMA Daisuke
  0 siblings, 0 replies; 26+ messages in thread
From: HATAYAMA Daisuke @ 2012-07-10  1:28 UTC (permalink / raw)
  To: zhangyanfei
  Cc: dzickus, luto, kvm, joerg.roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, avi, gregkh, ebiederm

From: Yanfei Zhang <zhangyanfei@cn.fujitsu.com>
Subject: [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
Date: Wed, 4 Jul 2012 18:05:19 +0800

> 
> Besides, this patch also exports vmcs revision identifier via
> /sys/devices/system/cpu/vmcs_id and offsets of fields via
> /sys/devices/system/cpu/vmcs/.

I think /sys/devices/system/cpu/vmcs/id is more natural, which also
belongs to vmcs.

<cut>
> +/*
> + * vmcs field offsets.
> + */
> +static struct vmcsinfo {
> +	u32 vmcs_revision_id;
> +	u16 vmcs_field_to_offset_table[HOST_RIP + 1];
> +} vmcsinfo;

This is what I said previously. HOST_RIP is 0x00006c16 => 27670. This
means sizeof (struct vmcsinfo) => 55346 bytes == 54 kbytes. But
actually exported fields are only 153, so 4 + 2 * 153 => 310 bytes are
enough.

How about getting the number of attributes from vmcs_attrs array? I
guess this is exactly the number of vmcs fields exported; here 153.

Thanks.
HATAYAMA, Daisuke

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-07-10  1:29 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-04  9:58 [PATCH v4 0/3] Export offsets of VMCS fields as note information for kdump Yanfei Zhang
2012-07-04  9:58 ` Yanfei Zhang
2012-07-04 10:01 ` [PATCH v4 1/3] KVM: Export symbols for module vmcsinfo-intel Yanfei Zhang
2012-07-04 10:01   ` Yanfei Zhang
2012-07-04 10:05 ` [PATCH v4 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO Yanfei Zhang
2012-07-04 10:05   ` Yanfei Zhang
2012-07-04 14:52   ` Greg KH
2012-07-04 14:52     ` Greg KH
2012-07-06  8:04   ` HATAYAMA Daisuke
2012-07-06  8:04     ` HATAYAMA Daisuke
2012-07-06  8:25     ` Wen Congyang
2012-07-06  8:25       ` Wen Congyang
2012-07-06  8:25       ` Wen Congyang
2012-07-06  8:38       ` HATAYAMA Daisuke
2012-07-06  8:38         ` HATAYAMA Daisuke
2012-07-06  8:38         ` HATAYAMA Daisuke
2012-07-10  1:04       ` HATAYAMA Daisuke
2012-07-10  1:04         ` HATAYAMA Daisuke
2012-07-10  1:28   ` HATAYAMA Daisuke
2012-07-10  1:28     ` HATAYAMA Daisuke
2012-07-04 10:06 ` [PATCH v4 3/3] Documentation: Add ABI entry for vmcs sysfs interface Yanfei Zhang
2012-07-04 10:06   ` Yanfei Zhang
2012-07-04 14:49   ` Greg KH
2012-07-04 14:49     ` Greg KH
2012-07-04 14:53   ` Greg KH
2012-07-04 14:53     ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.