All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/3] Export offsets of VMCS fields as note information for kdump
@ 2012-07-12  9:54 ` Zhang Yanfei
  0 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-12  9:54 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

This patch set exports offsets of VMCS fields as note information for
kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve
runtime state of guest machine image, such as registers, in host
machine's crash dump as VMCS format. The problem is that VMCS internal
is hidden by Intel in its specification. So, we slove this problem
by reverse engineering implemented in this patch set. The VMCSINFO
is exported via sysfs (/sys/devices/system/cpu/vmcs/) to kexec-tools.

Here are two usercases for two features that we want.

1) Create guest machine's crash dumpfile from host machine's crash dumpfile

In general, we want to use this feature on failure analysis for the system
where the processing depends on the communication between host and guest
machines to look into the system from both machines's viewpoints.

As a concrete situation, consider where there's heartbeat monitoring
feature on the guest machine's side, where we need to determine in
which machine side the cause of heartbeat stop lies. In our actual
experiments, we encountered such situation and we found the cause of
the bug was in host's process schedular so guest machine's vcpu stopped
for a long time and then led to heartbeat stop.

The module that judges heartbeat stop is on guest machine, so we need
to debug guest machine's data. But if the cause lies in host machine
side, we need to look into host machine's crash dump.

Without this feature, we first create guest machine's dump and then
create host mahine's, but there's only a short time between two
processings, during which it's unlikely that buggy situation remains.

So, we think the feature is useful to debug both guest machine's and
host machine's sides at the same time, and expect we can make failure
analysis efficiently.

Of course, we believe this feature is commonly useful on the situation
where guest machine doesn't work well due to something of host machine's.

2) Get offsets of VMCS information on the CPU running on the host machine

If kdump doesn't work well, then it means we cannot use kvm API to get
register values of guest machine and they are still left on its vmcs
region. In the case, we use crash dump mechanism running outside of
linux kernel, such as sadump, a firmware-based crash dump. Then VMCS
information is then necessary.

TODO:
  1. In kexec-tools, get VMCSINFO via sysfs and dump it as note information
     into vmcore.
  2. Dump VMCS region of each guest vcpu and VMCSINFO into qemu-process
     core file. To do this, we will modify kernel core dumper, gdb gcore
     and crash gcore.
  3. Dump guest image from the qemu-process core file into a vmcore.

Changelog from v4 to v5:
1. The VMCSINFO is stored in a two-dimensional array filled with each
   field's encoding and corresponding offset. So the size of VMCSINFO
   is much smaller.
2. vmcs sysfs file /sys/devices/system/cpu/vmcs_id is moved to
   /sys/devices/system/cpu/vmcs/id.
3. Rewrite the ABI entry for vmcs interface and remove the KernelVersion
   line.

Changelog from v3 to v4:
1. All the variables and functions are moved to vmcsinfo-intel module.
2. Add a new sysfs interface /sys/devices/system/cpu/vmcs_id to export
   vmcs revision identifier. And origial sysfs interface is changed
   from /sys/devices/cpu/vmcs to /sys/devices/system/cpu/vmcs. Thanks
   Greg KH for his helpful comments about sysfs.

Changelog from v2 to v3:
1. New VMCSINFO format.
   Now the VMCSINFO is mainly made up of an array that contains all vmcs
   fields' offsets. The offsets aren't encoded because we decode them in
   the module itself. If some field doesn't exist or its offset cannot be
   decoded correctly, the offset in the array is just set to zero.
2. New sysfs interface and Documentation/ABI entry. 
   We expose the actual fields in /sys/devices/cpu/vmcs instead of just
   exporting the address of VMCSINFO in /sys/kernel/vmcsinfo.
   For example, /sys/devices/cpu/vmcs/0800 contains the offset of
   GUEST_DS_SELECTOR. 0800 is the encoding of GUEST_DS_SELECTOR.
   Accordingly, ABI entry in Documentation is changed from sysfs-kernel-vmcsinfo
   to sysfs-devices-cpu-vmcs.

Changelog from v1 to v2:
1. The VMCSINFO now has a simple binary <field><encoded offset> format,
   as below:
     +-------------+--------------------------+
     | Byte offset | Contents                 |
     +-------------+--------------------------+
     | 0           | VMCS revision identifier |
     +-------------+--------------------------+
     | 4           | <field><encoded offset>  |
     +-------------+--------------------------+
     | 16          | <field><encoded offset>  |
     +-------------+--------------------------+
     ......
  
   The first 32 bits of VMCSINFO contains the VMCS revision identifier.
   The remainder of VMCSINFO is used for <field><encoded offset> sets.
   Each set takes 12 bytes: field occupys 4 bytes and its corresponding
   encoded offset occupys 8 bytes.

   Encoded offsets are raw values read by vmcs_read{16, 64, 32, l}, and
   they are all unsigned extended to 8 bytes for each <field><encoded offset>
   set will have the same size. 
   We do not decode offsets here. The decoding work is delayed in userspace
   tools for more flexible handling.
   
   And here are two examples of the new VMCSINFO:
   Processor: Intel(R) Core(TM)2 Duo CPU     E7500  @ 2.93GHz
   VMCSINFO contains:
     <0000000d>                   --> VMCS revision id = 0xd
     <00004000><0000000001840180> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x01840180
     <00004002><0000000001940190> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x01940190
     <0000401e><000000000fe40fe0> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x0fe40fe0
     <0000400c><0000000001e401e0> --> OFFSET(VM_EXIT_CONTROLS) = 0x01e401e0
     ......

   Processor: Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz (24 cores)
   VMCSINFO contains:
     <0000000e>                   --> VMCS revision id = 0xe 
     <00004000><0000000005540550> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x05540550
     <00004002><0000000005440540> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x05440540
     <0000401e><00000000054c0548> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x054c0548
     <0000400c><00000000057c0578> --> OFFSET(VM_EXIT_CONTROLS) = 0x057c0578
     ......

2. Add a new kernel module *vmcsinfo-intel* for filling VMCSINFO instead
   of putting it in module kvm-intel. The new module is auto-loaded
   when the vmx cpufeature is detected and it depends on module kvm-intel.
   *Loading and unloading this module will have no side effect on the
   running guests.*
3. The sysfs file vmcsinfo is splitted into 2 files:
   /sys/kernel/vmcsinfo: shows physical address of VMCSINFO note information.
   /sys/kernel/vmcsinfo_maxsize: shows max size of VMCSINFO.
4. A new Documentation/ABI entry is added for vmcsinfo and vmcsinfo_maxsize.
5. Do not update VMCSINFO note when the kernel is panicked.

zhangyanfei (3):
  KVM: Export symbols for module vmcsinfo-intel
  KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  Documentation: Add ABI entry for vmcs sysfs interface.

 Documentation/ABI/testing/sysfs-devices-system-cpu |   20 +
 arch/x86/include/asm/vmx.h                         |   73 ++
 arch/x86/kvm/Kconfig                               |   11 +
 arch/x86/kvm/Makefile                              |    3 +
 arch/x86/kvm/vmcsinfo.c                            |  714 ++++++++++++++++++++
 arch/x86/kvm/vmx.c                                 |   81 +--
 include/linux/kvm_host.h                           |    3 +
 virt/kvm/kvm_main.c                                |    8 +-
 8 files changed, 841 insertions(+), 72 deletions(-)
 create mode 100644 arch/x86/kvm/vmcsinfo.c

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 0/3] Export offsets of VMCS fields as note information for kdump
@ 2012-07-12  9:54 ` Zhang Yanfei
  0 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-12  9:54 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: dzickus, luto, kvm, Joerg Roedel, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

This patch set exports offsets of VMCS fields as note information for
kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve
runtime state of guest machine image, such as registers, in host
machine's crash dump as VMCS format. The problem is that VMCS internal
is hidden by Intel in its specification. So, we slove this problem
by reverse engineering implemented in this patch set. The VMCSINFO
is exported via sysfs (/sys/devices/system/cpu/vmcs/) to kexec-tools.

Here are two usercases for two features that we want.

1) Create guest machine's crash dumpfile from host machine's crash dumpfile

In general, we want to use this feature on failure analysis for the system
where the processing depends on the communication between host and guest
machines to look into the system from both machines's viewpoints.

As a concrete situation, consider where there's heartbeat monitoring
feature on the guest machine's side, where we need to determine in
which machine side the cause of heartbeat stop lies. In our actual
experiments, we encountered such situation and we found the cause of
the bug was in host's process schedular so guest machine's vcpu stopped
for a long time and then led to heartbeat stop.

The module that judges heartbeat stop is on guest machine, so we need
to debug guest machine's data. But if the cause lies in host machine
side, we need to look into host machine's crash dump.

Without this feature, we first create guest machine's dump and then
create host mahine's, but there's only a short time between two
processings, during which it's unlikely that buggy situation remains.

So, we think the feature is useful to debug both guest machine's and
host machine's sides at the same time, and expect we can make failure
analysis efficiently.

Of course, we believe this feature is commonly useful on the situation
where guest machine doesn't work well due to something of host machine's.

2) Get offsets of VMCS information on the CPU running on the host machine

If kdump doesn't work well, then it means we cannot use kvm API to get
register values of guest machine and they are still left on its vmcs
region. In the case, we use crash dump mechanism running outside of
linux kernel, such as sadump, a firmware-based crash dump. Then VMCS
information is then necessary.

TODO:
  1. In kexec-tools, get VMCSINFO via sysfs and dump it as note information
     into vmcore.
  2. Dump VMCS region of each guest vcpu and VMCSINFO into qemu-process
     core file. To do this, we will modify kernel core dumper, gdb gcore
     and crash gcore.
  3. Dump guest image from the qemu-process core file into a vmcore.

Changelog from v4 to v5:
1. The VMCSINFO is stored in a two-dimensional array filled with each
   field's encoding and corresponding offset. So the size of VMCSINFO
   is much smaller.
2. vmcs sysfs file /sys/devices/system/cpu/vmcs_id is moved to
   /sys/devices/system/cpu/vmcs/id.
3. Rewrite the ABI entry for vmcs interface and remove the KernelVersion
   line.

Changelog from v3 to v4:
1. All the variables and functions are moved to vmcsinfo-intel module.
2. Add a new sysfs interface /sys/devices/system/cpu/vmcs_id to export
   vmcs revision identifier. And origial sysfs interface is changed
   from /sys/devices/cpu/vmcs to /sys/devices/system/cpu/vmcs. Thanks
   Greg KH for his helpful comments about sysfs.

Changelog from v2 to v3:
1. New VMCSINFO format.
   Now the VMCSINFO is mainly made up of an array that contains all vmcs
   fields' offsets. The offsets aren't encoded because we decode them in
   the module itself. If some field doesn't exist or its offset cannot be
   decoded correctly, the offset in the array is just set to zero.
2. New sysfs interface and Documentation/ABI entry. 
   We expose the actual fields in /sys/devices/cpu/vmcs instead of just
   exporting the address of VMCSINFO in /sys/kernel/vmcsinfo.
   For example, /sys/devices/cpu/vmcs/0800 contains the offset of
   GUEST_DS_SELECTOR. 0800 is the encoding of GUEST_DS_SELECTOR.
   Accordingly, ABI entry in Documentation is changed from sysfs-kernel-vmcsinfo
   to sysfs-devices-cpu-vmcs.

Changelog from v1 to v2:
1. The VMCSINFO now has a simple binary <field><encoded offset> format,
   as below:
     +-------------+--------------------------+
     | Byte offset | Contents                 |
     +-------------+--------------------------+
     | 0           | VMCS revision identifier |
     +-------------+--------------------------+
     | 4           | <field><encoded offset>  |
     +-------------+--------------------------+
     | 16          | <field><encoded offset>  |
     +-------------+--------------------------+
     ......
  
   The first 32 bits of VMCSINFO contains the VMCS revision identifier.
   The remainder of VMCSINFO is used for <field><encoded offset> sets.
   Each set takes 12 bytes: field occupys 4 bytes and its corresponding
   encoded offset occupys 8 bytes.

   Encoded offsets are raw values read by vmcs_read{16, 64, 32, l}, and
   they are all unsigned extended to 8 bytes for each <field><encoded offset>
   set will have the same size. 
   We do not decode offsets here. The decoding work is delayed in userspace
   tools for more flexible handling.
   
   And here are two examples of the new VMCSINFO:
   Processor: Intel(R) Core(TM)2 Duo CPU     E7500  @ 2.93GHz
   VMCSINFO contains:
     <0000000d>                   --> VMCS revision id = 0xd
     <00004000><0000000001840180> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x01840180
     <00004002><0000000001940190> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x01940190
     <0000401e><000000000fe40fe0> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x0fe40fe0
     <0000400c><0000000001e401e0> --> OFFSET(VM_EXIT_CONTROLS) = 0x01e401e0
     ......

   Processor: Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz (24 cores)
   VMCSINFO contains:
     <0000000e>                   --> VMCS revision id = 0xe 
     <00004000><0000000005540550> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x05540550
     <00004002><0000000005440540> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x05440540
     <0000401e><00000000054c0548> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x054c0548
     <0000400c><00000000057c0578> --> OFFSET(VM_EXIT_CONTROLS) = 0x057c0578
     ......

2. Add a new kernel module *vmcsinfo-intel* for filling VMCSINFO instead
   of putting it in module kvm-intel. The new module is auto-loaded
   when the vmx cpufeature is detected and it depends on module kvm-intel.
   *Loading and unloading this module will have no side effect on the
   running guests.*
3. The sysfs file vmcsinfo is splitted into 2 files:
   /sys/kernel/vmcsinfo: shows physical address of VMCSINFO note information.
   /sys/kernel/vmcsinfo_maxsize: shows max size of VMCSINFO.
4. A new Documentation/ABI entry is added for vmcsinfo and vmcsinfo_maxsize.
5. Do not update VMCSINFO note when the kernel is panicked.

zhangyanfei (3):
  KVM: Export symbols for module vmcsinfo-intel
  KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  Documentation: Add ABI entry for vmcs sysfs interface.

 Documentation/ABI/testing/sysfs-devices-system-cpu |   20 +
 arch/x86/include/asm/vmx.h                         |   73 ++
 arch/x86/kvm/Kconfig                               |   11 +
 arch/x86/kvm/Makefile                              |    3 +
 arch/x86/kvm/vmcsinfo.c                            |  714 ++++++++++++++++++++
 arch/x86/kvm/vmx.c                                 |   81 +--
 include/linux/kvm_host.h                           |    3 +
 virt/kvm/kvm_main.c                                |    8 +-
 8 files changed, 841 insertions(+), 72 deletions(-)
 create mode 100644 arch/x86/kvm/vmcsinfo.c

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 1/3] KVM: Export symbols for module vmcsinfo-intel
  2012-07-12  9:54 ` Zhang Yanfei
@ 2012-07-12  9:56   ` Zhang Yanfei
  -1 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-12  9:56 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

A new module named vmcsinfo-intel is used to fill VMCSINFO. And
this module depends on kvm-intel and kvm module. So we should
export some symbols of kvm-intel and kvm module that are needed
by vmcsinfo-intel.

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/include/asm/vmx.h |   73 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx.c         |   81 +++++++-------------------------------------
 include/linux/kvm_host.h   |    3 ++
 virt/kvm/kvm_main.c        |    8 ++--
 4 files changed, 93 insertions(+), 72 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 31f180c..1044e2e 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -26,6 +26,7 @@
  */
 
 #include <linux/types.h>
+#include <linux/kvm_host.h>
 
 /*
  * Definitions of Primary Processor-Based VM-Execution Controls.
@@ -481,4 +482,76 @@ enum vm_instruction_error_number {
 	VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID = 28,
 };
 
+#define __ex(x) __kvm_handle_fault_on_reboot(x)
+#define __ex_clear(x, reg) \
+	____kvm_handle_fault_on_reboot(x, "xor " reg " , " reg)
+
+struct vmcs {
+	u32 revision_id;
+	u32 abort;
+	char data[0];
+};
+
+struct vmcs_config {
+	int size;
+	int order;
+	u32 revision_id;
+	u32 pin_based_exec_ctrl;
+	u32 cpu_based_exec_ctrl;
+	u32 cpu_based_2nd_exec_ctrl;
+	u32 vmexit_ctrl;
+	u32 vmentry_ctrl;
+};
+
+extern struct vmcs_config vmcs_config;
+
+DECLARE_PER_CPU(struct vmcs *, current_vmcs);
+
+enum vmcs_field_type {
+	VMCS_FIELD_TYPE_U16 = 0,
+	VMCS_FIELD_TYPE_U64 = 1,
+	VMCS_FIELD_TYPE_U32 = 2,
+	VMCS_FIELD_TYPE_NATURAL_WIDTH = 3
+};
+
+static inline int vmcs_field_type(unsigned long field)
+{
+	if (0x1 & field)	/* the *_HIGH fields are all 32 bit */
+		return VMCS_FIELD_TYPE_U32;
+	return (field >> 13) & 0x3 ;
+}
+
+static __always_inline unsigned long vmcs_readl(unsigned long field)
+{
+	unsigned long value;
+
+	asm volatile (__ex_clear(ASM_VMX_VMREAD_RDX_RAX, "%0")
+		      : "=a"(value) : "d"(field) : "cc");
+	return value;
+}
+
+static __always_inline u16 vmcs_read16(unsigned long field)
+{
+	return vmcs_readl(field);
+}
+
+static __always_inline u32 vmcs_read32(unsigned long field)
+{
+	return vmcs_readl(field);
+}
+
+static __always_inline u64 vmcs_read64(unsigned long field)
+{
+#ifdef CONFIG_X86_64
+	return vmcs_readl(field);
+#else
+	return vmcs_readl(field) | ((u64)vmcs_readl(field+1) << 32);
+#endif
+}
+
+struct vmcs *alloc_vmcs(void);
+void vmcs_load(struct vmcs *);
+void vmcs_clear(struct vmcs *);
+void free_vmcs(struct vmcs *);
+
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 32eb588..43ceae7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -20,7 +20,6 @@
 #include "mmu.h"
 #include "cpuid.h"
 
-#include <linux/kvm_host.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
@@ -45,10 +44,6 @@
 
 #include "trace.h"
 
-#define __ex(x) __kvm_handle_fault_on_reboot(x)
-#define __ex_clear(x, reg) \
-	____kvm_handle_fault_on_reboot(x, "xor " reg " , " reg)
-
 MODULE_AUTHOR("Qumranet");
 MODULE_LICENSE("GPL");
 
@@ -127,12 +122,6 @@ module_param(ple_window, int, S_IRUGO);
 #define NR_AUTOLOAD_MSRS 8
 #define VMCS02_POOL_SIZE 1
 
-struct vmcs {
-	u32 revision_id;
-	u32 abort;
-	char data[0];
-};
-
 /*
  * Track a VMCS that may be loaded on a certain CPU. If it is (cpu!=-1), also
  * remember whether it was VMLAUNCHed, and maintain a linked list of all VMCSs
@@ -617,7 +606,9 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3);
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
-static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
+DEFINE_PER_CPU(struct vmcs *, current_vmcs);
+EXPORT_SYMBOL_GPL(current_vmcs);
+
 /*
  * We maintain a per-CPU linked-list of VMCS loaded on that CPU. This is needed
  * when a CPU is brought down, and we need to VMCLEAR all VMCSs loaded on it.
@@ -636,16 +627,8 @@ static bool cpu_has_load_perf_global_ctrl;
 static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
 static DEFINE_SPINLOCK(vmx_vpid_lock);
 
-static struct vmcs_config {
-	int size;
-	int order;
-	u32 revision_id;
-	u32 pin_based_exec_ctrl;
-	u32 cpu_based_exec_ctrl;
-	u32 cpu_based_2nd_exec_ctrl;
-	u32 vmexit_ctrl;
-	u32 vmentry_ctrl;
-} vmcs_config;
+struct vmcs_config vmcs_config;
+EXPORT_SYMBOL_GPL(vmcs_config);
 
 static struct vmx_capability {
 	u32 ept;
@@ -940,7 +923,7 @@ static struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr)
 	return NULL;
 }
 
-static void vmcs_clear(struct vmcs *vmcs)
+void vmcs_clear(struct vmcs *vmcs)
 {
 	u64 phys_addr = __pa(vmcs);
 	u8 error;
@@ -952,6 +935,7 @@ static void vmcs_clear(struct vmcs *vmcs)
 		printk(KERN_ERR "kvm: vmclear fail: %p/%llx\n",
 		       vmcs, phys_addr);
 }
+EXPORT_SYMBOL_GPL(vmcs_clear);
 
 static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 {
@@ -960,7 +944,7 @@ static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 	loaded_vmcs->launched = 0;
 }
 
-static void vmcs_load(struct vmcs *vmcs)
+void vmcs_load(struct vmcs *vmcs)
 {
 	u64 phys_addr = __pa(vmcs);
 	u8 error;
@@ -972,6 +956,7 @@ static void vmcs_load(struct vmcs *vmcs)
 		printk(KERN_ERR "kvm: vmptrld %p/%llx failed\n",
 		       vmcs, phys_addr);
 }
+EXPORT_SYMBOL_GPL(vmcs_load);
 
 static void __loaded_vmcs_clear(void *arg)
 {
@@ -1043,34 +1028,6 @@ static inline void ept_sync_individual_addr(u64 eptp, gpa_t gpa)
 	}
 }
 
-static __always_inline unsigned long vmcs_readl(unsigned long field)
-{
-	unsigned long value;
-
-	asm volatile (__ex_clear(ASM_VMX_VMREAD_RDX_RAX, "%0")
-		      : "=a"(value) : "d"(field) : "cc");
-	return value;
-}
-
-static __always_inline u16 vmcs_read16(unsigned long field)
-{
-	return vmcs_readl(field);
-}
-
-static __always_inline u32 vmcs_read32(unsigned long field)
-{
-	return vmcs_readl(field);
-}
-
-static __always_inline u64 vmcs_read64(unsigned long field)
-{
-#ifdef CONFIG_X86_64
-	return vmcs_readl(field);
-#else
-	return vmcs_readl(field) | ((u64)vmcs_readl(field+1) << 32);
-#endif
-}
-
 static noinline void vmwrite_error(unsigned long field, unsigned long value)
 {
 	printk(KERN_ERR "vmwrite error: reg %lx value %lx (err %d)\n",
@@ -2580,15 +2537,17 @@ static struct vmcs *alloc_vmcs_cpu(int cpu)
 	return vmcs;
 }
 
-static struct vmcs *alloc_vmcs(void)
+struct vmcs *alloc_vmcs(void)
 {
 	return alloc_vmcs_cpu(raw_smp_processor_id());
 }
+EXPORT_SYMBOL_GPL(alloc_vmcs);
 
-static void free_vmcs(struct vmcs *vmcs)
+void free_vmcs(struct vmcs *vmcs)
 {
 	free_pages((unsigned long)vmcs, vmcs_config.order);
 }
+EXPORT_SYMBOL_GPL(free_vmcs);
 
 /*
  * Free a VMCS, but before that VMCLEAR it on the CPU where it was last loaded
@@ -5314,20 +5273,6 @@ static int handle_vmresume(struct kvm_vcpu *vcpu)
 	return nested_vmx_run(vcpu, false);
 }
 
-enum vmcs_field_type {
-	VMCS_FIELD_TYPE_U16 = 0,
-	VMCS_FIELD_TYPE_U64 = 1,
-	VMCS_FIELD_TYPE_U32 = 2,
-	VMCS_FIELD_TYPE_NATURAL_WIDTH = 3
-};
-
-static inline int vmcs_field_type(unsigned long field)
-{
-	if (0x1 & field)	/* the *_HIGH fields are all 32 bit */
-		return VMCS_FIELD_TYPE_U32;
-	return (field >> 13) & 0x3 ;
-}
-
 static inline int vmcs_field_readonly(unsigned long field)
 {
 	return (((field >> 10) & 0x3) == 1);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 96c158a..3038cd9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -95,6 +95,9 @@ enum kvm_bus {
 	KVM_NR_BUSES
 };
 
+int hardware_enable_all(void);
+void hardware_disable_all(void);
+
 int kvm_io_bus_write(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
 		     int len, const void *val);
 int kvm_io_bus_read(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr, int len,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 44ee712..d92f15e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -90,8 +90,6 @@ static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
 static long kvm_vcpu_compat_ioctl(struct file *file, unsigned int ioctl,
 				  unsigned long arg);
 #endif
-static int hardware_enable_all(void);
-static void hardware_disable_all(void);
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
 
@@ -2330,14 +2328,15 @@ static void hardware_disable_all_nolock(void)
 		on_each_cpu(hardware_disable_nolock, NULL, 1);
 }
 
-static void hardware_disable_all(void)
+void hardware_disable_all(void)
 {
 	raw_spin_lock(&kvm_lock);
 	hardware_disable_all_nolock();
 	raw_spin_unlock(&kvm_lock);
 }
+EXPORT_SYMBOL_GPL(hardware_disable_all);
 
-static int hardware_enable_all(void)
+int hardware_enable_all(void)
 {
 	int r = 0;
 
@@ -2358,6 +2357,7 @@ static int hardware_enable_all(void)
 
 	return r;
 }
+EXPORT_SYMBOL_GPL(hardware_enable_all);
 
 static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
 			   void *v)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 1/3] KVM: Export symbols for module vmcsinfo-intel
@ 2012-07-12  9:56   ` Zhang Yanfei
  0 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-12  9:56 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: dzickus, luto, kvm, Joerg Roedel, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

A new module named vmcsinfo-intel is used to fill VMCSINFO. And
this module depends on kvm-intel and kvm module. So we should
export some symbols of kvm-intel and kvm module that are needed
by vmcsinfo-intel.

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/include/asm/vmx.h |   73 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx.c         |   81 +++++++-------------------------------------
 include/linux/kvm_host.h   |    3 ++
 virt/kvm/kvm_main.c        |    8 ++--
 4 files changed, 93 insertions(+), 72 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 31f180c..1044e2e 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -26,6 +26,7 @@
  */
 
 #include <linux/types.h>
+#include <linux/kvm_host.h>
 
 /*
  * Definitions of Primary Processor-Based VM-Execution Controls.
@@ -481,4 +482,76 @@ enum vm_instruction_error_number {
 	VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID = 28,
 };
 
+#define __ex(x) __kvm_handle_fault_on_reboot(x)
+#define __ex_clear(x, reg) \
+	____kvm_handle_fault_on_reboot(x, "xor " reg " , " reg)
+
+struct vmcs {
+	u32 revision_id;
+	u32 abort;
+	char data[0];
+};
+
+struct vmcs_config {
+	int size;
+	int order;
+	u32 revision_id;
+	u32 pin_based_exec_ctrl;
+	u32 cpu_based_exec_ctrl;
+	u32 cpu_based_2nd_exec_ctrl;
+	u32 vmexit_ctrl;
+	u32 vmentry_ctrl;
+};
+
+extern struct vmcs_config vmcs_config;
+
+DECLARE_PER_CPU(struct vmcs *, current_vmcs);
+
+enum vmcs_field_type {
+	VMCS_FIELD_TYPE_U16 = 0,
+	VMCS_FIELD_TYPE_U64 = 1,
+	VMCS_FIELD_TYPE_U32 = 2,
+	VMCS_FIELD_TYPE_NATURAL_WIDTH = 3
+};
+
+static inline int vmcs_field_type(unsigned long field)
+{
+	if (0x1 & field)	/* the *_HIGH fields are all 32 bit */
+		return VMCS_FIELD_TYPE_U32;
+	return (field >> 13) & 0x3 ;
+}
+
+static __always_inline unsigned long vmcs_readl(unsigned long field)
+{
+	unsigned long value;
+
+	asm volatile (__ex_clear(ASM_VMX_VMREAD_RDX_RAX, "%0")
+		      : "=a"(value) : "d"(field) : "cc");
+	return value;
+}
+
+static __always_inline u16 vmcs_read16(unsigned long field)
+{
+	return vmcs_readl(field);
+}
+
+static __always_inline u32 vmcs_read32(unsigned long field)
+{
+	return vmcs_readl(field);
+}
+
+static __always_inline u64 vmcs_read64(unsigned long field)
+{
+#ifdef CONFIG_X86_64
+	return vmcs_readl(field);
+#else
+	return vmcs_readl(field) | ((u64)vmcs_readl(field+1) << 32);
+#endif
+}
+
+struct vmcs *alloc_vmcs(void);
+void vmcs_load(struct vmcs *);
+void vmcs_clear(struct vmcs *);
+void free_vmcs(struct vmcs *);
+
 #endif
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 32eb588..43ceae7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -20,7 +20,6 @@
 #include "mmu.h"
 #include "cpuid.h"
 
-#include <linux/kvm_host.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
@@ -45,10 +44,6 @@
 
 #include "trace.h"
 
-#define __ex(x) __kvm_handle_fault_on_reboot(x)
-#define __ex_clear(x, reg) \
-	____kvm_handle_fault_on_reboot(x, "xor " reg " , " reg)
-
 MODULE_AUTHOR("Qumranet");
 MODULE_LICENSE("GPL");
 
@@ -127,12 +122,6 @@ module_param(ple_window, int, S_IRUGO);
 #define NR_AUTOLOAD_MSRS 8
 #define VMCS02_POOL_SIZE 1
 
-struct vmcs {
-	u32 revision_id;
-	u32 abort;
-	char data[0];
-};
-
 /*
  * Track a VMCS that may be loaded on a certain CPU. If it is (cpu!=-1), also
  * remember whether it was VMLAUNCHed, and maintain a linked list of all VMCSs
@@ -617,7 +606,9 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3);
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
-static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
+DEFINE_PER_CPU(struct vmcs *, current_vmcs);
+EXPORT_SYMBOL_GPL(current_vmcs);
+
 /*
  * We maintain a per-CPU linked-list of VMCS loaded on that CPU. This is needed
  * when a CPU is brought down, and we need to VMCLEAR all VMCSs loaded on it.
@@ -636,16 +627,8 @@ static bool cpu_has_load_perf_global_ctrl;
 static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
 static DEFINE_SPINLOCK(vmx_vpid_lock);
 
-static struct vmcs_config {
-	int size;
-	int order;
-	u32 revision_id;
-	u32 pin_based_exec_ctrl;
-	u32 cpu_based_exec_ctrl;
-	u32 cpu_based_2nd_exec_ctrl;
-	u32 vmexit_ctrl;
-	u32 vmentry_ctrl;
-} vmcs_config;
+struct vmcs_config vmcs_config;
+EXPORT_SYMBOL_GPL(vmcs_config);
 
 static struct vmx_capability {
 	u32 ept;
@@ -940,7 +923,7 @@ static struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr)
 	return NULL;
 }
 
-static void vmcs_clear(struct vmcs *vmcs)
+void vmcs_clear(struct vmcs *vmcs)
 {
 	u64 phys_addr = __pa(vmcs);
 	u8 error;
@@ -952,6 +935,7 @@ static void vmcs_clear(struct vmcs *vmcs)
 		printk(KERN_ERR "kvm: vmclear fail: %p/%llx\n",
 		       vmcs, phys_addr);
 }
+EXPORT_SYMBOL_GPL(vmcs_clear);
 
 static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 {
@@ -960,7 +944,7 @@ static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
 	loaded_vmcs->launched = 0;
 }
 
-static void vmcs_load(struct vmcs *vmcs)
+void vmcs_load(struct vmcs *vmcs)
 {
 	u64 phys_addr = __pa(vmcs);
 	u8 error;
@@ -972,6 +956,7 @@ static void vmcs_load(struct vmcs *vmcs)
 		printk(KERN_ERR "kvm: vmptrld %p/%llx failed\n",
 		       vmcs, phys_addr);
 }
+EXPORT_SYMBOL_GPL(vmcs_load);
 
 static void __loaded_vmcs_clear(void *arg)
 {
@@ -1043,34 +1028,6 @@ static inline void ept_sync_individual_addr(u64 eptp, gpa_t gpa)
 	}
 }
 
-static __always_inline unsigned long vmcs_readl(unsigned long field)
-{
-	unsigned long value;
-
-	asm volatile (__ex_clear(ASM_VMX_VMREAD_RDX_RAX, "%0")
-		      : "=a"(value) : "d"(field) : "cc");
-	return value;
-}
-
-static __always_inline u16 vmcs_read16(unsigned long field)
-{
-	return vmcs_readl(field);
-}
-
-static __always_inline u32 vmcs_read32(unsigned long field)
-{
-	return vmcs_readl(field);
-}
-
-static __always_inline u64 vmcs_read64(unsigned long field)
-{
-#ifdef CONFIG_X86_64
-	return vmcs_readl(field);
-#else
-	return vmcs_readl(field) | ((u64)vmcs_readl(field+1) << 32);
-#endif
-}
-
 static noinline void vmwrite_error(unsigned long field, unsigned long value)
 {
 	printk(KERN_ERR "vmwrite error: reg %lx value %lx (err %d)\n",
@@ -2580,15 +2537,17 @@ static struct vmcs *alloc_vmcs_cpu(int cpu)
 	return vmcs;
 }
 
-static struct vmcs *alloc_vmcs(void)
+struct vmcs *alloc_vmcs(void)
 {
 	return alloc_vmcs_cpu(raw_smp_processor_id());
 }
+EXPORT_SYMBOL_GPL(alloc_vmcs);
 
-static void free_vmcs(struct vmcs *vmcs)
+void free_vmcs(struct vmcs *vmcs)
 {
 	free_pages((unsigned long)vmcs, vmcs_config.order);
 }
+EXPORT_SYMBOL_GPL(free_vmcs);
 
 /*
  * Free a VMCS, but before that VMCLEAR it on the CPU where it was last loaded
@@ -5314,20 +5273,6 @@ static int handle_vmresume(struct kvm_vcpu *vcpu)
 	return nested_vmx_run(vcpu, false);
 }
 
-enum vmcs_field_type {
-	VMCS_FIELD_TYPE_U16 = 0,
-	VMCS_FIELD_TYPE_U64 = 1,
-	VMCS_FIELD_TYPE_U32 = 2,
-	VMCS_FIELD_TYPE_NATURAL_WIDTH = 3
-};
-
-static inline int vmcs_field_type(unsigned long field)
-{
-	if (0x1 & field)	/* the *_HIGH fields are all 32 bit */
-		return VMCS_FIELD_TYPE_U32;
-	return (field >> 13) & 0x3 ;
-}
-
 static inline int vmcs_field_readonly(unsigned long field)
 {
 	return (((field >> 10) & 0x3) == 1);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 96c158a..3038cd9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -95,6 +95,9 @@ enum kvm_bus {
 	KVM_NR_BUSES
 };
 
+int hardware_enable_all(void);
+void hardware_disable_all(void);
+
 int kvm_io_bus_write(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
 		     int len, const void *val);
 int kvm_io_bus_read(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr, int len,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 44ee712..d92f15e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -90,8 +90,6 @@ static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
 static long kvm_vcpu_compat_ioctl(struct file *file, unsigned int ioctl,
 				  unsigned long arg);
 #endif
-static int hardware_enable_all(void);
-static void hardware_disable_all(void);
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
 
@@ -2330,14 +2328,15 @@ static void hardware_disable_all_nolock(void)
 		on_each_cpu(hardware_disable_nolock, NULL, 1);
 }
 
-static void hardware_disable_all(void)
+void hardware_disable_all(void)
 {
 	raw_spin_lock(&kvm_lock);
 	hardware_disable_all_nolock();
 	raw_spin_unlock(&kvm_lock);
 }
+EXPORT_SYMBOL_GPL(hardware_disable_all);
 
-static int hardware_enable_all(void)
+int hardware_enable_all(void)
 {
 	int r = 0;
 
@@ -2358,6 +2357,7 @@ static int hardware_enable_all(void)
 
 	return r;
 }
+EXPORT_SYMBOL_GPL(hardware_enable_all);
 
 static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
 			   void *v)
-- 
1.7.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
  2012-07-12  9:54 ` Zhang Yanfei
@ 2012-07-12  9:57   ` Zhang Yanfei
  -1 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-12  9:57 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

This patch implements a new module named vmcsinfo-intel. The
module fills VMCSINFO with the VMCS revision identifier,
and offsets of VMCS fields.

Note, offsets of fields that defined in Intel specification
(Intel® 64 and IA-32 Architectures Software Developer’s Manual,
Volume 3C) but not defined in *vmcs_field* will not be filled in
VMCSINFO. And, some fields may be unsupported in some machines,
in these machines, corresponding offsets will be zero.

Besides, this patch also exports vmcs revision identifier via
/sys/devices/system/cpu/vmcs/id and offsets of fields via
/sys/devices/system/cpu/vmcs/.
Individual offsets are contained in subfiles named by the filed's
encoding, e.g.: /sys/devices/cpu/vmcs/0800

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/kvm/Kconfig    |   11 +
 arch/x86/kvm/Makefile   |    3 +
 arch/x86/kvm/vmcsinfo.c |  714 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 728 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/kvm/vmcsinfo.c

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index a28f338..1dd64b1 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -63,6 +63,17 @@ config KVM_INTEL
 	  To compile this as a module, choose M here: the module
 	  will be called kvm-intel.
 
+config VMCSINFO_INTEL
+	tristate "Export VMCSINFO for Intel processors"
+	depends on KVM_INTEL
+	---help---
+	  Provides support for exporting VMCSINFO on Intel processors equipped
+	  with the VT extensions. The VMCSINFO contains a VMCS revision
+	  identifier and offsets of VMCS fields.
+
+	  To compile this as a module, choose M here: the module
+	  will be called vmcsinfo-intel.
+
 config KVM_AMD
 	tristate "KVM for AMD processors support"
 	depends on KVM
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 4f579e8..12a1ef6 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -4,6 +4,7 @@ ccflags-y += -Ivirt/kvm -Iarch/x86/kvm
 CFLAGS_x86.o := -I.
 CFLAGS_svm.o := -I.
 CFLAGS_vmx.o := -I.
+CFLAGS_vmcsinfo.o := -I.
 
 kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 				coalesced_mmio.o irq_comm.o eventfd.o \
@@ -15,7 +16,9 @@ kvm-y			+= x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
 			   i8254.o timer.o cpuid.o pmu.o
 kvm-intel-y		+= vmx.o
 kvm-amd-y		+= svm.o
+vmcsinfo-intel-y	+= vmcsinfo.o
 
 obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o
+obj-$(CONFIG_VMCSINFO_INTEL)	+= vmcsinfo-intel.o
diff --git a/arch/x86/kvm/vmcsinfo.c b/arch/x86/kvm/vmcsinfo.c
new file mode 100644
index 0000000..0730473
--- /dev/null
+++ b/arch/x86/kvm/vmcsinfo.c
@@ -0,0 +1,714 @@
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * This module enables machines with Intel VT-x extensions to export
+ * offsets of VMCS fields for guest debugging.
+ *
+ * Copyright (C) 2012 Fujitsu, Inc.
+ *
+ * Authors:
+ *   Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/mod_devicetable.h>
+#include <linux/device.h>
+#include <linux/swab.h>
+#include <linux/cpu.h>
+
+#include <asm/vmx.h>
+
+MODULE_AUTHOR("Fujitsu");
+MODULE_LICENSE("GPL");
+
+static const struct x86_cpu_id vmcsinfo_cpu_id[] = {
+	X86_FEATURE_MATCH(X86_FEATURE_VMX),
+	{}
+};
+MODULE_DEVICE_TABLE(x86cpu, vmcsinfo_cpu_id);
+
+/*
+ * vmcs field offsets.
+ */
+static struct vmcsinfo {
+	u32 vmcs_revision_id;
+	u16 field_offset[][2];
+} vmcsinfo = {
+	0,
+	{
+		{VIRTUAL_PROCESSOR_ID, 0},
+		{GUEST_ES_SELECTOR, 0},
+		{GUEST_CS_SELECTOR, 0},
+		{GUEST_SS_SELECTOR, 0},
+		{GUEST_DS_SELECTOR, 0},
+		{GUEST_FS_SELECTOR, 0},
+		{GUEST_GS_SELECTOR, 0},
+		{GUEST_LDTR_SELECTOR, 0},
+		{GUEST_TR_SELECTOR, 0},
+		{HOST_ES_SELECTOR, 0},
+		{HOST_CS_SELECTOR, 0},
+		{HOST_SS_SELECTOR, 0},
+		{HOST_DS_SELECTOR, 0},
+		{HOST_FS_SELECTOR, 0},
+		{HOST_GS_SELECTOR, 0},
+		{HOST_TR_SELECTOR, 0},
+		{IO_BITMAP_A, 0},
+		{IO_BITMAP_A_HIGH, 0},
+		{IO_BITMAP_B, 0},
+		{IO_BITMAP_B_HIGH, 0},
+		{MSR_BITMAP, 0},
+		{MSR_BITMAP_HIGH, 0},
+		{VM_EXIT_MSR_STORE_ADDR, 0},
+		{VM_EXIT_MSR_STORE_ADDR_HIGH, 0},
+		{VM_EXIT_MSR_LOAD_ADDR, 0},
+		{VM_EXIT_MSR_LOAD_ADDR_HIGH, 0},
+		{VM_ENTRY_MSR_LOAD_ADDR, 0},
+		{VM_ENTRY_MSR_LOAD_ADDR_HIGH, 0},
+		{TSC_OFFSET, 0},
+		{TSC_OFFSET_HIGH, 0},
+		{VIRTUAL_APIC_PAGE_ADDR, 0},
+		{VIRTUAL_APIC_PAGE_ADDR_HIGH, 0},
+		{APIC_ACCESS_ADDR, 0},
+		{APIC_ACCESS_ADDR_HIGH, 0},
+		{EPT_POINTER, 0},
+		{EPT_POINTER_HIGH, 0},
+		{GUEST_PHYSICAL_ADDRESS, 0},
+		{GUEST_PHYSICAL_ADDRESS_HIGH, 0},
+		{VMCS_LINK_POINTER, 0},
+		{VMCS_LINK_POINTER_HIGH, 0},
+		{GUEST_IA32_DEBUGCTL, 0},
+		{GUEST_IA32_DEBUGCTL_HIGH, 0},
+		{GUEST_IA32_PAT, 0},
+		{GUEST_IA32_PAT_HIGH, 0},
+		{GUEST_IA32_EFER, 0},
+		{GUEST_IA32_EFER_HIGH, 0},
+		{GUEST_IA32_PERF_GLOBAL_CTRL, 0},
+		{GUEST_IA32_PERF_GLOBAL_CTRL_HIGH, 0},
+		{GUEST_PDPTR0, 0},
+		{GUEST_PDPTR0_HIGH, 0},
+		{GUEST_PDPTR1, 0},
+		{GUEST_PDPTR1_HIGH, 0},
+		{GUEST_PDPTR2, 0},
+		{GUEST_PDPTR2_HIGH, 0},
+		{GUEST_PDPTR3, 0},
+		{GUEST_PDPTR3_HIGH, 0},
+		{HOST_IA32_PAT, 0},
+		{HOST_IA32_PAT_HIGH, 0},
+		{HOST_IA32_EFER, 0},
+		{HOST_IA32_EFER_HIGH, 0},
+		{HOST_IA32_PERF_GLOBAL_CTRL, 0},
+		{HOST_IA32_PERF_GLOBAL_CTRL_HIGH, 0},
+		{PIN_BASED_VM_EXEC_CONTROL, 0},
+		{CPU_BASED_VM_EXEC_CONTROL, 0},
+		{EXCEPTION_BITMAP, 0},
+		{PAGE_FAULT_ERROR_CODE_MASK, 0},
+		{PAGE_FAULT_ERROR_CODE_MATCH, 0},
+		{CR3_TARGET_COUNT, 0},
+		{VM_EXIT_CONTROLS, 0},
+		{VM_EXIT_MSR_STORE_COUNT, 0},
+		{VM_EXIT_MSR_LOAD_COUNT, 0},
+		{VM_ENTRY_CONTROLS, 0},
+		{VM_ENTRY_MSR_LOAD_COUNT, 0},
+		{VM_ENTRY_INTR_INFO_FIELD, 0},
+		{VM_ENTRY_EXCEPTION_ERROR_CODE, 0},
+		{VM_ENTRY_INSTRUCTION_LEN, 0},
+		{TPR_THRESHOLD, 0},
+		{SECONDARY_VM_EXEC_CONTROL, 0},
+		{PLE_GAP, 0},
+		{PLE_WINDOW, 0},
+		{VM_INSTRUCTION_ERROR, 0},
+		{VM_EXIT_REASON, 0},
+		{VM_EXIT_INTR_INFO, 0},
+		{VM_EXIT_INTR_ERROR_CODE, 0},
+		{IDT_VECTORING_INFO_FIELD, 0},
+		{IDT_VECTORING_ERROR_CODE, 0},
+		{VM_EXIT_INSTRUCTION_LEN, 0},
+		{VMX_INSTRUCTION_INFO, 0},
+		{GUEST_ES_LIMIT, 0},
+		{GUEST_CS_LIMIT, 0},
+		{GUEST_SS_LIMIT, 0},
+		{GUEST_DS_LIMIT, 0},
+		{GUEST_FS_LIMIT, 0},
+		{GUEST_GS_LIMIT, 0},
+		{GUEST_LDTR_LIMIT, 0},
+		{GUEST_TR_LIMIT, 0},
+		{GUEST_GDTR_LIMIT, 0},
+		{GUEST_IDTR_LIMIT, 0},
+		{GUEST_ES_AR_BYTES, 0},
+		{GUEST_CS_AR_BYTES, 0},
+		{GUEST_SS_AR_BYTES, 0},
+		{GUEST_DS_AR_BYTES, 0},
+		{GUEST_FS_AR_BYTES, 0},
+		{GUEST_GS_AR_BYTES, 0},
+		{GUEST_LDTR_AR_BYTES, 0},
+		{GUEST_TR_AR_BYTES, 0},
+		{GUEST_INTERRUPTIBILITY_INFO, 0},
+		{GUEST_ACTIVITY_STATE, 0},
+		{GUEST_SYSENTER_CS, 0},
+		{HOST_IA32_SYSENTER_CS, 0},
+		{CR0_GUEST_HOST_MASK, 0},
+		{CR4_GUEST_HOST_MASK, 0},
+		{CR0_READ_SHADOW, 0},
+		{CR4_READ_SHADOW, 0},
+		{CR3_TARGET_VALUE0, 0},
+		{CR3_TARGET_VALUE1, 0},
+		{CR3_TARGET_VALUE2, 0},
+		{CR3_TARGET_VALUE3, 0},
+		{EXIT_QUALIFICATION, 0},
+		{GUEST_LINEAR_ADDRESS, 0},
+		{GUEST_CR0, 0},
+		{GUEST_CR3, 0},
+		{GUEST_CR4, 0},
+		{GUEST_ES_BASE, 0},
+		{GUEST_CS_BASE, 0},
+		{GUEST_SS_BASE, 0},
+		{GUEST_DS_BASE, 0},
+		{GUEST_FS_BASE, 0},
+		{GUEST_GS_BASE, 0},
+		{GUEST_LDTR_BASE, 0},
+		{GUEST_TR_BASE, 0},
+		{GUEST_GDTR_BASE, 0},
+		{GUEST_IDTR_BASE, 0},
+		{GUEST_DR7, 0},
+		{GUEST_RSP, 0},
+		{GUEST_RIP, 0},
+		{GUEST_RFLAGS, 0},
+		{GUEST_PENDING_DBG_EXCEPTIONS, 0},
+		{GUEST_SYSENTER_ESP, 0},
+		{GUEST_SYSENTER_EIP, 0},
+		{HOST_CR0, 0},
+		{HOST_CR3, 0},
+		{HOST_CR4, 0},
+		{HOST_FS_BASE, 0},
+		{HOST_GS_BASE, 0},
+		{HOST_TR_BASE, 0},
+		{HOST_GDTR_BASE, 0},
+		{HOST_IDTR_BASE, 0},
+		{HOST_IA32_SYSENTER_ESP, 0},
+		{HOST_IA32_SYSENTER_EIP, 0},
+		{HOST_RSP, 0},
+		{HOST_RIP, 0}
+	}
+};
+
+const char vmcs_group_name[] = "vmcs";
+
+static ssize_t vmcs_id_show(struct device *dev,
+			    struct device_attribute *attr,
+			    char *buf)
+{
+	return sprintf(buf, "%d\n", vmcsinfo.vmcs_revision_id);
+}
+static DEVICE_ATTR(id, 0444, vmcs_id_show, NULL);
+
+#define BUILD_OFFSET_SHOW(i, field_code)                                      \
+static ssize_t _##field_code##_show(struct device *dev,                       \
+				    struct device_attribute *attr,            \
+				    char *buf)                                \
+{                                                                             \
+	return sprintf(buf, "%d\n", vmcsinfo.field_offset[i][1]);             \
+}                                                                             \
+static DEVICE_ATTR(field_code, 0444, _##field_code##_show, NULL);
+
+BUILD_OFFSET_SHOW(  0, 0000); /* VIRTUAL_PROCESSOR_ID             */
+BUILD_OFFSET_SHOW(  1, 0800); /* GUEST_ES_SELECTOR                */
+BUILD_OFFSET_SHOW(  2, 0802); /* GUEST_CS_SELECTOR                */
+BUILD_OFFSET_SHOW(  3, 0804); /* GUEST_SS_SELECTOR                */
+BUILD_OFFSET_SHOW(  4, 0806); /* GUEST_DS_SELECTOR                */
+BUILD_OFFSET_SHOW(  5, 0808); /* GUEST_FS_SELECTOR                */
+BUILD_OFFSET_SHOW(  6, 080a); /* GUEST_GS_SELECTOR                */
+BUILD_OFFSET_SHOW(  7, 080c); /* GUEST_LDTR_SELECTOR              */
+BUILD_OFFSET_SHOW(  8, 080e); /* GUEST_TR_SELECTOR                */
+BUILD_OFFSET_SHOW(  9, 0c00); /* HOST_ES_SELECTOR                 */
+BUILD_OFFSET_SHOW( 10, 0c02); /* HOST_CS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 11, 0c04); /* HOST_SS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 12, 0c06); /* HOST_DS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 13, 0c08); /* HOST_FS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 14, 0c0a); /* HOST_GS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 15, 0c0c); /* HOST_TR_SELECTOR                 */
+BUILD_OFFSET_SHOW( 16, 2000); /* IO_BITMAP_A                      */
+BUILD_OFFSET_SHOW( 17, 2001); /* IO_BITMAP_A_HIGH                 */
+BUILD_OFFSET_SHOW( 18, 2002); /* IO_BITMAP_B                      */
+BUILD_OFFSET_SHOW( 19, 2003); /* IO_BITMAP_B_HIGH                 */
+BUILD_OFFSET_SHOW( 20, 2004); /* MSR_BITMAP                       */
+BUILD_OFFSET_SHOW( 21, 2005); /* MSR_BITMAP_HIGH                  */
+BUILD_OFFSET_SHOW( 22, 2006); /* VM_EXIT_MSR_STORE_ADDR           */
+BUILD_OFFSET_SHOW( 23, 2007); /* VM_EXIT_MSR_STORE_ADDR_HIGH      */
+BUILD_OFFSET_SHOW( 24, 2008); /* VM_EXIT_MSR_LOAD_ADDR            */
+BUILD_OFFSET_SHOW( 25, 2009); /* VM_EXIT_MSR_LOAD_ADDR_HIGH       */
+BUILD_OFFSET_SHOW( 26, 200a); /* VM_ENTRY_MSR_LOAD_ADDR           */
+BUILD_OFFSET_SHOW( 27, 200b); /* VM_ENTRY_MSR_LOAD_ADDR_HIGH      */
+BUILD_OFFSET_SHOW( 28, 2010); /* TSC_OFFSET                       */
+BUILD_OFFSET_SHOW( 29, 2011); /* TSC_OFFSET_HIGH                  */
+BUILD_OFFSET_SHOW( 30, 2012); /* VIRTUAL_APIC_PAGE_ADDR           */
+BUILD_OFFSET_SHOW( 31, 2013); /* VIRTUAL_APIC_PAGE_ADDR_HIGH      */
+BUILD_OFFSET_SHOW( 32, 2014); /* APIC_ACCESS_ADDR                 */
+BUILD_OFFSET_SHOW( 33, 2015); /* APIC_ACCESS_ADDR_HIGH            */
+BUILD_OFFSET_SHOW( 34, 201a); /* EPT_POINTER                      */
+BUILD_OFFSET_SHOW( 35, 201b); /* EPT_POINTER_HIGH                 */
+BUILD_OFFSET_SHOW( 36, 2400); /* GUEST_PHYSICAL_ADDRESS           */
+BUILD_OFFSET_SHOW( 37, 2401); /* GUEST_PHYSICAL_ADDRESS_HIGH      */
+BUILD_OFFSET_SHOW( 38, 2800); /* VMCS_LINK_POINTER                */
+BUILD_OFFSET_SHOW( 39, 2801); /* VMCS_LINK_POINTER_HIGH           */
+BUILD_OFFSET_SHOW( 40, 2802); /* GUEST_IA32_DEBUGCTL              */
+BUILD_OFFSET_SHOW( 41, 2803); /* GUEST_IA32_DEBUGCTL_HIGH         */
+BUILD_OFFSET_SHOW( 42, 2804); /* GUEST_IA32_PAT                   */
+BUILD_OFFSET_SHOW( 43, 2805); /* GUEST_IA32_PAT_HIGH              */
+BUILD_OFFSET_SHOW( 44, 2806); /* GUEST_IA32_EFER                  */
+BUILD_OFFSET_SHOW( 45, 2807); /* GUEST_IA32_EFER_HIGH             */
+BUILD_OFFSET_SHOW( 46, 2808); /* GUEST_IA32_PERF_GLOBAL_CTRL      */
+BUILD_OFFSET_SHOW( 47, 2809); /* GUEST_IA32_PERF_GLOBAL_CTRL_HIGH */
+BUILD_OFFSET_SHOW( 48, 280a); /* GUEST_PDPTR0                     */
+BUILD_OFFSET_SHOW( 49, 280b); /* GUEST_PDPTR0_HIGH                */
+BUILD_OFFSET_SHOW( 50, 280c); /* GUEST_PDPTR1                     */
+BUILD_OFFSET_SHOW( 51, 280d); /* GUEST_PDPTR1_HIGH                */
+BUILD_OFFSET_SHOW( 52, 280e); /* GUEST_PDPTR2                     */
+BUILD_OFFSET_SHOW( 53, 280f); /* GUEST_PDPTR2_HIGH                */
+BUILD_OFFSET_SHOW( 54, 2810); /* GUEST_PDPTR3                     */
+BUILD_OFFSET_SHOW( 55, 2811); /* GUEST_PDPTR3_HIGH                */
+BUILD_OFFSET_SHOW( 56, 2c00); /* HOST_IA32_PAT                    */
+BUILD_OFFSET_SHOW( 57, 2c01); /* HOST_IA32_PAT_HIGH               */
+BUILD_OFFSET_SHOW( 58, 2c02); /* HOST_IA32_EFER                   */
+BUILD_OFFSET_SHOW( 59, 2c03); /* HOST_IA32_EFER_HIGH              */
+BUILD_OFFSET_SHOW( 60, 2c04); /* HOST_IA32_PERF_GLOBAL_CTRL       */
+BUILD_OFFSET_SHOW( 61, 2c05); /* HOST_IA32_PERF_GLOBAL_CTRL_HIGH  */
+BUILD_OFFSET_SHOW( 62, 4000); /* PIN_BASED_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW( 63, 4002); /* CPU_BASED_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW( 64, 4004); /* EXCEPTION_BITMAP                 */
+BUILD_OFFSET_SHOW( 65, 4006); /* PAGE_FAULT_ERROR_CODE_MASK       */
+BUILD_OFFSET_SHOW( 66, 4008); /* PAGE_FAULT_ERROR_CODE_MATCH      */
+BUILD_OFFSET_SHOW( 67, 400a); /* CR3_TARGET_COUNT                 */
+BUILD_OFFSET_SHOW( 68, 400c); /* VM_EXIT_CONTROLS                 */
+BUILD_OFFSET_SHOW( 69, 400e); /* VM_EXIT_MSR_STORE_COUNT          */
+BUILD_OFFSET_SHOW( 70, 4010); /* VM_EXIT_MSR_LOAD_COUNT           */
+BUILD_OFFSET_SHOW( 71, 4012); /* VM_ENTRY_CONTROLS                */
+BUILD_OFFSET_SHOW( 72, 4014); /* VM_ENTRY_MSR_LOAD_COUNT          */
+BUILD_OFFSET_SHOW( 73, 4016); /* VM_ENTRY_INTR_INFO_FIELD         */
+BUILD_OFFSET_SHOW( 74, 4018); /* VM_ENTRY_EXCEPTION_ERROR_CODE    */
+BUILD_OFFSET_SHOW( 75, 401a); /* VM_ENTRY_INSTRUCTION_LEN         */
+BUILD_OFFSET_SHOW( 76, 401c); /* TPR_THRESHOLD                    */
+BUILD_OFFSET_SHOW( 77, 401e); /* SECONDARY_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW( 78, 4020); /* PLE_GAP                          */
+BUILD_OFFSET_SHOW( 79, 4022); /* PLE_WINDOW                       */
+BUILD_OFFSET_SHOW( 80, 4400); /* VM_INSTRUCTION_ERROR             */
+BUILD_OFFSET_SHOW( 81, 4402); /* VM_EXIT_REASON                   */
+BUILD_OFFSET_SHOW( 82, 4404); /* VM_EXIT_INTR_INFO                */
+BUILD_OFFSET_SHOW( 83, 4406); /* VM_EXIT_INTR_ERROR_CODE          */
+BUILD_OFFSET_SHOW( 84, 4408); /* IDT_VECTORING_INFO_FIELD         */
+BUILD_OFFSET_SHOW( 85, 440a); /* IDT_VECTORING_ERROR_CODE         */
+BUILD_OFFSET_SHOW( 86, 440c); /* VM_EXIT_INSTRUCTION_LEN          */
+BUILD_OFFSET_SHOW( 87, 440e); /* VMX_INSTRUCTION_INFO             */
+BUILD_OFFSET_SHOW( 88, 4800); /* GUEST_ES_LIMIT                   */
+BUILD_OFFSET_SHOW( 89, 4802); /* GUEST_CS_LIMIT                   */
+BUILD_OFFSET_SHOW( 90, 4804); /* GUEST_SS_LIMIT                   */
+BUILD_OFFSET_SHOW( 91, 4806); /* GUEST_DS_LIMIT                   */
+BUILD_OFFSET_SHOW( 92, 4808); /* GUEST_FS_LIMIT                   */
+BUILD_OFFSET_SHOW( 93, 480a); /* GUEST_GS_LIMIT                   */
+BUILD_OFFSET_SHOW( 94, 480c); /* GUEST_LDTR_LIMIT                 */
+BUILD_OFFSET_SHOW( 95, 480e); /* GUEST_TR_LIMIT                   */
+BUILD_OFFSET_SHOW( 96, 4810); /* GUEST_GDTR_LIMIT                 */
+BUILD_OFFSET_SHOW( 97, 4812); /* GUEST_IDTR_LIMIT                 */
+BUILD_OFFSET_SHOW( 98, 4814); /* GUEST_ES_AR_BYTES                */
+BUILD_OFFSET_SHOW( 99, 4816); /* GUEST_CS_AR_BYTES                */
+BUILD_OFFSET_SHOW(100, 4818); /* GUEST_SS_AR_BYTES                */
+BUILD_OFFSET_SHOW(101, 481a); /* GUEST_DS_AR_BYTES                */
+BUILD_OFFSET_SHOW(102, 481c); /* GUEST_FS_AR_BYTES                */
+BUILD_OFFSET_SHOW(103, 481e); /* GUEST_GS_AR_BYTES                */
+BUILD_OFFSET_SHOW(104, 4820); /* GUEST_LDTR_AR_BYTES              */
+BUILD_OFFSET_SHOW(105, 4822); /* GUEST_TR_AR_BYTES                */
+BUILD_OFFSET_SHOW(106, 4824); /* GUEST_INTERRUPTIBILITY_INFO      */
+BUILD_OFFSET_SHOW(107, 4826); /* GUEST_ACTIVITY_STATE             */
+BUILD_OFFSET_SHOW(108, 482A); /* GUEST_SYSENTER_CS                */
+BUILD_OFFSET_SHOW(109, 4c00); /* HOST_IA32_SYSENTER_CS            */
+BUILD_OFFSET_SHOW(110, 6000); /* CR0_GUEST_HOST_MASK              */
+BUILD_OFFSET_SHOW(111, 6002); /* CR4_GUEST_HOST_MASK              */
+BUILD_OFFSET_SHOW(112, 6004); /* CR0_READ_SHADOW                  */
+BUILD_OFFSET_SHOW(113, 6006); /* CR4_READ_SHADOW                  */
+BUILD_OFFSET_SHOW(114, 6008); /* CR3_TARGET_VALUE0                */
+BUILD_OFFSET_SHOW(115, 600a); /* CR3_TARGET_VALUE1                */
+BUILD_OFFSET_SHOW(116, 600c); /* CR3_TARGET_VALUE2                */
+BUILD_OFFSET_SHOW(117, 600e); /* CR3_TARGET_VALUE3                */
+BUILD_OFFSET_SHOW(118, 6400); /* EXIT_QUALIFICATION               */
+BUILD_OFFSET_SHOW(119, 640a); /* GUEST_LINEAR_ADDRESS             */
+BUILD_OFFSET_SHOW(120, 6800); /* GUEST_CR0                        */
+BUILD_OFFSET_SHOW(121, 6802); /* GUEST_CR3                        */
+BUILD_OFFSET_SHOW(122, 6804); /* GUEST_CR4                        */
+BUILD_OFFSET_SHOW(123, 6806); /* GUEST_ES_BASE                    */
+BUILD_OFFSET_SHOW(124, 6808); /* GUEST_CS_BASE                    */
+BUILD_OFFSET_SHOW(125, 680a); /* GUEST_SS_BASE                    */
+BUILD_OFFSET_SHOW(126, 680c); /* GUEST_DS_BASE                    */
+BUILD_OFFSET_SHOW(127, 680e); /* GUEST_FS_BASE                    */
+BUILD_OFFSET_SHOW(128, 6810); /* GUEST_GS_BASE                    */
+BUILD_OFFSET_SHOW(129, 6812); /* GUEST_LDTR_BASE                  */
+BUILD_OFFSET_SHOW(130, 6814); /* GUEST_TR_BASE                    */
+BUILD_OFFSET_SHOW(131, 6816); /* GUEST_GDTR_BASE                  */
+BUILD_OFFSET_SHOW(132, 6818); /* GUEST_IDTR_BASE                  */
+BUILD_OFFSET_SHOW(133, 681a); /* GUEST_DR7                        */
+BUILD_OFFSET_SHOW(134, 681c); /* GUEST_RSP                        */
+BUILD_OFFSET_SHOW(135, 681e); /* GUEST_RIP                        */
+BUILD_OFFSET_SHOW(136, 6820); /* GUEST_RFLAGS                     */
+BUILD_OFFSET_SHOW(137, 6822); /* GUEST_PENDING_DBG_EXCEPTIONS     */
+BUILD_OFFSET_SHOW(138, 6824); /* GUEST_SYSENTER_ESP               */
+BUILD_OFFSET_SHOW(139, 6826); /* GUEST_SYSENTER_EIP               */
+BUILD_OFFSET_SHOW(140, 6c00); /* HOST_CR0                         */
+BUILD_OFFSET_SHOW(141, 6c02); /* HOST_CR3                         */
+BUILD_OFFSET_SHOW(142, 6c04); /* HOST_CR4                         */
+BUILD_OFFSET_SHOW(143, 6c06); /* HOST_FS_BASE                     */
+BUILD_OFFSET_SHOW(144, 6c08); /* HOST_GS_BASE                     */
+BUILD_OFFSET_SHOW(145, 6c0a); /* HOST_TR_BASE                     */
+BUILD_OFFSET_SHOW(146, 6c0c); /* HOST_GDTR_BASE                   */
+BUILD_OFFSET_SHOW(147, 6c0e); /* HOST_IDTR_BASE                   */
+BUILD_OFFSET_SHOW(148, 6c10); /* HOST_IA32_SYSENTER_ESP           */
+BUILD_OFFSET_SHOW(149, 6c12); /* HOST_IA32_SYSENTER_EIP           */
+BUILD_OFFSET_SHOW(150, 6c14); /* HOST_RSP                         */
+BUILD_OFFSET_SHOW(151, 6c16); /* HOST_RIP                         */
+
+static struct attribute *vmcs_attrs[] = {
+	&dev_attr_id.attr,
+	&dev_attr_0000.attr,
+	&dev_attr_0800.attr,
+	&dev_attr_0802.attr,
+	&dev_attr_0804.attr,
+	&dev_attr_0806.attr,
+	&dev_attr_0808.attr,
+	&dev_attr_080a.attr,
+	&dev_attr_080c.attr,
+	&dev_attr_080e.attr,
+	&dev_attr_0c00.attr,
+	&dev_attr_0c02.attr,
+	&dev_attr_0c04.attr,
+	&dev_attr_0c06.attr,
+	&dev_attr_0c08.attr,
+	&dev_attr_0c0a.attr,
+	&dev_attr_0c0c.attr,
+	&dev_attr_2000.attr,
+	&dev_attr_2001.attr,
+	&dev_attr_2002.attr,
+	&dev_attr_2003.attr,
+	&dev_attr_2004.attr,
+	&dev_attr_2005.attr,
+	&dev_attr_2006.attr,
+	&dev_attr_2007.attr,
+	&dev_attr_2008.attr,
+	&dev_attr_2009.attr,
+	&dev_attr_200a.attr,
+	&dev_attr_200b.attr,
+	&dev_attr_2010.attr,
+	&dev_attr_2011.attr,
+	&dev_attr_2012.attr,
+	&dev_attr_2013.attr,
+	&dev_attr_2014.attr,
+	&dev_attr_2015.attr,
+	&dev_attr_201a.attr,
+	&dev_attr_201b.attr,
+	&dev_attr_2400.attr,
+	&dev_attr_2401.attr,
+	&dev_attr_2800.attr,
+	&dev_attr_2801.attr,
+	&dev_attr_2802.attr,
+	&dev_attr_2803.attr,
+	&dev_attr_2804.attr,
+	&dev_attr_2805.attr,
+	&dev_attr_2806.attr,
+	&dev_attr_2807.attr,
+	&dev_attr_2808.attr,
+	&dev_attr_2809.attr,
+	&dev_attr_280a.attr,
+	&dev_attr_280b.attr,
+	&dev_attr_280c.attr,
+	&dev_attr_280d.attr,
+	&dev_attr_280e.attr,
+	&dev_attr_280f.attr,
+	&dev_attr_2810.attr,
+	&dev_attr_2811.attr,
+	&dev_attr_2c00.attr,
+	&dev_attr_2c01.attr,
+	&dev_attr_2c02.attr,
+	&dev_attr_2c03.attr,
+	&dev_attr_2c04.attr,
+	&dev_attr_2c05.attr,
+	&dev_attr_4000.attr,
+	&dev_attr_4002.attr,
+	&dev_attr_4004.attr,
+	&dev_attr_4006.attr,
+	&dev_attr_4008.attr,
+	&dev_attr_400a.attr,
+	&dev_attr_400c.attr,
+	&dev_attr_400e.attr,
+	&dev_attr_4010.attr,
+	&dev_attr_4012.attr,
+	&dev_attr_4014.attr,
+	&dev_attr_4016.attr,
+	&dev_attr_4018.attr,
+	&dev_attr_401a.attr,
+	&dev_attr_401c.attr,
+	&dev_attr_401e.attr,
+	&dev_attr_4020.attr,
+	&dev_attr_4022.attr,
+	&dev_attr_4400.attr,
+	&dev_attr_4402.attr,
+	&dev_attr_4404.attr,
+	&dev_attr_4406.attr,
+	&dev_attr_4408.attr,
+	&dev_attr_440a.attr,
+	&dev_attr_440c.attr,
+	&dev_attr_440e.attr,
+	&dev_attr_4800.attr,
+	&dev_attr_4802.attr,
+	&dev_attr_4804.attr,
+	&dev_attr_4806.attr,
+	&dev_attr_4808.attr,
+	&dev_attr_480a.attr,
+	&dev_attr_480c.attr,
+	&dev_attr_480e.attr,
+	&dev_attr_4810.attr,
+	&dev_attr_4812.attr,
+	&dev_attr_4814.attr,
+	&dev_attr_4816.attr,
+	&dev_attr_4818.attr,
+	&dev_attr_481a.attr,
+	&dev_attr_481c.attr,
+	&dev_attr_481e.attr,
+	&dev_attr_4820.attr,
+	&dev_attr_4822.attr,
+	&dev_attr_4824.attr,
+	&dev_attr_4826.attr,
+	&dev_attr_482A.attr,
+	&dev_attr_4c00.attr,
+	&dev_attr_6000.attr,
+	&dev_attr_6002.attr,
+	&dev_attr_6004.attr,
+	&dev_attr_6006.attr,
+	&dev_attr_6008.attr,
+	&dev_attr_600a.attr,
+	&dev_attr_600c.attr,
+	&dev_attr_600e.attr,
+	&dev_attr_6400.attr,
+	&dev_attr_640a.attr,
+	&dev_attr_6800.attr,
+	&dev_attr_6802.attr,
+	&dev_attr_6804.attr,
+	&dev_attr_6806.attr,
+	&dev_attr_6808.attr,
+	&dev_attr_680a.attr,
+	&dev_attr_680c.attr,
+	&dev_attr_680e.attr,
+	&dev_attr_6810.attr,
+	&dev_attr_6812.attr,
+	&dev_attr_6814.attr,
+	&dev_attr_6816.attr,
+	&dev_attr_6818.attr,
+	&dev_attr_681a.attr,
+	&dev_attr_681c.attr,
+	&dev_attr_681e.attr,
+	&dev_attr_6820.attr,
+	&dev_attr_6822.attr,
+	&dev_attr_6824.attr,
+	&dev_attr_6826.attr,
+	&dev_attr_6c00.attr,
+	&dev_attr_6c02.attr,
+	&dev_attr_6c04.attr,
+	&dev_attr_6c06.attr,
+	&dev_attr_6c08.attr,
+	&dev_attr_6c0a.attr,
+	&dev_attr_6c0c.attr,
+	&dev_attr_6c0e.attr,
+	&dev_attr_6c10.attr,
+	&dev_attr_6c12.attr,
+	&dev_attr_6c14.attr,
+	&dev_attr_6c16.attr,
+	NULL,
+};
+
+static struct attribute_group vmcs_attr_group = {
+	.name = vmcs_group_name,
+	.attrs = vmcs_attrs,
+};
+
+#define VMCSINFO_MAX_FIELD (ARRAY_SIZE(vmcs_attrs) - 1)
+
+static inline void vmcsinfo_revision_id(u32 id)
+{
+	vmcsinfo.vmcs_revision_id = id;
+}
+
+static inline void vmcsinfo_field(int i, u16 offset)
+{
+	if (i < VMCSINFO_MAX_FIELD)
+		vmcsinfo.field_offset[i][1] = offset;
+}
+
+/*
+ * For caculating offsets of fields in VMCS data, we index every 16-bit
+ * field by this kind of format:
+ *         | --------- 16 bits ---------- |
+ *         +-------------+-+------------+-+
+ *         | high 7 bits |1| low 7 bits |0|
+ *         +-------------+-+------------+-+
+ * In high byte, the lowest bit must be 1; In low byte, the lowest bit
+ * must be 0. The two bits are set like this in case indexes in VMCS
+ * data are read as big endian mode.
+ * The remaining 14 bits of the index indicate the real offset of the
+ * field. Because the size of a VMCS region is at most 4 KBytes, so
+ * 14 bits are enough to index the whole VMCS region.
+ *
+ * ENCODING_OFFSET: encode the offset into the index of this kind.
+ * DECODING_OFFSET: decode the index of this kind into real offset.
+ */
+#define OFFSET_HIGH_SHIFT (7)
+#define OFFSET_LOW_MASK   ((1 << OFFSET_HIGH_SHIFT) - 1) /* 0x7f */
+#define OFFSET_HIGH_MASK  (OFFSET_LOW_MASK << OFFSET_HIGH_SHIFT) /* 0x3f80 */
+#define ENCODING_OFFSET(offset)                                     \
+	((((offset) & OFFSET_LOW_MASK) << 1) +                      \
+	((((offset) & OFFSET_HIGH_MASK) << 2) | 0x100))
+/*
+ * index here should be always read in little endian mode.
+ */
+#define DECODING_OFFSET_LE(index)                                   \
+	((((index) >> 1) & OFFSET_LOW_MASK) +                       \
+	(((index) >> 2) & OFFSET_HIGH_MASK))
+/*
+ * n indicates the bits of index. We first check if index
+ * is read in big endian mode.
+ */
+#define DECODING_OFFSET(index, n)                                   \
+	((index & 1) ? (DECODING_OFFSET_LE(__swab##n(index))) :     \
+	(DECODING_OFFSET_LE(index)))
+
+#define FIELD_OFFSET16(i, offset)                                   \
+	vmcsinfo_field(i, DECODING_OFFSET(offset, 16))
+#define FIELD_OFFSET64(i, offset)                                   \
+	vmcsinfo_field(i, DECODING_OFFSET(offset, 64))
+#define FIELD_OFFSET32(i, offset)                                   \
+	vmcsinfo_field(i, DECODING_OFFSET(offset, 32))
+#define FIELD_OFFSETNW(i, offset)                                   \
+do {                                                                \
+	if (sizeof(offset) == 8)                                    \
+		vmcsinfo_field(i, DECODING_OFFSET(offset, 64));     \
+	else                                                        \
+		vmcsinfo_field(i, DECODING_OFFSET(offset, 32));     \
+} while (0)
+
+#define VMCS_FIELD_CHECK(i, offset, type)                           \
+do {                                                                \
+	if (vmcs_read32(VM_INSTRUCTION_ERROR) !=                    \
+		VMXERR_UNSUPPORTED_VMCS_COMPONENT)                  \
+		FIELD_OFFSET##type(i, offset);                      \
+} while (0)
+
+static inline void vmcs_read_checking(int i)
+{
+	unsigned long field;
+	u16 offset16;
+	u64 offset64;
+	u32 offset32;
+	unsigned long offsetnw;
+
+	field = vmcsinfo.field_offset[i][0];
+	switch (vmcs_field_type(field)) {
+	case VMCS_FIELD_TYPE_U16:
+		offset16 = vmcs_read16(field);
+		VMCS_FIELD_CHECK(i, offset16, 16);
+		break;
+	case VMCS_FIELD_TYPE_U64:
+		offset64 = vmcs_read64(field);
+		VMCS_FIELD_CHECK(i, offset64, 64);
+		break;
+	case VMCS_FIELD_TYPE_U32:
+		offset32 = vmcs_read32(field);
+		VMCS_FIELD_CHECK(i, offset32, 32);
+		break;
+	case VMCS_FIELD_TYPE_NATURAL_WIDTH:
+		offsetnw = vmcs_readl(field);
+		VMCS_FIELD_CHECK(i, offsetnw, NW);
+		break;
+	}
+}
+
+/*
+ * Note, offsets of fields that defined in Intel specification
+ * (Intel® 64 and IA-32 Architectures Software Developer’s Manual,
+ * Volume 3C) but not defined in *vmcs_field* will not be filled in
+ * VMCSINFO. And, some fields may be unsupported in some machines,
+ * in these machines, corresponding offsets will be zero.
+ */
+static int __init alloc_vmcsinfo_init(void)
+{
+/*
+ * The first 8 bytes in vmcs region are for
+ *   VMCS revision identifier
+ *   VMX-abort indicator
+ */
+#define FIELD_START (8)
+
+	int r, offset, i;
+	struct vmcs *vmcs;
+	int cpu;
+
+	vmcs = alloc_vmcs();
+	if (!vmcs) {
+		return -ENOMEM;
+	}
+
+	r = hardware_enable_all();
+	if (r)
+		goto out;
+
+	/*
+	 * Write encoded offsets into VMCS data for later vmcs_read.
+	 */
+	for (offset = FIELD_START; offset < vmcs_config.size;
+	     offset += sizeof(u16))
+		*(u16 *)((char *)vmcs + offset) = ENCODING_OFFSET(offset);
+
+	cpu = get_cpu();
+	vmcs_clear(vmcs);
+	per_cpu(current_vmcs, cpu) = vmcs;
+	vmcs_load(vmcs);
+
+	vmcsinfo_revision_id(vmcs->revision_id);
+
+	for (i = 0; i < VMCSINFO_MAX_FIELD; ++i) {
+		if (vmcsinfo.field_offset[i][0] != VM_INSTRUCTION_ERROR)
+			continue;
+		vmcs_read_checking(i);
+		if (vmcsinfo.field_offset[i][1] == 0) {
+			goto out_clear;
+		} else {
+			offset = vmcsinfo.field_offset[i][1];
+			break;
+		}
+	}
+
+	for (i = 0; i < VMCSINFO_MAX_FIELD; ++i) {
+		if (vmcsinfo.field_offset[i][0] == VM_INSTRUCTION_ERROR)
+			continue;
+		/*
+		 * Before each reading, zeroed field VM_INSTRUCTION_ERROR
+		 */
+		*(u32 *)((char *)vmcs + offset) = 0;
+		vmcs_read_checking(i);
+	}
+
+	r = sysfs_create_group(&cpu_subsys.dev_root->kobj, &vmcs_attr_group);
+
+out_clear:
+	vmcs_clear(vmcs);
+	put_cpu();
+	hardware_disable_all();
+out:
+	free_vmcs(vmcs);
+	return r;
+}
+
+static void __exit alloc_vmcsinfo_exit(void)
+{
+	sysfs_remove_group(&cpu_subsys.dev_root->kobj, &vmcs_attr_group);
+}
+
+module_init(alloc_vmcsinfo_init);
+module_exit(alloc_vmcsinfo_exit);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
@ 2012-07-12  9:57   ` Zhang Yanfei
  0 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-12  9:57 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: dzickus, luto, kvm, Joerg Roedel, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

This patch implements a new module named vmcsinfo-intel. The
module fills VMCSINFO with the VMCS revision identifier,
and offsets of VMCS fields.

Note, offsets of fields that defined in Intel specification
(Intel® 64 and IA-32 Architectures Software Developer’s Manual,
Volume 3C) but not defined in *vmcs_field* will not be filled in
VMCSINFO. And, some fields may be unsupported in some machines,
in these machines, corresponding offsets will be zero.

Besides, this patch also exports vmcs revision identifier via
/sys/devices/system/cpu/vmcs/id and offsets of fields via
/sys/devices/system/cpu/vmcs/.
Individual offsets are contained in subfiles named by the filed's
encoding, e.g.: /sys/devices/cpu/vmcs/0800

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/kvm/Kconfig    |   11 +
 arch/x86/kvm/Makefile   |    3 +
 arch/x86/kvm/vmcsinfo.c |  714 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 728 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/kvm/vmcsinfo.c

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index a28f338..1dd64b1 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -63,6 +63,17 @@ config KVM_INTEL
 	  To compile this as a module, choose M here: the module
 	  will be called kvm-intel.
 
+config VMCSINFO_INTEL
+	tristate "Export VMCSINFO for Intel processors"
+	depends on KVM_INTEL
+	---help---
+	  Provides support for exporting VMCSINFO on Intel processors equipped
+	  with the VT extensions. The VMCSINFO contains a VMCS revision
+	  identifier and offsets of VMCS fields.
+
+	  To compile this as a module, choose M here: the module
+	  will be called vmcsinfo-intel.
+
 config KVM_AMD
 	tristate "KVM for AMD processors support"
 	depends on KVM
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 4f579e8..12a1ef6 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -4,6 +4,7 @@ ccflags-y += -Ivirt/kvm -Iarch/x86/kvm
 CFLAGS_x86.o := -I.
 CFLAGS_svm.o := -I.
 CFLAGS_vmx.o := -I.
+CFLAGS_vmcsinfo.o := -I.
 
 kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 				coalesced_mmio.o irq_comm.o eventfd.o \
@@ -15,7 +16,9 @@ kvm-y			+= x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
 			   i8254.o timer.o cpuid.o pmu.o
 kvm-intel-y		+= vmx.o
 kvm-amd-y		+= svm.o
+vmcsinfo-intel-y	+= vmcsinfo.o
 
 obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o
+obj-$(CONFIG_VMCSINFO_INTEL)	+= vmcsinfo-intel.o
diff --git a/arch/x86/kvm/vmcsinfo.c b/arch/x86/kvm/vmcsinfo.c
new file mode 100644
index 0000000..0730473
--- /dev/null
+++ b/arch/x86/kvm/vmcsinfo.c
@@ -0,0 +1,714 @@
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * This module enables machines with Intel VT-x extensions to export
+ * offsets of VMCS fields for guest debugging.
+ *
+ * Copyright (C) 2012 Fujitsu, Inc.
+ *
+ * Authors:
+ *   Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/mod_devicetable.h>
+#include <linux/device.h>
+#include <linux/swab.h>
+#include <linux/cpu.h>
+
+#include <asm/vmx.h>
+
+MODULE_AUTHOR("Fujitsu");
+MODULE_LICENSE("GPL");
+
+static const struct x86_cpu_id vmcsinfo_cpu_id[] = {
+	X86_FEATURE_MATCH(X86_FEATURE_VMX),
+	{}
+};
+MODULE_DEVICE_TABLE(x86cpu, vmcsinfo_cpu_id);
+
+/*
+ * vmcs field offsets.
+ */
+static struct vmcsinfo {
+	u32 vmcs_revision_id;
+	u16 field_offset[][2];
+} vmcsinfo = {
+	0,
+	{
+		{VIRTUAL_PROCESSOR_ID, 0},
+		{GUEST_ES_SELECTOR, 0},
+		{GUEST_CS_SELECTOR, 0},
+		{GUEST_SS_SELECTOR, 0},
+		{GUEST_DS_SELECTOR, 0},
+		{GUEST_FS_SELECTOR, 0},
+		{GUEST_GS_SELECTOR, 0},
+		{GUEST_LDTR_SELECTOR, 0},
+		{GUEST_TR_SELECTOR, 0},
+		{HOST_ES_SELECTOR, 0},
+		{HOST_CS_SELECTOR, 0},
+		{HOST_SS_SELECTOR, 0},
+		{HOST_DS_SELECTOR, 0},
+		{HOST_FS_SELECTOR, 0},
+		{HOST_GS_SELECTOR, 0},
+		{HOST_TR_SELECTOR, 0},
+		{IO_BITMAP_A, 0},
+		{IO_BITMAP_A_HIGH, 0},
+		{IO_BITMAP_B, 0},
+		{IO_BITMAP_B_HIGH, 0},
+		{MSR_BITMAP, 0},
+		{MSR_BITMAP_HIGH, 0},
+		{VM_EXIT_MSR_STORE_ADDR, 0},
+		{VM_EXIT_MSR_STORE_ADDR_HIGH, 0},
+		{VM_EXIT_MSR_LOAD_ADDR, 0},
+		{VM_EXIT_MSR_LOAD_ADDR_HIGH, 0},
+		{VM_ENTRY_MSR_LOAD_ADDR, 0},
+		{VM_ENTRY_MSR_LOAD_ADDR_HIGH, 0},
+		{TSC_OFFSET, 0},
+		{TSC_OFFSET_HIGH, 0},
+		{VIRTUAL_APIC_PAGE_ADDR, 0},
+		{VIRTUAL_APIC_PAGE_ADDR_HIGH, 0},
+		{APIC_ACCESS_ADDR, 0},
+		{APIC_ACCESS_ADDR_HIGH, 0},
+		{EPT_POINTER, 0},
+		{EPT_POINTER_HIGH, 0},
+		{GUEST_PHYSICAL_ADDRESS, 0},
+		{GUEST_PHYSICAL_ADDRESS_HIGH, 0},
+		{VMCS_LINK_POINTER, 0},
+		{VMCS_LINK_POINTER_HIGH, 0},
+		{GUEST_IA32_DEBUGCTL, 0},
+		{GUEST_IA32_DEBUGCTL_HIGH, 0},
+		{GUEST_IA32_PAT, 0},
+		{GUEST_IA32_PAT_HIGH, 0},
+		{GUEST_IA32_EFER, 0},
+		{GUEST_IA32_EFER_HIGH, 0},
+		{GUEST_IA32_PERF_GLOBAL_CTRL, 0},
+		{GUEST_IA32_PERF_GLOBAL_CTRL_HIGH, 0},
+		{GUEST_PDPTR0, 0},
+		{GUEST_PDPTR0_HIGH, 0},
+		{GUEST_PDPTR1, 0},
+		{GUEST_PDPTR1_HIGH, 0},
+		{GUEST_PDPTR2, 0},
+		{GUEST_PDPTR2_HIGH, 0},
+		{GUEST_PDPTR3, 0},
+		{GUEST_PDPTR3_HIGH, 0},
+		{HOST_IA32_PAT, 0},
+		{HOST_IA32_PAT_HIGH, 0},
+		{HOST_IA32_EFER, 0},
+		{HOST_IA32_EFER_HIGH, 0},
+		{HOST_IA32_PERF_GLOBAL_CTRL, 0},
+		{HOST_IA32_PERF_GLOBAL_CTRL_HIGH, 0},
+		{PIN_BASED_VM_EXEC_CONTROL, 0},
+		{CPU_BASED_VM_EXEC_CONTROL, 0},
+		{EXCEPTION_BITMAP, 0},
+		{PAGE_FAULT_ERROR_CODE_MASK, 0},
+		{PAGE_FAULT_ERROR_CODE_MATCH, 0},
+		{CR3_TARGET_COUNT, 0},
+		{VM_EXIT_CONTROLS, 0},
+		{VM_EXIT_MSR_STORE_COUNT, 0},
+		{VM_EXIT_MSR_LOAD_COUNT, 0},
+		{VM_ENTRY_CONTROLS, 0},
+		{VM_ENTRY_MSR_LOAD_COUNT, 0},
+		{VM_ENTRY_INTR_INFO_FIELD, 0},
+		{VM_ENTRY_EXCEPTION_ERROR_CODE, 0},
+		{VM_ENTRY_INSTRUCTION_LEN, 0},
+		{TPR_THRESHOLD, 0},
+		{SECONDARY_VM_EXEC_CONTROL, 0},
+		{PLE_GAP, 0},
+		{PLE_WINDOW, 0},
+		{VM_INSTRUCTION_ERROR, 0},
+		{VM_EXIT_REASON, 0},
+		{VM_EXIT_INTR_INFO, 0},
+		{VM_EXIT_INTR_ERROR_CODE, 0},
+		{IDT_VECTORING_INFO_FIELD, 0},
+		{IDT_VECTORING_ERROR_CODE, 0},
+		{VM_EXIT_INSTRUCTION_LEN, 0},
+		{VMX_INSTRUCTION_INFO, 0},
+		{GUEST_ES_LIMIT, 0},
+		{GUEST_CS_LIMIT, 0},
+		{GUEST_SS_LIMIT, 0},
+		{GUEST_DS_LIMIT, 0},
+		{GUEST_FS_LIMIT, 0},
+		{GUEST_GS_LIMIT, 0},
+		{GUEST_LDTR_LIMIT, 0},
+		{GUEST_TR_LIMIT, 0},
+		{GUEST_GDTR_LIMIT, 0},
+		{GUEST_IDTR_LIMIT, 0},
+		{GUEST_ES_AR_BYTES, 0},
+		{GUEST_CS_AR_BYTES, 0},
+		{GUEST_SS_AR_BYTES, 0},
+		{GUEST_DS_AR_BYTES, 0},
+		{GUEST_FS_AR_BYTES, 0},
+		{GUEST_GS_AR_BYTES, 0},
+		{GUEST_LDTR_AR_BYTES, 0},
+		{GUEST_TR_AR_BYTES, 0},
+		{GUEST_INTERRUPTIBILITY_INFO, 0},
+		{GUEST_ACTIVITY_STATE, 0},
+		{GUEST_SYSENTER_CS, 0},
+		{HOST_IA32_SYSENTER_CS, 0},
+		{CR0_GUEST_HOST_MASK, 0},
+		{CR4_GUEST_HOST_MASK, 0},
+		{CR0_READ_SHADOW, 0},
+		{CR4_READ_SHADOW, 0},
+		{CR3_TARGET_VALUE0, 0},
+		{CR3_TARGET_VALUE1, 0},
+		{CR3_TARGET_VALUE2, 0},
+		{CR3_TARGET_VALUE3, 0},
+		{EXIT_QUALIFICATION, 0},
+		{GUEST_LINEAR_ADDRESS, 0},
+		{GUEST_CR0, 0},
+		{GUEST_CR3, 0},
+		{GUEST_CR4, 0},
+		{GUEST_ES_BASE, 0},
+		{GUEST_CS_BASE, 0},
+		{GUEST_SS_BASE, 0},
+		{GUEST_DS_BASE, 0},
+		{GUEST_FS_BASE, 0},
+		{GUEST_GS_BASE, 0},
+		{GUEST_LDTR_BASE, 0},
+		{GUEST_TR_BASE, 0},
+		{GUEST_GDTR_BASE, 0},
+		{GUEST_IDTR_BASE, 0},
+		{GUEST_DR7, 0},
+		{GUEST_RSP, 0},
+		{GUEST_RIP, 0},
+		{GUEST_RFLAGS, 0},
+		{GUEST_PENDING_DBG_EXCEPTIONS, 0},
+		{GUEST_SYSENTER_ESP, 0},
+		{GUEST_SYSENTER_EIP, 0},
+		{HOST_CR0, 0},
+		{HOST_CR3, 0},
+		{HOST_CR4, 0},
+		{HOST_FS_BASE, 0},
+		{HOST_GS_BASE, 0},
+		{HOST_TR_BASE, 0},
+		{HOST_GDTR_BASE, 0},
+		{HOST_IDTR_BASE, 0},
+		{HOST_IA32_SYSENTER_ESP, 0},
+		{HOST_IA32_SYSENTER_EIP, 0},
+		{HOST_RSP, 0},
+		{HOST_RIP, 0}
+	}
+};
+
+const char vmcs_group_name[] = "vmcs";
+
+static ssize_t vmcs_id_show(struct device *dev,
+			    struct device_attribute *attr,
+			    char *buf)
+{
+	return sprintf(buf, "%d\n", vmcsinfo.vmcs_revision_id);
+}
+static DEVICE_ATTR(id, 0444, vmcs_id_show, NULL);
+
+#define BUILD_OFFSET_SHOW(i, field_code)                                      \
+static ssize_t _##field_code##_show(struct device *dev,                       \
+				    struct device_attribute *attr,            \
+				    char *buf)                                \
+{                                                                             \
+	return sprintf(buf, "%d\n", vmcsinfo.field_offset[i][1]);             \
+}                                                                             \
+static DEVICE_ATTR(field_code, 0444, _##field_code##_show, NULL);
+
+BUILD_OFFSET_SHOW(  0, 0000); /* VIRTUAL_PROCESSOR_ID             */
+BUILD_OFFSET_SHOW(  1, 0800); /* GUEST_ES_SELECTOR                */
+BUILD_OFFSET_SHOW(  2, 0802); /* GUEST_CS_SELECTOR                */
+BUILD_OFFSET_SHOW(  3, 0804); /* GUEST_SS_SELECTOR                */
+BUILD_OFFSET_SHOW(  4, 0806); /* GUEST_DS_SELECTOR                */
+BUILD_OFFSET_SHOW(  5, 0808); /* GUEST_FS_SELECTOR                */
+BUILD_OFFSET_SHOW(  6, 080a); /* GUEST_GS_SELECTOR                */
+BUILD_OFFSET_SHOW(  7, 080c); /* GUEST_LDTR_SELECTOR              */
+BUILD_OFFSET_SHOW(  8, 080e); /* GUEST_TR_SELECTOR                */
+BUILD_OFFSET_SHOW(  9, 0c00); /* HOST_ES_SELECTOR                 */
+BUILD_OFFSET_SHOW( 10, 0c02); /* HOST_CS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 11, 0c04); /* HOST_SS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 12, 0c06); /* HOST_DS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 13, 0c08); /* HOST_FS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 14, 0c0a); /* HOST_GS_SELECTOR                 */
+BUILD_OFFSET_SHOW( 15, 0c0c); /* HOST_TR_SELECTOR                 */
+BUILD_OFFSET_SHOW( 16, 2000); /* IO_BITMAP_A                      */
+BUILD_OFFSET_SHOW( 17, 2001); /* IO_BITMAP_A_HIGH                 */
+BUILD_OFFSET_SHOW( 18, 2002); /* IO_BITMAP_B                      */
+BUILD_OFFSET_SHOW( 19, 2003); /* IO_BITMAP_B_HIGH                 */
+BUILD_OFFSET_SHOW( 20, 2004); /* MSR_BITMAP                       */
+BUILD_OFFSET_SHOW( 21, 2005); /* MSR_BITMAP_HIGH                  */
+BUILD_OFFSET_SHOW( 22, 2006); /* VM_EXIT_MSR_STORE_ADDR           */
+BUILD_OFFSET_SHOW( 23, 2007); /* VM_EXIT_MSR_STORE_ADDR_HIGH      */
+BUILD_OFFSET_SHOW( 24, 2008); /* VM_EXIT_MSR_LOAD_ADDR            */
+BUILD_OFFSET_SHOW( 25, 2009); /* VM_EXIT_MSR_LOAD_ADDR_HIGH       */
+BUILD_OFFSET_SHOW( 26, 200a); /* VM_ENTRY_MSR_LOAD_ADDR           */
+BUILD_OFFSET_SHOW( 27, 200b); /* VM_ENTRY_MSR_LOAD_ADDR_HIGH      */
+BUILD_OFFSET_SHOW( 28, 2010); /* TSC_OFFSET                       */
+BUILD_OFFSET_SHOW( 29, 2011); /* TSC_OFFSET_HIGH                  */
+BUILD_OFFSET_SHOW( 30, 2012); /* VIRTUAL_APIC_PAGE_ADDR           */
+BUILD_OFFSET_SHOW( 31, 2013); /* VIRTUAL_APIC_PAGE_ADDR_HIGH      */
+BUILD_OFFSET_SHOW( 32, 2014); /* APIC_ACCESS_ADDR                 */
+BUILD_OFFSET_SHOW( 33, 2015); /* APIC_ACCESS_ADDR_HIGH            */
+BUILD_OFFSET_SHOW( 34, 201a); /* EPT_POINTER                      */
+BUILD_OFFSET_SHOW( 35, 201b); /* EPT_POINTER_HIGH                 */
+BUILD_OFFSET_SHOW( 36, 2400); /* GUEST_PHYSICAL_ADDRESS           */
+BUILD_OFFSET_SHOW( 37, 2401); /* GUEST_PHYSICAL_ADDRESS_HIGH      */
+BUILD_OFFSET_SHOW( 38, 2800); /* VMCS_LINK_POINTER                */
+BUILD_OFFSET_SHOW( 39, 2801); /* VMCS_LINK_POINTER_HIGH           */
+BUILD_OFFSET_SHOW( 40, 2802); /* GUEST_IA32_DEBUGCTL              */
+BUILD_OFFSET_SHOW( 41, 2803); /* GUEST_IA32_DEBUGCTL_HIGH         */
+BUILD_OFFSET_SHOW( 42, 2804); /* GUEST_IA32_PAT                   */
+BUILD_OFFSET_SHOW( 43, 2805); /* GUEST_IA32_PAT_HIGH              */
+BUILD_OFFSET_SHOW( 44, 2806); /* GUEST_IA32_EFER                  */
+BUILD_OFFSET_SHOW( 45, 2807); /* GUEST_IA32_EFER_HIGH             */
+BUILD_OFFSET_SHOW( 46, 2808); /* GUEST_IA32_PERF_GLOBAL_CTRL      */
+BUILD_OFFSET_SHOW( 47, 2809); /* GUEST_IA32_PERF_GLOBAL_CTRL_HIGH */
+BUILD_OFFSET_SHOW( 48, 280a); /* GUEST_PDPTR0                     */
+BUILD_OFFSET_SHOW( 49, 280b); /* GUEST_PDPTR0_HIGH                */
+BUILD_OFFSET_SHOW( 50, 280c); /* GUEST_PDPTR1                     */
+BUILD_OFFSET_SHOW( 51, 280d); /* GUEST_PDPTR1_HIGH                */
+BUILD_OFFSET_SHOW( 52, 280e); /* GUEST_PDPTR2                     */
+BUILD_OFFSET_SHOW( 53, 280f); /* GUEST_PDPTR2_HIGH                */
+BUILD_OFFSET_SHOW( 54, 2810); /* GUEST_PDPTR3                     */
+BUILD_OFFSET_SHOW( 55, 2811); /* GUEST_PDPTR3_HIGH                */
+BUILD_OFFSET_SHOW( 56, 2c00); /* HOST_IA32_PAT                    */
+BUILD_OFFSET_SHOW( 57, 2c01); /* HOST_IA32_PAT_HIGH               */
+BUILD_OFFSET_SHOW( 58, 2c02); /* HOST_IA32_EFER                   */
+BUILD_OFFSET_SHOW( 59, 2c03); /* HOST_IA32_EFER_HIGH              */
+BUILD_OFFSET_SHOW( 60, 2c04); /* HOST_IA32_PERF_GLOBAL_CTRL       */
+BUILD_OFFSET_SHOW( 61, 2c05); /* HOST_IA32_PERF_GLOBAL_CTRL_HIGH  */
+BUILD_OFFSET_SHOW( 62, 4000); /* PIN_BASED_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW( 63, 4002); /* CPU_BASED_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW( 64, 4004); /* EXCEPTION_BITMAP                 */
+BUILD_OFFSET_SHOW( 65, 4006); /* PAGE_FAULT_ERROR_CODE_MASK       */
+BUILD_OFFSET_SHOW( 66, 4008); /* PAGE_FAULT_ERROR_CODE_MATCH      */
+BUILD_OFFSET_SHOW( 67, 400a); /* CR3_TARGET_COUNT                 */
+BUILD_OFFSET_SHOW( 68, 400c); /* VM_EXIT_CONTROLS                 */
+BUILD_OFFSET_SHOW( 69, 400e); /* VM_EXIT_MSR_STORE_COUNT          */
+BUILD_OFFSET_SHOW( 70, 4010); /* VM_EXIT_MSR_LOAD_COUNT           */
+BUILD_OFFSET_SHOW( 71, 4012); /* VM_ENTRY_CONTROLS                */
+BUILD_OFFSET_SHOW( 72, 4014); /* VM_ENTRY_MSR_LOAD_COUNT          */
+BUILD_OFFSET_SHOW( 73, 4016); /* VM_ENTRY_INTR_INFO_FIELD         */
+BUILD_OFFSET_SHOW( 74, 4018); /* VM_ENTRY_EXCEPTION_ERROR_CODE    */
+BUILD_OFFSET_SHOW( 75, 401a); /* VM_ENTRY_INSTRUCTION_LEN         */
+BUILD_OFFSET_SHOW( 76, 401c); /* TPR_THRESHOLD                    */
+BUILD_OFFSET_SHOW( 77, 401e); /* SECONDARY_VM_EXEC_CONTROL        */
+BUILD_OFFSET_SHOW( 78, 4020); /* PLE_GAP                          */
+BUILD_OFFSET_SHOW( 79, 4022); /* PLE_WINDOW                       */
+BUILD_OFFSET_SHOW( 80, 4400); /* VM_INSTRUCTION_ERROR             */
+BUILD_OFFSET_SHOW( 81, 4402); /* VM_EXIT_REASON                   */
+BUILD_OFFSET_SHOW( 82, 4404); /* VM_EXIT_INTR_INFO                */
+BUILD_OFFSET_SHOW( 83, 4406); /* VM_EXIT_INTR_ERROR_CODE          */
+BUILD_OFFSET_SHOW( 84, 4408); /* IDT_VECTORING_INFO_FIELD         */
+BUILD_OFFSET_SHOW( 85, 440a); /* IDT_VECTORING_ERROR_CODE         */
+BUILD_OFFSET_SHOW( 86, 440c); /* VM_EXIT_INSTRUCTION_LEN          */
+BUILD_OFFSET_SHOW( 87, 440e); /* VMX_INSTRUCTION_INFO             */
+BUILD_OFFSET_SHOW( 88, 4800); /* GUEST_ES_LIMIT                   */
+BUILD_OFFSET_SHOW( 89, 4802); /* GUEST_CS_LIMIT                   */
+BUILD_OFFSET_SHOW( 90, 4804); /* GUEST_SS_LIMIT                   */
+BUILD_OFFSET_SHOW( 91, 4806); /* GUEST_DS_LIMIT                   */
+BUILD_OFFSET_SHOW( 92, 4808); /* GUEST_FS_LIMIT                   */
+BUILD_OFFSET_SHOW( 93, 480a); /* GUEST_GS_LIMIT                   */
+BUILD_OFFSET_SHOW( 94, 480c); /* GUEST_LDTR_LIMIT                 */
+BUILD_OFFSET_SHOW( 95, 480e); /* GUEST_TR_LIMIT                   */
+BUILD_OFFSET_SHOW( 96, 4810); /* GUEST_GDTR_LIMIT                 */
+BUILD_OFFSET_SHOW( 97, 4812); /* GUEST_IDTR_LIMIT                 */
+BUILD_OFFSET_SHOW( 98, 4814); /* GUEST_ES_AR_BYTES                */
+BUILD_OFFSET_SHOW( 99, 4816); /* GUEST_CS_AR_BYTES                */
+BUILD_OFFSET_SHOW(100, 4818); /* GUEST_SS_AR_BYTES                */
+BUILD_OFFSET_SHOW(101, 481a); /* GUEST_DS_AR_BYTES                */
+BUILD_OFFSET_SHOW(102, 481c); /* GUEST_FS_AR_BYTES                */
+BUILD_OFFSET_SHOW(103, 481e); /* GUEST_GS_AR_BYTES                */
+BUILD_OFFSET_SHOW(104, 4820); /* GUEST_LDTR_AR_BYTES              */
+BUILD_OFFSET_SHOW(105, 4822); /* GUEST_TR_AR_BYTES                */
+BUILD_OFFSET_SHOW(106, 4824); /* GUEST_INTERRUPTIBILITY_INFO      */
+BUILD_OFFSET_SHOW(107, 4826); /* GUEST_ACTIVITY_STATE             */
+BUILD_OFFSET_SHOW(108, 482A); /* GUEST_SYSENTER_CS                */
+BUILD_OFFSET_SHOW(109, 4c00); /* HOST_IA32_SYSENTER_CS            */
+BUILD_OFFSET_SHOW(110, 6000); /* CR0_GUEST_HOST_MASK              */
+BUILD_OFFSET_SHOW(111, 6002); /* CR4_GUEST_HOST_MASK              */
+BUILD_OFFSET_SHOW(112, 6004); /* CR0_READ_SHADOW                  */
+BUILD_OFFSET_SHOW(113, 6006); /* CR4_READ_SHADOW                  */
+BUILD_OFFSET_SHOW(114, 6008); /* CR3_TARGET_VALUE0                */
+BUILD_OFFSET_SHOW(115, 600a); /* CR3_TARGET_VALUE1                */
+BUILD_OFFSET_SHOW(116, 600c); /* CR3_TARGET_VALUE2                */
+BUILD_OFFSET_SHOW(117, 600e); /* CR3_TARGET_VALUE3                */
+BUILD_OFFSET_SHOW(118, 6400); /* EXIT_QUALIFICATION               */
+BUILD_OFFSET_SHOW(119, 640a); /* GUEST_LINEAR_ADDRESS             */
+BUILD_OFFSET_SHOW(120, 6800); /* GUEST_CR0                        */
+BUILD_OFFSET_SHOW(121, 6802); /* GUEST_CR3                        */
+BUILD_OFFSET_SHOW(122, 6804); /* GUEST_CR4                        */
+BUILD_OFFSET_SHOW(123, 6806); /* GUEST_ES_BASE                    */
+BUILD_OFFSET_SHOW(124, 6808); /* GUEST_CS_BASE                    */
+BUILD_OFFSET_SHOW(125, 680a); /* GUEST_SS_BASE                    */
+BUILD_OFFSET_SHOW(126, 680c); /* GUEST_DS_BASE                    */
+BUILD_OFFSET_SHOW(127, 680e); /* GUEST_FS_BASE                    */
+BUILD_OFFSET_SHOW(128, 6810); /* GUEST_GS_BASE                    */
+BUILD_OFFSET_SHOW(129, 6812); /* GUEST_LDTR_BASE                  */
+BUILD_OFFSET_SHOW(130, 6814); /* GUEST_TR_BASE                    */
+BUILD_OFFSET_SHOW(131, 6816); /* GUEST_GDTR_BASE                  */
+BUILD_OFFSET_SHOW(132, 6818); /* GUEST_IDTR_BASE                  */
+BUILD_OFFSET_SHOW(133, 681a); /* GUEST_DR7                        */
+BUILD_OFFSET_SHOW(134, 681c); /* GUEST_RSP                        */
+BUILD_OFFSET_SHOW(135, 681e); /* GUEST_RIP                        */
+BUILD_OFFSET_SHOW(136, 6820); /* GUEST_RFLAGS                     */
+BUILD_OFFSET_SHOW(137, 6822); /* GUEST_PENDING_DBG_EXCEPTIONS     */
+BUILD_OFFSET_SHOW(138, 6824); /* GUEST_SYSENTER_ESP               */
+BUILD_OFFSET_SHOW(139, 6826); /* GUEST_SYSENTER_EIP               */
+BUILD_OFFSET_SHOW(140, 6c00); /* HOST_CR0                         */
+BUILD_OFFSET_SHOW(141, 6c02); /* HOST_CR3                         */
+BUILD_OFFSET_SHOW(142, 6c04); /* HOST_CR4                         */
+BUILD_OFFSET_SHOW(143, 6c06); /* HOST_FS_BASE                     */
+BUILD_OFFSET_SHOW(144, 6c08); /* HOST_GS_BASE                     */
+BUILD_OFFSET_SHOW(145, 6c0a); /* HOST_TR_BASE                     */
+BUILD_OFFSET_SHOW(146, 6c0c); /* HOST_GDTR_BASE                   */
+BUILD_OFFSET_SHOW(147, 6c0e); /* HOST_IDTR_BASE                   */
+BUILD_OFFSET_SHOW(148, 6c10); /* HOST_IA32_SYSENTER_ESP           */
+BUILD_OFFSET_SHOW(149, 6c12); /* HOST_IA32_SYSENTER_EIP           */
+BUILD_OFFSET_SHOW(150, 6c14); /* HOST_RSP                         */
+BUILD_OFFSET_SHOW(151, 6c16); /* HOST_RIP                         */
+
+static struct attribute *vmcs_attrs[] = {
+	&dev_attr_id.attr,
+	&dev_attr_0000.attr,
+	&dev_attr_0800.attr,
+	&dev_attr_0802.attr,
+	&dev_attr_0804.attr,
+	&dev_attr_0806.attr,
+	&dev_attr_0808.attr,
+	&dev_attr_080a.attr,
+	&dev_attr_080c.attr,
+	&dev_attr_080e.attr,
+	&dev_attr_0c00.attr,
+	&dev_attr_0c02.attr,
+	&dev_attr_0c04.attr,
+	&dev_attr_0c06.attr,
+	&dev_attr_0c08.attr,
+	&dev_attr_0c0a.attr,
+	&dev_attr_0c0c.attr,
+	&dev_attr_2000.attr,
+	&dev_attr_2001.attr,
+	&dev_attr_2002.attr,
+	&dev_attr_2003.attr,
+	&dev_attr_2004.attr,
+	&dev_attr_2005.attr,
+	&dev_attr_2006.attr,
+	&dev_attr_2007.attr,
+	&dev_attr_2008.attr,
+	&dev_attr_2009.attr,
+	&dev_attr_200a.attr,
+	&dev_attr_200b.attr,
+	&dev_attr_2010.attr,
+	&dev_attr_2011.attr,
+	&dev_attr_2012.attr,
+	&dev_attr_2013.attr,
+	&dev_attr_2014.attr,
+	&dev_attr_2015.attr,
+	&dev_attr_201a.attr,
+	&dev_attr_201b.attr,
+	&dev_attr_2400.attr,
+	&dev_attr_2401.attr,
+	&dev_attr_2800.attr,
+	&dev_attr_2801.attr,
+	&dev_attr_2802.attr,
+	&dev_attr_2803.attr,
+	&dev_attr_2804.attr,
+	&dev_attr_2805.attr,
+	&dev_attr_2806.attr,
+	&dev_attr_2807.attr,
+	&dev_attr_2808.attr,
+	&dev_attr_2809.attr,
+	&dev_attr_280a.attr,
+	&dev_attr_280b.attr,
+	&dev_attr_280c.attr,
+	&dev_attr_280d.attr,
+	&dev_attr_280e.attr,
+	&dev_attr_280f.attr,
+	&dev_attr_2810.attr,
+	&dev_attr_2811.attr,
+	&dev_attr_2c00.attr,
+	&dev_attr_2c01.attr,
+	&dev_attr_2c02.attr,
+	&dev_attr_2c03.attr,
+	&dev_attr_2c04.attr,
+	&dev_attr_2c05.attr,
+	&dev_attr_4000.attr,
+	&dev_attr_4002.attr,
+	&dev_attr_4004.attr,
+	&dev_attr_4006.attr,
+	&dev_attr_4008.attr,
+	&dev_attr_400a.attr,
+	&dev_attr_400c.attr,
+	&dev_attr_400e.attr,
+	&dev_attr_4010.attr,
+	&dev_attr_4012.attr,
+	&dev_attr_4014.attr,
+	&dev_attr_4016.attr,
+	&dev_attr_4018.attr,
+	&dev_attr_401a.attr,
+	&dev_attr_401c.attr,
+	&dev_attr_401e.attr,
+	&dev_attr_4020.attr,
+	&dev_attr_4022.attr,
+	&dev_attr_4400.attr,
+	&dev_attr_4402.attr,
+	&dev_attr_4404.attr,
+	&dev_attr_4406.attr,
+	&dev_attr_4408.attr,
+	&dev_attr_440a.attr,
+	&dev_attr_440c.attr,
+	&dev_attr_440e.attr,
+	&dev_attr_4800.attr,
+	&dev_attr_4802.attr,
+	&dev_attr_4804.attr,
+	&dev_attr_4806.attr,
+	&dev_attr_4808.attr,
+	&dev_attr_480a.attr,
+	&dev_attr_480c.attr,
+	&dev_attr_480e.attr,
+	&dev_attr_4810.attr,
+	&dev_attr_4812.attr,
+	&dev_attr_4814.attr,
+	&dev_attr_4816.attr,
+	&dev_attr_4818.attr,
+	&dev_attr_481a.attr,
+	&dev_attr_481c.attr,
+	&dev_attr_481e.attr,
+	&dev_attr_4820.attr,
+	&dev_attr_4822.attr,
+	&dev_attr_4824.attr,
+	&dev_attr_4826.attr,
+	&dev_attr_482A.attr,
+	&dev_attr_4c00.attr,
+	&dev_attr_6000.attr,
+	&dev_attr_6002.attr,
+	&dev_attr_6004.attr,
+	&dev_attr_6006.attr,
+	&dev_attr_6008.attr,
+	&dev_attr_600a.attr,
+	&dev_attr_600c.attr,
+	&dev_attr_600e.attr,
+	&dev_attr_6400.attr,
+	&dev_attr_640a.attr,
+	&dev_attr_6800.attr,
+	&dev_attr_6802.attr,
+	&dev_attr_6804.attr,
+	&dev_attr_6806.attr,
+	&dev_attr_6808.attr,
+	&dev_attr_680a.attr,
+	&dev_attr_680c.attr,
+	&dev_attr_680e.attr,
+	&dev_attr_6810.attr,
+	&dev_attr_6812.attr,
+	&dev_attr_6814.attr,
+	&dev_attr_6816.attr,
+	&dev_attr_6818.attr,
+	&dev_attr_681a.attr,
+	&dev_attr_681c.attr,
+	&dev_attr_681e.attr,
+	&dev_attr_6820.attr,
+	&dev_attr_6822.attr,
+	&dev_attr_6824.attr,
+	&dev_attr_6826.attr,
+	&dev_attr_6c00.attr,
+	&dev_attr_6c02.attr,
+	&dev_attr_6c04.attr,
+	&dev_attr_6c06.attr,
+	&dev_attr_6c08.attr,
+	&dev_attr_6c0a.attr,
+	&dev_attr_6c0c.attr,
+	&dev_attr_6c0e.attr,
+	&dev_attr_6c10.attr,
+	&dev_attr_6c12.attr,
+	&dev_attr_6c14.attr,
+	&dev_attr_6c16.attr,
+	NULL,
+};
+
+static struct attribute_group vmcs_attr_group = {
+	.name = vmcs_group_name,
+	.attrs = vmcs_attrs,
+};
+
+#define VMCSINFO_MAX_FIELD (ARRAY_SIZE(vmcs_attrs) - 1)
+
+static inline void vmcsinfo_revision_id(u32 id)
+{
+	vmcsinfo.vmcs_revision_id = id;
+}
+
+static inline void vmcsinfo_field(int i, u16 offset)
+{
+	if (i < VMCSINFO_MAX_FIELD)
+		vmcsinfo.field_offset[i][1] = offset;
+}
+
+/*
+ * For caculating offsets of fields in VMCS data, we index every 16-bit
+ * field by this kind of format:
+ *         | --------- 16 bits ---------- |
+ *         +-------------+-+------------+-+
+ *         | high 7 bits |1| low 7 bits |0|
+ *         +-------------+-+------------+-+
+ * In high byte, the lowest bit must be 1; In low byte, the lowest bit
+ * must be 0. The two bits are set like this in case indexes in VMCS
+ * data are read as big endian mode.
+ * The remaining 14 bits of the index indicate the real offset of the
+ * field. Because the size of a VMCS region is at most 4 KBytes, so
+ * 14 bits are enough to index the whole VMCS region.
+ *
+ * ENCODING_OFFSET: encode the offset into the index of this kind.
+ * DECODING_OFFSET: decode the index of this kind into real offset.
+ */
+#define OFFSET_HIGH_SHIFT (7)
+#define OFFSET_LOW_MASK   ((1 << OFFSET_HIGH_SHIFT) - 1) /* 0x7f */
+#define OFFSET_HIGH_MASK  (OFFSET_LOW_MASK << OFFSET_HIGH_SHIFT) /* 0x3f80 */
+#define ENCODING_OFFSET(offset)                                     \
+	((((offset) & OFFSET_LOW_MASK) << 1) +                      \
+	((((offset) & OFFSET_HIGH_MASK) << 2) | 0x100))
+/*
+ * index here should be always read in little endian mode.
+ */
+#define DECODING_OFFSET_LE(index)                                   \
+	((((index) >> 1) & OFFSET_LOW_MASK) +                       \
+	(((index) >> 2) & OFFSET_HIGH_MASK))
+/*
+ * n indicates the bits of index. We first check if index
+ * is read in big endian mode.
+ */
+#define DECODING_OFFSET(index, n)                                   \
+	((index & 1) ? (DECODING_OFFSET_LE(__swab##n(index))) :     \
+	(DECODING_OFFSET_LE(index)))
+
+#define FIELD_OFFSET16(i, offset)                                   \
+	vmcsinfo_field(i, DECODING_OFFSET(offset, 16))
+#define FIELD_OFFSET64(i, offset)                                   \
+	vmcsinfo_field(i, DECODING_OFFSET(offset, 64))
+#define FIELD_OFFSET32(i, offset)                                   \
+	vmcsinfo_field(i, DECODING_OFFSET(offset, 32))
+#define FIELD_OFFSETNW(i, offset)                                   \
+do {                                                                \
+	if (sizeof(offset) == 8)                                    \
+		vmcsinfo_field(i, DECODING_OFFSET(offset, 64));     \
+	else                                                        \
+		vmcsinfo_field(i, DECODING_OFFSET(offset, 32));     \
+} while (0)
+
+#define VMCS_FIELD_CHECK(i, offset, type)                           \
+do {                                                                \
+	if (vmcs_read32(VM_INSTRUCTION_ERROR) !=                    \
+		VMXERR_UNSUPPORTED_VMCS_COMPONENT)                  \
+		FIELD_OFFSET##type(i, offset);                      \
+} while (0)
+
+static inline void vmcs_read_checking(int i)
+{
+	unsigned long field;
+	u16 offset16;
+	u64 offset64;
+	u32 offset32;
+	unsigned long offsetnw;
+
+	field = vmcsinfo.field_offset[i][0];
+	switch (vmcs_field_type(field)) {
+	case VMCS_FIELD_TYPE_U16:
+		offset16 = vmcs_read16(field);
+		VMCS_FIELD_CHECK(i, offset16, 16);
+		break;
+	case VMCS_FIELD_TYPE_U64:
+		offset64 = vmcs_read64(field);
+		VMCS_FIELD_CHECK(i, offset64, 64);
+		break;
+	case VMCS_FIELD_TYPE_U32:
+		offset32 = vmcs_read32(field);
+		VMCS_FIELD_CHECK(i, offset32, 32);
+		break;
+	case VMCS_FIELD_TYPE_NATURAL_WIDTH:
+		offsetnw = vmcs_readl(field);
+		VMCS_FIELD_CHECK(i, offsetnw, NW);
+		break;
+	}
+}
+
+/*
+ * Note, offsets of fields that defined in Intel specification
+ * (Intel® 64 and IA-32 Architectures Software Developer’s Manual,
+ * Volume 3C) but not defined in *vmcs_field* will not be filled in
+ * VMCSINFO. And, some fields may be unsupported in some machines,
+ * in these machines, corresponding offsets will be zero.
+ */
+static int __init alloc_vmcsinfo_init(void)
+{
+/*
+ * The first 8 bytes in vmcs region are for
+ *   VMCS revision identifier
+ *   VMX-abort indicator
+ */
+#define FIELD_START (8)
+
+	int r, offset, i;
+	struct vmcs *vmcs;
+	int cpu;
+
+	vmcs = alloc_vmcs();
+	if (!vmcs) {
+		return -ENOMEM;
+	}
+
+	r = hardware_enable_all();
+	if (r)
+		goto out;
+
+	/*
+	 * Write encoded offsets into VMCS data for later vmcs_read.
+	 */
+	for (offset = FIELD_START; offset < vmcs_config.size;
+	     offset += sizeof(u16))
+		*(u16 *)((char *)vmcs + offset) = ENCODING_OFFSET(offset);
+
+	cpu = get_cpu();
+	vmcs_clear(vmcs);
+	per_cpu(current_vmcs, cpu) = vmcs;
+	vmcs_load(vmcs);
+
+	vmcsinfo_revision_id(vmcs->revision_id);
+
+	for (i = 0; i < VMCSINFO_MAX_FIELD; ++i) {
+		if (vmcsinfo.field_offset[i][0] != VM_INSTRUCTION_ERROR)
+			continue;
+		vmcs_read_checking(i);
+		if (vmcsinfo.field_offset[i][1] == 0) {
+			goto out_clear;
+		} else {
+			offset = vmcsinfo.field_offset[i][1];
+			break;
+		}
+	}
+
+	for (i = 0; i < VMCSINFO_MAX_FIELD; ++i) {
+		if (vmcsinfo.field_offset[i][0] == VM_INSTRUCTION_ERROR)
+			continue;
+		/*
+		 * Before each reading, zeroed field VM_INSTRUCTION_ERROR
+		 */
+		*(u32 *)((char *)vmcs + offset) = 0;
+		vmcs_read_checking(i);
+	}
+
+	r = sysfs_create_group(&cpu_subsys.dev_root->kobj, &vmcs_attr_group);
+
+out_clear:
+	vmcs_clear(vmcs);
+	put_cpu();
+	hardware_disable_all();
+out:
+	free_vmcs(vmcs);
+	return r;
+}
+
+static void __exit alloc_vmcsinfo_exit(void)
+{
+	sysfs_remove_group(&cpu_subsys.dev_root->kobj, &vmcs_attr_group);
+}
+
+module_init(alloc_vmcsinfo_init);
+module_exit(alloc_vmcsinfo_exit);
-- 
1.7.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 3/3] Documentation: Add ABI entry for vmcs sysfs interface
  2012-07-12  9:54 ` Zhang Yanfei
@ 2012-07-12  9:59   ` Zhang Yanfei
  -1 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-12  9:59 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 5dab364..2d0bc7f 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -9,6 +9,26 @@ Description:
 
 		/sys/devices/system/cpu/cpu#/
 
+What:		/sys/devices/system/cpu/vmcs/
+Date:		June 2012
+Contact:	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+Description:
+		A collection of vmcs fields' offsets for Intel cpu.
+		Also including vmcs revision identifier.
+
+		Individual offsets are contained in subfiles named by
+		the filed's encoding, e.g.:
+
+		/sys/devices/system/cpu/vmcs/0800
+
+What:		/sys/devices/system/cpu/vmcs/id
+Date:		June 2012
+Contact:	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+Description:
+		Vmcs revision identifier in Intel cpu. The value enables
+		software to avoid using a VMCS region formatted for one
+		processor on a processor that uses a different format.
+
 What:		/sys/devices/system/cpu/kernel_max
 		/sys/devices/system/cpu/offline
 		/sys/devices/system/cpu/online
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 3/3] Documentation: Add ABI entry for vmcs sysfs interface
@ 2012-07-12  9:59   ` Zhang Yanfei
  0 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-12  9:59 UTC (permalink / raw)
  To: Avi Kivity, mtosatti
  Cc: dzickus, luto, kvm, Joerg Roedel, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

Signed-off-by: zhangyanfei <zhangyanfei@cn.fujitsu.com>
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 5dab364..2d0bc7f 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -9,6 +9,26 @@ Description:
 
 		/sys/devices/system/cpu/cpu#/
 
+What:		/sys/devices/system/cpu/vmcs/
+Date:		June 2012
+Contact:	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+Description:
+		A collection of vmcs fields' offsets for Intel cpu.
+		Also including vmcs revision identifier.
+
+		Individual offsets are contained in subfiles named by
+		the filed's encoding, e.g.:
+
+		/sys/devices/system/cpu/vmcs/0800
+
+What:		/sys/devices/system/cpu/vmcs/id
+Date:		June 2012
+Contact:	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
+Description:
+		Vmcs revision identifier in Intel cpu. The value enables
+		software to avoid using a VMCS region formatted for one
+		processor on a processor that uses a different format.
+
 What:		/sys/devices/system/cpu/kernel_max
 		/sys/devices/system/cpu/offline
 		/sys/devices/system/cpu/online
-- 
1.7.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/3] Export offsets of VMCS fields as note information for kdump
  2012-07-12  9:54 ` Zhang Yanfei
@ 2012-07-30  2:53   ` Zhang Yanfei
  -1 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-30  2:53 UTC (permalink / raw)
  To: Avi Kivity
  Cc: mtosatti, ebiederm, luto, Joerg Roedel, dzickus, paul.gortmaker,
	ludwig.nussel, linux-kernel, kvm, kexec, Greg KH

Hello Avi,

Do you have any comments about this version of the patch set?

于 2012年07月12日 17:54, Zhang Yanfei 写道:
> This patch set exports offsets of VMCS fields as note information for
> kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve
> runtime state of guest machine image, such as registers, in host
> machine's crash dump as VMCS format. The problem is that VMCS internal
> is hidden by Intel in its specification. So, we slove this problem
> by reverse engineering implemented in this patch set. The VMCSINFO
> is exported via sysfs (/sys/devices/system/cpu/vmcs/) to kexec-tools.
> 
> Here are two usercases for two features that we want.
> 
> 1) Create guest machine's crash dumpfile from host machine's crash dumpfile
> 
> In general, we want to use this feature on failure analysis for the system
> where the processing depends on the communication between host and guest
> machines to look into the system from both machines's viewpoints.
> 
> As a concrete situation, consider where there's heartbeat monitoring
> feature on the guest machine's side, where we need to determine in
> which machine side the cause of heartbeat stop lies. In our actual
> experiments, we encountered such situation and we found the cause of
> the bug was in host's process schedular so guest machine's vcpu stopped
> for a long time and then led to heartbeat stop.
> 
> The module that judges heartbeat stop is on guest machine, so we need
> to debug guest machine's data. But if the cause lies in host machine
> side, we need to look into host machine's crash dump.
> 
> Without this feature, we first create guest machine's dump and then
> create host mahine's, but there's only a short time between two
> processings, during which it's unlikely that buggy situation remains.
> 
> So, we think the feature is useful to debug both guest machine's and
> host machine's sides at the same time, and expect we can make failure
> analysis efficiently.
> 
> Of course, we believe this feature is commonly useful on the situation
> where guest machine doesn't work well due to something of host machine's.
> 
> 2) Get offsets of VMCS information on the CPU running on the host machine
> 
> If kdump doesn't work well, then it means we cannot use kvm API to get
> register values of guest machine and they are still left on its vmcs
> region. In the case, we use crash dump mechanism running outside of
> linux kernel, such as sadump, a firmware-based crash dump. Then VMCS
> information is then necessary.
> 
> TODO:
>   1. In kexec-tools, get VMCSINFO via sysfs and dump it as note information
>      into vmcore.
>   2. Dump VMCS region of each guest vcpu and VMCSINFO into qemu-process
>      core file. To do this, we will modify kernel core dumper, gdb gcore
>      and crash gcore.
>   3. Dump guest image from the qemu-process core file into a vmcore.
> 
> Changelog from v4 to v5:
> 1. The VMCSINFO is stored in a two-dimensional array filled with each
>    field's encoding and corresponding offset. So the size of VMCSINFO
>    is much smaller.
> 2. vmcs sysfs file /sys/devices/system/cpu/vmcs_id is moved to
>    /sys/devices/system/cpu/vmcs/id.
> 3. Rewrite the ABI entry for vmcs interface and remove the KernelVersion
>    line.
> 
> Changelog from v3 to v4:
> 1. All the variables and functions are moved to vmcsinfo-intel module.
> 2. Add a new sysfs interface /sys/devices/system/cpu/vmcs_id to export
>    vmcs revision identifier. And origial sysfs interface is changed
>    from /sys/devices/cpu/vmcs to /sys/devices/system/cpu/vmcs. Thanks
>    Greg KH for his helpful comments about sysfs.
> 
> Changelog from v2 to v3:
> 1. New VMCSINFO format.
>    Now the VMCSINFO is mainly made up of an array that contains all vmcs
>    fields' offsets. The offsets aren't encoded because we decode them in
>    the module itself. If some field doesn't exist or its offset cannot be
>    decoded correctly, the offset in the array is just set to zero.
> 2. New sysfs interface and Documentation/ABI entry. 
>    We expose the actual fields in /sys/devices/cpu/vmcs instead of just
>    exporting the address of VMCSINFO in /sys/kernel/vmcsinfo.
>    For example, /sys/devices/cpu/vmcs/0800 contains the offset of
>    GUEST_DS_SELECTOR. 0800 is the encoding of GUEST_DS_SELECTOR.
>    Accordingly, ABI entry in Documentation is changed from sysfs-kernel-vmcsinfo
>    to sysfs-devices-cpu-vmcs.
> 
> Changelog from v1 to v2:
> 1. The VMCSINFO now has a simple binary <field><encoded offset> format,
>    as below:
>      +-------------+--------------------------+
>      | Byte offset | Contents                 |
>      +-------------+--------------------------+
>      | 0           | VMCS revision identifier |
>      +-------------+--------------------------+
>      | 4           | <field><encoded offset>  |
>      +-------------+--------------------------+
>      | 16          | <field><encoded offset>  |
>      +-------------+--------------------------+
>      ......
>   
>    The first 32 bits of VMCSINFO contains the VMCS revision identifier.
>    The remainder of VMCSINFO is used for <field><encoded offset> sets.
>    Each set takes 12 bytes: field occupys 4 bytes and its corresponding
>    encoded offset occupys 8 bytes.
> 
>    Encoded offsets are raw values read by vmcs_read{16, 64, 32, l}, and
>    they are all unsigned extended to 8 bytes for each <field><encoded offset>
>    set will have the same size. 
>    We do not decode offsets here. The decoding work is delayed in userspace
>    tools for more flexible handling.
>    
>    And here are two examples of the new VMCSINFO:
>    Processor: Intel(R) Core(TM)2 Duo CPU     E7500  @ 2.93GHz
>    VMCSINFO contains:
>      <0000000d>                   --> VMCS revision id = 0xd
>      <00004000><0000000001840180> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x01840180
>      <00004002><0000000001940190> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x01940190
>      <0000401e><000000000fe40fe0> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x0fe40fe0
>      <0000400c><0000000001e401e0> --> OFFSET(VM_EXIT_CONTROLS) = 0x01e401e0
>      ......
> 
>    Processor: Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz (24 cores)
>    VMCSINFO contains:
>      <0000000e>                   --> VMCS revision id = 0xe 
>      <00004000><0000000005540550> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x05540550
>      <00004002><0000000005440540> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x05440540
>      <0000401e><00000000054c0548> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x054c0548
>      <0000400c><00000000057c0578> --> OFFSET(VM_EXIT_CONTROLS) = 0x057c0578
>      ......
> 
> 2. Add a new kernel module *vmcsinfo-intel* for filling VMCSINFO instead
>    of putting it in module kvm-intel. The new module is auto-loaded
>    when the vmx cpufeature is detected and it depends on module kvm-intel.
>    *Loading and unloading this module will have no side effect on the
>    running guests.*
> 3. The sysfs file vmcsinfo is splitted into 2 files:
>    /sys/kernel/vmcsinfo: shows physical address of VMCSINFO note information.
>    /sys/kernel/vmcsinfo_maxsize: shows max size of VMCSINFO.
> 4. A new Documentation/ABI entry is added for vmcsinfo and vmcsinfo_maxsize.
> 5. Do not update VMCSINFO note when the kernel is panicked.
> 
> zhangyanfei (3):
>   KVM: Export symbols for module vmcsinfo-intel
>   KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
>   Documentation: Add ABI entry for vmcs sysfs interface.
> 
>  Documentation/ABI/testing/sysfs-devices-system-cpu |   20 +
>  arch/x86/include/asm/vmx.h                         |   73 ++
>  arch/x86/kvm/Kconfig                               |   11 +
>  arch/x86/kvm/Makefile                              |    3 +
>  arch/x86/kvm/vmcsinfo.c                            |  714 ++++++++++++++++++++
>  arch/x86/kvm/vmx.c                                 |   81 +--
>  include/linux/kvm_host.h                           |    3 +
>  virt/kvm/kvm_main.c                                |    8 +-
>  8 files changed, 841 insertions(+), 72 deletions(-)
>  create mode 100644 arch/x86/kvm/vmcsinfo.c


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 0/3] Export offsets of VMCS fields as note information for kdump
@ 2012-07-30  2:53   ` Zhang Yanfei
  0 siblings, 0 replies; 10+ messages in thread
From: Zhang Yanfei @ 2012-07-30  2:53 UTC (permalink / raw)
  To: Avi Kivity
  Cc: dzickus, luto, kvm, Joerg Roedel, mtosatti, kexec, linux-kernel,
	paul.gortmaker, ludwig.nussel, ebiederm, Greg KH

Hello Avi,

Do you have any comments about this version of the patch set?

于 2012年07月12日 17:54, Zhang Yanfei 写道:
> This patch set exports offsets of VMCS fields as note information for
> kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve
> runtime state of guest machine image, such as registers, in host
> machine's crash dump as VMCS format. The problem is that VMCS internal
> is hidden by Intel in its specification. So, we slove this problem
> by reverse engineering implemented in this patch set. The VMCSINFO
> is exported via sysfs (/sys/devices/system/cpu/vmcs/) to kexec-tools.
> 
> Here are two usercases for two features that we want.
> 
> 1) Create guest machine's crash dumpfile from host machine's crash dumpfile
> 
> In general, we want to use this feature on failure analysis for the system
> where the processing depends on the communication between host and guest
> machines to look into the system from both machines's viewpoints.
> 
> As a concrete situation, consider where there's heartbeat monitoring
> feature on the guest machine's side, where we need to determine in
> which machine side the cause of heartbeat stop lies. In our actual
> experiments, we encountered such situation and we found the cause of
> the bug was in host's process schedular so guest machine's vcpu stopped
> for a long time and then led to heartbeat stop.
> 
> The module that judges heartbeat stop is on guest machine, so we need
> to debug guest machine's data. But if the cause lies in host machine
> side, we need to look into host machine's crash dump.
> 
> Without this feature, we first create guest machine's dump and then
> create host mahine's, but there's only a short time between two
> processings, during which it's unlikely that buggy situation remains.
> 
> So, we think the feature is useful to debug both guest machine's and
> host machine's sides at the same time, and expect we can make failure
> analysis efficiently.
> 
> Of course, we believe this feature is commonly useful on the situation
> where guest machine doesn't work well due to something of host machine's.
> 
> 2) Get offsets of VMCS information on the CPU running on the host machine
> 
> If kdump doesn't work well, then it means we cannot use kvm API to get
> register values of guest machine and they are still left on its vmcs
> region. In the case, we use crash dump mechanism running outside of
> linux kernel, such as sadump, a firmware-based crash dump. Then VMCS
> information is then necessary.
> 
> TODO:
>   1. In kexec-tools, get VMCSINFO via sysfs and dump it as note information
>      into vmcore.
>   2. Dump VMCS region of each guest vcpu and VMCSINFO into qemu-process
>      core file. To do this, we will modify kernel core dumper, gdb gcore
>      and crash gcore.
>   3. Dump guest image from the qemu-process core file into a vmcore.
> 
> Changelog from v4 to v5:
> 1. The VMCSINFO is stored in a two-dimensional array filled with each
>    field's encoding and corresponding offset. So the size of VMCSINFO
>    is much smaller.
> 2. vmcs sysfs file /sys/devices/system/cpu/vmcs_id is moved to
>    /sys/devices/system/cpu/vmcs/id.
> 3. Rewrite the ABI entry for vmcs interface and remove the KernelVersion
>    line.
> 
> Changelog from v3 to v4:
> 1. All the variables and functions are moved to vmcsinfo-intel module.
> 2. Add a new sysfs interface /sys/devices/system/cpu/vmcs_id to export
>    vmcs revision identifier. And origial sysfs interface is changed
>    from /sys/devices/cpu/vmcs to /sys/devices/system/cpu/vmcs. Thanks
>    Greg KH for his helpful comments about sysfs.
> 
> Changelog from v2 to v3:
> 1. New VMCSINFO format.
>    Now the VMCSINFO is mainly made up of an array that contains all vmcs
>    fields' offsets. The offsets aren't encoded because we decode them in
>    the module itself. If some field doesn't exist or its offset cannot be
>    decoded correctly, the offset in the array is just set to zero.
> 2. New sysfs interface and Documentation/ABI entry. 
>    We expose the actual fields in /sys/devices/cpu/vmcs instead of just
>    exporting the address of VMCSINFO in /sys/kernel/vmcsinfo.
>    For example, /sys/devices/cpu/vmcs/0800 contains the offset of
>    GUEST_DS_SELECTOR. 0800 is the encoding of GUEST_DS_SELECTOR.
>    Accordingly, ABI entry in Documentation is changed from sysfs-kernel-vmcsinfo
>    to sysfs-devices-cpu-vmcs.
> 
> Changelog from v1 to v2:
> 1. The VMCSINFO now has a simple binary <field><encoded offset> format,
>    as below:
>      +-------------+--------------------------+
>      | Byte offset | Contents                 |
>      +-------------+--------------------------+
>      | 0           | VMCS revision identifier |
>      +-------------+--------------------------+
>      | 4           | <field><encoded offset>  |
>      +-------------+--------------------------+
>      | 16          | <field><encoded offset>  |
>      +-------------+--------------------------+
>      ......
>   
>    The first 32 bits of VMCSINFO contains the VMCS revision identifier.
>    The remainder of VMCSINFO is used for <field><encoded offset> sets.
>    Each set takes 12 bytes: field occupys 4 bytes and its corresponding
>    encoded offset occupys 8 bytes.
> 
>    Encoded offsets are raw values read by vmcs_read{16, 64, 32, l}, and
>    they are all unsigned extended to 8 bytes for each <field><encoded offset>
>    set will have the same size. 
>    We do not decode offsets here. The decoding work is delayed in userspace
>    tools for more flexible handling.
>    
>    And here are two examples of the new VMCSINFO:
>    Processor: Intel(R) Core(TM)2 Duo CPU     E7500  @ 2.93GHz
>    VMCSINFO contains:
>      <0000000d>                   --> VMCS revision id = 0xd
>      <00004000><0000000001840180> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x01840180
>      <00004002><0000000001940190> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x01940190
>      <0000401e><000000000fe40fe0> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x0fe40fe0
>      <0000400c><0000000001e401e0> --> OFFSET(VM_EXIT_CONTROLS) = 0x01e401e0
>      ......
> 
>    Processor: Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz (24 cores)
>    VMCSINFO contains:
>      <0000000e>                   --> VMCS revision id = 0xe 
>      <00004000><0000000005540550> --> OFFSET(PIN_BASED_VM_EXEC_CONTROL) = 0x05540550
>      <00004002><0000000005440540> --> OFFSET(CPU_BASED_VM_EXEC_CONTROL) = 0x05440540
>      <0000401e><00000000054c0548> --> OFFSET(SECONDARY_VM_EXEC_CONTROL) = 0x054c0548
>      <0000400c><00000000057c0578> --> OFFSET(VM_EXIT_CONTROLS) = 0x057c0578
>      ......
> 
> 2. Add a new kernel module *vmcsinfo-intel* for filling VMCSINFO instead
>    of putting it in module kvm-intel. The new module is auto-loaded
>    when the vmx cpufeature is detected and it depends on module kvm-intel.
>    *Loading and unloading this module will have no side effect on the
>    running guests.*
> 3. The sysfs file vmcsinfo is splitted into 2 files:
>    /sys/kernel/vmcsinfo: shows physical address of VMCSINFO note information.
>    /sys/kernel/vmcsinfo_maxsize: shows max size of VMCSINFO.
> 4. A new Documentation/ABI entry is added for vmcsinfo and vmcsinfo_maxsize.
> 5. Do not update VMCSINFO note when the kernel is panicked.
> 
> zhangyanfei (3):
>   KVM: Export symbols for module vmcsinfo-intel
>   KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO
>   Documentation: Add ABI entry for vmcs sysfs interface.
> 
>  Documentation/ABI/testing/sysfs-devices-system-cpu |   20 +
>  arch/x86/include/asm/vmx.h                         |   73 ++
>  arch/x86/kvm/Kconfig                               |   11 +
>  arch/x86/kvm/Makefile                              |    3 +
>  arch/x86/kvm/vmcsinfo.c                            |  714 ++++++++++++++++++++
>  arch/x86/kvm/vmx.c                                 |   81 +--
>  include/linux/kvm_host.h                           |    3 +
>  virt/kvm/kvm_main.c                                |    8 +-
>  8 files changed, 841 insertions(+), 72 deletions(-)
>  create mode 100644 arch/x86/kvm/vmcsinfo.c


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-07-30  2:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-12  9:54 [PATCH v5 0/3] Export offsets of VMCS fields as note information for kdump Zhang Yanfei
2012-07-12  9:54 ` Zhang Yanfei
2012-07-12  9:56 ` [PATCH v5 1/3] KVM: Export symbols for module vmcsinfo-intel Zhang Yanfei
2012-07-12  9:56   ` Zhang Yanfei
2012-07-12  9:57 ` [PATCH v5 2/3] KVM-INTEL: Add new module vmcsinfo-intel to fill VMCSINFO Zhang Yanfei
2012-07-12  9:57   ` Zhang Yanfei
2012-07-12  9:59 ` [PATCH v5 3/3] Documentation: Add ABI entry for vmcs sysfs interface Zhang Yanfei
2012-07-12  9:59   ` Zhang Yanfei
2012-07-30  2:53 ` [PATCH v5 0/3] Export offsets of VMCS fields as note information for kdump Zhang Yanfei
2012-07-30  2:53   ` Zhang Yanfei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.