All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] Intel Processor Trace virtulization enabling
@ 2018-05-30 13:27 Luwei Kang
  2018-05-30 13:27 ` [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest Luwei Kang
                   ` (10 more replies)
  0 siblings, 11 replies; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:27 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Hi All,

Here is a patch-series which adding Processor Trace enabling in XEN guest. You can get It's software developer manuals from:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
In Chapter 4 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.

Introduction:
Intel Processor Trace (Intel PT) is an extension of Intel Architecture that captures information about software execution using dedicated hardware facilities that cause only minimal performance perturbation to the software being traced. Details on the Intel PT infrastructure and trace capabilities can be found in the Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C.

The suite of architecture changes serve to simplify the process of virtualizing Intel PT for use by a guest software. There are two primary elements to this new architecture support for VMX support improvements made for Intel PT.
1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
  — This serves to speed and simplify the process of disabling trace on VM exit, and restoring it on VM entry.
2. Enabling use of EPT to redirect PT output.
  — This enables the VMM to elect to virtualize the PT output buffer using EPT. In this mode, the CPU will treat PT output addresses as Guest Physical Addresses (GPAs) and translate them using EPT. This means that Intel PT output reads (of the ToPA table) and writes (of trace output) can cause EPT violations, and other output events.


From v1:
 1. Rename the parameter to "ipt";
 2. Dynamic alloc the intel pt descriptor;
 3. Disable Intel PT when part of hardware enhancement not supported;
 4. Add some safty check before MSRs read/write;
 5. Add new VM-exit qualification;
 6. Not include introspection handler in this version.

Luwei Kang (10):
  x86: add an flag to enable Intel Processor Trace in guest
  x86: Configure VMCS for Intel Processor Trace virtualization
  x86: Add Intel Processor Trace support for cpuid
  x86: Add Intel Processor Trace MSRs and bit definitions
  x86: Implement Intel Processor Trace context switch
  x86: Introduce a new function to get capability of Intel PT
  x86: Add Intel Processor Trace MSRs read/write emulation
  x86: Introduce a function to check the value of RTIT_CTL
  x86: Disable Intel Processor Trace when VMXON in L1 guest
  x86: Handle new asynchronous exit qualification

 docs/misc/xen-command-line.markdown         |  10 +
 tools/libxc/xc_cpuid_x86.c                  |  12 +-
 xen/arch/x86/cpu/Makefile                   |   1 +
 xen/arch/x86/cpu/ipt.c                      | 424 ++++++++++++++++++++++++++++
 xen/arch/x86/cpuid.c                        |  22 ++
 xen/arch/x86/domctl.c                       |   5 +
 xen/arch/x86/hvm/hvm.c                      |   8 +-
 xen/arch/x86/hvm/vmx/vmcs.c                 |  37 ++-
 xen/arch/x86/hvm/vmx/vmx.c                  |  33 +++
 xen/arch/x86/hvm/vmx/vvmx.c                 |  13 +-
 xen/include/asm-x86/cpufeature.h            |   1 +
 xen/include/asm-x86/cpuid.h                 |  12 +-
 xen/include/asm-x86/hvm/vmx/vmcs.h          |  10 +
 xen/include/asm-x86/hvm/vmx/vmx.h           |   8 +-
 xen/include/asm-x86/ipt.h                   |  76 +++++
 xen/include/asm-x86/msr-index.h             |  37 +++
 xen/include/public/arch-x86/cpufeatureset.h |   1 +
 xen/include/xen/mm.h                        |   1 +
 18 files changed, 696 insertions(+), 15 deletions(-)
 create mode 100644 xen/arch/x86/cpu/ipt.c
 create mode 100644 xen/include/asm-x86/ipt.h

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
@ 2018-05-30 13:27 ` Luwei Kang
  2018-06-28 14:11   ` Jan Beulich
  2018-05-30 13:27 ` [PATCH v2 02/10] x86: Configure VMCS for Intel Processor Trace virtualization Luwei Kang
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:27 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

This patch add a new flag to enable Intel Processor Trace
in HVM guest by add parameter 'ipt = guest' in XEN
command line. Intel Processor Trace is disabled
in default.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 docs/misc/xen-command-line.markdown | 10 +++++++++
 xen/arch/x86/cpu/Makefile           |  1 +
 xen/arch/x86/cpu/ipt.c              | 42 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/ipt.h           | 29 +++++++++++++++++++++++++
 4 files changed, 82 insertions(+)
 create mode 100644 xen/arch/x86/cpu/ipt.c
 create mode 100644 xen/include/asm-x86/ipt.h

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 8712a83..bf8f89a 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1215,6 +1215,16 @@ Rather than only mapping RAM pages for IOMMU accesses for Dom0, with this
 option all pages not marked as unusable in the E820 table will get a mapping
 established.
 
+### ipt
+> `= guest`
+
+> Default: `off`
+
+This option is use for switch on the Intel Processor Trace feature
+in HVM guest when 'ipt=guest'. By default, this feature is disabled
+in guest. Intel Processor Trace virtualization depend on
+EPT, so it can only enabled in HVM guest at present.
+
 ### irq\_ratelimit (x86)
 > `= <integer>`
 
diff --git a/xen/arch/x86/cpu/Makefile b/xen/arch/x86/cpu/Makefile
index 74f23ae..af60277 100644
--- a/xen/arch/x86/cpu/Makefile
+++ b/xen/arch/x86/cpu/Makefile
@@ -6,5 +6,6 @@ obj-y += centaur.o
 obj-y += common.o
 obj-y += intel.o
 obj-y += intel_cacheinfo.o
+obj-y += ipt.o
 obj-y += mwait-idle.o
 obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
diff --git a/xen/arch/x86/cpu/ipt.c b/xen/arch/x86/cpu/ipt.c
new file mode 100644
index 0000000..1fd7f51
--- /dev/null
+++ b/xen/arch/x86/cpu/ipt.c
@@ -0,0 +1,42 @@
+/*
+ * ipt.c: Support for Intel Processor Trace Virtualization.
+ *
+ * Copyright (c) 2018, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Luwei Kang <luwei.kang@intel.com>
+ */
+#include <xen/errno.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/string.h>
+#include <asm/ipt.h>
+
+/* ipt: Flag to enable Intel Processor Trace (default off). */
+unsigned int __read_mostly ipt_mode = IPT_MODE_OFF;
+static int parse_ipt_params(const char *str);
+custom_param("ipt", parse_ipt_params);
+
+static int __init parse_ipt_params(const char *str)
+{
+    if ( !strcmp("guest", str) )
+        ipt_mode = IPT_MODE_GUEST;
+    else if ( str )
+    {
+        printk("Unknown Intel Processor Trace mode specified: '%s'\n", str);
+        return -EINVAL;
+    }
+
+    return 0;
+}
diff --git a/xen/include/asm-x86/ipt.h b/xen/include/asm-x86/ipt.h
new file mode 100644
index 0000000..c46b9fc
--- /dev/null
+++ b/xen/include/asm-x86/ipt.h
@@ -0,0 +1,29 @@
+/*
+ * ipt.h: Intel Processor Trace virtualization for HVM domain.
+ *
+ * Copyright (c) 2018, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Luwei Kang <luwei.kang@intel.com>
+ */
+
+#ifndef __ASM_X86_HVM_IPT_H_
+#define __ASM_X86_HVM_IPT_H_
+
+#define IPT_MODE_OFF        0
+#define IPT_MODE_GUEST      (1<<0)
+
+extern unsigned int ipt_mode;
+
+#endif /* __ASM_X86_HVM_IPT_H_ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 02/10] x86: Configure VMCS for Intel Processor Trace virtualization
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
  2018-05-30 13:27 ` [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest Luwei Kang
@ 2018-05-30 13:27 ` Luwei Kang
  2018-05-30 13:27 ` [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid Luwei Kang
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:27 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

This patch configure VMCS to make Intel Processor Trace
output address can be treat as guest physical address and
translated by EPT when ipt option is in guest mode.

Intel Processor Trace will be disabled in guest and the VMCS
configuration will be clear when part of the required
features is available in hardware or this feature is
disabled by user.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/hvm/vmx/vmcs.c        | 37 ++++++++++++++++++++++++++++++++-----
 xen/include/asm-x86/hvm/vmx/vmcs.h |  7 +++++++
 2 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 70c2fb7..1a2ee60 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -40,6 +40,7 @@
 #include <asm/shadow.h>
 #include <asm/tboot.h>
 #include <asm/apic.h>
+#include <asm/ipt.h>
 
 static bool_t __read_mostly opt_vpid_enabled = 1;
 boolean_param("vpid", opt_vpid_enabled);
@@ -242,6 +243,9 @@ static int vmx_init_vmcs_config(void)
         rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
         if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
             opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
+        if ( _vmx_misc_cap & VMX_MISC_PT_ENABLE )
+            opt |= SECONDARY_EXEC_PT_USE_GPA |
+                   SECONDARY_EXEC_CONCEAL_PT_PIP;
         if ( opt_vpid_enabled )
             opt |= SECONDARY_EXEC_ENABLE_VPID;
         if ( opt_unrestricted_guest_enabled )
@@ -343,7 +347,8 @@ static int vmx_init_vmcs_config(void)
 
     min = VM_EXIT_ACK_INTR_ON_EXIT;
     opt = VM_EXIT_SAVE_GUEST_PAT | VM_EXIT_LOAD_HOST_PAT |
-          VM_EXIT_CLEAR_BNDCFGS;
+          VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_CONCEAL_PT_PIP |
+          VM_EXIT_CLEAR_IA32_RTIT_CTL;
     min |= VM_EXIT_IA32E_MODE;
     _vmx_vmexit_control = adjust_vmx_controls(
         "VMExit Control", min, opt, MSR_IA32_VMX_EXIT_CTLS, &mismatch);
@@ -383,13 +388,29 @@ static int vmx_init_vmcs_config(void)
         _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
 
     min = 0;
-    opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
+    opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS |
+          VM_ENTRY_CONCEAL_PT_PIP | VM_ENTRY_LOAD_IA32_RTIT_CTL;
     _vmx_vmentry_control = adjust_vmx_controls(
         "VMEntry Control", min, opt, MSR_IA32_VMX_ENTRY_CTLS, &mismatch);
 
     if ( mismatch )
         return -EINVAL;
 
+    if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT) ||
+         !(_vmx_secondary_exec_control & SECONDARY_EXEC_PT_USE_GPA) ||
+         !(_vmx_vmexit_control & VM_EXIT_CLEAR_IA32_RTIT_CTL) ||
+         !(_vmx_vmentry_control & VM_ENTRY_LOAD_IA32_RTIT_CTL) ||
+         (ipt_mode == IPT_MODE_OFF) )
+    {
+        _vmx_secondary_exec_control &= ~(SECONDARY_EXEC_PT_USE_GPA |
+                                         SECONDARY_EXEC_CONCEAL_PT_PIP);
+        _vmx_vmexit_control &= ~(VM_EXIT_CONCEAL_PT_PIP |
+                                 VM_EXIT_CLEAR_IA32_RTIT_CTL);
+        _vmx_vmentry_control &= ~(VM_ENTRY_CONCEAL_PT_PIP |
+                                  VM_ENTRY_LOAD_IA32_RTIT_CTL);
+        ipt_mode = IPT_MODE_OFF;
+    }
+
     if ( !vmx_pin_based_exec_control )
     {
         /* First time through. */
@@ -1029,10 +1050,16 @@ static int construct_vmcs(struct vcpu *v)
         v->arch.hvm_vmx.secondary_exec_control &= 
             ~(SECONDARY_EXEC_ENABLE_EPT | 
               SECONDARY_EXEC_UNRESTRICTED_GUEST |
-              SECONDARY_EXEC_ENABLE_INVPCID);
+              SECONDARY_EXEC_ENABLE_INVPCID |
+              SECONDARY_EXEC_PT_USE_GPA |
+              SECONDARY_EXEC_CONCEAL_PT_PIP);
         vmexit_ctl &= ~(VM_EXIT_SAVE_GUEST_PAT |
-                        VM_EXIT_LOAD_HOST_PAT);
-        vmentry_ctl &= ~VM_ENTRY_LOAD_GUEST_PAT;
+                        VM_EXIT_LOAD_HOST_PAT |
+                        VM_EXIT_CONCEAL_PT_PIP |
+                        VM_EXIT_CLEAR_IA32_RTIT_CTL);
+        vmentry_ctl &= ~(VM_ENTRY_LOAD_GUEST_PAT |
+                         VM_ENTRY_CONCEAL_PT_PIP |
+                         VM_ENTRY_LOAD_IA32_RTIT_CTL);
     }
 
     /* Disable Virtualize x2APIC mode by default. */
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 06c3179..2990992 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -223,6 +223,8 @@ extern u32 vmx_pin_based_exec_control;
 #define VM_EXIT_LOAD_HOST_EFER          0x00200000
 #define VM_EXIT_SAVE_PREEMPT_TIMER      0x00400000
 #define VM_EXIT_CLEAR_BNDCFGS           0x00800000
+#define VM_EXIT_CONCEAL_PT_PIP          0x01000000
+#define VM_EXIT_CLEAR_IA32_RTIT_CTL     0x02000000
 extern u32 vmx_vmexit_control;
 
 #define VM_ENTRY_IA32E_MODE             0x00000200
@@ -232,6 +234,8 @@ extern u32 vmx_vmexit_control;
 #define VM_ENTRY_LOAD_GUEST_PAT         0x00004000
 #define VM_ENTRY_LOAD_GUEST_EFER        0x00008000
 #define VM_ENTRY_LOAD_BNDCFGS           0x00010000
+#define VM_ENTRY_CONCEAL_PT_PIP         0x00020000
+#define VM_ENTRY_LOAD_IA32_RTIT_CTL     0x00040000
 extern u32 vmx_vmentry_control;
 
 #define SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES 0x00000001
@@ -250,7 +254,9 @@ extern u32 vmx_vmentry_control;
 #define SECONDARY_EXEC_ENABLE_VMCS_SHADOWING    0x00004000
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
 #define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS   0x00040000
+#define SECONDARY_EXEC_CONCEAL_PT_PIP           0x00080000
 #define SECONDARY_EXEC_XSAVES                   0x00100000
+#define SECONDARY_EXEC_PT_USE_GPA               0x01000000
 #define SECONDARY_EXEC_TSC_SCALING              0x02000000
 extern u32 vmx_secondary_exec_control;
 
@@ -271,6 +277,7 @@ extern u32 vmx_secondary_exec_control;
 #define VMX_VPID_INVVPID_SINGLE_CONTEXT_RETAINING_GLOBAL 0x80000000000ULL
 extern u64 vmx_ept_vpid_cap;
 
+#define VMX_MISC_PT_ENABLE                      0x00004000
 #define VMX_MISC_CR3_TARGET                     0x01ff0000
 #define VMX_MISC_VMWRITE_ALL                    0x20000000
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
  2018-05-30 13:27 ` [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest Luwei Kang
  2018-05-30 13:27 ` [PATCH v2 02/10] x86: Configure VMCS for Intel Processor Trace virtualization Luwei Kang
@ 2018-05-30 13:27 ` Luwei Kang
  2018-06-28 14:27   ` Jan Beulich
  2018-06-29 15:17   ` Jan Beulich
  2018-05-30 13:27 ` [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions Luwei Kang
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:27 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Intel Processor Trace will be disabled in guest
when ipt_mode is off (IPT_MODE_OFF).

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 tools/libxc/xc_cpuid_x86.c                  | 12 ++++++++++--
 xen/arch/x86/cpuid.c                        | 22 ++++++++++++++++++++++
 xen/arch/x86/domctl.c                       |  5 +++++
 xen/include/asm-x86/cpufeature.h            |  1 +
 xen/include/asm-x86/cpuid.h                 | 12 +++++++++++-
 xen/include/asm-x86/ipt.h                   |  2 ++
 xen/include/public/arch-x86/cpufeatureset.h |  1 +
 7 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 9fa2f7c..f8f962a 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -38,7 +38,7 @@ enum {
 #define clear_feature(idx, dst) ((dst) &= ~bitmaskof(idx))
 #define set_feature(idx, dst)   ((dst) |=  bitmaskof(idx))
 
-#define DEF_MAX_BASE 0x0000000du
+#define DEF_MAX_BASE 0x00000014u
 #define DEF_MAX_INTELEXT  0x80000008u
 #define DEF_MAX_AMDEXT    0x8000001cu
 
@@ -473,6 +473,7 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
     case 0x00000002: /* Intel cache info (dumped by AMD policy) */
     case 0x00000004: /* Intel cache info (dumped by AMD policy) */
     case 0x0000000a: /* Architectural Performance Monitor Features */
+    case 0x00000014: /* Intel Processor Trace Features */
     case 0x80000002: /* Processor name string */
     case 0x80000003: /* ... continued         */
     case 0x80000004: /* ... continued         */
@@ -759,12 +760,19 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
                 continue;
         }
 
+        if ( input[0] == 0x14 )
+        {
+            input[1]++;
+            if ( input[1] == 1 )
+                continue;
+        }
+
         input[0]++;
         if ( !(input[0] & 0x80000000u) && (input[0] > base_max ) )
             input[0] = 0x80000000u;
 
         input[1] = XEN_CPUID_INPUT_UNUSED;
-        if ( (input[0] == 4) || (input[0] == 7) )
+        if ( (input[0] == 4) || (input[0] == 7) || (input[0] == 0x14) )
             input[1] = 0;
         else if ( input[0] == 0xd )
             input[1] = 1; /* Xen automatically calculates almost everything. */
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 4b8d330..8f30f9e 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -6,6 +6,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/svm/svm.h>
 #include <asm/hvm/vmx/vmcs.h>
+#include <asm/ipt.h>
 #include <asm/paging.h>
 #include <asm/processor.h>
 #include <asm/xstate.h>
@@ -583,7 +584,19 @@ void recalculate_cpuid_policy(struct domain *d)
             __clear_bit(X86_FEATURE_VMX, max_fs);
             __clear_bit(X86_FEATURE_SVM, max_fs);
         }
+
+        /*
+         * Hide Intel Processor trace feature when hardware not support
+         * PT-VMX or ipt option is off.
+         */
+        if ( ipt_mode == IPT_MODE_OFF )
+        {
+            __clear_bit(X86_FEATURE_IPT, max_fs);
+            zero_leaves(p->ipt.raw, 0, ARRAY_SIZE(p->ipt.raw) - 1);
+        }
     }
+    else
+        zero_leaves(p->ipt.raw, 0, ARRAY_SIZE(p->ipt.raw) - 1);
 
     /*
      * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
@@ -738,6 +751,15 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
             *res = p->feat.raw[subleaf];
             break;
 
+        case IPT_CPUID:
+            ASSERT(p->ipt.max_subleaf < ARRAY_SIZE(p->ipt.raw));
+            if ( subleaf > min_t(uint32_t, p->ipt.max_subleaf,
+                                 ARRAY_SIZE(p->ipt.raw) - 1) )
+                return;
+
+            *res = p->ipt.raw[subleaf];
+            break;
+
         case XSTATE_CPUID:
             if ( !p->basic.xsave || subleaf >= ARRAY_SIZE(p->xstate.raw) )
                 return;
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 8fbbf3a..51743be 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -35,6 +35,7 @@
 #include <asm/debugger.h>
 #include <asm/psr.h>
 #include <asm/cpuid.h>
+#include <asm/ipt.h>
 
 static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
 {
@@ -101,6 +102,10 @@ static int update_domain_cpuid_info(struct domain *d,
             p->feat.raw[ctl->input[1]] = leaf;
             break;
 
+        case IPT_CPUID:
+            p->ipt.raw[ctl->input[1]] = leaf;
+            break;
+
         case XSTATE_CPUID:
             p->xstate.raw[ctl->input[1]] = leaf;
             break;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 2cf8f7e..97610d8 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -102,6 +102,7 @@
 #define cpu_has_mpx             boot_cpu_has(X86_FEATURE_MPX)
 #define cpu_has_rdseed          boot_cpu_has(X86_FEATURE_RDSEED)
 #define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
+#define cpu_has_ipt             boot_cpu_has(X86_FEATURE_IPT)
 #define cpu_has_sha             boot_cpu_has(X86_FEATURE_SHA)
 
 /* CPUID level 0x80000007.edx */
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 4cce268..c19ef28 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -58,10 +58,11 @@ DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
 /* Default masking MSR values, calculated at boot. */
 extern struct cpuidmasks cpuidmask_defaults;
 
-#define CPUID_GUEST_NR_BASIC      (0xdu + 1)
+#define CPUID_GUEST_NR_BASIC      (0x14u + 1)
 #define CPUID_GUEST_NR_FEAT       (0u + 1)
 #define CPUID_GUEST_NR_CACHE      (5u + 1)
 #define CPUID_GUEST_NR_XSTATE     (62u + 1)
+#define CPUID_GUEST_NR_IPT        (1u + 1)
 #define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
 #define CPUID_GUEST_NR_EXTD_AMD   (0x1cu + 1)
 #define CPUID_GUEST_NR_EXTD       MAX(CPUID_GUEST_NR_EXTD_INTEL, \
@@ -166,6 +167,15 @@ struct cpuid_policy
         } comp[CPUID_GUEST_NR_XSTATE];
     } xstate;
 
+    /* Structured feature leaf: 0x00000014[xx] */
+    union {
+        struct cpuid_leaf raw[CPUID_GUEST_NR_IPT];
+        struct {
+            /* Subleaf 0. */
+            uint32_t max_subleaf;
+        };
+    } ipt;
+
     /* Extended leaves: 0x800000xx */
     union {
         struct cpuid_leaf raw[CPUID_GUEST_NR_EXTD];
diff --git a/xen/include/asm-x86/ipt.h b/xen/include/asm-x86/ipt.h
index c46b9fc..65b064c 100644
--- a/xen/include/asm-x86/ipt.h
+++ b/xen/include/asm-x86/ipt.h
@@ -24,6 +24,8 @@
 #define IPT_MODE_OFF        0
 #define IPT_MODE_GUEST      (1<<0)
 
+#define IPT_CPUID           0x00000014
+
 extern unsigned int ipt_mode;
 
 #endif /* __ASM_X86_HVM_IPT_H_ */
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index c721c12..a1643a5 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -215,6 +215,7 @@ XEN_CPUFEATURE(SMAP,          5*32+20) /*S  Supervisor Mode Access Prevention */
 XEN_CPUFEATURE(AVX512IFMA,    5*32+21) /*A  AVX-512 Integer Fused Multiply Add */
 XEN_CPUFEATURE(CLFLUSHOPT,    5*32+23) /*A  CLFLUSHOPT instruction */
 XEN_CPUFEATURE(CLWB,          5*32+24) /*A  CLWB instruction */
+XEN_CPUFEATURE(IPT,           5*32+25) /*H  Intel Processor Trace */
 XEN_CPUFEATURE(AVX512PF,      5*32+26) /*A  AVX-512 Prefetch Instructions */
 XEN_CPUFEATURE(AVX512ER,      5*32+27) /*A  AVX-512 Exponent & Reciprocal Instrs */
 XEN_CPUFEATURE(AVX512CD,      5*32+28) /*A  AVX-512 Conflict Detection Instrs */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
                   ` (2 preceding siblings ...)
  2018-05-30 13:27 ` [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid Luwei Kang
@ 2018-05-30 13:27 ` Luwei Kang
  2018-06-28 14:44   ` Jan Beulich
  2018-05-30 13:27 ` [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch Luwei Kang
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:27 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Add Intel Processor Trace MSRs and bit definitions.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/include/asm-x86/msr-index.h | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 8fbccc8..7c02653 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -548,4 +548,41 @@
 #define MSR_PKGC9_IRTL			0x00000634
 #define MSR_PKGC10_IRTL			0x00000635
 
+/* Intel PT MSRs */
+#define MSR_IA32_RTIT_CTL		0x00000570
+#define RTIT_CTL_TRACEEN		(1ULL << 0)
+#define RTIT_CTL_CYCEN			(1ULL << 1)
+#define RTIT_CTL_OS			(1ULL << 2)
+#define RTIT_CTL_USR			(1ULL << 3)
+#define RTIT_CTL_PWR_EVT_EN		(1ULL << 4)
+#define RTIT_CTL_FUP_ON_PTW		(1ULL << 5)
+#define RTIT_CTL_FABRIC_EN		(1ULL << 6)
+#define RTIT_CTL_CR3_FILTER		(1ULL << 7)
+#define RTIT_CTL_TOPA			(1ULL << 8)
+#define RTIT_CTL_MTC_EN			(1ULL << 9)
+#define RTIT_CTL_TSC_EN			(1ULL << 10)
+#define RTIT_CTL_DIS_RETC		(1ULL << 11)
+#define RTIT_CTL_PTW_EN			(1ULL << 12)
+#define RTIT_CTL_BRANCH_EN		(1ULL << 13)
+#define RTIT_CTL_MTC_FREQ_OFFSET	14
+#define RTIT_CTL_MTC_FREQ		(0x0fULL << RTIT_CTL_MTC_FREQ_OFFSET)
+#define RTIT_CTL_CYC_THRESH_OFFSET	19
+#define RTIT_CTL_CYC_THRESH		(0x0fULL << RTIT_CTL_CYC_THRESH_OFFSET)
+#define RTIT_CTL_PSB_FREQ_OFFSET	24
+#define RTIT_CTL_PSB_FREQ		(0x0fULL << RTIT_CTL_PSB_FREQ_OFFSET)
+#define RTIT_CTL_ADDR_OFFSET(n)		(32 + 4 * (n))
+#define RTIT_CTL_ADDR(n)		(0x0fULL << RTIT_CTL_ADDR_OFFSET(n))
+#define MSR_IA32_RTIT_STATUS		0x00000571
+#define RTIT_STATUS_FILTER_EN		(1ULL << 0)
+#define RTIT_STATUS_CONTEXT_EN		(1ULL << 1)
+#define RTIT_STATUS_TRIGGER_EN		(1ULL << 2)
+#define RTIT_STATUS_ERROR		(1ULL << 4)
+#define RTIT_STATUS_STOPPED		(1ULL << 5)
+#define RTIT_STATUS_BYTECNT		(0x1ffffULL << 32)
+#define MSR_IA32_RTIT_CR3_MATCH		0x00000572
+#define MSR_IA32_RTIT_OUTPUT_BASE	0x00000560
+#define MSR_IA32_RTIT_OUTPUT_MASK	0x00000561
+#define MSR_IA32_RTIT_ADDR_A(n)		(0x00000580 + (n) * 2)
+#define MSR_IA32_RTIT_ADDR_B(n)		(0x00000581 + (n) * 2)
+
 #endif /* __ASM_MSR_INDEX_H */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
                   ` (3 preceding siblings ...)
  2018-05-30 13:27 ` [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions Luwei Kang
@ 2018-05-30 13:27 ` Luwei Kang
  2018-06-29 14:12   ` Jan Beulich
  2018-05-30 13:28 ` [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT Luwei Kang
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:27 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Load/Restore Intel Processor Trace Register in context switch.
MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS.
When Intel Processor Trace is supported in guest, we need
to load/restore MSRs only when this feature is enabled
in guest.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/cpu/ipt.c             | 101 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmx/vmx.c         |  10 ++++
 xen/include/asm-x86/hvm/vmx/vmcs.h |   3 ++
 xen/include/asm-x86/ipt.h          |  23 +++++++++
 4 files changed, 137 insertions(+)

diff --git a/xen/arch/x86/cpu/ipt.c b/xen/arch/x86/cpu/ipt.c
index 1fd7f51..b81a155 100644
--- a/xen/arch/x86/cpu/ipt.c
+++ b/xen/arch/x86/cpu/ipt.c
@@ -21,7 +21,9 @@
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <xen/string.h>
+#include <asm/hvm/vmx/vmx.h>
 #include <asm/ipt.h>
+#include <asm/msr.h>
 
 /* ipt: Flag to enable Intel Processor Trace (default off). */
 unsigned int __read_mostly ipt_mode = IPT_MODE_OFF;
@@ -40,3 +42,102 @@ static int __init parse_ipt_params(const char *str)
 
     return 0;
 }
+
+static inline void ipt_load_msr(const struct ipt_ctx *ctx,
+                       unsigned int addr_range)
+{
+    unsigned int i;
+
+    wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
+    wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
+    wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
+    wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
+    for ( i = 0; i < addr_range; i++ )
+    {
+        wrmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
+        wrmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
+    }
+}
+
+static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned int addr_range)
+{
+    unsigned int i;
+
+    rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
+    rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
+    rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
+    rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
+    for ( i = 0; i < addr_range; i++ )
+    {
+        rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
+        rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
+    }
+}
+
+void ipt_guest_enter(struct vcpu *v)
+{
+    struct ipt_desc *ipt = v->arch.hvm_vmx.ipt_desc;
+
+    if ( !ipt )
+        return;
+
+    /*
+     * Need re-initialize the guest state of IA32_RTIT_CTL
+     * When this vcpu be scheduled to another Physical CPU.
+     * TBD: Performance optimization. Add a new item in
+     * struct ipt_desc to record the last pcpu, and check
+     * if this vcpu is scheduled to another pcpu here (like vpmu).
+     */
+    vmx_vmcs_enter(v);
+    __vmwrite(GUEST_IA32_RTIT_CTL, ipt->ipt_guest.ctl);
+    vmx_vmcs_exit(v);
+
+    if ( ipt->ipt_guest.ctl & RTIT_CTL_TRACEEN )
+        ipt_load_msr(&ipt->ipt_guest, ipt->addr_range);
+}
+
+void ipt_guest_exit(struct vcpu *v)
+{
+    struct ipt_desc *ipt = v->arch.hvm_vmx.ipt_desc;
+
+    if ( !ipt )
+        return;
+
+    if ( ipt->ipt_guest.ctl & RTIT_CTL_TRACEEN )
+        ipt_save_msr(&ipt->ipt_guest, ipt->addr_range);
+}
+
+int ipt_initialize(struct vcpu *v)
+{
+    struct ipt_desc *ipt = NULL;
+    unsigned int eax, tmp, addr_range;
+
+    if ( !cpu_has_ipt || (ipt_mode == IPT_MODE_OFF) ||
+         !(v->arch.hvm_vmx.secondary_exec_control & SECONDARY_EXEC_PT_USE_GPA) )
+        return 0;
+
+    if ( cpuid_eax(IPT_CPUID) == 0 )
+        return -EINVAL;
+
+    cpuid_count(IPT_CPUID, 1, &eax, &tmp, &tmp, &tmp);
+    addr_range = eax & IPT_ADDR_RANGE_MASK;
+    ipt = _xzalloc(sizeof(struct ipt_desc) + sizeof(uint64_t) * addr_range * 2,
+			__alignof(*ipt));
+    if ( !ipt )
+        return -ENOMEM;
+
+    ipt->addr_range = addr_range;
+    ipt->ipt_guest.output_mask = RTIT_OUTPUT_MASK_DEFAULT;
+    v->arch.hvm_vmx.ipt_desc = ipt;
+
+    return 0;
+}
+
+void ipt_destroy(struct vcpu *v)
+{
+    if ( v->arch.hvm_vmx.ipt_desc )
+    {
+        xfree(v->arch.hvm_vmx.ipt_desc);
+        v->arch.hvm_vmx.ipt_desc = NULL;
+    }
+}
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 9707514..060ab65 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -55,6 +55,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <asm/altp2m.h>
 #include <asm/event.h>
+#include <asm/ipt.h>
 #include <asm/mce.h>
 #include <asm/monitor.h>
 #include <public/arch-x86/cpuid.h>
@@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
     if ( v->vcpu_id == 0 )
         v->arch.user_regs.rax = 1;
 
+    rc = ipt_initialize(v);
+    if ( rc )
+        dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace.\n", v);
+
     return 0;
 }
 
 static void vmx_vcpu_destroy(struct vcpu *v)
 {
+    ipt_destroy(v);
     /*
      * There are cases that domain still remains in log-dirty mode when it is
      * about to be destroyed (ex, user types 'xl destroy <dom>'), in which case
@@ -3508,6 +3514,8 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
     __vmread(GUEST_RSP,    &regs->rsp);
     __vmread(GUEST_RFLAGS, &regs->rflags);
 
+    ipt_guest_exit(v);
+
     hvm_invalidate_regs_fields(regs);
 
     if ( paging_mode_hap(v->domain) )
@@ -4281,6 +4289,8 @@ bool vmx_vmenter_helper(const struct cpu_user_regs *regs)
     }
 
  out:
+    ipt_guest_enter(curr);
+
     if ( unlikely(curr->arch.hvm_vmx.lbr_fixup_enabled) )
         lbr_fixup();
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 2990992..2388e27 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -174,6 +174,8 @@ struct arch_vmx_struct {
      * pCPU and wakeup the related vCPU.
      */
     struct pi_blocking_vcpu pi_blocking;
+
+    struct ipt_desc      *ipt_desc;
 };
 
 int vmx_create_vmcs(struct vcpu *v);
@@ -421,6 +423,7 @@ enum vmcs_field {
     GUEST_PDPTE0                    = 0x0000280a,
 #define GUEST_PDPTE(n) (GUEST_PDPTE0 + (n) * 2) /* n = 0...3 */
     GUEST_BNDCFGS                   = 0x00002812,
+    GUEST_IA32_RTIT_CTL             = 0x00002814,
     HOST_PAT                        = 0x00002c00,
     HOST_EFER                       = 0x00002c02,
     HOST_PERF_GLOBAL_CTRL           = 0x00002c04,
diff --git a/xen/include/asm-x86/ipt.h b/xen/include/asm-x86/ipt.h
index 65b064c..a69f049 100644
--- a/xen/include/asm-x86/ipt.h
+++ b/xen/include/asm-x86/ipt.h
@@ -26,6 +26,29 @@
 
 #define IPT_CPUID           0x00000014
 
+#define IPT_ADDR_RANGE_MASK         0x00000007
+#define RTIT_OUTPUT_MASK_DEFAULT    0x0000007f
+
 extern unsigned int ipt_mode;
 
+struct ipt_ctx {
+    uint64_t ctl;
+    uint64_t status;
+    uint64_t output_base;
+    uint64_t output_mask;
+    uint64_t cr3_match;
+    uint64_t addr[0];
+};
+
+struct ipt_desc {
+    unsigned int addr_range;
+    struct ipt_ctx ipt_guest;
+};
+
+extern void ipt_guest_enter(struct vcpu *v);
+extern void ipt_guest_exit(struct vcpu *v);
+
+extern int ipt_initialize(struct vcpu *v);
+extern void ipt_destroy(struct vcpu *v);
+
 #endif /* __ASM_X86_HVM_IPT_H_ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
                   ` (4 preceding siblings ...)
  2018-05-30 13:27 ` [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch Luwei Kang
@ 2018-05-30 13:28 ` Luwei Kang
  2018-06-29 14:35   ` Jan Beulich
  2018-05-30 13:28 ` [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation Luwei Kang
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:28 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Introduce a new function to check if a specific capability
of Intel Processor Trace is exists.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/cpu/ipt.c    | 63 +++++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/ipt.h | 19 ++++++++++++++
 2 files changed, 82 insertions(+)

diff --git a/xen/arch/x86/cpu/ipt.c b/xen/arch/x86/cpu/ipt.c
index b81a155..977a3d7 100644
--- a/xen/arch/x86/cpu/ipt.c
+++ b/xen/arch/x86/cpu/ipt.c
@@ -25,11 +25,74 @@
 #include <asm/ipt.h>
 #include <asm/msr.h>
 
+#define EAX 0
+#define ECX 1
+#define EDX 2
+#define EBX 3
+#define CPUID_REGS_NUM   4 /* number of regsters (eax, ebx, ecx, edx) */
+
+#define BIT(nr)                 (1UL << (nr))
+
 /* ipt: Flag to enable Intel Processor Trace (default off). */
 unsigned int __read_mostly ipt_mode = IPT_MODE_OFF;
 static int parse_ipt_params(const char *str);
 custom_param("ipt", parse_ipt_params);
 
+#define IPT_CAP(_n, _l, _r, _m)                               \
+    [IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
+        .reg = _r, .mask = _m }
+
+static struct ipt_cap_desc {
+    const char    *name;
+    unsigned int  leaf;
+    unsigned char reg;
+    unsigned int  mask;
+} ipt_caps[] = {
+    IPT_CAP(max_subleaf,            0, EAX, 0xffffffff),
+    IPT_CAP(cr3_filter,             0, EBX, BIT(0)),
+    IPT_CAP(psb_cyc,                0, EBX, BIT(1)),
+    IPT_CAP(ip_filter,              0, EBX, BIT(2)),
+    IPT_CAP(mtc,                    0, EBX, BIT(3)),
+    IPT_CAP(ptwrite,                0, EBX, BIT(4)),
+    IPT_CAP(power_event,            0, EBX, BIT(5)),
+    IPT_CAP(topa_output,            0, ECX, BIT(0)),
+    IPT_CAP(topa_multi_entry,       0, ECX, BIT(1)),
+    IPT_CAP(single_range_output,    0, ECX, BIT(2)),
+    IPT_CAP(output_subsys,          0, ECX, BIT(3)),
+    IPT_CAP(payloads_lip,           0, ECX, BIT(31)),
+    IPT_CAP(addr_range,             1, EAX, 0x7),
+    IPT_CAP(mtc_period,             1, EAX, 0xffff0000),
+    IPT_CAP(cycle_threshold,        1, EBX, 0xffff),
+    IPT_CAP(psb_freq,               1, EBX, 0xffff0000),
+};
+
+static unsigned int ipt_cap(const struct cpuid_leaf *cpuid_ipt, enum ipt_cap cap)
+{
+    const struct ipt_cap_desc *cd = &ipt_caps[cap];
+    unsigned int shift = ffs(cd->mask) - 1;
+    unsigned int val = 0;
+
+    cpuid_ipt += cd->leaf;
+
+    switch ( cd->reg )
+    {
+    case EAX:
+        val = cpuid_ipt->a;
+        break;
+    case EBX:
+        val = cpuid_ipt->b;
+        break;
+    case ECX:
+        val = cpuid_ipt->c;
+        break;
+    case EDX:
+        val = cpuid_ipt->d;
+        break;
+    }
+
+    return (val & cd->mask) >> shift;
+}
+
 static int __init parse_ipt_params(const char *str)
 {
     if ( !strcmp("guest", str) )
diff --git a/xen/include/asm-x86/ipt.h b/xen/include/asm-x86/ipt.h
index a69f049..422f46a 100644
--- a/xen/include/asm-x86/ipt.h
+++ b/xen/include/asm-x86/ipt.h
@@ -31,6 +31,25 @@
 
 extern unsigned int ipt_mode;
 
+enum ipt_cap {
+    IPT_CAP_max_subleaf = 0,
+    IPT_CAP_cr3_filter,
+    IPT_CAP_psb_cyc,
+    IPT_CAP_ip_filter,
+    IPT_CAP_mtc,
+    IPT_CAP_ptwrite,
+    IPT_CAP_power_event,
+    IPT_CAP_topa_output,
+    IPT_CAP_topa_multi_entry,
+    IPT_CAP_single_range_output,
+    IPT_CAP_output_subsys,
+    IPT_CAP_payloads_lip,
+    IPT_CAP_addr_range,
+    IPT_CAP_mtc_period,
+    IPT_CAP_cycle_threshold,
+    IPT_CAP_psb_freq,
+};
+
 struct ipt_ctx {
     uint64_t ctl;
     uint64_t status;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
                   ` (5 preceding siblings ...)
  2018-05-30 13:28 ` [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT Luwei Kang
@ 2018-05-30 13:28 ` Luwei Kang
  2018-06-29 14:46   ` Jan Beulich
  2018-05-30 13:28 ` [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL Luwei Kang
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:28 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Add Intel Processor Trace MSRs read/write emulation.
If Intel Processor Trace is not supported in guest,
access not supported MSRs or access reserved bits,
a #GP will be injected to guest.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/cpu/ipt.c     | 108 +++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmx/vmx.c |  18 ++++++++
 xen/include/asm-x86/ipt.h  |   3 ++
 3 files changed, 129 insertions(+)

diff --git a/xen/arch/x86/cpu/ipt.c b/xen/arch/x86/cpu/ipt.c
index 977a3d7..dcb7a8d 100644
--- a/xen/arch/x86/cpu/ipt.c
+++ b/xen/arch/x86/cpu/ipt.c
@@ -33,6 +33,14 @@
 
 #define BIT(nr)                 (1UL << (nr))
 
+#define MSR_IA32_RTIT_STATUS_MASK (~(RTIT_STATUS_FILTER_EN | \
+               RTIT_STATUS_CONTEXT_EN | RTIT_STATUS_TRIGGER_EN | \
+               RTIT_STATUS_ERROR | RTIT_STATUS_STOPPED | \
+               RTIT_STATUS_BYTECNT))
+
+#define MSR_IA32_RTIT_OUTPUT_BASE_MASK(maxphyaddr) \
+               (~((1UL << (maxphyaddr)) - 1) | 0x7f)
+
 /* ipt: Flag to enable Intel Processor Trace (default off). */
 unsigned int __read_mostly ipt_mode = IPT_MODE_OFF;
 static int parse_ipt_params(const char *str);
@@ -106,6 +114,105 @@ static int __init parse_ipt_params(const char *str)
     return 0;
 }
 
+int ipt_do_rdmsr(unsigned int msr, uint64_t *msr_content)
+{
+    const struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
+    const struct cpuid_policy *p = current->domain->arch.cpuid;
+    unsigned int index;
+
+    if ( !ipt_desc )
+        return 1;
+
+    switch ( msr )
+    {
+    case MSR_IA32_RTIT_CTL:
+        *msr_content = ipt_desc->ipt_guest.ctl;
+        break;
+    case MSR_IA32_RTIT_STATUS:
+        *msr_content = ipt_desc->ipt_guest.status;
+        break;
+    case MSR_IA32_RTIT_OUTPUT_BASE:
+        if ( !ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) &&
+             !ipt_cap(p->ipt.raw, IPT_CAP_topa_output) )
+            return 1;
+        *msr_content = ipt_desc->ipt_guest.output_base;
+        break;
+    case MSR_IA32_RTIT_OUTPUT_MASK:
+        if ( !ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) &&
+             !ipt_cap(p->ipt.raw, IPT_CAP_topa_output) )
+            return 1;
+        *msr_content = ipt_desc->ipt_guest.output_mask |
+                                    RTIT_OUTPUT_MASK_DEFAULT;
+        break;
+    case MSR_IA32_RTIT_CR3_MATCH:
+        if ( !ipt_cap(p->ipt.raw, IPT_CAP_cr3_filter) )
+            return 1;
+        *msr_content = ipt_desc->ipt_guest.cr3_match;
+        break;
+    default:
+	index = msr - MSR_IA32_RTIT_ADDR_A(0);
+        if ( index >= ipt_cap(p->ipt.raw, IPT_CAP_addr_range) * 2 )
+            return 1;
+        *msr_content = ipt_desc->ipt_guest.addr[index];
+    }
+
+    return 0;
+}
+
+int ipt_do_wrmsr(unsigned int msr, uint64_t msr_content)
+{
+    struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
+    const struct cpuid_policy *p = current->domain->arch.cpuid;
+    unsigned int index;
+
+    if ( !ipt_desc )
+        return 1;
+
+    switch ( msr )
+    {
+    case MSR_IA32_RTIT_CTL:
+        ipt_desc->ipt_guest.ctl = msr_content;
+        __vmwrite(GUEST_IA32_RTIT_CTL, msr_content);
+        break;
+    case MSR_IA32_RTIT_STATUS:
+        if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) ||
+             (msr_content & MSR_IA32_RTIT_STATUS_MASK) )
+            return 1;
+        ipt_desc->ipt_guest.status = msr_content;
+        break;
+    case MSR_IA32_RTIT_OUTPUT_BASE:
+        if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) ||
+             (msr_content &
+                 MSR_IA32_RTIT_OUTPUT_BASE_MASK(p->extd.maxphysaddr)) ||
+             (!ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) &&
+              !ipt_cap(p->ipt.raw, IPT_CAP_topa_output)) )
+            return 1;
+        ipt_desc->ipt_guest.output_base = msr_content;
+        break;
+    case MSR_IA32_RTIT_OUTPUT_MASK:
+        if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) ||
+             (!ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) &&
+              !ipt_cap(p->ipt.raw, IPT_CAP_topa_output)) )
+            return 1;
+        ipt_desc->ipt_guest.output_mask = msr_content |
+                                RTIT_OUTPUT_MASK_DEFAULT;
+        break;
+    case MSR_IA32_RTIT_CR3_MATCH:
+        if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) ||
+             !ipt_cap(p->ipt.raw, IPT_CAP_cr3_filter) )
+            return 1;
+        ipt_desc->ipt_guest.cr3_match = msr_content;
+        break;
+    default:
+        index = msr - MSR_IA32_RTIT_ADDR_A(0);
+        if ( index >= ipt_cap(p->ipt.raw, IPT_CAP_addr_range) * 2 )
+            return 1;
+        ipt_desc->ipt_guest.addr[index] = msr_content;
+    }
+
+    return 0;
+}
+
 static inline void ipt_load_msr(const struct ipt_ctx *ctx,
                        unsigned int addr_range)
 {
@@ -204,3 +311,4 @@ void ipt_destroy(struct vcpu *v)
         v->arch.hvm_vmx.ipt_desc = NULL;
     }
 }
+
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 060ab65..fa1ca0c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2898,6 +2898,15 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         if ( vpmu_do_rdmsr(msr, msr_content) )
             goto gp_fault;
         break;
+    case MSR_IA32_RTIT_CTL:
+    case MSR_IA32_RTIT_STATUS:
+    case MSR_IA32_RTIT_OUTPUT_BASE:
+    case MSR_IA32_RTIT_OUTPUT_MASK:
+    case MSR_IA32_RTIT_CR3_MATCH:
+    case MSR_IA32_RTIT_ADDR_A(0) ... MSR_IA32_RTIT_ADDR_B(3):
+        if ( ipt_do_rdmsr(msr, msr_content) )
+            goto gp_fault;
+        break;
 
     default:
         if ( passive_domain_do_rdmsr(msr, msr_content) )
@@ -3148,6 +3157,15 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content)
          if ( vpmu_do_wrmsr(msr, msr_content, 0) )
             goto gp_fault;
         break;
+    case MSR_IA32_RTIT_CTL:
+    case MSR_IA32_RTIT_STATUS:
+    case MSR_IA32_RTIT_OUTPUT_BASE:
+    case MSR_IA32_RTIT_OUTPUT_MASK:
+    case MSR_IA32_RTIT_CR3_MATCH:
+    case MSR_IA32_RTIT_ADDR_A(0) ... MSR_IA32_RTIT_ADDR_B(3):
+        if ( ipt_do_wrmsr(msr, msr_content) )
+            goto gp_fault;
+        break;
 
     default:
         if ( passive_domain_do_wrmsr(msr, msr_content) )
diff --git a/xen/include/asm-x86/ipt.h b/xen/include/asm-x86/ipt.h
index 422f46a..961de0b 100644
--- a/xen/include/asm-x86/ipt.h
+++ b/xen/include/asm-x86/ipt.h
@@ -64,6 +64,9 @@ struct ipt_desc {
     struct ipt_ctx ipt_guest;
 };
 
+extern int ipt_do_rdmsr(unsigned int msr, uint64_t *pdata);
+extern int ipt_do_wrmsr(unsigned int msr, uint64_t data);
+
 extern void ipt_guest_enter(struct vcpu *v);
 extern void ipt_guest_exit(struct vcpu *v);
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
                   ` (6 preceding siblings ...)
  2018-05-30 13:28 ` [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation Luwei Kang
@ 2018-05-30 13:28 ` Luwei Kang
  2018-06-29 14:56   ` Jan Beulich
  2018-05-30 13:28 ` [PATCH v2 09/10] x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:28 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
result in a #GP unless the same write also clears TraceEn.
Writes to IA32_RTIT_CTL that do not modify any bits will not
cause a #GP, even if TraceEn remains set.
MSR write that attempts to change bits marked reserved, or
utilize encodings marked reserved, will cause a #GP fault.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/cpu/ipt.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)

diff --git a/xen/arch/x86/cpu/ipt.c b/xen/arch/x86/cpu/ipt.c
index dcb7a8d..fd75a01 100644
--- a/xen/arch/x86/cpu/ipt.c
+++ b/xen/arch/x86/cpu/ipt.c
@@ -114,6 +114,114 @@ static int __init parse_ipt_params(const char *str)
     return 0;
 }
 
+static int rtit_ctl_check(uint64_t new, uint64_t old)
+{
+    const struct cpuid_policy *p = current->domain->arch.cpuid;
+    const struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
+    uint64_t rtit_ctl_mask = ~((uint64_t)0);
+    unsigned int addr_range = ipt_cap(p->ipt.raw, IPT_CAP_addr_range);
+    unsigned int val, i;
+
+    if  ( new == old )
+        return 0;
+
+    /* Clear no dependency bits */
+    rtit_ctl_mask = ~(RTIT_CTL_TRACEEN | RTIT_CTL_OS |
+                RTIT_CTL_USR | RTIT_CTL_TSC_EN | RTIT_CTL_DIS_RETC);
+
+    /* If CPUID.(EAX=14H,ECX=0):EBX[0]=1 CR3Filter can be set */
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_cr3_filter) )
+        rtit_ctl_mask &= ~RTIT_CTL_CR3_FILTER;
+
+    /*
+     * If CPUID.(EAX=14H,ECX=0):EBX[1]=1 CYCEn, CycThresh and
+     * PSBFreq can be set
+     */
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_psb_cyc) )
+        rtit_ctl_mask &= ~(RTIT_CTL_CYCEN |
+                RTIT_CTL_CYC_THRESH | RTIT_CTL_PSB_FREQ);
+    /*
+     * If CPUID.(EAX=14H,ECX=0):EBX[3]=1 MTCEn BranchEn and
+     * MTCFreq can be set
+     */
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_mtc) )
+        rtit_ctl_mask &= ~(RTIT_CTL_MTC_EN |
+                RTIT_CTL_BRANCH_EN | RTIT_CTL_MTC_FREQ);
+
+    /* If CPUID.(EAX=14H,ECX=0):EBX[4]=1 FUPonPTW and PTWEn can be set */
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_ptwrite) )
+        rtit_ctl_mask &= ~(RTIT_CTL_FUP_ON_PTW |
+                                        RTIT_CTL_PTW_EN);
+
+    /* If CPUID.(EAX=14H,ECX=0):EBX[5]=1 PwrEvEn can be set */
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_power_event) )
+        rtit_ctl_mask &= ~RTIT_CTL_PWR_EVT_EN;
+
+    /* If CPUID.(EAX=14H,ECX=0):ECX[0]=1 ToPA can be set */
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_topa_output) )
+        rtit_ctl_mask &= ~RTIT_CTL_TOPA;
+    /* If CPUID.(EAX=14H,ECX=0):ECX[3]=1 FabircEn can be set */
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_output_subsys))
+        rtit_ctl_mask &= ~RTIT_CTL_FABRIC_EN;
+    /* unmask address range configure area */
+    for (i = 0; i < addr_range; i++)
+        rtit_ctl_mask &= ~(0xf << (32 + i * 4));
+
+    /*
+     * Any MSR write that attempts to change bits marked reserved will
+     * case a #GP fault.
+     */
+    if ( new & rtit_ctl_mask )
+        return 1;
+
+    /*
+     * Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
+     * result in a #GP unless the same write also clears TraceEn.
+     */
+    if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) &&
+        ((ipt_desc->ipt_guest.ctl ^ new) & ~RTIT_CTL_TRACEEN) )
+        return 1;
+
+    /*
+     * WRMSR to IA32_RTIT_CTL that sets TraceEn but clears this bit
+     * and FabricEn would cause #GP, if
+     * CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 0
+     */
+   if ( (new & RTIT_CTL_TRACEEN) && !(new & RTIT_CTL_TOPA) &&
+        !(new & RTIT_CTL_FABRIC_EN) &&
+        !ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) )
+        return 1;
+    /*
+     * MTCFreq, CycThresh and PSBFreq encodings check, any MSR write that
+     * utilize encodings marked reserved will casue a #GP fault.
+     */
+    val = ipt_cap(p->ipt.raw, IPT_CAP_mtc_period);
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_mtc) &&
+                !test_bit((new & RTIT_CTL_MTC_FREQ) >>
+                RTIT_CTL_MTC_FREQ_OFFSET, &val) )
+        return 1;
+    val = ipt_cap(p->ipt.raw, IPT_CAP_cycle_threshold);
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_psb_cyc) &&
+                !test_bit((new & RTIT_CTL_CYC_THRESH) >>
+                RTIT_CTL_CYC_THRESH_OFFSET, &val) )
+        return 1;
+    val = ipt_cap(p->ipt.raw, IPT_CAP_psb_freq);
+    if ( ipt_cap(p->ipt.raw, IPT_CAP_psb_cyc) &&
+                !test_bit((new & RTIT_CTL_PSB_FREQ) >>
+                RTIT_CTL_PSB_FREQ_OFFSET, &val) )
+        return 1;
+
+    /*
+     * If ADDRx_CFG is reserved or the encodings is >2 will
+     * cause a #GP fault.
+     */
+    for (i = 0; i < addr_range; i++)
+        if ( ((new & RTIT_CTL_ADDR(i)) >> RTIT_CTL_ADDR_OFFSET(i)) > 2 )
+            return 1;
+
+    return 0;
+}
+
 int ipt_do_rdmsr(unsigned int msr, uint64_t *msr_content)
 {
     const struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
@@ -171,6 +279,8 @@ int ipt_do_wrmsr(unsigned int msr, uint64_t msr_content)
     switch ( msr )
     {
     case MSR_IA32_RTIT_CTL:
+        if ( rtit_ctl_check(msr_content, ipt_desc->ipt_guest.ctl) )
+            return 1;
         ipt_desc->ipt_guest.ctl = msr_content;
         __vmwrite(GUEST_IA32_RTIT_CTL, msr_content);
         break;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 09/10] x86: Disable Intel Processor Trace when VMXON in L1 guest
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
                   ` (7 preceding siblings ...)
  2018-05-30 13:28 ` [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL Luwei Kang
@ 2018-05-30 13:28 ` Luwei Kang
  2018-06-29 15:14   ` Jan Beulich
  2018-05-30 13:28 ` [PATCH v2 10/10] x86: Handle new asynchronous exit qualification Luwei Kang
  2018-05-30 15:14 ` [PATCH v2 00/10] Intel Processor Trace virtulization enabling Julien Grall
  10 siblings, 1 reply; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:28 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Disable Intel Processor Trace VMX operation(IA32_VMX_MISC[bit 14] is 0)
in L1 guest. As mentioned in SDM, on these type of processors, execution
of the VMXON instruction will  clears IA32_RTIT_CTL.TraceEn and any
attempt to write IA32_RTIT_CTL causes a general-protection xception (#GP).

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/hvm/vmx/vvmx.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index e97db33..30c7876 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -26,6 +26,7 @@
 #include <asm/hvm/vmx/vmx.h>
 #include <asm/hvm/vmx/vvmx.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/ipt.h>
 
 static DEFINE_PER_CPU(u64 *, vvmcs_buf);
 
@@ -1519,6 +1520,14 @@ int nvmx_handle_vmxon(struct cpu_user_regs *regs)
     v->arch.hvm_vmx.launched = 0;
     vmsucceed(regs);
 
+    if ( v->arch.hvm_vmx.ipt_desc )
+    {
+        v->arch.hvm_vmx.ipt_desc->ipt_guest.ctl = 0;
+        vmx_vmcs_enter(current);
+        __vmwrite(GUEST_IA32_RTIT_CTL, 0);
+        vmx_vmcs_exit(current);
+    }
+
     return X86EMUL_OKAY;
 }
 
@@ -2143,8 +2152,8 @@ int nvmx_msr_read_intercept(unsigned int msr, u64 *msr_content)
         data = hvm_cr4_guest_valid_bits(d, false);
         break;
     case MSR_IA32_VMX_MISC:
-        /* Do not support CR3-target feature now */
-        data = host_data & ~VMX_MISC_CR3_TARGET;
+        /* Do not support CR3-target and PT VMX feature now */
+        data = host_data & ~(VMX_MISC_CR3_TARGET | VMX_MISC_PT_ENABLE);
         break;
     case MSR_IA32_VMX_EPT_VPID_CAP:
         data = nept_get_ept_vpid_cap();
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v2 10/10] x86: Handle new asynchronous exit qualification
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
                   ` (8 preceding siblings ...)
  2018-05-30 13:28 ` [PATCH v2 09/10] x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
@ 2018-05-30 13:28 ` Luwei Kang
  2018-06-29 15:22   ` Jan Beulich
  2018-05-30 15:14 ` [PATCH v2 00/10] Intel Processor Trace virtulization enabling Julien Grall
  10 siblings, 1 reply; 50+ messages in thread
From: Luwei Kang @ 2018-05-30 13:28 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, julien.grall, jbeulich,
	Luwei Kang

Using EPT to translate PT output addresses introduces the possibility of
taking events on PT output reads and writes. Event possibilities include
EPT violations, EPT misconfigurations, PML log-full VM exits, and APIC
access VM exits.
EPT violations:
 a. Intel PT buffer is a MMIO address in guest. Actually, it can be a
    MMIO address (SDM 35.2.6.1), but in order do not affect other
    passthrough/emulate device in guest. Ferbid use MMIO addr at present.
 b. Intel PT buffer is a RAM non-writable address. Don't need emulate
    and inject a #GP to guest.
 c. EPT table entry write protect for Live Migration. Do nothing and
    handled as usual.
EPT misconfiguration:
 Nothing to do.
PML log-full VM exits:
 Intel PT trace output a new page, this behavior will be recorded to
 PML page may cause PML log-FULL VM-exit. Nothing to do.
APIC access VM exits:
 PT output region shouldn't have overlap with 4KB APIC MMIO region as
 defined by the IA32_APIC_BASE (SDM 35.2.6.4) but no error for this
 case in hardware. Crash guest in hypervisor.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/hvm/hvm.c            | 8 ++++++--
 xen/arch/x86/hvm/vmx/vmx.c        | 5 +++++
 xen/include/asm-x86/hvm/vmx/vmx.h | 8 +++++---
 xen/include/xen/mm.h              | 1 +
 4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c23983c..7782160 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1712,7 +1712,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
     struct p2m_domain *p2m, *hostp2m;
-    int rc, fall_through = 0, paged = 0;
+    int rc = 0, fall_through = 0, paged = 0;
     int sharing_enomem = 0;
     vm_event_request_t *req_ptr = NULL;
     bool_t ap2m_active, sync = 0;
@@ -1873,7 +1873,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
          (npfec.write_access &&
           (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server))) )
     {
-        if ( !handle_mmio_with_translation(gla, gpa >> PAGE_SHIFT, npfec) )
+        /* Don't emulate and make guest crash when write to mmio address */
+        if ( npfec.async && (p2mt == p2m_mmio_dm) )
+            goto out_put_gfn;
+
+        if ( npfec.async || !handle_mmio_with_translation(gla, gpa >> PAGE_SHIFT, npfec) )
             hvm_inject_hw_exception(TRAP_gp_fault, 0);
         rc = 1;
         goto out_put_gfn;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index fa1ca0c..d0d00f8 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3253,6 +3253,7 @@ static void ept_handle_violation(ept_qual_t q, paddr_t gpa)
         .write_access = q.write,
         .insn_fetch = q.fetch,
         .present = q.eff_read || q.eff_write || q.eff_exec,
+        .async = q.async,
     };
 
     if ( tb_init_done )
@@ -4027,6 +4028,10 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         break;
 
     case EXIT_REASON_APIC_ACCESS:
+        __vmread(EXIT_QUALIFICATION, &exit_qualification);
+        if ( exit_qualification & 0x10000 )
+            goto exit_and_crash;
+
         if ( !vmx_handle_eoi_write() && !handle_mmio() )
             hvm_inject_hw_exception(TRAP_gp_fault, 0);
         break;
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 89619e4..e7c5360 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -620,11 +620,13 @@ void vmx_pi_hooks_deassign(struct domain *d);
 typedef union ept_qual {
     unsigned long raw;
     struct {
-        bool read:1, write:1, fetch:1,
+        unsigned long read:1, write:1, fetch:1,
             eff_read:1, eff_write:1, eff_exec:1, /* eff_user_exec */:1,
             gla_valid:1,
-            gla_fault:1; /* Valid iff gla_valid. */
-        unsigned long /* pad */:55;
+            gla_fault:1, /* Valid iff gla_valid. */
+            :7,
+            async:1; /* Asynchronous to Instruction Execution (e.g. ipt) */
+        unsigned long /* pad */:47;
     };
 } __transparent__ ept_qual_t;
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index e928551..1546d4f 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -228,6 +228,7 @@ struct npfec {
     unsigned int present:1;
     unsigned int gla_valid:1;
     unsigned int kind:2;  /* npfec_kind_t */
+    unsigned int async:1;
 };
 
 /* memflags: */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
  2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
                   ` (9 preceding siblings ...)
  2018-05-30 13:28 ` [PATCH v2 10/10] x86: Handle new asynchronous exit qualification Luwei Kang
@ 2018-05-30 15:14 ` Julien Grall
  2018-05-30 23:29   ` Kang, Luwei
  10 siblings, 1 reply; 50+ messages in thread
From: Julien Grall @ 2018-05-30 15:14 UTC (permalink / raw)
  To: Luwei Kang, xen-devel
  Cc: kevin.tian, sstabellini, wei.liu2, jun.nakajima, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich

Hi,

Can you please avoid CC everyone on each patch? You can use 
scripts/get_maintainers.pl on each patch to see who is required to be CCed.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
  2018-05-30 15:14 ` [PATCH v2 00/10] Intel Processor Trace virtulization enabling Julien Grall
@ 2018-05-30 23:29   ` Kang, Luwei
  2018-05-31  9:10     ` Julien Grall
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-05-30 23:29 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Tian, Kevin, sstabellini, wei.liu2, Nakajima, Jun, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich

> -----Original Message-----
> From: Julien Grall [mailto:julien.grall@arm.com]
> Sent: Wednesday, May 30, 2018 11:15 PM
> To: Kang, Luwei <luwei.kang@intel.com>; xen-devel@lists.xen.org
> Cc: andrew.cooper3@citrix.com; George.Dunlap@eu.citrix.com; ian.jackson@eu.citrix.com; jbeulich@suse.com;
> konrad.wilk@oracle.com; sstabellini@kernel.org; tim@xen.org; wei.liu2@citrix.com; Nakajima, Jun <jun.nakajima@intel.com>; Tian,
> Kevin <kevin.tian@intel.com>
> Subject: Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
> 
> Hi,
> 
> Can you please avoid CC everyone on each patch? You can use scripts/get_maintainers.pl on each patch to see who is required to be
> CCed.

OK, get it. I use script/get_maintainers.pl to get the people who need to be CC and indeed different patch may include different peoples. If somebody  just receive one patch of this patch set may feel a little strange and don't understand the context. So I CC all the peoples who is mentioned in this patch set.

> 
> Cheers,
> 
> --
> Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
  2018-05-30 23:29   ` Kang, Luwei
@ 2018-05-31  9:10     ` Julien Grall
  2018-05-31  9:21       ` Kang, Luwei
  2018-06-01  7:49       ` Jan Beulich
  0 siblings, 2 replies; 50+ messages in thread
From: Julien Grall @ 2018-05-31  9:10 UTC (permalink / raw)
  To: Kang, Luwei, xen-devel
  Cc: Tian, Kevin, sstabellini, wei.liu2, Nakajima, Jun, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich

Hi,

On 31/05/18 00:29, Kang, Luwei wrote:
>> -----Original Message-----
>> From: Julien Grall [mailto:julien.grall@arm.com]
>> Sent: Wednesday, May 30, 2018 11:15 PM
>> To: Kang, Luwei <luwei.kang@intel.com>; xen-devel@lists.xen.org
>> Cc: andrew.cooper3@citrix.com; George.Dunlap@eu.citrix.com; ian.jackson@eu.citrix.com; jbeulich@suse.com;
>> konrad.wilk@oracle.com; sstabellini@kernel.org; tim@xen.org; wei.liu2@citrix.com; Nakajima, Jun <jun.nakajima@intel.com>; Tian,
>> Kevin <kevin.tian@intel.com>
>> Subject: Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
>>
>> Hi,
>>
>> Can you please avoid CC everyone on each patch? You can use scripts/get_maintainers.pl on each patch to see who is required to be
>> CCed.
> 
> OK, get it. I use script/get_maintainers.pl to get the people who need to be CC and indeed different patch may include different peoples. If somebody  just receive one patch of this patch set may feel a little strange and don't understand the context. So I CC all the peoples who is mentioned in this patch set.

That's usually why I CC everyone on the cover letter. Then for each 
patch I CC only the necessary person.

This avoids maintainers to have to look for what they should review/ack.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
  2018-05-31  9:10     ` Julien Grall
@ 2018-05-31  9:21       ` Kang, Luwei
  2018-06-01  7:49       ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Kang, Luwei @ 2018-05-31  9:21 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Tian, Kevin, sstabellini, wei.liu2, Nakajima, Jun, George.Dunlap,
	andrew.cooper3, ian.jackson, tim, jbeulich

> >> -----Original Message-----
> >> From: Julien Grall [mailto:julien.grall@arm.com]
> >> Sent: Wednesday, May 30, 2018 11:15 PM
> >> To: Kang, Luwei <luwei.kang@intel.com>; xen-devel@lists.xen.org
> >> Cc: andrew.cooper3@citrix.com; George.Dunlap@eu.citrix.com;
> >> ian.jackson@eu.citrix.com; jbeulich@suse.com; konrad.wilk@oracle.com;
> >> sstabellini@kernel.org; tim@xen.org; wei.liu2@citrix.com; Nakajima,
> >> Jun <jun.nakajima@intel.com>; Tian, Kevin <kevin.tian@intel.com>
> >> Subject: Re: [PATCH v2 00/10] Intel Processor Trace virtulization
> >> enabling
> >>
> >> Hi,
> >>
> >> Can you please avoid CC everyone on each patch? You can use
> >> scripts/get_maintainers.pl on each patch to see who is required to be CCed.
> >
> > OK, get it. I use script/get_maintainers.pl to get the people who need to be CC and indeed different patch may include different
> peoples. If somebody  just receive one patch of this patch set may feel a little strange and don't understand the context. So I CC all
> the peoples who is mentioned in this patch set.
> 
> That's usually why I CC everyone on the cover letter. Then for each patch I CC only the necessary person.
> 
> This avoids maintainers to have to look for what they should review/ack.

Get it. This is a good way to me.  Thanks.

Luwei Kang
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
  2018-05-31  9:10     ` Julien Grall
  2018-05-31  9:21       ` Kang, Luwei
@ 2018-06-01  7:49       ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2018-06-01  7:49 UTC (permalink / raw)
  To: Julien Grall, Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Jun Nakajima

>>> On 31.05.18 at 11:10, <julien.grall@arm.com> wrote:
> Hi,
> 
> On 31/05/18 00:29, Kang, Luwei wrote:
>>> -----Original Message-----
>>> From: Julien Grall [mailto:julien.grall@arm.com]
>>> Sent: Wednesday, May 30, 2018 11:15 PM
>>> To: Kang, Luwei <luwei.kang@intel.com>; xen-devel@lists.xen.org 
>>> Cc: andrew.cooper3@citrix.com; George.Dunlap@eu.citrix.com; 
> ian.jackson@eu.citrix.com; jbeulich@suse.com;
>>> konrad.wilk@oracle.com; sstabellini@kernel.org; tim@xen.org; 
> wei.liu2@citrix.com; Nakajima, Jun <jun.nakajima@intel.com>; Tian,
>>> Kevin <kevin.tian@intel.com>
>>> Subject: Re: [PATCH v2 00/10] Intel Processor Trace virtulization enabling
>>>
>>> Hi,
>>>
>>> Can you please avoid CC everyone on each patch? You can use scripts/get_maintainers.pl on each patch to see who is required to be
>>> CCed.
>> 
>> OK, get it. I use script/get_maintainers.pl to get the people who need to be 
> CC and indeed different patch may include different peoples. If somebody  
> just receive one patch of this patch set may feel a little strange and don't 
> understand the context. So I CC all the peoples who is mentioned in this 
> patch set.
> 
> That's usually why I CC everyone on the cover letter. Then for each 
> patch I CC only the necessary person.
> 
> This avoids maintainers to have to look for what they should review/ack.

Indeed, plus most people are subscribed to xen-devel anyway, so get a
copy of the other patches via the list. The default really should be to
avoid spamming people; if there are people wanting to see full series
(like George has told me he prefers), that can be taken care of individually.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest
  2018-05-30 13:27 ` [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest Luwei Kang
@ 2018-06-28 14:11   ` Jan Beulich
  2018-07-03 10:18     ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-28 14:11 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:27, <luwei.kang@intel.com> wrote:
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1215,6 +1215,16 @@ Rather than only mapping RAM pages for IOMMU accesses for Dom0, with this
>  option all pages not marked as unusable in the E820 table will get a mapping
>  established.
>  
> +### ipt
> +> `= guest`
> +
> +> Default: `off`
> +
> +This option is use for switch on the Intel Processor Trace feature
> +in HVM guest when 'ipt=guest'. By default, this feature is disabled
> +in guest. Intel Processor Trace virtualization depend on
> +EPT, so it can only enabled in HVM guest at present.
> +
>  ### irq\_ratelimit (x86)

Did you not notice the (x86) here when re-basing?

> --- /dev/null
> +++ b/xen/arch/x86/cpu/ipt.c
> @@ -0,0 +1,42 @@
> +/*
> + * ipt.c: Support for Intel Processor Trace Virtualization.
> + *
> + * Copyright (c) 2018, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Author: Luwei Kang <luwei.kang@intel.com>
> + */
> +#include <xen/errno.h>
> +#include <xen/init.h>
> +#include <xen/lib.h>
> +#include <xen/string.h>
> +#include <asm/ipt.h>
> +
> +/* ipt: Flag to enable Intel Processor Trace (default off). */
> +unsigned int __read_mostly ipt_mode = IPT_MODE_OFF;
> +static int parse_ipt_params(const char *str);

I think it was pointed out before that the forward declaration can be
avoided if you move ...

> +custom_param("ipt", parse_ipt_params);

... this line ...

> +static int __init parse_ipt_params(const char *str)
> +{
> +    if ( !strcmp("guest", str) )
> +        ipt_mode = IPT_MODE_GUEST;
> +    else if ( str )
> +    {
> +        printk("Unknown Intel Processor Trace mode specified: '%s'\n", str);
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}

... here.

> --- /dev/null
> +++ b/xen/include/asm-x86/ipt.h
> @@ -0,0 +1,29 @@
> +/*
> + * ipt.h: Intel Processor Trace virtualization for HVM domain.
> + *
> + * Copyright (c) 2018, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Author: Luwei Kang <luwei.kang@intel.com>
> + */
> +
> +#ifndef __ASM_X86_HVM_IPT_H_
> +#define __ASM_X86_HVM_IPT_H_
> +
> +#define IPT_MODE_OFF        0
> +#define IPT_MODE_GUEST      (1<<0)
> +
> +extern unsigned int ipt_mode;

At this point I can't see why the variable can't be bool. With the
patch being placed first in the series it is also impossible (without
peeking into later patches) to judge whether its __read_mostly
attribute is actually appropriate.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid
  2018-05-30 13:27 ` [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid Luwei Kang
@ 2018-06-28 14:27   ` Jan Beulich
  2018-07-12  7:21     ` Kang, Luwei
  2018-06-29 15:17   ` Jan Beulich
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-28 14:27 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:27, <luwei.kang@intel.com> wrote:
> @@ -759,12 +760,19 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
>                  continue;
>          }
>  
> +        if ( input[0] == 0x14 )
> +        {
> +            input[1]++;
> +            if ( input[1] == 1 )
> +                continue;
> +        }

Together with what's there and what iirc Andrew's series puts here,
this should become a switch() imo.

> @@ -583,7 +584,19 @@ void recalculate_cpuid_policy(struct domain *d)
>              __clear_bit(X86_FEATURE_VMX, max_fs);
>              __clear_bit(X86_FEATURE_SVM, max_fs);
>          }
> +
> +        /*
> +         * Hide Intel Processor trace feature when hardware not support
> +         * PT-VMX or ipt option is off.
> +         */
> +        if ( ipt_mode == IPT_MODE_OFF )
> +        {
> +            __clear_bit(X86_FEATURE_IPT, max_fs);
> +            zero_leaves(p->ipt.raw, 0, ARRAY_SIZE(p->ipt.raw) - 1);
> +        }

The clearing of bits in max_fs further up is needed here because this
varies depending on domain config. You, otoh, put a conditional here
which is not going to change post boot. This instead belongs into
calculate_hvm_max_policy() I believe.

> @@ -101,6 +102,10 @@ static int update_domain_cpuid_info(struct domain *d,
>              p->feat.raw[ctl->input[1]] = leaf;
>              break;
>  
> +        case IPT_CPUID:
> +            p->ipt.raw[ctl->input[1]] = leaf;
> +            break;

This lacks a bounds check of ctl->input[1] (in the earlier switch()).

> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -102,6 +102,7 @@
>  #define cpu_has_mpx             boot_cpu_has(X86_FEATURE_MPX)
>  #define cpu_has_rdseed          boot_cpu_has(X86_FEATURE_RDSEED)
>  #define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
> +#define cpu_has_ipt             boot_cpu_has(X86_FEATURE_IPT)

This definition is unused.

> --- a/xen/include/asm-x86/cpuid.h
> +++ b/xen/include/asm-x86/cpuid.h
> @@ -58,10 +58,11 @@ DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
>  /* Default masking MSR values, calculated at boot. */
>  extern struct cpuidmasks cpuidmask_defaults;
>  
> -#define CPUID_GUEST_NR_BASIC      (0xdu + 1)
> +#define CPUID_GUEST_NR_BASIC      (0x14u + 1)

Is there anything to convince me that the intermediate leaves don't
need any further handling added anywhere? Same question btw for
the libxc side bumping of DEF_MAX_BASE.

> @@ -166,6 +167,15 @@ struct cpuid_policy
>          } comp[CPUID_GUEST_NR_XSTATE];
>      } xstate;
>  
> +    /* Structured feature leaf: 0x00000014[xx] */
> +    union {
> +        struct cpuid_leaf raw[CPUID_GUEST_NR_IPT];
> +        struct {
> +            /* Subleaf 0. */
> +            uint32_t max_subleaf;
> +        };
> +    } ipt;

In particular this looks to be placed earlier than it should be (in
other words I'm getting the impression that you failed to insert
some padding for the skipped leaves).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions
  2018-05-30 13:27 ` [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions Luwei Kang
@ 2018-06-28 14:44   ` Jan Beulich
  2018-07-03 10:18     ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-28 14:44 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:27, <luwei.kang@intel.com> wrote:
> --- a/xen/include/asm-x86/msr-index.h
> +++ b/xen/include/asm-x86/msr-index.h
> @@ -548,4 +548,41 @@
>  #define MSR_PKGC9_IRTL			0x00000634
>  #define MSR_PKGC10_IRTL			0x00000635
>  
> +/* Intel PT MSRs */
> +#define MSR_IA32_RTIT_CTL		0x00000570
> +#define RTIT_CTL_TRACEEN		(1ULL << 0)
> +#define RTIT_CTL_CYCEN			(1ULL << 1)
> +#define RTIT_CTL_OS			(1ULL << 2)
> +#define RTIT_CTL_USR			(1ULL << 3)
> +#define RTIT_CTL_PWR_EVT_EN		(1ULL << 4)
> +#define RTIT_CTL_FUP_ON_PTW		(1ULL << 5)
> +#define RTIT_CTL_FABRIC_EN		(1ULL << 6)
> +#define RTIT_CTL_CR3_FILTER		(1ULL << 7)
> +#define RTIT_CTL_TOPA			(1ULL << 8)
> +#define RTIT_CTL_MTC_EN			(1ULL << 9)
> +#define RTIT_CTL_TSC_EN			(1ULL << 10)
> +#define RTIT_CTL_DIS_RETC		(1ULL << 11)
> +#define RTIT_CTL_PTW_EN			(1ULL << 12)
> +#define RTIT_CTL_BRANCH_EN		(1ULL << 13)
> +#define RTIT_CTL_MTC_FREQ_OFFSET	14
> +#define RTIT_CTL_MTC_FREQ		(0x0fULL << RTIT_CTL_MTC_FREQ_OFFSET)

No duplicates like these please - with MASK_EXTR() / MASK_INSR()
having just the mask (and no offset/shift value) is sufficient.

> +#define RTIT_CTL_ADDR(n)		(0x0fULL << RTIT_CTL_ADDR_OFFSET(n))
> +#define MSR_IA32_RTIT_STATUS		0x00000571

Please add blank lines between one MSR (and its bits) and the next one.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch
  2018-05-30 13:27 ` [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch Luwei Kang
@ 2018-06-29 14:12   ` Jan Beulich
  2018-07-03 10:18     ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-29 14:12 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:27, <luwei.kang@intel.com> wrote:
> @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const char *str)
>  
>      return 0;
>  }
> +
> +static inline void ipt_load_msr(const struct ipt_ctx *ctx,
> +                       unsigned int addr_range)

Please let the compiler decide whether to inline such functions. The keyword
should only be used (with very few exceptions) in header files.

> +{
> +    unsigned int i;
> +
> +    wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> +    wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> +    wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> +    wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> +    for ( i = 0; i < addr_range; i++ )

Wouldn't "nr" or "nr_addr" be a better parameter name?

> +    {
> +        wrmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> +        wrmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> +    }
> +}
> +
> +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned int addr_range)
> +{
> +    unsigned int i;
> +
> +    rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> +    rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> +    rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> +    rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> +    for ( i = 0; i < addr_range; i++ )
> +    {
> +        rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> +        rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> +    }
> +}

So you save/restore them not at context switch, but at VM entry/exit
time. This means the title is misleading. But it raises efficiency questions:
Is it really necessary to do it this often? In patch 7 you handle reads
and writes to the MSRs, but you don't disable the MSR intercepts (and
judging from their titles no other patch is a candidate where you might
do that). If all writes are seen by Xen, why would you need to read all
the MSRs here, when the majority is - afaict - not modified by hardware?

> +void ipt_guest_enter(struct vcpu *v)
> +{
> +    struct ipt_desc *ipt = v->arch.hvm_vmx.ipt_desc;

const

> +    if ( !ipt )
> +        return;

Would seem better to put the check outside the call, so no call would
be made at all in the common case.

> +    /*
> +     * Need re-initialize the guest state of IA32_RTIT_CTL
> +     * When this vcpu be scheduled to another Physical CPU.
> +     * TBD: Performance optimization. Add a new item in
> +     * struct ipt_desc to record the last pcpu, and check
> +     * if this vcpu is scheduled to another pcpu here (like vpmu).
> +     */
> +    vmx_vmcs_enter(v);
> +    __vmwrite(GUEST_IA32_RTIT_CTL, ipt->ipt_guest.ctl);
> +    vmx_vmcs_exit(v);

With the sole caller being vmx_vmenter_helper() there's no need
to vmx_vmcs_{enter,exit}() here afaict.

> +int ipt_initialize(struct vcpu *v)
> +{
> +    struct ipt_desc *ipt = NULL;

Pointless initializer.

> +    unsigned int eax, tmp, addr_range;
> +
> +    if ( !cpu_has_ipt || (ipt_mode == IPT_MODE_OFF) ||
> +         !(v->arch.hvm_vmx.secondary_exec_control & SECONDARY_EXEC_PT_USE_GPA) )

ipt_mode == IPT_MODE_OFF implies
!(v->arch.hvm_vmx.secondary_exec_control & SECONDARY_EXEC_PT_USE_GPA)
as per patch 2, so no need for the separate check here (an ASSERT() inside
the if() body would be fine). The same should perhaps, if not already the
case, be made true for !cpu_has_ipt.

> +        return 0;
> +
> +    if ( cpuid_eax(IPT_CPUID) == 0 )
> +        return -EINVAL;
> +
> +    cpuid_count(IPT_CPUID, 1, &eax, &tmp, &tmp, &tmp);
> +    addr_range = eax & IPT_ADDR_RANGE_MASK;

As per my remark further up - the use of "addr_range" should perhaps
be revisited throughout the patch/series.

> +    ipt = _xzalloc(sizeof(struct ipt_desc) + sizeof(uint64_t) * addr_range * 2,
> +			__alignof(*ipt));

Please don't effectively open-code xzalloc_bytes(). Also please use the
type of variables or expressions in scope instead of type names. And
please get indentation right (won't be visible below anymore). IOW

    ipt = xzalloc_bytes(sizeof(*ipt) + sizeof(ipt->addr_range[0]) * addr_range * 2);

> +void ipt_destroy(struct vcpu *v)
> +{
> +    if ( v->arch.hvm_vmx.ipt_desc )
> +    {
> +        xfree(v->arch.hvm_vmx.ipt_desc);
> +        v->arch.hvm_vmx.ipt_desc = NULL;
> +    }

Pointless if().

> @@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
>      if ( v->vcpu_id == 0 )
>          v->arch.user_regs.rax = 1;
>  
> +    rc = ipt_initialize(v);
> +    if ( rc )
> +        dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace.\n", v);

For such a message to be helpful, please also log rc. And no full stop in
log messages please (again with very few exceptions).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT
  2018-05-30 13:28 ` [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT Luwei Kang
@ 2018-06-29 14:35   ` Jan Beulich
  2018-07-03 10:18     ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-29 14:35 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:28, <luwei.kang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/ipt.c
> +++ b/xen/arch/x86/cpu/ipt.c
> @@ -25,11 +25,74 @@
>  #include <asm/ipt.h>
>  #include <asm/msr.h>
>  
> +#define EAX 0
> +#define ECX 1
> +#define EDX 2
> +#define EBX 3
> +#define CPUID_REGS_NUM   4 /* number of regsters (eax, ebx, ecx, edx) */
> +
> +#define BIT(nr)                 (1UL << (nr))

I don't particularly like any such pretty generic things to be added to individual
files, but I also have nothing better to suggest. But please add the missing i to
the comment.

> +#define IPT_CAP(_n, _l, _r, _m)                               \
> +    [IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
> +        .reg = _r, .mask = _m }
> +
> +static struct ipt_cap_desc {
> +    const char    *name;
> +    unsigned int  leaf;
> +    unsigned char reg;

I don't think leaf needs to be full 32 bits wide? Once shrunk by at least
two bits, the size of the overall structure could go down from 24 to 16
bytes.

> +    unsigned int  mask;
> +} ipt_caps[] = {
> +    IPT_CAP(max_subleaf,            0, EAX, 0xffffffff),
> +    IPT_CAP(cr3_filter,             0, EBX, BIT(0)),
> +    IPT_CAP(psb_cyc,                0, EBX, BIT(1)),
> +    IPT_CAP(ip_filter,              0, EBX, BIT(2)),
> +    IPT_CAP(mtc,                    0, EBX, BIT(3)),
> +    IPT_CAP(ptwrite,                0, EBX, BIT(4)),
> +    IPT_CAP(power_event,            0, EBX, BIT(5)),
> +    IPT_CAP(topa_output,            0, ECX, BIT(0)),
> +    IPT_CAP(topa_multi_entry,       0, ECX, BIT(1)),
> +    IPT_CAP(single_range_output,    0, ECX, BIT(2)),
> +    IPT_CAP(output_subsys,          0, ECX, BIT(3)),
> +    IPT_CAP(payloads_lip,           0, ECX, BIT(31)),
> +    IPT_CAP(addr_range,             1, EAX, 0x7),
> +    IPT_CAP(mtc_period,             1, EAX, 0xffff0000),
> +    IPT_CAP(cycle_threshold,        1, EBX, 0xffff),
> +    IPT_CAP(psb_freq,               1, EBX, 0xffff0000),
> +};

const?

> +static unsigned int ipt_cap(const struct cpuid_leaf *cpuid_ipt, enum ipt_cap cap)
> +{
> +    const struct ipt_cap_desc *cd = &ipt_caps[cap];
> +    unsigned int shift = ffs(cd->mask) - 1;

Do you really need this?

> +    unsigned int val = 0;
> +
> +    cpuid_ipt += cd->leaf;
> +
> +    switch ( cd->reg )
> +    {
> +    case EAX:
> +        val = cpuid_ipt->a;
> +        break;
> +    case EBX:
> +        val = cpuid_ipt->b;
> +        break;
> +    case ECX:
> +        val = cpuid_ipt->c;
> +        break;
> +    case EDX:
> +        val = cpuid_ipt->d;
> +        break;
> +    }
> +
> +    return (val & cd->mask) >> shift;

If all masks are indeed contiguous series of set bits, MASK_EXTR() can
be used here afaict.

> +}
> +
>  static int __init parse_ipt_params(const char *str)
>  {
>      if ( !strcmp("guest", str) )

So this is the end of the changes to this file, and the function you
introduce is static. I'm pretty sure compilers will warn about the unused
static, and hence the build will fail at this point of the series (due to
-Werror). I think you want to introduce the function together with its
first user.

> diff --git a/xen/include/asm-x86/ipt.h b/xen/include/asm-x86/ipt.h
> index a69f049..422f46a 100644
> --- a/xen/include/asm-x86/ipt.h
> +++ b/xen/include/asm-x86/ipt.h
> @@ -31,6 +31,25 @@
>  
>  extern unsigned int ipt_mode;
>  
> +enum ipt_cap {
> +    IPT_CAP_max_subleaf = 0,

Pointless value specification.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation
  2018-05-30 13:28 ` [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation Luwei Kang
@ 2018-06-29 14:46   ` Jan Beulich
  2018-07-03 10:18     ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-29 14:46 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:28, <luwei.kang@intel.com> wrote:
> @@ -106,6 +114,105 @@ static int __init parse_ipt_params(const char *str)
>      return 0;
>  }
>  
> +int ipt_do_rdmsr(unsigned int msr, uint64_t *msr_content)
> +{
> +    const struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
> +    const struct cpuid_policy *p = current->domain->arch.cpuid;
> +    unsigned int index;
> +
> +    if ( !ipt_desc )
> +        return 1;
> +
> +    switch ( msr )
> +    {
> +    case MSR_IA32_RTIT_CTL:
> +        *msr_content = ipt_desc->ipt_guest.ctl;
> +        break;
> +    case MSR_IA32_RTIT_STATUS:
> +        *msr_content = ipt_desc->ipt_guest.status;
> +        break;
> +    case MSR_IA32_RTIT_OUTPUT_BASE:
> +        if ( !ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) &&
> +             !ipt_cap(p->ipt.raw, IPT_CAP_topa_output) )
> +            return 1;
> +        *msr_content = ipt_desc->ipt_guest.output_base;
> +        break;
> +    case MSR_IA32_RTIT_OUTPUT_MASK:
> +        if ( !ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) &&
> +             !ipt_cap(p->ipt.raw, IPT_CAP_topa_output) )
> +            return 1;
> +        *msr_content = ipt_desc->ipt_guest.output_mask |
> +                                    RTIT_OUTPUT_MASK_DEFAULT;
> +        break;
> +    case MSR_IA32_RTIT_CR3_MATCH:
> +        if ( !ipt_cap(p->ipt.raw, IPT_CAP_cr3_filter) )
> +            return 1;
> +        *msr_content = ipt_desc->ipt_guest.cr3_match;
> +        break;
> +    default:
> +	index = msr - MSR_IA32_RTIT_ADDR_A(0);

Hard tab. Also throughout both functions' switch() statements please
add blank lines between case blocks.

> +int ipt_do_wrmsr(unsigned int msr, uint64_t msr_content)
> +{
> +    struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
> +    const struct cpuid_policy *p = current->domain->arch.cpuid;
> +    unsigned int index;
> +
> +    if ( !ipt_desc )
> +        return 1;
> +
> +    switch ( msr )
> +    {
> +    case MSR_IA32_RTIT_CTL:
> +        ipt_desc->ipt_guest.ctl = msr_content;
> +        __vmwrite(GUEST_IA32_RTIT_CTL, msr_content);
> +        break;
> +    case MSR_IA32_RTIT_STATUS:
> +        if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) ||
> +             (msr_content & MSR_IA32_RTIT_STATUS_MASK) )
> +            return 1;
> +        ipt_desc->ipt_guest.status = msr_content;
> +        break;
> +    case MSR_IA32_RTIT_OUTPUT_BASE:
> +        if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) ||
> +             (msr_content &
> +                 MSR_IA32_RTIT_OUTPUT_BASE_MASK(p->extd.maxphysaddr)) ||

Indentation.

> +             (!ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) &&
> +              !ipt_cap(p->ipt.raw, IPT_CAP_topa_output)) )
> +            return 1;
> +        ipt_desc->ipt_guest.output_base = msr_content;
> +        break;
> +    case MSR_IA32_RTIT_OUTPUT_MASK:
> +        if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) ||
> +             (!ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) &&
> +              !ipt_cap(p->ipt.raw, IPT_CAP_topa_output)) )
> +            return 1;
> +        ipt_desc->ipt_guest.output_mask = msr_content |
> +                                RTIT_OUTPUT_MASK_DEFAULT;

Again.

> +        break;
> +    case MSR_IA32_RTIT_CR3_MATCH:
> +        if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) ||
> +             !ipt_cap(p->ipt.raw, IPT_CAP_cr3_filter) )
> +            return 1;
> +        ipt_desc->ipt_guest.cr3_match = msr_content;
> +        break;
> +    default:
> +        index = msr - MSR_IA32_RTIT_ADDR_A(0);
> +        if ( index >= ipt_cap(p->ipt.raw, IPT_CAP_addr_range) * 2 )
> +            return 1;
> +        ipt_desc->ipt_guest.addr[index] = msr_content;
> +    }

Please don't omit the "break;" above here (same in the other
function).

> +    return 0;
> +}

Both functions only ever return 0 or 1 - did you mean their return type
to be bool? And the perhaps better use true for success and false for
failure?

> @@ -204,3 +311,4 @@ void ipt_destroy(struct vcpu *v)
>          v->arch.hvm_vmx.ipt_desc = NULL;
>      }
>  }
> +

Stray addition of a blank line.

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2898,6 +2898,15 @@ static int vmx_msr_read_intercept(unsigned int msr, 
> uint64_t *msr_content)
>          if ( vpmu_do_rdmsr(msr, msr_content) )
>              goto gp_fault;
>          break;
> +    case MSR_IA32_RTIT_CTL:
> +    case MSR_IA32_RTIT_STATUS:
> +    case MSR_IA32_RTIT_OUTPUT_BASE:
> +    case MSR_IA32_RTIT_OUTPUT_MASK:
> +    case MSR_IA32_RTIT_CR3_MATCH:
> +    case MSR_IA32_RTIT_ADDR_A(0) ... MSR_IA32_RTIT_ADDR_B(3):

Is the 3 here an architectural limit? Otherwise you want to use a higher
number here and rely on the callee to do the full checking.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL
  2018-05-30 13:28 ` [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL Luwei Kang
@ 2018-06-29 14:56   ` Jan Beulich
  2018-07-03 10:19     ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-29 14:56 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:28, <luwei.kang@intel.com> wrote:
> Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
> result in a #GP unless the same write also clears TraceEn.
> Writes to IA32_RTIT_CTL that do not modify any bits will not
> cause a #GP, even if TraceEn remains set.
> MSR write that attempts to change bits marked reserved, or
> utilize encodings marked reserved, will cause a #GP fault.

May I ask that you also add a similar code comment, perhaps ahead
of the function definition?

> --- a/xen/arch/x86/cpu/ipt.c
> +++ b/xen/arch/x86/cpu/ipt.c
> @@ -114,6 +114,114 @@ static int __init parse_ipt_params(const char *str)
>      return 0;
>  }
>  
> +static int rtit_ctl_check(uint64_t new, uint64_t old)

It looks as if again you mean the function to return boolean, so please
have it have bool return type.

> +{
> +    const struct cpuid_policy *p = current->domain->arch.cpuid;
> +    const struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
> +    uint64_t rtit_ctl_mask = ~((uint64_t)0);

Too many parentheses.

> +    unsigned int addr_range = ipt_cap(p->ipt.raw, IPT_CAP_addr_range);
> +    unsigned int val, i;
> +
> +    if  ( new == old )
> +        return 0;
> +
> +    /* Clear no dependency bits */
> +    rtit_ctl_mask = ~(RTIT_CTL_TRACEEN | RTIT_CTL_OS |
> +                RTIT_CTL_USR | RTIT_CTL_TSC_EN | RTIT_CTL_DIS_RETC);
> +
> +    /* If CPUID.(EAX=14H,ECX=0):EBX[0]=1 CR3Filter can be set */
> +    if ( ipt_cap(p->ipt.raw, IPT_CAP_cr3_filter) )
> +        rtit_ctl_mask &= ~RTIT_CTL_CR3_FILTER;
> +
> +    /*
> +     * If CPUID.(EAX=14H,ECX=0):EBX[1]=1 CYCEn, CycThresh and
> +     * PSBFreq can be set
> +     */
> +    if ( ipt_cap(p->ipt.raw, IPT_CAP_psb_cyc) )
> +        rtit_ctl_mask &= ~(RTIT_CTL_CYCEN |
> +                RTIT_CTL_CYC_THRESH | RTIT_CTL_PSB_FREQ);
> +    /*
> +     * If CPUID.(EAX=14H,ECX=0):EBX[3]=1 MTCEn BranchEn and
> +     * MTCFreq can be set
> +     */
> +    if ( ipt_cap(p->ipt.raw, IPT_CAP_mtc) )
> +        rtit_ctl_mask &= ~(RTIT_CTL_MTC_EN |
> +                RTIT_CTL_BRANCH_EN | RTIT_CTL_MTC_FREQ);
> +
> +    /* If CPUID.(EAX=14H,ECX=0):EBX[4]=1 FUPonPTW and PTWEn can be set */
> +    if ( ipt_cap(p->ipt.raw, IPT_CAP_ptwrite) )
> +        rtit_ctl_mask &= ~(RTIT_CTL_FUP_ON_PTW |
> +                                        RTIT_CTL_PTW_EN);
> +
> +    /* If CPUID.(EAX=14H,ECX=0):EBX[5]=1 PwrEvEn can be set */
> +    if ( ipt_cap(p->ipt.raw, IPT_CAP_power_event) )
> +        rtit_ctl_mask &= ~RTIT_CTL_PWR_EVT_EN;
> +
> +    /* If CPUID.(EAX=14H,ECX=0):ECX[0]=1 ToPA can be set */
> +    if ( ipt_cap(p->ipt.raw, IPT_CAP_topa_output) )
> +        rtit_ctl_mask &= ~RTIT_CTL_TOPA;
> +    /* If CPUID.(EAX=14H,ECX=0):ECX[3]=1 FabircEn can be set */
> +    if ( ipt_cap(p->ipt.raw, IPT_CAP_output_subsys))
> +        rtit_ctl_mask &= ~RTIT_CTL_FABRIC_EN;
> +    /* unmask address range configure area */
> +    for (i = 0; i < addr_range; i++)

Style.

> +        rtit_ctl_mask &= ~(0xf << (32 + i * 4));
> +
> +    /*
> +     * Any MSR write that attempts to change bits marked reserved will
> +     * case a #GP fault.

cause

> +     */
> +    if ( new & rtit_ctl_mask )
> +        return 1;
> +
> +    /*
> +     * Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
> +     * result in a #GP unless the same write also clears TraceEn.
> +     */
> +    if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) &&
> +        ((ipt_desc->ipt_guest.ctl ^ new) & ~RTIT_CTL_TRACEEN) )

Why the ^ ? You only need to check whether new has the bit clear.
Also please use "old" wherever possible, if you already have it passed
into the function. This way it'll become obvious that the "nothing
changed" case is already handled by the very first if().

> +        return 1;
> +
> +    /*
> +     * WRMSR to IA32_RTIT_CTL that sets TraceEn but clears this bit
> +     * and FabricEn would cause #GP, if
> +     * CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 0
> +     */
> +   if ( (new & RTIT_CTL_TRACEEN) && !(new & RTIT_CTL_TOPA) &&
> +        !(new & RTIT_CTL_FABRIC_EN) &&
> +        !ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) )
> +        return 1;
> +    /*
> +     * MTCFreq, CycThresh and PSBFreq encodings check, any MSR write that
> +     * utilize encodings marked reserved will casue a #GP fault.
> +     */
> +    val = ipt_cap(p->ipt.raw, IPT_CAP_mtc_period);
> +    if ( ipt_cap(p->ipt.raw, IPT_CAP_mtc) &&
> +                !test_bit((new & RTIT_CTL_MTC_FREQ) >>
> +                RTIT_CTL_MTC_FREQ_OFFSET, &val) )

Indentation.

> @@ -171,6 +279,8 @@ int ipt_do_wrmsr(unsigned int msr, uint64_t msr_content)
>      switch ( msr )
>      {
>      case MSR_IA32_RTIT_CTL:
> +        if ( rtit_ctl_check(msr_content, ipt_desc->ipt_guest.ctl) )
> +            return 1;
>          ipt_desc->ipt_guest.ctl = msr_content;
>          __vmwrite(GUEST_IA32_RTIT_CTL, msr_content);
>          break;

Without this I don't see how the previous patch is complete.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 09/10] x86: Disable Intel Processor Trace when VMXON in L1 guest
  2018-05-30 13:28 ` [PATCH v2 09/10] x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
@ 2018-06-29 15:14   ` Jan Beulich
  2018-07-03 10:19     ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-29 15:14 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:28, <luwei.kang@intel.com> wrote:
> @@ -1519,6 +1520,14 @@ int nvmx_handle_vmxon(struct cpu_user_regs *regs)
>      v->arch.hvm_vmx.launched = 0;
>      vmsucceed(regs);
>  
> +    if ( v->arch.hvm_vmx.ipt_desc )
> +    {
> +        v->arch.hvm_vmx.ipt_desc->ipt_guest.ctl = 0;
> +        vmx_vmcs_enter(current);
> +        __vmwrite(GUEST_IA32_RTIT_CTL, 0);
> +        vmx_vmcs_exit(current);
> +    }

I again don't understand why enter and exit the VMCS here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid
  2018-05-30 13:27 ` [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid Luwei Kang
  2018-06-28 14:27   ` Jan Beulich
@ 2018-06-29 15:17   ` Jan Beulich
  2018-07-03 10:19     ` Kang, Luwei
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-29 15:17 UTC (permalink / raw)
  To: Luwei Kang, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jun Nakajima

>>> On 30.05.18 at 15:27, <luwei.kang@intel.com> wrote:
> --- a/xen/include/public/arch-x86/cpufeatureset.h
> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> @@ -215,6 +215,7 @@ XEN_CPUFEATURE(SMAP,          5*32+20) /*S  Supervisor Mode Access Prevention */
>  XEN_CPUFEATURE(AVX512IFMA,    5*32+21) /*A  AVX-512 Integer Fused Multiply Add */
>  XEN_CPUFEATURE(CLFLUSHOPT,    5*32+23) /*A  CLFLUSHOPT instruction */
>  XEN_CPUFEATURE(CLWB,          5*32+24) /*A  CLWB instruction */
> +XEN_CPUFEATURE(IPT,           5*32+25) /*H  Intel Processor Trace */

Btw - introducing the feature flag here is certainly fine, but I think you
should add the H annotation only once functionality is complete. That
would then e.g. also reduce the impact of patch 8 adding functionality
otherwise strictly necessary already in patch 7.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 10/10] x86: Handle new asynchronous exit qualification
  2018-05-30 13:28 ` [PATCH v2 10/10] x86: Handle new asynchronous exit qualification Luwei Kang
@ 2018-06-29 15:22   ` Jan Beulich
  2018-06-29 15:29     ` Andrew Cooper
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-06-29 15:22 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 30.05.18 at 15:28, <luwei.kang@intel.com> wrote:
> Using EPT to translate PT output addresses introduces the possibility of
> taking events on PT output reads and writes. Event possibilities include
> EPT violations, EPT misconfigurations, PML log-full VM exits, and APIC
> access VM exits.
> EPT violations:
>  a. Intel PT buffer is a MMIO address in guest. Actually, it can be a
>     MMIO address (SDM 35.2.6.1), but in order do not affect other
>     passthrough/emulate device in guest. Ferbid use MMIO addr at present.
>  b. Intel PT buffer is a RAM non-writable address. Don't need emulate
>     and inject a #GP to guest.

Is such #GP injection architectural behavior? We've got a few bad
examples where we inject exceptions which are architecturally
impossible - let's please not add any further instances.

> @@ -4027,6 +4028,10 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>          break;
>  
>      case EXIT_REASON_APIC_ACCESS:
> +        __vmread(EXIT_QUALIFICATION, &exit_qualification);
> +        if ( exit_qualification & 0x10000 )

Please no use of literal numbers like this.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 10/10] x86: Handle new asynchronous exit qualification
  2018-06-29 15:22   ` Jan Beulich
@ 2018-06-29 15:29     ` Andrew Cooper
  0 siblings, 0 replies; 50+ messages in thread
From: Andrew Cooper @ 2018-06-29 15:29 UTC (permalink / raw)
  To: Jan Beulich, Luwei Kang
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Razvan Cojocaru,
	George Dunlap, Tim Deegan, Ian Jackson, xen-devel, Julien Grall,
	Tamas K Lengyel, Jun Nakajima

On 29/06/18 16:22, Jan Beulich wrote:
>>>> On 30.05.18 at 15:28, <luwei.kang@intel.com> wrote:
>> Using EPT to translate PT output addresses introduces the possibility of
>> taking events on PT output reads and writes. Event possibilities include
>> EPT violations, EPT misconfigurations, PML log-full VM exits, and APIC
>> access VM exits.
>> EPT violations:
>>  a. Intel PT buffer is a MMIO address in guest. Actually, it can be a
>>     MMIO address (SDM 35.2.6.1), but in order do not affect other
>>     passthrough/emulate device in guest. Ferbid use MMIO addr at present.
>>  b. Intel PT buffer is a RAM non-writable address. Don't need emulate
>>     and inject a #GP to guest.
> Is such #GP injection architectural behavior? We've got a few bad
> examples where we inject exceptions which are architecturally
> impossible - let's please not add any further instances.

We discussed this IRL, and this point was altered.

We need to run the ept_violation handler (e.g. to account for logdirty
tracking) and it also gives an introspection agent a chance to intervene
(e.g. remove protection on the page if its use as a PT buffer is
legitimate).

However, if the violation isn't fixed (making the frame writeable in
EPT, or killing PT), the guest cannot continue executing crashing is the
only remaining option.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest
  2018-06-28 14:11   ` Jan Beulich
@ 2018-07-03 10:18     ` Kang, Luwei
  2018-07-03 11:58       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-03 10:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -1215,6 +1215,16 @@ Rather than only mapping RAM pages for IOMMU
> > accesses for Dom0, with this  option all pages not marked as unusable
> > in the E820 table will get a mapping  established.
> >
> > +### ipt
> > +> `= guest`
> > +
> > +> Default: `off`
> > +
> > +This option is use for switch on the Intel Processor Trace feature in
> > +HVM guest when 'ipt=guest'. By default, this feature is disabled in
> > +guest. Intel Processor Trace virtualization depend on EPT, so it can
> > +only enabled in HVM guest at present.
> > +
> >  ### irq\_ratelimit (x86)
> 
> Did you not notice the (x86) here when re-basing?

So, the option should be "### ipt (x86)" ?

> 
> > --- /dev/null
> > +++ b/xen/arch/x86/cpu/ipt.c
> > @@ -0,0 +1,42 @@
> > +/*
> > + * ipt.c: Support for Intel Processor Trace Virtualization.
> > + *
> > + * Copyright (c) 2018, Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > +modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but
> > +WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> > +or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > +License for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > +along with
> > + * this program; If not, see <http://www.gnu.org/licenses/>.
> > + *
> > + * Author: Luwei Kang <luwei.kang@intel.com>  */ #include
> > +<xen/errno.h> #include <xen/init.h> #include <xen/lib.h> #include
> > +<xen/string.h> #include <asm/ipt.h>
> > +
> > +/* ipt: Flag to enable Intel Processor Trace (default off). */
> > +unsigned int __read_mostly ipt_mode = IPT_MODE_OFF; static int
> > +parse_ipt_params(const char *str);
> 
> I think it was pointed out before that the forward declaration can be avoided if you move ...
> 
> > +custom_param("ipt", parse_ipt_params);
> 
> ... this line ...
> 
> > +static int __init parse_ipt_params(const char *str) {
> > +    if ( !strcmp("guest", str) )
> > +        ipt_mode = IPT_MODE_GUEST;
> > +    else if ( str )
> > +    {
> > +        printk("Unknown Intel Processor Trace mode specified: '%s'\n", str);
> > +        return -EINVAL;
> > +    }
> > +
> > +    return 0;
> > +}
> 
> ... here.

Get it. Will fix it.

> 
> > --- /dev/null
> > +++ b/xen/include/asm-x86/ipt.h
> > @@ -0,0 +1,29 @@
> > +/*
> > + * ipt.h: Intel Processor Trace virtualization for HVM domain.
> > + *
> > + * Copyright (c) 2018, Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > +modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but
> > +WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> > +or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > +License for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > +along with
> > + * this program; If not, see <http://www.gnu.org/licenses/>.
> > + *
> > + * Author: Luwei Kang <luwei.kang@intel.com>  */
> > +
> > +#ifndef __ASM_X86_HVM_IPT_H_
> > +#define __ASM_X86_HVM_IPT_H_
> > +
> > +#define IPT_MODE_OFF        0
> > +#define IPT_MODE_GUEST      (1<<0)
> > +
> > +extern unsigned int ipt_mode;
> 
> At this point I can't see why the variable can't be bool. With the patch being placed first in the series it is also impossible (without
> peeking into later patches) to judge whether its __read_mostly attribute is actually appropriate.

OK, will change it to bool. About __read_mostly attribute,  I will remove it if not read frequency.

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions
  2018-06-28 14:44   ` Jan Beulich
@ 2018-07-03 10:18     ` Kang, Luwei
  2018-07-03 12:00       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-03 10:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> > --- a/xen/include/asm-x86/msr-index.h
> > +++ b/xen/include/asm-x86/msr-index.h
> > @@ -548,4 +548,41 @@
> >  #define MSR_PKGC9_IRTL			0x00000634
> >  #define MSR_PKGC10_IRTL			0x00000635
> >
> > +/* Intel PT MSRs */
> > +#define MSR_IA32_RTIT_CTL		0x00000570
> > +#define RTIT_CTL_TRACEEN		(1ULL << 0)
> > +#define RTIT_CTL_CYCEN			(1ULL << 1)
> > +#define RTIT_CTL_OS			(1ULL << 2)
> > +#define RTIT_CTL_USR			(1ULL << 3)
> > +#define RTIT_CTL_PWR_EVT_EN		(1ULL << 4)
> > +#define RTIT_CTL_FUP_ON_PTW		(1ULL << 5)
> > +#define RTIT_CTL_FABRIC_EN		(1ULL << 6)
> > +#define RTIT_CTL_CR3_FILTER		(1ULL << 7)
> > +#define RTIT_CTL_TOPA			(1ULL << 8)
> > +#define RTIT_CTL_MTC_EN			(1ULL << 9)
> > +#define RTIT_CTL_TSC_EN			(1ULL << 10)
> > +#define RTIT_CTL_DIS_RETC		(1ULL << 11)
> > +#define RTIT_CTL_PTW_EN			(1ULL << 12)
> > +#define RTIT_CTL_BRANCH_EN		(1ULL << 13)
> > +#define RTIT_CTL_MTC_FREQ_OFFSET	14
> > +#define RTIT_CTL_MTC_FREQ		(0x0fULL << RTIT_CTL_MTC_FREQ_OFFSET)
> 
> No duplicates like these please - with MASK_EXTR() / MASK_INSR() having just the mask (and no offset/shift value) is sufficient.

OK, get it. I just need to define
+#define RTIT_CTL_MTC_FREQ		(0x0fULL << RTIT_CTL_MTC_FREQ_OFFSET)
If I want to get the value of this filed (bit17:14) just call "MASK_EXTR(v, RTIT_CTL_MTC_FREQ)" 

> 
> > +#define RTIT_CTL_ADDR(n)		(0x0fULL << RTIT_CTL_ADDR_OFFSET(n))
> > +#define MSR_IA32_RTIT_STATUS		0x00000571
> 
> Please add blank lines between one MSR (and its bits) and the next one.

Will fix it.

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch
  2018-06-29 14:12   ` Jan Beulich
@ 2018-07-03 10:18     ` Kang, Luwei
  2018-07-03 12:04       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-03 10:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> > @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const char
> > *str)
> >
> >      return 0;
> >  }
> > +
> > +static inline void ipt_load_msr(const struct ipt_ctx *ctx,
> > +                       unsigned int addr_range)
> 
> Please let the compiler decide whether to inline such functions. The keyword should only be used (with very few exceptions) in
> header files.

OK, get it.

> 
> > +{
> > +    unsigned int i;
> > +
> > +    wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> > +    wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> > +    wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> > +    wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> > +    for ( i = 0; i < addr_range; i++ )
> 
> Wouldn't "nr" or "nr_addr" be a better parameter name?

Looks good to me.

> 
> > +    {
> > +        wrmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> > +        wrmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> > +    }
> > +}
> > +
> > +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned int
> > +addr_range) {
> > +    unsigned int i;
> > +
> > +    rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> > +    rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> > +    for ( i = 0; i < addr_range; i++ )
> > +    {
> > +        rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> > +        rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> > +    }
> > +}
> 
> So you save/restore them not at context switch, but at VM entry/exit time. This means the title is misleading. But it raises efficiency
> questions:
> Is it really necessary to do it this often? In patch 7 you handle reads and writes to the MSRs, but you don't disable the MSR
> intercepts (and judging from their titles no other patch is a candidate where you might do that). If all writes are seen by Xen, why
> would you need to read all the MSRs here, when the majority is - afaict - not modified by hardware?

when PT in disabled in guest (guest have capability to enable PT but RTIT_CTL.EN is 0), all the PT msrs read/write are intercepted and we don't need to save or restore during vm-exit/entry. When PT is enabled in guest, we need to save or restore the guest stat when vm-exit/entry.
What about add a flag to log the value of MSRs' changes so that we don't need save/restore the MSRs when guest not change these values?
> 
> > +void ipt_guest_enter(struct vcpu *v)
> > +{
> > +    struct ipt_desc *ipt = v->arch.hvm_vmx.ipt_desc;
> 
> const
> 
> > +    if ( !ipt )
> > +        return;
> 
> Would seem better to put the check outside the call, so no call would be made at all in the common case.

OK, looks good to me. Will fix it.

> 
> > +    /*
> > +     * Need re-initialize the guest state of IA32_RTIT_CTL
> > +     * When this vcpu be scheduled to another Physical CPU.
> > +     * TBD: Performance optimization. Add a new item in
> > +     * struct ipt_desc to record the last pcpu, and check
> > +     * if this vcpu is scheduled to another pcpu here (like vpmu).
> > +     */
> > +    vmx_vmcs_enter(v);
> > +    __vmwrite(GUEST_IA32_RTIT_CTL, ipt->ipt_guest.ctl);
> > +    vmx_vmcs_exit(v);
> 
> With the sole caller being vmx_vmenter_helper() there's no need to vmx_vmcs_{enter,exit}() here afaict.

Get it.

> 
> > +int ipt_initialize(struct vcpu *v)
> > +{
> > +    struct ipt_desc *ipt = NULL;
> 
> Pointless initializer.
> 
> > +    unsigned int eax, tmp, addr_range;
> > +
> > +    if ( !cpu_has_ipt || (ipt_mode == IPT_MODE_OFF) ||
> > +         !(v->arch.hvm_vmx.secondary_exec_control &
> > + SECONDARY_EXEC_PT_USE_GPA) )
> 
> ipt_mode == IPT_MODE_OFF implies
> !(v->arch.hvm_vmx.secondary_exec_control & SECONDARY_EXEC_PT_USE_GPA) as per patch 2, so no need for the separate check
> here (an ASSERT() inside the if() body would be fine). The same should perhaps, if not already the case, be made true
> for !cpu_has_ipt.

Will fix it.

> 
> > +        return 0;
> > +
> > +    if ( cpuid_eax(IPT_CPUID) == 0 )
> > +        return -EINVAL;
> > +
> > +    cpuid_count(IPT_CPUID, 1, &eax, &tmp, &tmp, &tmp);
> > +    addr_range = eax & IPT_ADDR_RANGE_MASK;
> 
> As per my remark further up - the use of "addr_range" should perhaps be revisited throughout the patch/series.
> 
> > +    ipt = _xzalloc(sizeof(struct ipt_desc) + sizeof(uint64_t) * addr_range * 2,
> > +			__alignof(*ipt));
> 
> Please don't effectively open-code xzalloc_bytes(). Also please use the type of variables or expressions in scope instead of type
> names. And please get indentation right (won't be visible below anymore). IOW
> 
>     ipt = xzalloc_bytes(sizeof(*ipt) + sizeof(ipt->addr_range[0]) * addr_range * 2);

Will fix it.

> 
> > +void ipt_destroy(struct vcpu *v)
> > +{
> > +    if ( v->arch.hvm_vmx.ipt_desc )
> > +    {
> > +        xfree(v->arch.hvm_vmx.ipt_desc);
> > +        v->arch.hvm_vmx.ipt_desc = NULL;
> > +    }
> 
> Pointless if().

Will remove it.

> 
> > @@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
> >      if ( v->vcpu_id == 0 )
> >          v->arch.user_regs.rax = 1;
> >
> > +    rc = ipt_initialize(v);
> > +    if ( rc )
> > +        dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor
> > + Trace.\n", v);
> 
> For such a message to be helpful, please also log rc. And no full stop in log messages please (again with very few exceptions).

Not full understand here. What is the " no full stop in log messages " mean?

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT
  2018-06-29 14:35   ` Jan Beulich
@ 2018-07-03 10:18     ` Kang, Luwei
  2018-07-03 12:09       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-03 10:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> > --- a/xen/arch/x86/cpu/ipt.c
> > +++ b/xen/arch/x86/cpu/ipt.c
> > @@ -25,11 +25,74 @@
> >  #include <asm/ipt.h>
> >  #include <asm/msr.h>
> >
> > +#define EAX 0
> > +#define ECX 1
> > +#define EDX 2
> > +#define EBX 3
> > +#define CPUID_REGS_NUM   4 /* number of regsters (eax, ebx, ecx, edx) */
> > +
> > +#define BIT(nr)                 (1UL << (nr))
> 
> I don't particularly like any such pretty generic things to be added to individual files, but I also have nothing better to suggest. But
> please add the missing i to the comment.
> 
> > +#define IPT_CAP(_n, _l, _r, _m)                               \
> > +    [IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
> > +        .reg = _r, .mask = _m }
> > +
> > +static struct ipt_cap_desc {
> > +    const char    *name;
> > +    unsigned int  leaf;
> > +    unsigned char reg;
> 
> I don't think leaf needs to be full 32 bits wide? Once shrunk by at least two bits, the size of the overall structure could go down
> from 24 to 16 bytes.

OK, will change it from " unsigned int  " to "unsinged char".

> 
> > +    unsigned int  mask;
> > +} ipt_caps[] = {
> > +    IPT_CAP(max_subleaf,            0, EAX, 0xffffffff),
> > +    IPT_CAP(cr3_filter,             0, EBX, BIT(0)),
> > +    IPT_CAP(psb_cyc,                0, EBX, BIT(1)),
> > +    IPT_CAP(ip_filter,              0, EBX, BIT(2)),
> > +    IPT_CAP(mtc,                    0, EBX, BIT(3)),
> > +    IPT_CAP(ptwrite,                0, EBX, BIT(4)),
> > +    IPT_CAP(power_event,            0, EBX, BIT(5)),
> > +    IPT_CAP(topa_output,            0, ECX, BIT(0)),
> > +    IPT_CAP(topa_multi_entry,       0, ECX, BIT(1)),
> > +    IPT_CAP(single_range_output,    0, ECX, BIT(2)),
> > +    IPT_CAP(output_subsys,          0, ECX, BIT(3)),
> > +    IPT_CAP(payloads_lip,           0, ECX, BIT(31)),
> > +    IPT_CAP(addr_range,             1, EAX, 0x7),
> > +    IPT_CAP(mtc_period,             1, EAX, 0xffff0000),
> > +    IPT_CAP(cycle_threshold,        1, EBX, 0xffff),
> > +    IPT_CAP(psb_freq,               1, EBX, 0xffff0000),
> > +};
> 
> const?

Oh, will add it.

> 
> > +static unsigned int ipt_cap(const struct cpuid_leaf *cpuid_ipt, enum
> > +ipt_cap cap) {
> > +    const struct ipt_cap_desc *cd = &ipt_caps[cap];
> > +    unsigned int shift = ffs(cd->mask) - 1;
> 
> Do you really need this?
> 
> > +    unsigned int val = 0;
> > +
> > +    cpuid_ipt += cd->leaf;
> > +
> > +    switch ( cd->reg )
> > +    {
> > +    case EAX:
> > +        val = cpuid_ipt->a;
> > +        break;
> > +    case EBX:
> > +        val = cpuid_ipt->b;
> > +        break;
> > +    case ECX:
> > +        val = cpuid_ipt->c;
> > +        break;
> > +    case EDX:
> > +        val = cpuid_ipt->d;
> > +        break;
> > +    }
> > +
> > +    return (val & cd->mask) >> shift;
> 
> If all masks are indeed contiguous series of set bits, MASK_EXTR() can be used here afaict.

Yes, it is a good define to me. Will fix it.

> 
> > +}
> > +
> >  static int __init parse_ipt_params(const char *str)  {
> >      if ( !strcmp("guest", str) )
> 
> So this is the end of the changes to this file, and the function you introduce is static. I'm pretty sure compilers will warn about the
> unused static, and hence the build will fail at this point of the series (due to -Werror). I think you want to introduce the function
> together with its first user.

I can't reproduce this issue by:
#./configure
# make build-xen         (-Werror has been included during build)
Could you tell me how to?

Thanks,
Luwei Kang


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation
  2018-06-29 14:46   ` Jan Beulich
@ 2018-07-03 10:18     ` Kang, Luwei
  2018-07-03 12:10       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-03 10:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -2898,6 +2898,15 @@ static int vmx_msr_read_intercept(unsigned int
> > msr, uint64_t *msr_content)
> >          if ( vpmu_do_rdmsr(msr, msr_content) )
> >              goto gp_fault;
> >          break;
> > +    case MSR_IA32_RTIT_CTL:
> > +    case MSR_IA32_RTIT_STATUS:
> > +    case MSR_IA32_RTIT_OUTPUT_BASE:
> > +    case MSR_IA32_RTIT_OUTPUT_MASK:
> > +    case MSR_IA32_RTIT_CR3_MATCH:
> > +    case MSR_IA32_RTIT_ADDR_A(0) ... MSR_IA32_RTIT_ADDR_B(3):
> 
> Is the 3 here an architectural limit? Otherwise you want to use a higher number here and rely on the callee to do the full checking.
> 

"3" is the max number of address ranges which I can find in spec. The number of address ranges is get from CPUID info. Will fix it.

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL
  2018-06-29 14:56   ` Jan Beulich
@ 2018-07-03 10:19     ` Kang, Luwei
  2018-07-03 12:15       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-03 10:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> > Any attempt to modify IA32_RTIT_CTL while TraceEn is set will result
> > in a #GP unless the same write also clears TraceEn.
> > Writes to IA32_RTIT_CTL that do not modify any bits will not cause a
> > #GP, even if TraceEn remains set.
> > MSR write that attempts to change bits marked reserved, or utilize
> > encodings marked reserved, will cause a #GP fault.
> 
> May I ask that you also add a similar code comment, perhaps ahead of the function definition?

Get it. Will do that.

> 
> > --- a/xen/arch/x86/cpu/ipt.c
> > +++ b/xen/arch/x86/cpu/ipt.c
> > @@ -114,6 +114,114 @@ static int __init parse_ipt_params(const char *str)
> >      return 0;
> >  }
> >
> > +static int rtit_ctl_check(uint64_t new, uint64_t old)
> 
> It looks as if again you mean the function to return boolean, so please have it have bool return type.

Get it.

> 
> > +{
> > +    const struct cpuid_policy *p = current->domain->arch.cpuid;
> > +    const struct ipt_desc *ipt_desc = current->arch.hvm_vmx.ipt_desc;
> > +    uint64_t rtit_ctl_mask = ~((uint64_t)0);
> 
> Too many parentheses.

Here is a pointless initializer. Will fix it.

> > +     */
> > +    if ( new & rtit_ctl_mask )
> > +        return 1;
> > +
> > +    /*
> > +     * Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
> > +     * result in a #GP unless the same write also clears TraceEn.
> > +     */
> > +    if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) &&
> > +        ((ipt_desc->ipt_guest.ctl ^ new) & ~RTIT_CTL_TRACEEN) )
> 
> Why the ^ ? You only need to check whether new has the bit clear.

I mean if change any bits (set or clear, not include RTIT_CTL_TRACEEN) with PT is enabled (RTIT_CTL_TRACEEN is set) will inject an #GP.

> Also please use "old" wherever possible, if you already have it passed into the function. This way it'll become obvious that the
> "nothing changed" case is already handled by the very first if().

I think also can remove "old" (decrease a function parameter) and all use ipt_desc->ipt_guest.ctl instead.

> 
> > +        return 1;
> > +
> > +    /*
> > +     * WRMSR to IA32_RTIT_CTL that sets TraceEn but clears this bit
> > +     * and FabricEn would cause #GP, if
> > +     * CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 0
> > +     */
> > +   if ( (new & RTIT_CTL_TRACEEN) && !(new & RTIT_CTL_TOPA) &&
> > +        !(new & RTIT_CTL_FABRIC_EN) &&
> > +        !ipt_cap(p->ipt.raw, IPT_CAP_single_range_output) )
> > +        return 1;
> > +    /*
> > +     * MTCFreq, CycThresh and PSBFreq encodings check, any MSR write that
> > +     * utilize encodings marked reserved will casue a #GP fault.
> > +     */
> > +    val = ipt_cap(p->ipt.raw, IPT_CAP_mtc_period);
> > +    if ( ipt_cap(p->ipt.raw, IPT_CAP_mtc) &&
> > +                !test_bit((new & RTIT_CTL_MTC_FREQ) >>
> > +                RTIT_CTL_MTC_FREQ_OFFSET, &val) )
> 
> Indentation.
> 
> > @@ -171,6 +279,8 @@ int ipt_do_wrmsr(unsigned int msr, uint64_t msr_content)
> >      switch ( msr )
> >      {
> >      case MSR_IA32_RTIT_CTL:
> > +        if ( rtit_ctl_check(msr_content, ipt_desc->ipt_guest.ctl) )
> > +            return 1;
> >          ipt_desc->ipt_guest.ctl = msr_content;
> >          __vmwrite(GUEST_IA32_RTIT_CTL, msr_content);
> >          break;
> 
> Without this I don't see how the previous patch is complete.

What about merge this patch with patch 7 ?

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 09/10] x86: Disable Intel Processor Trace when VMXON in L1 guest
  2018-06-29 15:14   ` Jan Beulich
@ 2018-07-03 10:19     ` Kang, Luwei
  0 siblings, 0 replies; 50+ messages in thread
From: Kang, Luwei @ 2018-07-03 10:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> > @@ -1519,6 +1520,14 @@ int nvmx_handle_vmxon(struct cpu_user_regs *regs)
> >      v->arch.hvm_vmx.launched = 0;
> >      vmsucceed(regs);
> >
> > +    if ( v->arch.hvm_vmx.ipt_desc )
> > +    {
> > +        v->arch.hvm_vmx.ipt_desc->ipt_guest.ctl = 0;
> > +        vmx_vmcs_enter(current);
> > +        __vmwrite(GUEST_IA32_RTIT_CTL, 0);
> > +        vmx_vmcs_exit(current);
> > +    }
> 
> I again don't understand why enter and exit the VMCS here.
> 
Will remove this because it in vmx_vmenter_helper(). 

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid
  2018-06-29 15:17   ` Jan Beulich
@ 2018-07-03 10:19     ` Kang, Luwei
  2018-07-03 10:25       ` Andrew Cooper
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-03 10:19 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Tian, Kevin, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall, Nakajima,
	Jun

> > --- a/xen/include/public/arch-x86/cpufeatureset.h
> > +++ b/xen/include/public/arch-x86/cpufeatureset.h
> > @@ -215,6 +215,7 @@ XEN_CPUFEATURE(SMAP,          5*32+20) /*S  Supervisor Mode Access Prevention */
> >  XEN_CPUFEATURE(AVX512IFMA,    5*32+21) /*A  AVX-512 Integer Fused Multiply Add */
> >  XEN_CPUFEATURE(CLFLUSHOPT,    5*32+23) /*A  CLFLUSHOPT instruction */
> >  XEN_CPUFEATURE(CLWB,          5*32+24) /*A  CLWB instruction */
> > +XEN_CPUFEATURE(IPT,           5*32+25) /*H  Intel Processor Trace */
> 
> Btw - introducing the feature flag here is certainly fine, but I think you should add the H annotation only once functionality is
> complete. That would then e.g. also reduce the impact of patch 8 adding functionality otherwise strictly necessary already in patch
> 7.

Sorry, not full understand here. Do you mean I need expose this feature to guest after all is ready? That is move this patch after patch 7,8 ?

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid
  2018-07-03 10:19     ` Kang, Luwei
@ 2018-07-03 10:25       ` Andrew Cooper
  0 siblings, 0 replies; 50+ messages in thread
From: Andrew Cooper @ 2018-07-03 10:25 UTC (permalink / raw)
  To: Kang, Luwei, Jan Beulich, xen-devel
  Cc: Tian, Kevin, Stefano Stabellini, Wei Liu, George Dunlap,
	Tim Deegan, Ian Jackson, Julien Grall, Nakajima, Jun

On 03/07/18 11:19, Kang, Luwei wrote:
>>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>>> @@ -215,6 +215,7 @@ XEN_CPUFEATURE(SMAP,          5*32+20) /*S  Supervisor Mode Access Prevention */
>>>  XEN_CPUFEATURE(AVX512IFMA,    5*32+21) /*A  AVX-512 Integer Fused Multiply Add */
>>>  XEN_CPUFEATURE(CLFLUSHOPT,    5*32+23) /*A  CLFLUSHOPT instruction */
>>>  XEN_CPUFEATURE(CLWB,          5*32+24) /*A  CLWB instruction */
>>> +XEN_CPUFEATURE(IPT,           5*32+25) /*H  Intel Processor Trace */
>> Btw - introducing the feature flag here is certainly fine, but I think you should add the H annotation only once functionality is
>> complete. That would then e.g. also reduce the impact of patch 8 adding functionality otherwise strictly necessary already in patch
>> 7.
> Sorry, not full understand here. Do you mean I need expose this feature to guest after all is ready? That is move this patch after patch 7,8 ?

As an example, take my series adding SSBD support to Xen.

First,
https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=9df52a25e0e95a0b9971aa2fc26c5c6a5cbdf4ef
which adds the new feature define so Xen can gain some code.

Later
https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=9df52a25e0e95a0b9971aa2fc26c5c6a5cbdf4ef
the feature is offered to guests when all the Xen infrastructure is in
place.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest
  2018-07-03 10:18     ` Kang, Luwei
@ 2018-07-03 11:58       ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2018-07-03 11:58 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 03.07.18 at 12:18, <luwei.kang@intel.com> wrote:
>> > --- a/docs/misc/xen-command-line.markdown
>> > +++ b/docs/misc/xen-command-line.markdown
>> > @@ -1215,6 +1215,16 @@ Rather than only mapping RAM pages for IOMMU
>> > accesses for Dom0, with this  option all pages not marked as unusable
>> > in the E820 table will get a mapping  established.
>> >
>> > +### ipt
>> > +> `= guest`
>> > +
>> > +> Default: `off`
>> > +
>> > +This option is use for switch on the Intel Processor Trace feature in
>> > +HVM guest when 'ipt=guest'. By default, this feature is disabled in
>> > +guest. Intel Processor Trace virtualization depend on EPT, so it can
>> > +only enabled in HVM guest at present.
>> > +
>> >  ### irq\_ratelimit (x86)
>> 
>> Did you not notice the (x86) here when re-basing?
> 
> So, the option should be "### ipt (x86)" ?

Yes.

>> > +#ifndef __ASM_X86_HVM_IPT_H_
>> > +#define __ASM_X86_HVM_IPT_H_
>> > +
>> > +#define IPT_MODE_OFF        0
>> > +#define IPT_MODE_GUEST      (1<<0)
>> > +
>> > +extern unsigned int ipt_mode;
>> 
>> At this point I can't see why the variable can't be bool. With the patch being placed first in the series it is also impossible (without
>> peeking into later patches) to judge whether its __read_mostly attribute is actually appropriate.
> 
> OK, will change it to bool. About __read_mostly attribute,  I will remove it 
> if not read frequency.

No, the point wasn't to remove it, but whether instead it could be
__initdata.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions
  2018-07-03 10:18     ` Kang, Luwei
@ 2018-07-03 12:00       ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2018-07-03 12:00 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 03.07.18 at 12:18, <luwei.kang@intel.com> wrote:
>> > --- a/xen/include/asm-x86/msr-index.h
>> > +++ b/xen/include/asm-x86/msr-index.h
>> > @@ -548,4 +548,41 @@
>> >  #define MSR_PKGC9_IRTL			0x00000634
>> >  #define MSR_PKGC10_IRTL			0x00000635
>> >
>> > +/* Intel PT MSRs */
>> > +#define MSR_IA32_RTIT_CTL		0x00000570
>> > +#define RTIT_CTL_TRACEEN		(1ULL << 0)
>> > +#define RTIT_CTL_CYCEN			(1ULL << 1)
>> > +#define RTIT_CTL_OS			(1ULL << 2)
>> > +#define RTIT_CTL_USR			(1ULL << 3)
>> > +#define RTIT_CTL_PWR_EVT_EN		(1ULL << 4)
>> > +#define RTIT_CTL_FUP_ON_PTW		(1ULL << 5)
>> > +#define RTIT_CTL_FABRIC_EN		(1ULL << 6)
>> > +#define RTIT_CTL_CR3_FILTER		(1ULL << 7)
>> > +#define RTIT_CTL_TOPA			(1ULL << 8)
>> > +#define RTIT_CTL_MTC_EN			(1ULL << 9)
>> > +#define RTIT_CTL_TSC_EN			(1ULL << 10)
>> > +#define RTIT_CTL_DIS_RETC		(1ULL << 11)
>> > +#define RTIT_CTL_PTW_EN			(1ULL << 12)
>> > +#define RTIT_CTL_BRANCH_EN		(1ULL << 13)
>> > +#define RTIT_CTL_MTC_FREQ_OFFSET	14
>> > +#define RTIT_CTL_MTC_FREQ		(0x0fULL << RTIT_CTL_MTC_FREQ_OFFSET)
>> 
>> No duplicates like these please - with MASK_EXTR() / MASK_INSR() having just 
> the mask (and no offset/shift value) is sufficient.
> 
> OK, get it. I just need to define
> +#define RTIT_CTL_MTC_FREQ		(0x0fULL << RTIT_CTL_MTC_FREQ_OFFSET)

#define RTIT_CTL_MTC_FREQ		(0x0fULL << 14)

you mean, I assume?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch
  2018-07-03 10:18     ` Kang, Luwei
@ 2018-07-03 12:04       ` Jan Beulich
  2018-07-04  8:48         ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-07-03 12:04 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 03.07.18 at 12:18, <luwei.kang@intel.com> wrote:
>> > @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const char
>> > +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned int
>> > +addr_range) {
>> > +    unsigned int i;
>> > +
>> > +    rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
>> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
>> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
>> > +    rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
>> > +    for ( i = 0; i < addr_range; i++ )
>> > +    {
>> > +        rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
>> > +        rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
>> > +    }
>> > +}
>> 
>> So you save/restore them not at context switch, but at VM entry/exit time. This means the title is misleading. But it raises efficiency
>> questions:
>> Is it really necessary to do it this often? In patch 7 you handle reads and writes to the MSRs, but you don't disable the MSR
>> intercepts (and judging from their titles no other patch is a candidate where you might do that). If all writes are seen by Xen, why
>> would you need to read all the MSRs here, when the majority is - afaict - not modified by hardware?
> 
> when PT in disabled in guest (guest have capability to enable PT but 
> RTIT_CTL.EN is 0), all the PT msrs read/write are intercepted and we don't 
> need to save or restore during vm-exit/entry. When PT is enabled in guest, we 
> need to save or restore the guest stat when vm-exit/entry.

Why for MSRs which don't get changed by hardware?

> What about add a flag to log the value of MSRs' changes so that we don't 
> need save/restore the MSRs when guest not change these values?

I'm afraid it's not clear to me what "log the value" is supposed to mean here.

>> > @@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
>> >      if ( v->vcpu_id == 0 )
>> >          v->arch.user_regs.rax = 1;
>> >
>> > +    rc = ipt_initialize(v);
>> > +    if ( rc )
>> > +        dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor
>> > + Trace.\n", v);
>> 
>> For such a message to be helpful, please also log rc. And no full stop in log messages please (again with very few exceptions).
> 
> Not full understand here. What is the " no full stop in log messages " mean?

"full stop" is the final period in a sentence. I.e. you want

        dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace\n", v);

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT
  2018-07-03 10:18     ` Kang, Luwei
@ 2018-07-03 12:09       ` Jan Beulich
  2018-07-04  8:48         ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-07-03 12:09 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 03.07.18 at 12:18, <luwei.kang@intel.com> wrote:
>> > +#define IPT_CAP(_n, _l, _r, _m)                               \
>> > +    [IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
>> > +        .reg = _r, .mask = _m }
>> > +
>> > +static struct ipt_cap_desc {
>> > +    const char    *name;
>> > +    unsigned int  leaf;
>> > +    unsigned char reg;
>> 
>> I don't think leaf needs to be full 32 bits wide? Once shrunk by at least two bits, the size of the overall structure could go down
>> from 24 to 16 bytes.
> 
> OK, will change it from " unsigned int  " to "unsinged char".

I'd prefer if you used bit fields, as was meant to be implied by my
reply.

>> > +static unsigned int ipt_cap(const struct cpuid_leaf *cpuid_ipt, enum
>> > +ipt_cap cap) {
>> > +    const struct ipt_cap_desc *cd = &ipt_caps[cap];
>> > +    unsigned int shift = ffs(cd->mask) - 1;
>> 
>> Do you really need this?
>> 
>> > +    unsigned int val = 0;
>> > +
>> > +    cpuid_ipt += cd->leaf;
>> > +
>> > +    switch ( cd->reg )
>> > +    {
>> > +    case EAX:
>> > +        val = cpuid_ipt->a;
>> > +        break;
>> > +    case EBX:
>> > +        val = cpuid_ipt->b;
>> > +        break;
>> > +    case ECX:
>> > +        val = cpuid_ipt->c;
>> > +        break;
>> > +    case EDX:
>> > +        val = cpuid_ipt->d;
>> > +        break;
>> > +    }
>> > +
>> > +    return (val & cd->mask) >> shift;
>> 
>> If all masks are indeed contiguous series of set bits, MASK_EXTR() can be 
> used here afaict.
> 
> Yes, it is a good define to me. Will fix it.
> 
>> 
>> > +}
>> > +
>> >  static int __init parse_ipt_params(const char *str)  {
>> >      if ( !strcmp("guest", str) )
>> 
>> So this is the end of the changes to this file, and the function you introduce is static. I'm pretty sure compilers will warn about the
>> unused static, and hence the build will fail at this point of the series (due to -Werror). I think you want to introduce the function
>> together with its first user.
> 
> I can't reproduce this issue by:
> #./configure
> # make build-xen         (-Werror has been included during build)
> Could you tell me how to?

There is certainly the possibility that gcc versions differ in this regard.
But I'm sure it's clear to you that the code should build fine with all
supported versions. There's also the possibility that I'm overlooking
something.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation
  2018-07-03 10:18     ` Kang, Luwei
@ 2018-07-03 12:10       ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2018-07-03 12:10 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 03.07.18 at 12:18, <luwei.kang@intel.com> wrote:
>> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> > @@ -2898,6 +2898,15 @@ static int vmx_msr_read_intercept(unsigned int
>> > msr, uint64_t *msr_content)
>> >          if ( vpmu_do_rdmsr(msr, msr_content) )
>> >              goto gp_fault;
>> >          break;
>> > +    case MSR_IA32_RTIT_CTL:
>> > +    case MSR_IA32_RTIT_STATUS:
>> > +    case MSR_IA32_RTIT_OUTPUT_BASE:
>> > +    case MSR_IA32_RTIT_OUTPUT_MASK:
>> > +    case MSR_IA32_RTIT_CR3_MATCH:
>> > +    case MSR_IA32_RTIT_ADDR_A(0) ... MSR_IA32_RTIT_ADDR_B(3):
>> 
>> Is the 3 here an architectural limit? Otherwise you want to use a higher 
> number here and rely on the callee to do the full checking.
>> 
> 
> "3" is the max number of address ranges which I can find in spec. The number 
> of address ranges is get from CPUID info. Will fix it.

Question is what the largest number is that the designated CPUID output
field can hold.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL
  2018-07-03 10:19     ` Kang, Luwei
@ 2018-07-03 12:15       ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2018-07-03 12:15 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 03.07.18 at 12:19, <luwei.kang@intel.com> wrote:
>> > +    /*
>> > +     * Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
>> > +     * result in a #GP unless the same write also clears TraceEn.
>> > +     */
>> > +    if ( (ipt_desc->ipt_guest.ctl & RTIT_CTL_TRACEEN) &&
>> > +        ((ipt_desc->ipt_guest.ctl ^ new) & ~RTIT_CTL_TRACEEN) )
>> 
>> Why the ^ ? You only need to check whether new has the bit clear.
> 
> I mean if change any bits (set or clear, not include RTIT_CTL_TRACEEN) with 
> PT is enabled (RTIT_CTL_TRACEEN is set) will inject an #GP.

Hmm, maybe I was wrongly thinking that the earlier new != old check
renders redundant the check here.

>> > @@ -171,6 +279,8 @@ int ipt_do_wrmsr(unsigned int msr, uint64_t msr_content)
>> >      switch ( msr )
>> >      {
>> >      case MSR_IA32_RTIT_CTL:
>> > +        if ( rtit_ctl_check(msr_content, ipt_desc->ipt_guest.ctl) )
>> > +            return 1;
>> >          ipt_desc->ipt_guest.ctl = msr_content;
>> >          __vmwrite(GUEST_IA32_RTIT_CTL, msr_content);
>> >          break;
>> 
>> Without this I don't see how the previous patch is complete.
> 
> What about merge this patch with patch 7 ?

That was the implied suggestion.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch
  2018-07-03 12:04       ` Jan Beulich
@ 2018-07-04  8:48         ` Kang, Luwei
  2018-07-04  9:05           ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-04  8:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> >> > @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const char
> >> > +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned int
> >> > +addr_range) {
> >> > +    unsigned int i;
> >> > +
> >> > +    rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> >> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> >> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> >> > +    rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> >> > +    for ( i = 0; i < addr_range; i++ )
> >> > +    {
> >> > +        rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> >> > +        rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> >> > +    }
> >> > +}
> >>
> >> So you save/restore them not at context switch, but at VM entry/exit
> >> time. This means the title is misleading. But it raises efficiency
> >> questions:
> >> Is it really necessary to do it this often? In patch 7 you handle
> >> reads and writes to the MSRs, but you don't disable the MSR
> >> intercepts (and judging from their titles no other patch is a candidate where you might do that). If all writes are seen by Xen, why
> would you need to read all the MSRs here, when the majority is - afaict - not modified by hardware?
> >
> > when PT in disabled in guest (guest have capability to enable PT but
> > RTIT_CTL.EN is 0), all the PT msrs read/write are intercepted and we
> > don't need to save or restore during vm-exit/entry. When PT is enabled
> > in guest, we need to save or restore the guest stat when vm-exit/entry.
> 
> Why for MSRs which don't get changed by hardware?
> 
> > What about add a flag to log the value of MSRs' changes so that we
> > don't need save/restore the MSRs when guest not change these values?
> 
> I'm afraid it's not clear to me what "log the value" is supposed to mean here.

I mean add a new flag to mark if the value of Intel PT MSRs is changed by guest. If guest don't have any change that we don't need to save/restore the guest PT MSRs value to real hardware when VM-exit/entry.

> 
> >> > @@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
> >> >      if ( v->vcpu_id == 0 )
> >> >          v->arch.user_regs.rax = 1;
> >> >
> >> > +    rc = ipt_initialize(v);
> >> > +    if ( rc )
> >> > +        dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor
> >> > + Trace.\n", v);
> >>
> >> For such a message to be helpful, please also log rc. And no full stop in log messages please (again with very few exceptions).
> >
> > Not full understand here. What is the " no full stop in log messages " mean?
> 
> "full stop" is the final period in a sentence. I.e. you want
> 
>         dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace\n", v);

Change like this ?
dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace: err=%d.\n", v, rc);

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT
  2018-07-03 12:09       ` Jan Beulich
@ 2018-07-04  8:48         ` Kang, Luwei
  2018-07-04  9:09           ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-04  8:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> >> > +#define IPT_CAP(_n, _l, _r, _m)                               \
> >> > +    [IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
> >> > +        .reg = _r, .mask = _m }
> >> > +
> >> > +static struct ipt_cap_desc {
> >> > +    const char    *name;
> >> > +    unsigned int  leaf;
> >> > +    unsigned char reg;
> >>
> >> I don't think leaf needs to be full 32 bits wide? Once shrunk by at
> >> least two bits, the size of the overall structure could go down from 24 to 16 bytes.
> >
> > OK, will change it from " unsigned int  " to "unsinged char".
> 
> I'd prefer if you used bit fields, as was meant to be implied by my reply.

like this? If two bits is too few for "leaf"?

static const struct ipt_cap_desc {
    const char    *name;
    unsigned char leaf:2;
    unsigned char reg:2;
    unsinged int mask;
}

> 
> >> > +static unsigned int ipt_cap(const struct cpuid_leaf *cpuid_ipt,
> >> > +enum ipt_cap cap) {
> >> > +    const struct ipt_cap_desc *cd = &ipt_caps[cap];
> >> > +    unsigned int shift = ffs(cd->mask) - 1;
> >>
> >> Do you really need this?
> >>
> >> > +    unsigned int val = 0;
> >> > +
> >> > +    cpuid_ipt += cd->leaf;
> >> > +
> >> > +    switch ( cd->reg )
> >> > +    {
> >> > +    case EAX:
> >> > +        val = cpuid_ipt->a;
> >> > +        break;
> >> > +    case EBX:
> >> > +        val = cpuid_ipt->b;
> >> > +        break;
> >> > +    case ECX:
> >> > +        val = cpuid_ipt->c;
> >> > +        break;
> >> > +    case EDX:
> >> > +        val = cpuid_ipt->d;
> >> > +        break;
> >> > +    }
> >> > +
> >> > +    return (val & cd->mask) >> shift;
> >>
> >> If all masks are indeed contiguous series of set bits, MASK_EXTR()
> >> can be
> > used here afaict.
> >
> > Yes, it is a good define to me. Will fix it.
> >
> >>
> >> > +}
> >> > +
> >> >  static int __init parse_ipt_params(const char *str)  {
> >> >      if ( !strcmp("guest", str) )
> >>
> >> So this is the end of the changes to this file, and the function you
> >> introduce is static. I'm pretty sure compilers will warn about the
> >> unused static, and hence the build will fail at this point of the series (due to -Werror). I think you want to introduce the function
> together with its first user.
> >
> > I can't reproduce this issue by:
> > #./configure
> > # make build-xen         (-Werror has been included during build)
> > Could you tell me how to?
> 
> There is certainly the possibility that gcc versions differ in this regard.
> But I'm sure it's clear to you that the code should build fine with all supported versions. There's also the possibility that I'm
> overlooking something.


Get it. Will test it.

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch
  2018-07-04  8:48         ` Kang, Luwei
@ 2018-07-04  9:05           ` Jan Beulich
  2018-07-04  9:41             ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-07-04  9:05 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 04.07.18 at 10:48, <luwei.kang@intel.com> wrote:
>> >> > @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const char
>> >> > +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned int
>> >> > +addr_range) {
>> >> > +    unsigned int i;
>> >> > +
>> >> > +    rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
>> >> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
>> >> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
>> >> > +    rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
>> >> > +    for ( i = 0; i < addr_range; i++ )
>> >> > +    {
>> >> > +        rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
>> >> > +        rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
>> >> > +    }
>> >> > +}
>> >>
>> >> So you save/restore them not at context switch, but at VM entry/exit
>> >> time. This means the title is misleading. But it raises efficiency
>> >> questions:
>> >> Is it really necessary to do it this often? In patch 7 you handle
>> >> reads and writes to the MSRs, but you don't disable the MSR
>> >> intercepts (and judging from their titles no other patch is a candidate 
> where you might do that). If all writes are seen by Xen, why
>> would you need to read all the MSRs here, when the majority is - afaict - not 
> modified by hardware?
>> >
>> > when PT in disabled in guest (guest have capability to enable PT but
>> > RTIT_CTL.EN is 0), all the PT msrs read/write are intercepted and we
>> > don't need to save or restore during vm-exit/entry. When PT is enabled
>> > in guest, we need to save or restore the guest stat when vm-exit/entry.
>> 
>> Why for MSRs which don't get changed by hardware?
>> 
>> > What about add a flag to log the value of MSRs' changes so that we
>> > don't need save/restore the MSRs when guest not change these values?
>> 
>> I'm afraid it's not clear to me what "log the value" is supposed to mean here.
> 
> I mean add a new flag to mark if the value of Intel PT MSRs is changed by 
> guest. If guest don't have any change that we don't need to save/restore the 
> guest PT MSRs value to real hardware when VM-exit/entry.

Okay, in which case back to the original question: Without disabling the
intercepts, you know what the guest wrote last. Why read the MSR then?

>> >> > @@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
>> >> >      if ( v->vcpu_id == 0 )
>> >> >          v->arch.user_regs.rax = 1;
>> >> >
>> >> > +    rc = ipt_initialize(v);
>> >> > +    if ( rc )
>> >> > +        dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor
>> >> > + Trace.\n", v);
>> >>
>> >> For such a message to be helpful, please also log rc. And no full stop in 
> log messages please (again with very few exceptions).
>> >
>> > Not full understand here. What is the " no full stop in log messages " 
> mean?
>> 
>> "full stop" is the final period in a sentence. I.e. you want
>> 
>>         dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace\n", 
> v);
> 
> Change like this ?
> dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace: err=%d.\n", v, rc);

Excuse me - I've told you to omit the full stop, and there it is again.
Apart from that, yes, this is one option. A slightly short one we use here
and there is

dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace (%d)\n", v, rc);

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT
  2018-07-04  8:48         ` Kang, Luwei
@ 2018-07-04  9:09           ` Jan Beulich
  2018-07-04  9:42             ` Kang, Luwei
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2018-07-04  9:09 UTC (permalink / raw)
  To: Luwei Kang
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Jun Nakajima

>>> On 04.07.18 at 10:48, <luwei.kang@intel.com> wrote:
>> >> > +#define IPT_CAP(_n, _l, _r, _m)                               \
>> >> > +    [IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
>> >> > +        .reg = _r, .mask = _m }
>> >> > +
>> >> > +static struct ipt_cap_desc {
>> >> > +    const char    *name;
>> >> > +    unsigned int  leaf;
>> >> > +    unsigned char reg;
>> >>
>> >> I don't think leaf needs to be full 32 bits wide? Once shrunk by at
>> >> least two bits, the size of the overall structure could go down from 24 to 16 bytes.
>> >
>> > OK, will change it from " unsigned int  " to "unsinged char".
>> 
>> I'd prefer if you used bit fields, as was meant to be implied by my reply.
> 
> like this? If two bits is too few for "leaf"?
> 
> static const struct ipt_cap_desc {
>     const char    *name;
>     unsigned char leaf:2;
>     unsigned char reg:2;
>     unsinged int mask;
> }

Yes. As suggested before I'd use more bits for leaf, to avoid this
needing to change immediately once a new leaf becomes known.
After all the goal is only to have leaf and reg together fit in 32
bits. Making leaf 8 bits wide for now would likely help generated
code. And please don't use unsigned char in cases like this where
you don't really need the more narrow type - be as close to what
the standard allows without extensions as possible; IOW
unsigned int here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch
  2018-07-04  9:05           ` Jan Beulich
@ 2018-07-04  9:41             ` Kang, Luwei
  0 siblings, 0 replies; 50+ messages in thread
From: Kang, Luwei @ 2018-07-04  9:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> >> >> > @@ -40,3 +42,102 @@ static int __init parse_ipt_params(const
> >> >> > char
> >> >> > +static inline void ipt_save_msr(struct ipt_ctx *ctx, unsigned
> >> >> > +int
> >> >> > +addr_range) {
> >> >> > +    unsigned int i;
> >> >> > +
> >> >> > +    rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
> >> >> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
> >> >> > +    rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
> >> >> > +    rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
> >> >> > +    for ( i = 0; i < addr_range; i++ )
> >> >> > +    {
> >> >> > +        rdmsrl(MSR_IA32_RTIT_ADDR_A(i), ctx->addr[i * 2]);
> >> >> > +        rdmsrl(MSR_IA32_RTIT_ADDR_B(i), ctx->addr[i * 2 + 1]);
> >> >> > +    }
> >> >> > +}
> >> >>
> >> >> So you save/restore them not at context switch, but at VM
> >> >> entry/exit time. This means the title is misleading. But it raises
> >> >> efficiency
> >> >> questions:
> >> >> Is it really necessary to do it this often? In patch 7 you handle
> >> >> reads and writes to the MSRs, but you don't disable the MSR
> >> >> intercepts (and judging from their titles no other patch is a
> >> >> candidate
> > where you might do that). If all writes are seen by Xen, why
> >> would you need to read all the MSRs here, when the majority is -
> >> afaict - not
> > modified by hardware?
> >> >
> >> > when PT in disabled in guest (guest have capability to enable PT
> >> > but RTIT_CTL.EN is 0), all the PT msrs read/write are intercepted
> >> > and we don't need to save or restore during vm-exit/entry. When PT
> >> > is enabled in guest, we need to save or restore the guest stat when vm-exit/entry.
> >>
> >> Why for MSRs which don't get changed by hardware?
> >>
> >> > What about add a flag to log the value of MSRs' changes so that we
> >> > don't need save/restore the MSRs when guest not change these values?
> >>
> >> I'm afraid it's not clear to me what "log the value" is supposed to mean here.
> >
> > I mean add a new flag to mark if the value of Intel PT MSRs is changed
> > by guest. If guest don't have any change that we don't need to
> > save/restore the guest PT MSRs value to real hardware when VM-exit/entry.
> 
> Okay, in which case back to the original question: Without disabling the intercepts, you know what the guest wrote last. Why read
> the MSR then?

Oh, understand. I re-check these MSRs in spec and Only IA32_RTIT_STATUS is an exception. It can be changed by hardware.
Bit[3:0] are write ignore. 
Bit[5:4] : these bit are set by hardware and once it set only software can clear it.
Bit[48:32]: write by processor and can clear or modify this filed at any time when PT is enabled.
So, I think this register (IA32_RTIT_STATUS) is needed read after vm-exit. Other MSRs may don't need.
Bit[5:4] should be cleared in any way before vm-entry because only software can clear it.


> 
> >> >> > @@ -466,11 +467,16 @@ static int vmx_vcpu_initialise(struct vcpu *v)
> >> >> >      if ( v->vcpu_id == 0 )
> >> >> >          v->arch.user_regs.rax = 1;
> >> >> >
> >> >> > +    rc = ipt_initialize(v);
> >> >> > +    if ( rc )
> >> >> > +        dprintk(XENLOG_ERR, "%pv: Failed to init Intel
> >> >> > + Processor Trace.\n", v);
> >> >>
> >> >> For such a message to be helpful, please also log rc. And no full
> >> >> stop in
> > log messages please (again with very few exceptions).
> >> >
> >> > Not full understand here. What is the " no full stop in log messages "
> > mean?
> >>
> >> "full stop" is the final period in a sentence. I.e. you want
> >>
> >>         dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor
> >> Trace\n",
> > v);
> >
> > Change like this ?
> > dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace:
> > err=%d.\n", v, rc);
> 
> Excuse me - I've told you to omit the full stop, and there it is again.
> Apart from that, yes, this is one option. A slightly short one we use here and there is
> 
> dprintk(XENLOG_ERR, "%pv: Failed to init Intel Processor Trace (%d)\n", v, rc);

I Just understand. "Full stop" is the  "." at the end of the sentence. We don't need that. :)

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT
  2018-07-04  9:09           ` Jan Beulich
@ 2018-07-04  9:42             ` Kang, Luwei
  0 siblings, 0 replies; 50+ messages in thread
From: Kang, Luwei @ 2018-07-04  9:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> >> >> > +#define IPT_CAP(_n, _l, _r, _m)                               \
> >> >> > +    [IPT_CAP_ ## _n] = { .name = __stringify(_n), .leaf = _l, \
> >> >> > +        .reg = _r, .mask = _m }
> >> >> > +
> >> >> > +static struct ipt_cap_desc {
> >> >> > +    const char    *name;
> >> >> > +    unsigned int  leaf;
> >> >> > +    unsigned char reg;
> >> >>
> >> >> I don't think leaf needs to be full 32 bits wide? Once shrunk by
> >> >> at least two bits, the size of the overall structure could go down from 24 to 16 bytes.
> >> >
> >> > OK, will change it from " unsigned int  " to "unsinged char".
> >>
> >> I'd prefer if you used bit fields, as was meant to be implied by my reply.
> >
> > like this? If two bits is too few for "leaf"?
> >
> > static const struct ipt_cap_desc {
> >     const char    *name;
> >     unsigned char leaf:2;
> >     unsigned char reg:2;
> >     unsinged int mask;
> > }
> 
> Yes. As suggested before I'd use more bits for leaf, to avoid this needing to change immediately once a new leaf becomes known.
> After all the goal is only to have leaf and reg together fit in 32 bits. Making leaf 8 bits wide for now would likely help generated code.
> And please don't use unsigned char in cases like this where you don't really need the more narrow type - be as close to what the
> standard allows without extensions as possible; IOW unsigned int here.
> 

static const struct ipt_cap_desc {
    const char    *name;
    unsigned int leaf:8;
    unsigned int reg:2;
    unsinged int mask;
}

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid
  2018-06-28 14:27   ` Jan Beulich
@ 2018-07-12  7:21     ` Kang, Luwei
  2018-07-12  7:48       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Kang, Luwei @ 2018-07-12  7:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Tian, Kevin, Stefano Stabellini, Wei Liu,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, Nakajima, Jun

> >>> On 30.05.18 at 15:27, <luwei.kang@intel.com> wrote:
> > @@ -759,12 +760,19 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
> >                  continue;
> >          }
> >
> > +        if ( input[0] == 0x14 )
> > +        {
> > +            input[1]++;
> > +            if ( input[1] == 1 )
> > +                continue;
> > +        }
> 
> Together with what's there and what iirc Andrew's series puts here, this should become a switch() imo.

Why use switch() here? I don't think need change to switch() and I can't find any example in this function use switch(). For example leaf 4 is also implement like this.

        /* Intel cache descriptor leaves. */
        if ( input[0] == 4 )
        {
            input[1]++;
            /* More to do? Then loop keeping %%eax==0x00000004. */
            if ( (regs[0] & 0x1f) != 0 )
                continue;
        }

> 
> > @@ -583,7 +584,19 @@ void recalculate_cpuid_policy(struct domain *d)
> >              __clear_bit(X86_FEATURE_VMX, max_fs);
> >              __clear_bit(X86_FEATURE_SVM, max_fs);
> >          }
> > +
> > +        /*
> > +         * Hide Intel Processor trace feature when hardware not support
> > +         * PT-VMX or ipt option is off.
> > +         */
> > +        if ( ipt_mode == IPT_MODE_OFF )
> > +        {
> > +            __clear_bit(X86_FEATURE_IPT, max_fs);
> > +            zero_leaves(p->ipt.raw, 0, ARRAY_SIZE(p->ipt.raw) - 1);
> > +        }
> 
> The clearing of bits in max_fs further up is needed here because this varies depending on domain config. You, otoh, put a
> conditional here which is not going to change post boot. This instead belongs into
> calculate_hvm_max_policy() I believe.

ipt_mode is any global parameter for all domain. Move to calculate_hvm_max_policy() is good to me.
Will fix in next version.

> 
> > @@ -101,6 +102,10 @@ static int update_domain_cpuid_info(struct domain *d,
> >              p->feat.raw[ctl->input[1]] = leaf;
> >              break;
> >
> > +        case IPT_CPUID:
> > +            p->ipt.raw[ctl->input[1]] = leaf;
> > +            break;
> 
> This lacks a bounds check of ctl->input[1] (in the earlier switch()).

Oh, get it. 

> 
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -102,6 +102,7 @@
> >  #define cpu_has_mpx             boot_cpu_has(X86_FEATURE_MPX)
> >  #define cpu_has_rdseed          boot_cpu_has(X86_FEATURE_RDSEED)
> >  #define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
> > +#define cpu_has_ipt             boot_cpu_has(X86_FEATURE_IPT)
> 
> This definition is unused.

Will remove it.

> 
> > --- a/xen/include/asm-x86/cpuid.h
> > +++ b/xen/include/asm-x86/cpuid.h
> > @@ -58,10 +58,11 @@ DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
> >  /* Default masking MSR values, calculated at boot. */  extern struct
> > cpuidmasks cpuidmask_defaults;
> >
> > -#define CPUID_GUEST_NR_BASIC      (0xdu + 1)
> > +#define CPUID_GUEST_NR_BASIC      (0x14u + 1)
> 
> Is there anything to convince me that the intermediate leaves don't need any further handling added anywhere? Same question btw
> for the libxc side bumping of DEF_MAX_BASE.

They are all zero and meaningless in these intermediate leaves. So I think we don't need do anything. what is your concern ?

> 
> > @@ -166,6 +167,15 @@ struct cpuid_policy
> >          } comp[CPUID_GUEST_NR_XSTATE];
> >      } xstate;
> >
> > +    /* Structured feature leaf: 0x00000014[xx] */
> > +    union {
> > +        struct cpuid_leaf raw[CPUID_GUEST_NR_IPT];
> > +        struct {
> > +            /* Subleaf 0. */
> > +            uint32_t max_subleaf;
> > +        };
> > +    } ipt;
> 
> In particular this looks to be placed earlier than it should be (in other words I'm getting the impression that you failed to insert
> some padding for the skipped leaves).

I think we don't need add some padding for skipped leaves because these are accessed by name (e.g. xstate, ipt ...) not offset.

Thanks,
Luwei Kang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid
  2018-07-12  7:21     ` Kang, Luwei
@ 2018-07-12  7:48       ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2018-07-12  7:48 UTC (permalink / raw)
  To: Andrew Cooper, Luwei Kang
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, George Dunlap,
	Tim Deegan, Ian Jackson, xen-devel, Julien Grall, Jun Nakajima

>>> On 12.07.18 at 09:21, <luwei.kang@intel.com> wrote:
>> >>> On 30.05.18 at 15:27, <luwei.kang@intel.com> wrote:
>> > @@ -759,12 +760,19 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
>> >                  continue;
>> >          }
>> >
>> > +        if ( input[0] == 0x14 )
>> > +        {
>> > +            input[1]++;
>> > +            if ( input[1] == 1 )
>> > +                continue;
>> > +        }
>> 
>> Together with what's there and what iirc Andrew's series puts here, this 
> should become a switch() imo.
> 
> Why use switch() here? I don't think need change to switch() and I can't 
> find any example in this function use switch(). For example leaf 4 is also 
> implement like this.
> 
>         /* Intel cache descriptor leaves. */
>         if ( input[0] == 4 )
>         {
>             input[1]++;
>             /* More to do? Then loop keeping %%eax==0x00000004. */
>             if ( (regs[0] & 0x1f) != 0 )
>                 continue;
>         }

For exactly this reason - instead of multiple if()-s evaluating
input[0], switch(input[0]) is preferred.

>> > --- a/xen/include/asm-x86/cpuid.h
>> > +++ b/xen/include/asm-x86/cpuid.h
>> > @@ -58,10 +58,11 @@ DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
>> >  /* Default masking MSR values, calculated at boot. */  extern struct
>> > cpuidmasks cpuidmask_defaults;
>> >
>> > -#define CPUID_GUEST_NR_BASIC      (0xdu + 1)
>> > +#define CPUID_GUEST_NR_BASIC      (0x14u + 1)
>> 
>> Is there anything to convince me that the intermediate leaves don't need any 
> further handling added anywhere? Same question btw
>> for the libxc side bumping of DEF_MAX_BASE.
> 
> They are all zero and meaningless in these intermediate leaves. So I think 
> we don't need do anything. what is your concern ?

As said - how can I convince myself that no further handling is needed?
How did you convince yourself? For example, update_domain_cpuid_info()
will then allow to set the intermediate fields to arbitrary values, without
recalculate_cpuid_policy() doing anything with them. After all
cpuid_featureset_to_policy() only sets explicit fields, but doesn't clear the
policy up front. Andrew - isn't this still a left-over piece of white-listing?

>> > @@ -166,6 +167,15 @@ struct cpuid_policy
>> >          } comp[CPUID_GUEST_NR_XSTATE];
>> >      } xstate;
>> >
>> > +    /* Structured feature leaf: 0x00000014[xx] */
>> > +    union {
>> > +        struct cpuid_leaf raw[CPUID_GUEST_NR_IPT];
>> > +        struct {
>> > +            /* Subleaf 0. */
>> > +            uint32_t max_subleaf;
>> > +        };
>> > +    } ipt;
>> 
>> In particular this looks to be placed earlier than it should be (in other 
> words I'm getting the impression that you failed to insert
>> some padding for the skipped leaves).
> 
> I think we don't need add some padding for skipped leaves because these are 
> accessed by name (e.g. xstate, ipt ...) not offset.

Oh, right, I was mistaken here - the immediately containing entity
is a struct, not a union.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2018-07-12  7:48 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-30 13:27 [PATCH v2 00/10] Intel Processor Trace virtulization enabling Luwei Kang
2018-05-30 13:27 ` [PATCH v2 01/10] x86: add an flag to enable Intel Processor Trace in guest Luwei Kang
2018-06-28 14:11   ` Jan Beulich
2018-07-03 10:18     ` Kang, Luwei
2018-07-03 11:58       ` Jan Beulich
2018-05-30 13:27 ` [PATCH v2 02/10] x86: Configure VMCS for Intel Processor Trace virtualization Luwei Kang
2018-05-30 13:27 ` [PATCH v2 03/10] x86: Add Intel Processor Trace support for cpuid Luwei Kang
2018-06-28 14:27   ` Jan Beulich
2018-07-12  7:21     ` Kang, Luwei
2018-07-12  7:48       ` Jan Beulich
2018-06-29 15:17   ` Jan Beulich
2018-07-03 10:19     ` Kang, Luwei
2018-07-03 10:25       ` Andrew Cooper
2018-05-30 13:27 ` [PATCH v2 04/10] x86: Add Intel Processor Trace MSRs and bit definitions Luwei Kang
2018-06-28 14:44   ` Jan Beulich
2018-07-03 10:18     ` Kang, Luwei
2018-07-03 12:00       ` Jan Beulich
2018-05-30 13:27 ` [PATCH v2 05/10] x86: Implement Intel Processor Trace context switch Luwei Kang
2018-06-29 14:12   ` Jan Beulich
2018-07-03 10:18     ` Kang, Luwei
2018-07-03 12:04       ` Jan Beulich
2018-07-04  8:48         ` Kang, Luwei
2018-07-04  9:05           ` Jan Beulich
2018-07-04  9:41             ` Kang, Luwei
2018-05-30 13:28 ` [PATCH v2 06/10] x86: Introduce a new function to get capability of Intel PT Luwei Kang
2018-06-29 14:35   ` Jan Beulich
2018-07-03 10:18     ` Kang, Luwei
2018-07-03 12:09       ` Jan Beulich
2018-07-04  8:48         ` Kang, Luwei
2018-07-04  9:09           ` Jan Beulich
2018-07-04  9:42             ` Kang, Luwei
2018-05-30 13:28 ` [PATCH v2 07/10] x86: Add Intel Processor Trace MSRs read/write emulation Luwei Kang
2018-06-29 14:46   ` Jan Beulich
2018-07-03 10:18     ` Kang, Luwei
2018-07-03 12:10       ` Jan Beulich
2018-05-30 13:28 ` [PATCH v2 08/10] x86: Introduce a function to check the value of RTIT_CTL Luwei Kang
2018-06-29 14:56   ` Jan Beulich
2018-07-03 10:19     ` Kang, Luwei
2018-07-03 12:15       ` Jan Beulich
2018-05-30 13:28 ` [PATCH v2 09/10] x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
2018-06-29 15:14   ` Jan Beulich
2018-07-03 10:19     ` Kang, Luwei
2018-05-30 13:28 ` [PATCH v2 10/10] x86: Handle new asynchronous exit qualification Luwei Kang
2018-06-29 15:22   ` Jan Beulich
2018-06-29 15:29     ` Andrew Cooper
2018-05-30 15:14 ` [PATCH v2 00/10] Intel Processor Trace virtulization enabling Julien Grall
2018-05-30 23:29   ` Kang, Luwei
2018-05-31  9:10     ` Julien Grall
2018-05-31  9:21       ` Kang, Luwei
2018-06-01  7:49       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.