All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature
@ 2014-09-25 10:19 Chao Peng
  2014-09-25 10:19 ` [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall Chao Peng
                   ` (9 more replies)
  0 siblings, 10 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Changes from v15:
 - Keywords change: Intel changed the names for PQOS/CQM in latest SDM.
   Adjust the code accordingly:
     PQOS(Platform QOS) => PSR(Platform Shared Resource)
     CQM(Cache QoS Monitoring) => CMT(Cache Monitoring Technology)
 - Make resource operation more clean:
   * do_platform_op is the minimum unit for non-preemptible operation, it
     accepts Small, non-preempt operations as well as single operation.
   * Other preemptible batch operations are performed with multicall mechanism.
 - Add padding field in xenpf_resource_data structure and check for that.

Changes from v14:
 - Address comments from Jan and Andrew, including:
   * Add non-preemption ability to multicall;
   * Build the resource batch operation on top of multicall;
   * Simplify pqos option.

Changes from v13:
 - Address comments from Jan and Andrew, including:
   * Support mixed resource types in one invocation;
   * Remove some unused fields(rmid_min/rmid_inuse);
   * Other minor changes and code clean up;

Changes from v12:
 - Address comments from Jan, like QoS feature setting when booting,
   avoid unbound memory allocation in Xen, put resource access
   hypercall in platform_hypercall.c to avoid creating new files, 
   specifically enumerate L3 cache size for CQM (instead of
   x86_cache_size), get random socket CPU in user space tools, and 
   also some coding styles. However the continue_hypercall_on_cpu()
   suggestion is not adopted in this version due to the potential
   issue in our usage case.
 - Add a white list to limit the capability for resource access from
   tool side.
 - Address comments from Ian on the xl/libxl/libxc side.

Changes from v11:
 - Turn off pqos and pqos_monitor in Xen command line by default.
 - Modify the original specific MSR access hypercall into a generic
   resource access hypercall. This hypercall could be used to access
   MSR, Port I/O, etc. Use platform_op to replace sysctl so that both
   dom0 kernel and userspace could use this hypercall.
 - Address various comments from Jan, Ian, Konrad, and Daniel.

Changes from v10:
 - Re-design and re-implement the whole logic. In this version,
   hypervisor provides basic mechanisms (like access MSRs) while all
   policies are put in user space.
   patch 1-3 provide a generic MSR hypercall for toolstack to access
   patch 4-9 implement the cache QoS monitoring feature

Changes from v9:
 - Revise the readonly mapping mechanism to share data between Xen and
   userspace. We create L3C buffer for each socket, share both buffer
   address MFNs and buffer MFNs to userspace.
 - Split the pqos.c into pqos.c and cqm.c for better code structure.
 - Show the total L3 cache size when issueing xl pqos-list cqm command.
 - Abstract a libxl_getcqminfo() function to fetch cqm data from Xen.
 - Several coding style fixes.

Changes from v8:
 - Address comments from Ian Campbell, including:
   * Modify the return handling for xc_sysctl();
   * Add man page items for platform QoS related commands.
   * Fix typo in commit message.

Changes from v7:
 - Address comments from Andrew Cooper, including:
   * Check CQM capability before allocating cpumask memory.
   * Move one function declaration into the correct patch.

Changes from v6:
 - Address comments from Jan Beulich, including:
   * Remove the unnecessary CPUID feature check.
   * Remove the unnecessary socket_cpu_map.
   * Spin_lock related changes, avoid spin_lock_irqsave().
   * Use readonly mapping to pass cqm data between Xen/Userspace,
     to avoid data copying.
   * Optimize RDMSR/WRMSR logic to avoid unnecessary calls.
   * Misc fixes including __read_mostly prefix, return value, etc.

Changes from v5:
 - Address comments from Dario Faggioli, including:
   * Define a new libxl_cqminfo structure to avoid reference of xc
     structure in libxl functions.
   * Use LOGE() instead of the LIBXL__LOG() functions.

Changes from v4:
 - When comparing xl cqm parameter, use strcmp instead of strncmp,
   otherwise, "xl pqos-attach cqmabcd domid" will be considered as
   a valid command line.
 - Address comments from Andrew Cooper, including:
   * Adjust the pqos parameter parsing function.
   * Modify the pqos related documentation.
   * Add a check for opt_cqm_max_rmid in initialization code.
   * Do not IPI CPU that is in same socket with current CPU.
 - Address comments from Dario Faggioli, including:
   * Fix an typo in export symbols.
   * Return correct libxl error code for qos related functions.
   * Abstract the error printing logic into a function.
 - Address comment from Daniel De Graaf, including:
   * Add return value in for pqos related check.
 - Address comments from Konrad Rzeszutek Wilk, including:
   * Modify the GPLv2 related file header, remove the address.

Changes from v3:
 - Use structure to better organize CQM related global variables.
 - Address comments from Andrew Cooper, including:
   * Remove the domain creation flag for CQM RMID allocation.
   * Adjust the boot parameter format, use custom_param().
   * Add documentation for the new added boot parameter.
   * Change QoS type flag to be uint64_t.
   * Initialize the per socket cpu bitmap in system boot time.
   * Remove get_cqm_avail() function.
   * Misc of format changes.
 - Address comment from Daniel De Graaf, including:
   * Use avc_current_has_perm() for XEN2__PQOS_OP that belongs to SECCLASS_XEN2.

Changes from v2:
 - Address comments from Andrew Cooper, including:
   * Merging tools stack changes into one patch.
   * Reduce the IPI number to one per socket.
   * Change structures for CQM data exchange between tools and Xen.
   * Misc of format/variable/function name changes.
 - Address comments from Konrad Rzeszutek Wilk, including:
   * Simplify the error printing logic.
   * Add xsm check for the new added hypercalls.

Changes from v1:
 - Address comments from Andrew Cooper, including:
   * Change function names, e.g., alloc_cqm_rmid(), system_supports_cqm(), etc.
   * Change some structure element order to save packing cost.
   * Correct some function's return value.
   * Some programming styles change.
   * ...

The Intel Xeon processor E5 v3 family introduced resource monitoring capability
in each logical processor to measure specific platform shared resource metrics,
for example, L3 cache occupancy. Detailed information please refer to Intel SDM
chapter 17.14.

Cache Monitoring Technology provides a layer of abstraction between applications
and logical processors through the use of Resource Monitoring IDs (RMIDs).
In Xen design, each guest in the system can be assigned an RMID independently,
while RMID=0 is reserved for monitoring domains that doesn't enable CMT service.
When any of the domain's vcpu is scheduled on a logical processor, the domain's
RMID will be activated by programming the value into one specific MSR, and when
the vcpu is scheduled out, a RMID=0 will be programmed into that MSR.
The CMT Hardware tracks cache utilization of memory accesses according to the
RMIDs and reports monitored data via a counter register. With this solution,
we can get the knowledge how much L3 cache is used by a certain guest.

To attach CMT service to a certain guest:
xl psr-cmt-attach domid

To detached CMT service from a guest:
xl psr-cmt-detach domid

To get the L3 cache usage:
$ xl psr-cmt-show cache_occupancy <domid>

The below data is just an example showing how the CMT related data is exposed to
end user.

[root@localhost]# xl psr-cmt-show cache_occupancy
Total RMID: 55
Per-Socket L3 Cache Size: 35840 KB
Name                                        ID        Socket 0        Socket 1
Domain-0                                     0        20720 KB        15960 KB
ExampleHVMDomain                             1         4200 KB         2352 KB

Chao Peng (10):
  x86: add generic resource (e.g. MSR) access hypercall
  xsm: add resource operation related xsm policy
  tools: provide interface for generic resource access
  x86: detect and initialize Cache Monitoring Technology feature
  x86: dynamically attach/detach CMT service for a guest
  x86: collect global CMT information
  x86: enable CMT for each domain RMID
  x86: add CMT related MSRs in allowed list
  xsm: add CMT related xsm policies
  tools: CMDs and APIs for Cache Monitoring Technology

 docs/man/xl.pod.1                            |   25 +++
 docs/misc/xen-command-line.markdown          |   12 ++
 tools/flask/policy/policy/modules/xen/xen.if |    2 +-
 tools/flask/policy/policy/modules/xen/xen.te |    6 +-
 tools/libxc/Makefile                         |    2 +
 tools/libxc/xc_msr_x86.h                     |   36 +++++
 tools/libxc/xc_private.h                     |   52 +++++++
 tools/libxc/xc_psr.c                         |  215 ++++++++++++++++++++++++++
 tools/libxc/xc_resource.c                    |  143 +++++++++++++++++
 tools/libxc/xenctrl.h                        |   30 ++++
 tools/libxl/Makefile                         |    2 +-
 tools/libxl/libxl.h                          |   19 +++
 tools/libxl/libxl_psr.c                      |  184 ++++++++++++++++++++++
 tools/libxl/libxl_types.idl                  |    4 +
 tools/libxl/libxl_utils.c                    |   28 ++++
 tools/libxl/xl.h                             |    3 +
 tools/libxl/xl_cmdimpl.c                     |  131 ++++++++++++++++
 tools/libxl/xl_cmdtable.c                    |   17 ++
 xen/arch/x86/Makefile                        |    1 +
 xen/arch/x86/cpu/intel_cacheinfo.c           |   49 +-----
 xen/arch/x86/domain.c                        |    8 +
 xen/arch/x86/domctl.c                        |   29 ++++
 xen/arch/x86/platform_hypercall.c            |   97 ++++++++++++
 xen/arch/x86/psr.c                           |  180 +++++++++++++++++++++
 xen/arch/x86/setup.c                         |    3 +
 xen/arch/x86/sysctl.c                        |   43 ++++++
 xen/arch/x86/x86_64/platform_hypercall.c     |    4 +
 xen/include/asm-x86/cpufeature.h             |   46 ++++++
 xen/include/asm-x86/domain.h                 |    2 +
 xen/include/asm-x86/msr-index.h              |    5 +
 xen/include/asm-x86/psr.h                    |   63 ++++++++
 xen/include/public/domctl.h                  |   12 ++
 xen/include/public/platform.h                |   23 +++
 xen/include/public/sysctl.h                  |   14 ++
 xen/include/xlat.lst                         |    2 +
 xen/xsm/flask/hooks.c                        |   10 ++
 xen/xsm/flask/policy/access_vectors          |   18 ++-
 xen/xsm/flask/policy/security_classes        |    1 +
 38 files changed, 1468 insertions(+), 53 deletions(-)
 create mode 100644 tools/libxc/xc_msr_x86.h
 create mode 100644 tools/libxc/xc_psr.c
 create mode 100644 tools/libxc/xc_resource.c
 create mode 100644 tools/libxl/libxl_psr.c
 create mode 100644 xen/arch/x86/psr.c
 create mode 100644 xen/include/asm-x86/psr.h

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 19:57   ` Andrew Cooper
  2014-09-26 15:40   ` Jan Beulich
  2014-09-25 10:19 ` [PATCH v16 02/10] xsm: add resource operation related xsm policy Chao Peng
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Add a generic resource access hypercall for tool stack or other
components, e.g., accessing MSR, port I/O, etc.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
 xen/arch/x86/platform_hypercall.c        |   90 ++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/platform_hypercall.c |    4 ++
 xen/include/public/platform.h            |   23 ++++++++
 xen/include/xlat.lst                     |    2 +
 4 files changed, 119 insertions(+)

diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
index 2162811..081d9f5 100644
--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -61,6 +61,68 @@ long cpu_down_helper(void *data);
 long core_parking_helper(void *data);
 uint32_t get_cur_idle_nums(void);
 
+struct xen_resource_access {
+    int32_t ret;
+    uint32_t nr;
+    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
+};
+
+static bool_t allow_access_msr(unsigned int msr)
+{
+    return 0;
+}
+
+static void resource_access(void *info)
+{
+    struct xen_resource_access *ra = info;
+    xenpf_resource_data_t data;
+    int ret = 0;
+    unsigned int i;
+
+    for ( i = 0; i < ra->nr; i++ )
+    {
+        if ( copy_from_guest_offset(&data, ra->data, i, 1) )
+        {
+            ret = -EFAULT;
+            break;
+        }
+
+        if ( data.rsvd ) {
+            ret = -EINVAL;
+            break;
+        }
+
+        switch ( data.cmd )
+        {
+        case XEN_RESOURCE_OP_MSR_READ:
+        case XEN_RESOURCE_OP_MSR_WRITE:
+            if ( data.idx >> 32 )
+                ret = -EINVAL;
+            else if ( !allow_access_msr(data.idx) )
+                ret = -EACCES;
+            else if ( data.cmd == XEN_RESOURCE_OP_MSR_READ )
+                ret = rdmsr_safe(data.idx, data.val);
+            else
+                ret = wrmsr_safe(data.idx, data.val);
+            break;
+        default:
+            ret = -EINVAL;
+            break;
+        }
+
+        if ( ret )
+            break;
+
+        if ( copy_to_guest_offset(ra->data, i, &data, 1) )
+        {
+            ret = -EFAULT;
+            break;
+        }
+    }
+
+    ra->ret = ret;
+}
+
 ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
 {
     ret_t ret = 0;
@@ -601,6 +663,34 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
     }
     break;
 
+    case XENPF_resource_op:
+    {
+        struct xen_resource_access ra;
+        struct xenpf_resource_op *rsc_op = &op->u.resource_op;
+        unsigned int cpu = smp_processor_id();
+
+        ra.nr = rsc_op->nr;
+        ra.data = rsc_op->data;
+
+        if ( rsc_op->cpu == cpu )
+            resource_access(&ra);
+        else if ( cpu_online(rsc_op->cpu) )
+            on_selected_cpus(cpumask_of(rsc_op->cpu),
+                         resource_access, &ra, 1);
+        else
+        {
+            ret = -ENODEV;
+            break;
+        }
+
+        if ( ra.ret )
+        {
+            ret = ra.ret;
+            break;
+        }
+    }
+    break;
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/arch/x86/x86_64/platform_hypercall.c b/xen/arch/x86/x86_64/platform_hypercall.c
index b6f380e..4db6622 100644
--- a/xen/arch/x86/x86_64/platform_hypercall.c
+++ b/xen/arch/x86/x86_64/platform_hypercall.c
@@ -32,6 +32,10 @@ CHECK_pf_pcpu_version;
 CHECK_pf_enter_acpi_sleep;
 #undef xen_pf_enter_acpi_sleep
 
+#define xen_pf_resource_data xenpf_resource_data
+CHECK_pf_resource_data;
+#undef xen_pf_resource_data
+
 #define COMPAT
 #define _XEN_GUEST_HANDLE(t) XEN_GUEST_HANDLE(t)
 #define _XEN_GUEST_HANDLE_PARAM(t) XEN_GUEST_HANDLE_PARAM(t)
diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
index 053b9fa..e4d9091 100644
--- a/xen/include/public/platform.h
+++ b/xen/include/public/platform.h
@@ -527,6 +527,28 @@ struct xenpf_core_parking {
 typedef struct xenpf_core_parking xenpf_core_parking_t;
 DEFINE_XEN_GUEST_HANDLE(xenpf_core_parking_t);
 
+#define XENPF_resource_op   61
+
+#define XEN_RESOURCE_OP_MSR_READ  0
+#define XEN_RESOURCE_OP_MSR_WRITE 1
+
+struct xenpf_resource_data {
+    uint32_t cmd;       /* XEN_RESOURCE_OP_* */
+    uint32_t rsvd;
+    uint64_t idx;
+    uint64_t val;
+};
+typedef struct xenpf_resource_data xenpf_resource_data_t;
+DEFINE_XEN_GUEST_HANDLE(xenpf_resource_data_t);
+
+struct xenpf_resource_op {
+    uint32_t nr;    /* number of data entry */
+    uint32_t cpu;   /* which cpu to run */
+    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
+};
+typedef struct xenpf_resource_op xenpf_resource_op_t;
+DEFINE_XEN_GUEST_HANDLE(xenpf_resource_op_t);
+
 /*
  * ` enum neg_errnoval
  * ` HYPERVISOR_platform_op(const struct xen_platform_op*);
@@ -553,6 +575,7 @@ struct xen_platform_op {
         struct xenpf_cpu_hotadd        cpu_add;
         struct xenpf_mem_hotadd        mem_add;
         struct xenpf_core_parking      core_parking;
+        struct xenpf_resource_op       resource_op;
         uint8_t                        pad[128];
     } u;
 };
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 9a35dd7..100fcf5 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -88,6 +88,8 @@
 ?	xenpf_enter_acpi_sleep		platform.h
 ?	xenpf_pcpuinfo			platform.h
 ?	xenpf_pcpu_version		platform.h
+?	xenpf_resource_op		platform.h
+?	xenpf_resource_data		platform.h
 !	sched_poll			sched.h
 ?	sched_remote_shutdown		sched.h
 ?	sched_shutdown			sched.h
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 02/10] xsm: add resource operation related xsm policy
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
  2014-09-25 10:19 ` [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 10:19 ` [PATCH v16 03/10] tools: provide interface for generic resource access Chao Peng
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Add xsm policies for resource access related hypercall, such as MSR
access, port I/O read/write, and other related resource operations.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Daniel De Graaf  <dgdegra@tycho.nsa.gov>
---
 tools/flask/policy/policy/modules/xen/xen.te |    3 +++
 xen/xsm/flask/hooks.c                        |    4 ++++
 xen/xsm/flask/policy/access_vectors          |   14 +++++++++++---
 xen/xsm/flask/policy/security_classes        |    1 +
 4 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te
index 1937883..6cecf97 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -64,6 +64,9 @@ allow dom0_t xen_t:xen {
 	getidle debug getcpuinfo heap pm_op mca_op lockprof cpupool_op tmem_op
 	tmem_control getscheduler setscheduler
 };
+allow dom0_t xen_t:xen2 {
+    resource_op
+};
 allow dom0_t xen_t:mmu memorymap;
 
 # Allow dom0 to use these domctls on itself. For domctls acting on other
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index df05566..9f36503 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1391,6 +1391,10 @@ static int flask_platform_op(uint32_t op)
     case XENPF_get_cpuinfo:
         return domain_has_xen(current->domain, XEN__GETCPUINFO);
 
+    case XENPF_resource_op:
+        return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
+                                    XEN2__RESOURCE_OP, NULL);
+
     default:
         printk("flask_platform_op: Unknown op %d\n", op);
         return -EPERM;
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index d279841..daf0de5 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -3,9 +3,9 @@
 #
 # class class_name { permission_name ... }
 
-# Class xen consists of dom0-only operations dealing with the hypervisor itself.
-# Unless otherwise specified, the source is the domain executing the hypercall,
-# and the target is the xen initial sid (type xen_t).
+# Class xen and xen2 consists of dom0-only operations dealing with the
+# hypervisor itself. Unless otherwise specified, the source is the domain
+# executing the hypercall, and the target is the xen initial sid (type xen_t).
 class xen
 {
 # XENPF_settime
@@ -75,6 +75,14 @@ class xen
     setscheduler
 }
 
+# This is a continuation of class xen, since only 32 permissions can be
+# defined per class
+class xen2
+{
+# XENPF_resource_op
+    resource_op
+}
+
 # Classes domain and domain2 consist of operations that a domain performs on
 # another domain or on itself.  Unless otherwise specified, the source is the
 # domain executing the hypercall, and the target is the domain being operated on
diff --git a/xen/xsm/flask/policy/security_classes b/xen/xsm/flask/policy/security_classes
index ef134a7..ca191db 100644
--- a/xen/xsm/flask/policy/security_classes
+++ b/xen/xsm/flask/policy/security_classes
@@ -8,6 +8,7 @@
 # for userspace object managers
 
 class xen
+class xen2
 class domain
 class domain2
 class hvm
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 03/10] tools: provide interface for generic resource access
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
  2014-09-25 10:19 ` [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall Chao Peng
  2014-09-25 10:19 ` [PATCH v16 02/10] xsm: add resource operation related xsm policy Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 20:06   ` Konrad Rzeszutek Wilk
  2014-09-25 10:19 ` [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature Chao Peng
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Xen added a new platform_op hypercall for generic MSR access, and this
is the the tool side change to wrapper the hypercall into xc APIs.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/Makefile      |    1 +
 tools/libxc/xc_private.h  |   52 +++++++++++++++++
 tools/libxc/xc_resource.c |  143 +++++++++++++++++++++++++++++++++++++++++++++
 tools/libxc/xenctrl.h     |   13 +++++
 4 files changed, 209 insertions(+)
 create mode 100644 tools/libxc/xc_resource.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 3b04027..dde6109 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -34,6 +34,7 @@ CTRL_SRCS-y       += xc_foreign_memory.c
 CTRL_SRCS-y       += xc_kexec.c
 CTRL_SRCS-y       += xtl_core.c
 CTRL_SRCS-y       += xtl_logger_stdio.c
+CTRL_SRCS-y       += xc_resource.c
 CTRL_SRCS-$(CONFIG_X86) += xc_pagetab.c
 CTRL_SRCS-$(CONFIG_Linux) += xc_linux.c xc_linux_osdep.c
 CTRL_SRCS-$(CONFIG_FreeBSD) += xc_freebsd.c xc_freebsd_osdep.c
diff --git a/tools/libxc/xc_private.h b/tools/libxc/xc_private.h
index 94df688..fbdfe79 100644
--- a/tools/libxc/xc_private.h
+++ b/tools/libxc/xc_private.h
@@ -46,6 +46,7 @@
 #define DECLARE_SYSCTL struct xen_sysctl sysctl
 #define DECLARE_PHYSDEV_OP struct physdev_op physdev_op
 #define DECLARE_FLASK_OP struct xen_flask_op op
+#define DECLARE_PLATFORM_OP struct xen_platform_op platform_op
 
 #undef PAGE_SHIFT
 #undef PAGE_SIZE
@@ -310,6 +311,57 @@ static inline int do_sysctl(xc_interface *xch, struct xen_sysctl *sysctl)
     return ret;
 }
 
+static inline int do_platform_op(xc_interface *xch,
+                                 struct xen_platform_op *platform_op)
+{
+    int ret = -1;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BOUNCE(platform_op, sizeof(*platform_op),
+                             XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
+
+    platform_op->interface_version = XENPF_INTERFACE_VERSION;
+
+    if ( xc_hypercall_bounce_pre(xch, platform_op) )
+    {
+        PERROR("Could not bounce buffer for platform_op hypercall");
+        goto out1;
+    }
+
+    hypercall.op     = __HYPERVISOR_platform_op;
+    hypercall.arg[0] = HYPERCALL_BUFFER_AS_ARG(platform_op);
+    if ( (ret = do_xen_hypercall(xch, &hypercall)) < 0 )
+    {
+        if ( errno == EACCES )
+            DPRINTF("platform operation failed -- need to"
+                    " rebuild the user-space tool set?\n");
+    }
+
+    xc_hypercall_bounce_post(xch, platform_op);
+ out1:
+    return ret;
+}
+
+static inline int do_multicall_op(xc_interface *xch,
+                                  xc_hypercall_buffer_t *call_list,
+                                  uint32_t nr_calls)
+{
+    int ret = -1;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER_ARGUMENT(call_list);
+
+    hypercall.op     = __HYPERVISOR_multicall;
+    hypercall.arg[0] = HYPERCALL_BUFFER_AS_ARG(call_list);
+    hypercall.arg[1] = nr_calls;
+    if ( (ret = do_xen_hypercall(xch, &hypercall)) < 0 )
+    {
+        if ( errno == EACCES )
+            DPRINTF("multicall operation failed -- need to"
+                    " rebuild the user-space tool set?\n");
+    }
+
+    return ret;
+}
+
 int do_memory_op(xc_interface *xch, int cmd, void *arg, size_t len);
 
 void *xc_map_foreign_ranges(xc_interface *xch, uint32_t dom,
diff --git a/tools/libxc/xc_resource.c b/tools/libxc/xc_resource.c
new file mode 100644
index 0000000..c92910b
--- /dev/null
+++ b/tools/libxc/xc_resource.c
@@ -0,0 +1,143 @@
+/*
+ * xc_resource.c
+ *
+ * Generic resource access API
+ *
+ * Copyright (C) 2014      Intel Corporation
+ * Author Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "xc_private.h"
+
+static int xc_resource_op_one(xc_interface *xch, xc_resource_op_t *op)
+{
+    int rc;
+    DECLARE_PLATFORM_OP;
+    DECLARE_NAMED_HYPERCALL_BOUNCE(data, op->entries,
+                                op->nr_entries * sizeof(*op->entries),
+                                XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
+
+    if ( xc_hypercall_bounce_pre(xch, data) )
+        return -1;
+
+    platform_op.cmd = XENPF_resource_op;
+    platform_op.u.resource_op.nr = op->nr_entries;
+    platform_op.u.resource_op.cpu = op->cpu;
+    set_xen_guest_handle(platform_op.u.resource_op.data, data);
+
+    rc = do_platform_op(xch, &platform_op);
+
+    xc_hypercall_bounce_post(xch, data);
+
+    return rc;
+}
+
+static int xc_resource_op_multi(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops)
+{
+    int rc, i, entries_size;
+    xc_resource_op_t *op;
+    multicall_entry_t *call;
+    DECLARE_HYPERCALL_BUFFER(multicall_entry_t, call_list);
+    xc_hypercall_buffer_array_t *platform_ops, *entries_list = NULL;
+
+    call_list = xc_hypercall_buffer_alloc(xch, call_list,
+                                          sizeof(*call_list) * nr_ops);
+    if ( !call_list )
+        return -1;
+
+    platform_ops = xc_hypercall_buffer_array_create(xch, nr_ops);
+    if ( !platform_ops ) {
+        rc = -1;
+        goto out;
+    }
+
+    entries_list = xc_hypercall_buffer_array_create(xch, nr_ops);
+    if ( !entries_list ) {
+        rc = -1;
+        goto out;
+    }
+
+    for ( i = 0; i < nr_ops; i++ ) {
+        DECLARE_HYPERCALL_BUFFER(xen_platform_op_t, platform_op);
+        DECLARE_HYPERCALL_BUFFER(xc_resource_data_t, entries);
+
+        op = ops + i;
+
+        platform_op = xc_hypercall_buffer_array_alloc(xch, platform_ops, i,
+                        platform_op, sizeof(xen_platform_op_t));
+        if ( !platform_op ) {
+            rc = -1;
+            goto out;
+        }
+
+        entries_size = sizeof(xc_resource_data_t) * op->nr_entries;
+        entries = xc_hypercall_buffer_array_alloc(xch, entries_list, i,
+                   entries, entries_size);
+        if ( !entries) {
+            rc = -1;
+            goto out;
+        }
+        memcpy(entries, op->entries, entries_size);
+
+        call = call_list + i;
+        call->op = __HYPERVISOR_platform_op;
+        call->args[0] = HYPERCALL_BUFFER_AS_ARG(platform_op);
+
+        platform_op->interface_version = XENPF_INTERFACE_VERSION;
+        platform_op->cmd = XENPF_resource_op;
+        platform_op->u.resource_op.cpu = op->cpu;
+        platform_op->u.resource_op.nr = op->nr_entries;
+        set_xen_guest_handle(platform_op->u.resource_op.data, entries);
+    }
+
+    rc = do_multicall_op(xch, HYPERCALL_BUFFER(call_list), nr_ops);
+
+    for ( i = 0; i < nr_ops; i++ ) {
+        DECLARE_HYPERCALL_BUFFER(xc_resource_data_t, entries);
+        op = ops + i;
+
+        call = call_list + i;
+        op->result = call->result;
+
+        entries_size = sizeof(xc_resource_data_t) * op->nr_entries;
+        entries = xc_hypercall_buffer_array_get(xch, entries_list, i,
+                   entries, entries_size);
+        memcpy(op->entries, entries, entries_size);
+    }
+
+out:
+    xc_hypercall_buffer_array_destroy(xch, entries_list);
+    xc_hypercall_buffer_array_destroy(xch, platform_ops);
+    xc_hypercall_buffer_free(xch, call_list);
+    return rc;
+}
+
+int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops)
+{
+    if ( nr_ops == 1 )
+        return xc_resource_op_one(xch, ops);
+    else if ( nr_ops > 1 )
+        return xc_resource_op_multi(xch, nr_ops, ops);
+    else
+        return -1;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index 514b241..fb58c44 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -47,6 +47,7 @@
 #include <xen/xsm/flask_op.h>
 #include <xen/tmem.h>
 #include <xen/kexec.h>
+#include <xen/platform.h>
 
 #include "xentoollog.h"
 
@@ -2655,6 +2656,18 @@ int xc_kexec_load(xc_interface *xch, uint8_t type, uint16_t arch,
  */
 int xc_kexec_unload(xc_interface *xch, int type);
 
+typedef xenpf_resource_data_t xc_resource_data_t;
+
+struct xc_resource_op {
+    uint64_t result;
+    uint32_t cpu;
+    uint32_t nr_entries;
+    xc_resource_data_t *entries;
+};
+
+typedef struct xc_resource_op xc_resource_op_t;
+int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops);
+
 #endif /* XENCTRL_H */
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
                   ` (2 preceding siblings ...)
  2014-09-25 10:19 ` [PATCH v16 03/10] tools: provide interface for generic resource access Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 20:33   ` Konrad Rzeszutek Wilk
  2014-09-26 15:45   ` Jan Beulich
  2014-09-25 10:19 ` [PATCH v16 05/10] x86: dynamically attach/detach CMT service for a guest Chao Peng
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Detect Cache Monitoring Technology(CMT) feature and enumerate the
resource types, one of which is to monitor the L3 cache occupancy.

Also introduce a Xen command line parameter to control the Platform
Shared Resource such as CMT.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
 docs/misc/xen-command-line.markdown |   12 ++++
 xen/arch/x86/Makefile               |    1 +
 xen/arch/x86/psr.c                  |  107 +++++++++++++++++++++++++++++++++++
 xen/arch/x86/setup.c                |    3 +
 xen/include/asm-x86/cpufeature.h    |    1 +
 xen/include/asm-x86/psr.h           |   53 +++++++++++++++++
 6 files changed, 177 insertions(+)
 create mode 100644 xen/arch/x86/psr.c
 create mode 100644 xen/include/asm-x86/psr.h

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index af93e17..b106a46 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1005,6 +1005,18 @@ This option can be specified more than once (up to 8 times at present).
 ### ple\_window
 > `= <integer>`
 
+### psr (Intel)
+> `= List of ( cmt:<boolean> | rmid_max:<integer> )`
+
+> Default: `psr=cmt:0,rmid_max:255`
+
+Configure platform shared resource services, which are available on Intel
+Haswell Server family and future platforms.
+
+`cmt` instructs Xen to enable/disable Cache Monitoring Technology.
+
+`rmid_max` indicates the max value for rmid.
+
 ### reboot
 > `= t[riple] | k[bd] | a[cpi] | p[ci] | n[o] [, [w]arm | [c]old]`
 
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index c1e244d..cf137fd 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -59,6 +59,7 @@ obj-y += crash.o
 obj-y += tboot.o
 obj-y += hpet.o
 obj-y += xstate.o
+obj-y += psr.o
 
 obj-$(crash_debug) += gdbstub.o
 
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
new file mode 100644
index 0000000..9025aeb
--- /dev/null
+++ b/xen/arch/x86/psr.c
@@ -0,0 +1,107 @@
+/*
+ * pqos.c: Platform Shared Resource related service for guest.
+ *
+ * Copyright (c) 2014, Intel Corporation
+ * Author: Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+#include <xen/init.h>
+#include <xen/cpu.h>
+#include <asm/psr.h>
+
+#define PSR_CMT        (1<<0)
+
+struct psr_cmt *__read_mostly  psr_cmt = NULL;
+static bool_t __initdata opt_psr = 0;
+static unsigned int __initdata opt_rmid_max = 255;
+
+static void __init parse_psr_param(char *s)
+{
+    char *ss, *val_str;
+
+    do {
+        ss = strchr(s, ',');
+        if ( ss )
+            *ss = '\0';
+
+        val_str = strchr(s, ':');
+        if ( val_str )
+            *val_str++ = '\0';
+
+        if ( !strcmp(s, "cmt")
+             && ( !val_str || parse_bool(val_str) == 1 )) {
+            opt_psr &= PSR_CMT;
+        } else if ( val_str && !strcmp(s, "rmid_max") )
+            opt_rmid_max = simple_strtoul(val_str, NULL, 0);
+
+        s = ss + 1;
+    } while ( ss );
+}
+custom_param("psr", parse_psr_param);
+
+static void __init init_psr_cmt(unsigned int rmid_max)
+{
+    unsigned int eax, ebx, ecx, edx;
+    unsigned int rmid;
+
+    if ( !boot_cpu_has(X86_FEATURE_CMT) )
+        return;
+
+    cpuid_count(0xf, 0, &eax, &ebx, &ecx, &edx);
+    if ( !edx )
+        return;
+
+    psr_cmt = xzalloc(struct psr_cmt);
+    if ( !psr_cmt )
+        return;
+
+    psr_cmt->features = edx;
+    psr_cmt->rmid_mask = ~(~0ull << get_count_order(ebx));
+    psr_cmt->rmid_max = min(rmid_max, ebx);
+
+    if ( psr_cmt->features & PSR_RESOURCE_TYPE_L3 )
+    {
+        cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx);
+        psr_cmt->l3.upscaling_factor = ebx;
+        psr_cmt->l3.rmid_max = ecx;
+        psr_cmt->l3.features = edx;
+    }
+
+    psr_cmt->rmid_max = min(rmid_max, psr_cmt->l3.rmid_max);
+    psr_cmt->rmid_to_dom = xmalloc_array(domid_t, psr_cmt->rmid_max + 1);
+    if ( !psr_cmt->rmid_to_dom )
+    {
+        xfree(psr_cmt);
+        return;
+    }
+    /* Reserve RMID 0 for all domains not being monitored */
+    psr_cmt->rmid_to_dom[0] = DOMID_XEN;
+    for ( rmid = 1; rmid <= psr_cmt->rmid_max; rmid++ )
+        psr_cmt->rmid_to_dom[rmid] = DOMID_INVALID;
+
+    printk(XENLOG_INFO "Cache Monitoring Technology Enabled.\n");
+}
+
+void __init init_psr(void)
+{
+    if ( opt_psr & PSR_CMT && opt_rmid_max )
+        init_psr_cmt(opt_rmid_max);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 8c8b91f..ca4785e 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -49,6 +49,7 @@
 #include <xen/cpu.h>
 #include <asm/nmi.h>
 #include <asm/alternative.h>
+#include <asm/psr.h>
 
 /* opt_nosmp: If true, secondary processors are ignored. */
 static bool_t __initdata opt_nosmp;
@@ -1430,6 +1431,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
     domain_unpause_by_systemcontroller(dom0);
 
+    init_psr();
+
     reset_stack_and_jump(init_done);
 }
 
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 8014241..137d75c 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -148,6 +148,7 @@
 #define X86_FEATURE_ERMS	(7*32+ 9) /* Enhanced REP MOVSB/STOSB */
 #define X86_FEATURE_INVPCID	(7*32+10) /* Invalidate Process Context ID */
 #define X86_FEATURE_RTM 	(7*32+11) /* Restricted Transactional Memory */
+#define X86_FEATURE_CMT 	(7*32+12) /* Cache Monitoring Technology */
 #define X86_FEATURE_NO_FPU_SEL 	(7*32+13) /* FPU CS/DS stored as zero */
 #define X86_FEATURE_MPX		(7*32+14) /* Memory Protection Extensions */
 #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
new file mode 100644
index 0000000..e321890
--- /dev/null
+++ b/xen/include/asm-x86/psr.h
@@ -0,0 +1,53 @@
+/*
+ * psr.h: Platform Shared Resource related service for guest.
+ *
+ * Copyright (c) 2014, Intel Corporation
+ * Author: Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+#ifndef __ASM_PSR_H__
+#define __ASM_PSR_H__
+
+/* Resource Type Enumeration */
+#define PSR_RESOURCE_TYPE_L3            0x2
+
+/* L3 Monitoring Features */
+#define PSR_CMT_L3_OCCUPANCY           0x1
+
+struct psr_cmt_l3 {
+    unsigned int features;
+    unsigned int upscaling_factor;
+    unsigned int rmid_max;
+};
+
+struct psr_cmt {
+    unsigned long rmid_mask;
+    unsigned int rmid_max;
+    unsigned int features;
+    domid_t *rmid_to_dom;
+    struct psr_cmt_l3 l3;
+};
+
+extern struct psr_cmt *psr_cmt;
+
+void init_psr(void);
+
+#endif /* __ASM_PSR_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 05/10] x86: dynamically attach/detach CMT service for a guest
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
                   ` (3 preceding siblings ...)
  2014-09-25 10:19 ` [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 20:41   ` Konrad Rzeszutek Wilk
  2014-09-25 10:19 ` [PATCH v16 06/10] x86: collect global CMT information Chao Peng
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Add hypervisor side support for dynamically attach and detach
Cache Monitoring Technology(CMT) services for a certain guest.

When attach CMT service for a guest, system will allocate an
RMID for it. When detach or guest is shutdown, the RMID will be
recycled for future use.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/domain.c        |    3 +++
 xen/arch/x86/domctl.c        |   29 ++++++++++++++++++++++++++
 xen/arch/x86/psr.c           |   46 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/domain.h |    2 ++
 xen/include/asm-x86/psr.h    |    9 +++++++++
 xen/include/public/domctl.h  |   12 +++++++++++
 6 files changed, 101 insertions(+)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7b1dfe6..3cfd8f4 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -60,6 +60,7 @@
 #include <xen/numa.h>
 #include <xen/iommu.h>
 #include <compat/vcpu.h>
+#include <asm/psr.h>
 
 DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
 DEFINE_PER_CPU(unsigned long, cr4);
@@ -647,6 +648,8 @@ void arch_domain_destroy(struct domain *d)
 
     free_xenheap_page(d->shared_info);
     cleanup_domain_irq_mapping(d);
+
+    psr_free_rmid(d);
 }
 
 unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long guest_cr4)
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 7a5de43..6ed480a 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -35,6 +35,7 @@
 #include <asm/mem_sharing.h>
 #include <asm/xstate.h>
 #include <asm/debugger.h>
+#include <asm/psr.h>
 
 static int gdbsx_guest_mem_io(
     domid_t domid, struct xen_domctl_gdbsx_memio *iop)
@@ -1319,6 +1320,34 @@ long arch_do_domctl(
     }
     break;
 
+    case XEN_DOMCTL_psr_cmt_op:
+        if ( !psr_cmt_enabled() )
+        {
+            ret = -ENODEV;
+            break;
+        }
+
+        switch ( domctl->u.psr_cmt_op.cmd )
+        {
+        case XEN_DOMCTL_PSR_CMT_OP_ATTACH:
+            ret = psr_alloc_rmid(d);
+            break;
+        case XEN_DOMCTL_PSR_CMT_OP_DETACH:
+            if ( d->arch.psr_rmid > 0 )
+                psr_free_rmid(d);
+            else
+                ret = -ENOENT;
+            break;
+        case XEN_DOMCTL_PSR_CMT_OP_QUERY_RMID:
+            domctl->u.psr_cmt_op.data = d->arch.psr_rmid;
+            copyback = 1;
+            break;
+        default:
+            ret = -ENOSYS;
+            break;
+        }
+        break;
+
     default:
         ret = iommu_do_domctl(domctl, d, u_domctl);
         break;
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 9025aeb..41f7496 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -15,6 +15,7 @@
  */
 #include <xen/init.h>
 #include <xen/cpu.h>
+#include <xen/sched.h>
 #include <asm/psr.h>
 
 #define PSR_CMT        (1<<0)
@@ -96,6 +97,51 @@ void __init init_psr(void)
         init_psr_cmt(opt_rmid_max);
 }
 
+/* Called with domain lock held, no psr specific lock needed */
+int psr_alloc_rmid(struct domain *d)
+{
+    unsigned int rmid;
+
+    ASSERT(psr_cmt_enabled());
+
+    if ( d->arch.psr_rmid > 0 )
+        return -EEXIST;
+
+    for ( rmid = 1; rmid <= psr_cmt->rmid_max; rmid++ )
+    {
+        if ( psr_cmt->rmid_to_dom[rmid] != DOMID_INVALID)
+            continue;
+
+        psr_cmt->rmid_to_dom[rmid] = d->domain_id;
+        break;
+    }
+
+    /* No RMID available, assign RMID=0 by default */
+    if ( rmid > psr_cmt->rmid_max )
+    {
+        d->arch.psr_rmid = 0;
+        return -EUSERS;
+    }
+
+    d->arch.psr_rmid = rmid;
+
+    return 0;
+}
+
+/* Called with domain lock held, no psr specific lock needed */
+void psr_free_rmid(struct domain *d)
+{
+    unsigned int rmid;
+
+    rmid = d->arch.psr_rmid;
+    /* We do not free system reserved "RMID=0" */
+    if ( rmid == 0 )
+        return;
+
+    psr_cmt->rmid_to_dom[rmid] = DOMID_INVALID;
+    d->arch.psr_rmid = 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 7abe1b3..2be1d1e 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -314,6 +314,8 @@ struct arch_domain
     /* Shared page for notifying that explicit PIRQ EOI is required. */
     unsigned long *pirq_eoi_map;
     unsigned long pirq_eoi_map_mfn;
+
+    unsigned int psr_rmid; /* RMID assigned to the domain for CMT */
 } __cacheline_aligned;
 
 #define has_arch_pdevs(d)    (!list_empty(&(d)->arch.pdev_list))
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index e321890..930b22b 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -16,6 +16,8 @@
 #ifndef __ASM_PSR_H__
 #define __ASM_PSR_H__
 
+#include <xen/types.h>
+
 /* Resource Type Enumeration */
 #define PSR_RESOURCE_TYPE_L3            0x2
 
@@ -38,7 +40,14 @@ struct psr_cmt {
 
 extern struct psr_cmt *psr_cmt;
 
+static inline bool_t psr_cmt_enabled(void)
+{
+    return !!psr_cmt;
+}
+
 void init_psr(void);
+int psr_alloc_rmid(struct domain *d);
+void psr_free_rmid(struct domain *d);
 
 #endif /* __ASM_PSR_H__ */
 
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index cfa39b3..59220ed 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -965,6 +965,16 @@ struct xen_domctl_vnuma {
 typedef struct xen_domctl_vnuma xen_domctl_vnuma_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_vnuma_t);
 
+struct xen_domctl_psr_cmt_op {
+#define XEN_DOMCTL_PSR_CMT_OP_DETACH         0
+#define XEN_DOMCTL_PSR_CMT_OP_ATTACH         1
+#define XEN_DOMCTL_PSR_CMT_OP_QUERY_RMID     2
+    uint32_t cmd;
+    uint32_t data;
+};
+typedef struct xen_domctl_psr_cmt_op xen_domctl_psr_cmt_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cmt_op_t);
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1038,6 +1048,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_get_vcpu_msrs                 72
 #define XEN_DOMCTL_set_vcpu_msrs                 73
 #define XEN_DOMCTL_setvnumainfo                  74
+#define XEN_DOMCTL_psr_cmt_op                    75
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1099,6 +1110,7 @@ struct xen_domctl {
         struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
         struct xen_domctl_gdbsx_domstatus   gdbsx_domstatus;
         struct xen_domctl_vnuma             vnuma;
+        struct xen_domctl_psr_cmt_op        psr_cmt_op;
         uint8_t                             pad[128];
     } u;
 };
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 06/10] x86: collect global CMT information
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
                   ` (4 preceding siblings ...)
  2014-09-25 10:19 ` [PATCH v16 05/10] x86: dynamically attach/detach CMT service for a guest Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 20:53   ` Konrad Rzeszutek Wilk
  2014-09-25 10:19 ` [PATCH v16 07/10] x86: enable CMT for each domain RMID Chao Peng
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

This implementation tries to put all policies into user space, thus some
global CMT information needs to be exposed, such as the total RMID count,
L3 upscaling factor, etc.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/cpu/intel_cacheinfo.c |   49 ++----------------------------------
 xen/arch/x86/sysctl.c              |   43 +++++++++++++++++++++++++++++++
 xen/include/asm-x86/cpufeature.h   |   45 +++++++++++++++++++++++++++++++++
 xen/include/public/sysctl.h        |   14 +++++++++++
 4 files changed, 104 insertions(+), 47 deletions(-)

diff --git a/xen/arch/x86/cpu/intel_cacheinfo.c b/xen/arch/x86/cpu/intel_cacheinfo.c
index 430f939..48970c0 100644
--- a/xen/arch/x86/cpu/intel_cacheinfo.c
+++ b/xen/arch/x86/cpu/intel_cacheinfo.c
@@ -81,54 +81,9 @@ static struct _cache_table cache_table[] __cpuinitdata =
 	{ 0x00, 0, 0}
 };
 
-
-enum _cache_type
-{
-	CACHE_TYPE_NULL	= 0,
-	CACHE_TYPE_DATA = 1,
-	CACHE_TYPE_INST = 2,
-	CACHE_TYPE_UNIFIED = 3
-};
-
-union _cpuid4_leaf_eax {
-	struct {
-		enum _cache_type	type:5;
-		unsigned int		level:3;
-		unsigned int		is_self_initializing:1;
-		unsigned int		is_fully_associative:1;
-		unsigned int		reserved:4;
-		unsigned int		num_threads_sharing:12;
-		unsigned int		num_cores_on_die:6;
-	} split;
-	u32 full;
-};
-
-union _cpuid4_leaf_ebx {
-	struct {
-		unsigned int		coherency_line_size:12;
-		unsigned int		physical_line_partition:10;
-		unsigned int		ways_of_associativity:10;
-	} split;
-	u32 full;
-};
-
-union _cpuid4_leaf_ecx {
-	struct {
-		unsigned int		number_of_sets:32;
-	} split;
-	u32 full;
-};
-
-struct _cpuid4_info {
-	union _cpuid4_leaf_eax eax;
-	union _cpuid4_leaf_ebx ebx;
-	union _cpuid4_leaf_ecx ecx;
-	unsigned long size;
-};
-
 unsigned short			num_cache_leaves;
 
-static int __cpuinit cpuid4_cache_lookup(int index, struct _cpuid4_info *this_leaf)
+int cpuid4_cache_lookup(int index, struct cpuid4_info *this_leaf)
 {
 	union _cpuid4_leaf_eax 	eax;
 	union _cpuid4_leaf_ebx 	ebx;
@@ -185,7 +140,7 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
 		 * parameters cpuid leaf to find the cache details
 		 */
 		for (i = 0; i < num_cache_leaves; i++) {
-			struct _cpuid4_info this_leaf;
+			struct cpuid4_info this_leaf;
 
 			int retval;
 
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 15d4b91..b95408f 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -28,6 +28,7 @@
 #include <xen/nodemask.h>
 #include <xen/cpu.h>
 #include <xsm/xsm.h>
+#include <asm/psr.h>
 
 #define get_xen_guest_handle(val, hnd)  do { val = (hnd).p; } while (0)
 
@@ -101,6 +102,48 @@ long arch_do_sysctl(
     }
     break;
 
+    case XEN_SYSCTL_psr_cmt_op:
+        if ( !psr_cmt_enabled() )
+            return -ENODEV;
+
+        if ( sysctl->u.psr_cmt_op.flags != 0 )
+            return -EINVAL;
+
+        switch ( sysctl->u.psr_cmt_op.cmd )
+        {
+        case XEN_SYSCTL_PSR_CMT_enabled:
+            sysctl->u.psr_cmt_op.data =
+                (psr_cmt->features & PSR_RESOURCE_TYPE_L3) &&
+                (psr_cmt->l3.features & PSR_CMT_L3_OCCUPANCY);
+            break;
+        case XEN_SYSCTL_PSR_CMT_get_total_rmid:
+            sysctl->u.psr_cmt_op.data = psr_cmt->rmid_max;
+            break;
+        case XEN_SYSCTL_PSR_CMT_get_l3_upscaling_factor:
+            sysctl->u.psr_cmt_op.data = psr_cmt->l3.upscaling_factor;
+            break;
+        case XEN_SYSCTL_PSR_CMT_get_l3_cache_size:
+        {
+            struct cpuid4_info info;
+
+            ret = cpuid4_cache_lookup(3, &info);
+            if ( ret < 0 )
+                break;
+
+            sysctl->u.psr_cmt_op.data = info.size / 1024; /* in KB unit */
+        }
+        break;
+        default:
+            sysctl->u.psr_cmt_op.data = 0;
+            ret = -ENOSYS;
+            break;
+        }
+
+        if ( __copy_to_guest(u_sysctl, sysctl, 1) )
+            ret = -EFAULT;
+
+        break;
+
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 137d75c..d3bd14d 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -215,6 +215,51 @@
 #define cpu_has_vmx		boot_cpu_has(X86_FEATURE_VMXE)
 
 #define cpu_has_cpuid_faulting	boot_cpu_has(X86_FEATURE_CPUID_FAULTING)
+
+enum _cache_type {
+    CACHE_TYPE_NULL = 0,
+    CACHE_TYPE_DATA = 1,
+    CACHE_TYPE_INST = 2,
+    CACHE_TYPE_UNIFIED = 3
+};
+
+union _cpuid4_leaf_eax {
+    struct {
+        enum _cache_type type:5;
+        unsigned int level:3;
+        unsigned int is_self_initializing:1;
+        unsigned int is_fully_associative:1;
+        unsigned int reserved:4;
+        unsigned int num_threads_sharing:12;
+        unsigned int num_cores_on_die:6;
+    } split;
+    u32 full;
+};
+
+union _cpuid4_leaf_ebx {
+    struct {
+        unsigned int coherency_line_size:12;
+        unsigned int physical_line_partition:10;
+        unsigned int ways_of_associativity:10;
+    } split;
+    u32 full;
+};
+
+union _cpuid4_leaf_ecx {
+    struct {
+        unsigned int number_of_sets:32;
+    } split;
+    u32 full;
+};
+
+struct cpuid4_info {
+    union _cpuid4_leaf_eax eax;
+    union _cpuid4_leaf_ebx ebx;
+    union _cpuid4_leaf_ecx ecx;
+    unsigned long size;
+};
+
+int cpuid4_cache_lookup(int index, struct cpuid4_info *this_leaf);
 #endif
 
 #endif /* __ASM_I386_CPUFEATURE_H */
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 3588698..66b6e47 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -636,6 +636,18 @@ struct xen_sysctl_coverage_op {
 typedef struct xen_sysctl_coverage_op xen_sysctl_coverage_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t);
 
+#define XEN_SYSCTL_PSR_CMT_get_total_rmid            0
+#define XEN_SYSCTL_PSR_CMT_get_l3_upscaling_factor   1
+/* The L3 cache size is returned in KB unit */
+#define XEN_SYSCTL_PSR_CMT_get_l3_cache_size         2
+#define XEN_SYSCTL_PSR_CMT_enabled                   3
+struct xen_sysctl_psr_cmt_op {
+    uint32_t cmd;
+    uint32_t flags;      /* padding variable, may be extended for future use */
+    uint64_t data;
+};
+typedef struct xen_sysctl_psr_cmt_op xen_sysctl_psr_cmt_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cmt_op_t);
 
 struct xen_sysctl {
     uint32_t cmd;
@@ -658,6 +670,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_cpupool_op                    18
 #define XEN_SYSCTL_scheduler_op                  19
 #define XEN_SYSCTL_coverage_op                   20
+#define XEN_SYSCTL_psr_cmt_op                    21
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
     union {
         struct xen_sysctl_readconsole       readconsole;
@@ -679,6 +692,7 @@ struct xen_sysctl {
         struct xen_sysctl_cpupool_op        cpupool_op;
         struct xen_sysctl_scheduler_op      scheduler_op;
         struct xen_sysctl_coverage_op       coverage_op;
+        struct xen_sysctl_psr_cmt_op        psr_cmt_op;
         uint8_t                             pad[128];
     } u;
 };
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 07/10] x86: enable CMT for each domain RMID
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
                   ` (5 preceding siblings ...)
  2014-09-25 10:19 ` [PATCH v16 06/10] x86: collect global CMT information Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 21:23   ` Andrew Cooper
  2014-09-25 10:19 ` [PATCH v16 08/10] x86: add CMT related MSRs in allowed list Chao Peng
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

If the CMT service is attached to a domain, its related RMID
will be set to hardware for monitoring when the domain's vcpu is
scheduled in. When the domain's vcpu is scheduled out, RMID 0
(system reserved) will be set for monitoring.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/domain.c           |    5 +++++
 xen/arch/x86/psr.c              |   27 +++++++++++++++++++++++++++
 xen/include/asm-x86/msr-index.h |    3 +++
 xen/include/asm-x86/psr.h       |    1 +
 4 files changed, 36 insertions(+)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 3cfd8f4..04a6719 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1418,6 +1418,8 @@ static void __context_switch(void)
     {
         memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
         vcpu_save_fpu(p);
+        if ( psr_cmt_enabled() )
+            psr_assoc_rmid(0);
         p->arch.ctxt_switch_from(p);
     }
 
@@ -1442,6 +1444,9 @@ static void __context_switch(void)
         }
         vcpu_restore_fpu_eager(n);
         n->arch.ctxt_switch_to(n);
+
+        if ( psr_cmt_enabled() && n->domain->arch.psr_rmid > 0 )
+            psr_assoc_rmid(n->domain->arch.psr_rmid);
     }
 
     gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 41f7496..56163bd 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -20,9 +20,15 @@
 
 #define PSR_CMT        (1<<0)
 
+struct pqr_assoc {
+    uint64_t val;
+    bool_t initialized;
+};
+
 struct psr_cmt *__read_mostly  psr_cmt = NULL;
 static bool_t __initdata opt_psr = 0;
 static unsigned int __initdata opt_rmid_max = 255;
+static DEFINE_PER_CPU(struct pqr_assoc, pqr_assoc);
 
 static void __init parse_psr_param(char *s)
 {
@@ -142,6 +148,27 @@ void psr_free_rmid(struct domain *d)
     d->arch.psr_rmid = 0;
 }
 
+void psr_assoc_rmid(unsigned int rmid)
+{
+    uint64_t val;
+    uint64_t new_val;
+    struct pqr_assoc *pqr = &this_cpu(pqr_assoc);
+
+    if ( !pqr->initialized )
+    {
+        rdmsrl(MSR_IA32_PQR_ASSOC, pqr->val);
+        pqr->initialized = 1;
+    }
+    val = pqr->val;
+
+    new_val = (val & ~psr_cmt->rmid_mask) | (rmid & psr_cmt->rmid_mask);
+    if ( val != new_val )
+    {
+        wrmsrl(MSR_IA32_PQR_ASSOC, new_val);
+        pqr->val = new_val;
+    }
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 542222e..dcb2b87 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -323,6 +323,9 @@
 #define MSR_IA32_TSC_DEADLINE		0x000006E0
 #define MSR_IA32_ENERGY_PERF_BIAS	0x000001b0
 
+/* Platform Shared Resource MSRs */
+#define MSR_IA32_PQR_ASSOC		0x00000c8f
+
 /* Intel Model 6 */
 #define MSR_P6_PERFCTR(n)		(0x000000c1 + (n))
 #define MSR_P6_EVNTSEL(n)		(0x00000186 + (n))
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 930b22b..00b4625 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -48,6 +48,7 @@ static inline bool_t psr_cmt_enabled(void)
 void init_psr(void);
 int psr_alloc_rmid(struct domain *d);
 void psr_free_rmid(struct domain *d);
+void psr_assoc_rmid(unsigned int rmid);
 
 #endif /* __ASM_PSR_H__ */
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 08/10] x86: add CMT related MSRs in allowed list
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
                   ` (6 preceding siblings ...)
  2014-09-25 10:19 ` [PATCH v16 07/10] x86: enable CMT for each domain RMID Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 20:58   ` Konrad Rzeszutek Wilk
  2014-09-25 10:19 ` [PATCH v16 09/10] xsm: add CMT related xsm policies Chao Peng
  2014-09-25 10:19 ` [PATCH v16 10/10] tools: CMDs and APIs for Cache Monitoring Technology Chao Peng
  9 siblings, 1 reply; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Tool stack will try to access the two MSRs to perform CMT
related operations, thus added them in the allowed list.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
 xen/arch/x86/platform_hypercall.c |    7 +++++++
 xen/include/asm-x86/msr-index.h   |    2 ++
 2 files changed, 9 insertions(+)

diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
index 081d9f5..be06f3a 100644
--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -69,6 +69,13 @@ struct xen_resource_access {
 
 static bool_t allow_access_msr(unsigned int msr)
 {
+    switch ( msr )
+    {
+    case MSR_IA32_QOSEVTSEL:
+    case MSR_IA32_QMC:
+        return 1;
+    }
+
     return 0;
 }
 
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index dcb2b87..ae089fb 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -324,6 +324,8 @@
 #define MSR_IA32_ENERGY_PERF_BIAS	0x000001b0
 
 /* Platform Shared Resource MSRs */
+#define MSR_IA32_QOSEVTSEL		0x00000c8d
+#define MSR_IA32_QMC			0x00000c8e
 #define MSR_IA32_PQR_ASSOC		0x00000c8f
 
 /* Intel Model 6 */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 09/10] xsm: add CMT related xsm policies
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
                   ` (7 preceding siblings ...)
  2014-09-25 10:19 ` [PATCH v16 08/10] x86: add CMT related MSRs in allowed list Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 10:19 ` [PATCH v16 10/10] tools: CMDs and APIs for Cache Monitoring Technology Chao Peng
  9 siblings, 0 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Add xsm policies for CMT related hypercalls.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 tools/flask/policy/policy/modules/xen/xen.if |    2 +-
 tools/flask/policy/policy/modules/xen/xen.te |    3 ++-
 xen/xsm/flask/hooks.c                        |    6 ++++++
 xen/xsm/flask/policy/access_vectors          |    4 ++++
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if
index 32b51b6..641c797 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -49,7 +49,7 @@ define(`create_domain_common', `
 			getdomaininfo hypercall setvcpucontext setextvcpucontext
 			getscheduler getvcpuinfo getvcpuextstate getaddrsize
 			getaffinity setaffinity };
-	allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim set_max_evtchn set_vnumainfo get_vnumainfo };
+	allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim set_max_evtchn set_vnumainfo get_vnumainfo psr_cmt_op };
 	allow $1 $2:security check_context;
 	allow $1 $2:shadow enable;
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op };
diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te
index 6cecf97..d214470 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -66,6 +66,7 @@ allow dom0_t xen_t:xen {
 };
 allow dom0_t xen_t:xen2 {
     resource_op
+    psr_cmt_op
 };
 allow dom0_t xen_t:mmu memorymap;
 
@@ -79,7 +80,7 @@ allow dom0_t dom0_t:domain {
 	getpodtarget setpodtarget set_misc_info set_virq_handler
 };
 allow dom0_t dom0_t:domain2 {
-	set_cpuid gettsc settsc setscheduler set_max_evtchn set_vnumainfo get_vnumainfo
+	set_cpuid gettsc settsc setscheduler set_max_evtchn set_vnumainfo get_vnumainfo psr_cmt_op
 };
 allow dom0_t dom0_t:resource { add remove };
 
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 9f36503..1bb3c00 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -722,6 +722,8 @@ static int flask_domctl(struct domain *d, int cmd)
 
     case XEN_DOMCTL_setvnumainfo:
         return current_has_perm(d, SECCLASS_DOMAIN, DOMAIN2__SET_VNUMAINFO);
+    case XEN_DOMCTL_psr_cmt_op:
+        return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PSR_CMT_OP);
 
     default:
         printk("flask_domctl: Unknown op %d\n", cmd);
@@ -778,6 +780,10 @@ static int flask_sysctl(int cmd)
     case XEN_SYSCTL_numainfo:
         return domain_has_xen(current->domain, XEN__PHYSINFO);
 
+    case XEN_SYSCTL_psr_cmt_op:
+        return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
+                                    XEN2__PSR_CMT_OP, NULL);
+
     default:
         printk("flask_sysctl: Unknown op %d\n", cmd);
         return -EPERM;
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index daf0de5..de0c707 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -81,6 +81,8 @@ class xen2
 {
 # XENPF_resource_op
     resource_op
+# XEN_SYSCTL_psr_cmt_op
+    psr_cmt_op
 }
 
 # Classes domain and domain2 consist of operations that a domain performs on
@@ -212,6 +214,8 @@ class domain2
     set_vnumainfo
 # XENMEM_getvnumainfo
     get_vnumainfo
+# XEN_DOMCTL_psr_cmt_op
+    psr_cmt_op
 }
 
 # Similar to class domain, but primarily contains domctls related to HVM domains
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 10/10] tools: CMDs and APIs for Cache Monitoring Technology
  2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
                   ` (8 preceding siblings ...)
  2014-09-25 10:19 ` [PATCH v16 09/10] xsm: add CMT related xsm policies Chao Peng
@ 2014-09-25 10:19 ` Chao Peng
  2014-09-25 21:14   ` Konrad Rzeszutek Wilk
  9 siblings, 1 reply; 34+ messages in thread
From: Chao Peng @ 2014-09-25 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, JBeulich, dgdegra

Introduced some new xl commands to enable/disable Cache Monitoring
Technology(CMT) feature.

The following two commands is to attach/detach the CMT feature
to/from a certain domain.

$ xl psr-cmt-attach domid
$ xl psr-cmt-detach domid

This command is to display the CMT information, such as L3 cache
occupancy.

$ xl psr-cmt-show cache_occupancy <domid>

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
---
 docs/man/xl.pod.1           |   25 +++++
 tools/libxc/Makefile        |    1 +
 tools/libxc/xc_msr_x86.h    |   36 ++++++++
 tools/libxc/xc_psr.c        |  215 +++++++++++++++++++++++++++++++++++++++++++
 tools/libxc/xenctrl.h       |   17 ++++
 tools/libxl/Makefile        |    2 +-
 tools/libxl/libxl.h         |   19 ++++
 tools/libxl/libxl_psr.c     |  184 ++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_types.idl |    4 +
 tools/libxl/libxl_utils.c   |   28 ++++++
 tools/libxl/xl.h            |    3 +
 tools/libxl/xl_cmdimpl.c    |  131 ++++++++++++++++++++++++++
 tools/libxl/xl_cmdtable.c   |   17 ++++
 13 files changed, 681 insertions(+), 1 deletion(-)
 create mode 100644 tools/libxc/xc_msr_x86.h
 create mode 100644 tools/libxc/xc_psr.c
 create mode 100644 tools/libxl/libxl_psr.c

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index f1e95db..ef8cd24 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1387,6 +1387,31 @@ Load FLASK policy from the given policy file. The initial policy is provided to
 the hypervisor as a multiboot module; this command allows runtime updates to the
 policy. Loading new security policy will reset runtime changes to device labels.
 
+=head1 Cache Monitoring Technology
+
+Some new hardware may offer monitoring capability in each logical processor to
+measure specific platform shared resource metric, for example, L3 cache
+occupancy. In Xen implementation, the monitoring granularity is domain level.
+To monitor a specific domain, just attach the domain id with the monitoring
+service. When the domain doesn't need to be monitored any more, detach the
+domain id from the monitoring service.
+
+=over 4
+
+=item B<psr-cmt-attach> [I<domain-id>]
+
+attach: Attach the platform shared resource monitoring service to a domain.
+
+=item B<psr-cmt-detach> [I<domain-id>]
+
+detach: Detach the platform shared resource monitoring service from a domain.
+
+=item B<psr-cmt-show> [I<psr-monitor-type>] [I<domain-id>]
+
+Show monitoring data for a certain domain or all domains. Current supported
+monitor types are:
+ - "cache-occupancy": showing the L3 cache occupancy.
+
 =back
 
 =head1 TO BE DOCUMENTED
diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index dde6109..d8cc21b 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -35,6 +35,7 @@ CTRL_SRCS-y       += xc_kexec.c
 CTRL_SRCS-y       += xtl_core.c
 CTRL_SRCS-y       += xtl_logger_stdio.c
 CTRL_SRCS-y       += xc_resource.c
+CTRL_SRCS-y       += xc_psr.c
 CTRL_SRCS-$(CONFIG_X86) += xc_pagetab.c
 CTRL_SRCS-$(CONFIG_Linux) += xc_linux.c xc_linux_osdep.c
 CTRL_SRCS-$(CONFIG_FreeBSD) += xc_freebsd.c xc_freebsd_osdep.c
diff --git a/tools/libxc/xc_msr_x86.h b/tools/libxc/xc_msr_x86.h
new file mode 100644
index 0000000..1e0ee99
--- /dev/null
+++ b/tools/libxc/xc_msr_x86.h
@@ -0,0 +1,36 @@
+/*
+ * xc_msr_x86.h
+ *
+ * MSR definition macros
+ *
+ * Copyright (C) 2014      Intel Corporation
+ * Author Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#ifndef XC_MSR_X86_H
+#define XC_MSR_X86_H
+
+#define MSR_IA32_QOSEVTSEL      0x00000c8d
+#define MSR_IA32_QMC            0x00000c8e
+
+#endif
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
new file mode 100644
index 0000000..0b5a227
--- /dev/null
+++ b/tools/libxc/xc_psr.c
@@ -0,0 +1,215 @@
+/*
+ * xc_psr.c
+ *
+ * platform shared resource related API functions.
+ *
+ * Copyright (C) 2014      Intel Corporation
+ * Author Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "xc_private.h"
+#include "xc_msr_x86.h"
+
+#define IA32_QM_CTR_ERROR_MASK         (0x3ull << 62)
+
+#define EVTID_L3_OCCUPANCY             0x1
+
+int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid)
+{
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_psr_cmt_op;
+    domctl.domain = (domid_t)domid;
+    domctl.u.psr_cmt_op.cmd = XEN_DOMCTL_PSR_CMT_OP_ATTACH;
+
+    return do_domctl(xch, &domctl);
+}
+
+int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid)
+{
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_psr_cmt_op;
+    domctl.domain = (domid_t)domid;
+    domctl.u.psr_cmt_op.cmd = XEN_DOMCTL_PSR_CMT_OP_DETACH;
+
+    return do_domctl(xch, &domctl);
+}
+
+int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
+                                    uint32_t *rmid)
+{
+    int rc;
+    DECLARE_DOMCTL;
+
+    domctl.cmd = XEN_DOMCTL_psr_cmt_op;
+    domctl.domain = (domid_t)domid;
+    domctl.u.psr_cmt_op.cmd = XEN_DOMCTL_PSR_CMT_OP_QUERY_RMID;
+
+    rc = do_domctl(xch, &domctl);
+
+    if ( !rc )
+        *rmid = domctl.u.psr_cmt_op.data;
+
+    return rc;
+}
+
+int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid)
+{
+    static int val = 0;
+    int rc;
+    DECLARE_SYSCTL;
+
+    if ( val )
+    {
+        *total_rmid = val;
+        return 0;
+    }
+
+    sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
+    sysctl.u.psr_cmt_op.cmd = XEN_SYSCTL_PSR_CMT_get_total_rmid;
+    sysctl.u.psr_cmt_op.flags = 0;
+
+    rc = xc_sysctl(xch, &sysctl);
+    if ( !rc )
+        val = *total_rmid = sysctl.u.psr_cmt_op.data;
+
+    return rc;
+}
+
+int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
+                                            uint32_t *upscaling_factor)
+{
+    static int val = 0;
+    int rc;
+    DECLARE_SYSCTL;
+
+    if ( val )
+    {
+        *upscaling_factor = val;
+        return 0;
+    }
+
+    sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
+    sysctl.u.psr_cmt_op.cmd =
+        XEN_SYSCTL_PSR_CMT_get_l3_upscaling_factor;
+    sysctl.u.psr_cmt_op.flags = 0;
+
+    rc = xc_sysctl(xch, &sysctl);
+    if ( !rc )
+        val = *upscaling_factor = sysctl.u.psr_cmt_op.data;
+
+    return rc;
+}
+
+int xc_psr_cmt_get_l3_cache_size(xc_interface *xch,
+                                      uint32_t *l3_cache_size)
+{
+    static int val = 0;
+    int rc;
+    DECLARE_SYSCTL;
+
+    if ( val )
+    {
+        *l3_cache_size = val;
+        return 0;
+    }
+
+    sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
+    sysctl.u.psr_cmt_op.cmd =
+        XEN_SYSCTL_PSR_CMT_get_l3_cache_size;
+    sysctl.u.psr_cmt_op.flags = 0;
+
+    rc = xc_sysctl(xch, &sysctl);
+    if ( !rc )
+        val = *l3_cache_size= sysctl.u.psr_cmt_op.data;
+
+    return rc;
+}
+
+int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid,
+    uint32_t cpu, xc_psr_cmt_type type, uint64_t *monitor_data)
+{
+    xc_resource_op_t op;
+    xc_resource_data_t entries[2];
+    uint32_t evtid;
+    int rc;
+
+    switch ( type )
+    {
+    case XC_PSR_CMT_L3_OCCUPANCY:
+        evtid = EVTID_L3_OCCUPANCY;
+        break;
+    default:
+        return -1;
+    }
+
+    entries[0].cmd = XEN_RESOURCE_OP_MSR_WRITE;
+    entries[0].idx = MSR_IA32_QOSEVTSEL;
+    entries[0].val = (uint64_t)rmid << 32 | evtid;
+    entries[0].rsvd = 0;
+
+    entries[1].cmd = XEN_RESOURCE_OP_MSR_READ;
+    entries[1].idx = MSR_IA32_QMC;
+    entries[1].val = 0;
+    entries[1].rsvd = 0;
+
+    op.result = 0;
+    op.cpu = cpu;
+    op.nr_entries = 2;
+    op.entries = entries;
+
+    rc = xc_resource_op(xch, 1, &op);
+    if ( rc )
+        return rc;
+
+    if ( op.result || entries[1].val & IA32_QM_CTR_ERROR_MASK )
+        return -1;
+
+    *monitor_data = entries[1].val;
+
+    return 0;
+}
+
+int xc_psr_cmt_enabled(xc_interface *xch)
+{
+    static int val = -1;
+    int rc;
+    DECLARE_SYSCTL;
+
+    if ( val >= 0 )
+        return val;
+
+    sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
+    sysctl.u.psr_cmt_op.cmd = XEN_SYSCTL_PSR_CMT_enabled;
+    sysctl.u.psr_cmt_op.flags = 0;
+
+    rc = do_sysctl(xch, &sysctl);
+    if ( !rc )
+    {
+        val = sysctl.u.psr_cmt_op.data;
+        return val;
+    }
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index fb58c44..d5a3c0a 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -2668,6 +2668,23 @@ struct xc_resource_op {
 typedef struct xc_resource_op xc_resource_op_t;
 int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops);
 
+enum xc_psr_cmt_type {
+    XC_PSR_CMT_L3_OCCUPANCY,
+};
+typedef enum xc_psr_cmt_type xc_psr_cmt_type;
+int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid);
+int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid);
+int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
+    uint32_t *rmid);
+int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid);
+int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
+    uint32_t *upscaling_factor);
+int xc_psr_cmt_get_l3_cache_size(xc_interface *xch,
+    uint32_t *l3_cache_size);
+int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid,
+    uint32_t cpu, uint32_t psr_cmt_type, uint64_t *monitor_data);
+int xc_psr_cmt_enabled(xc_interface *xch);
+
 #endif /* XENCTRL_H */
 
 /*
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 990414b..fd9ae28 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -43,7 +43,7 @@ LIBXL_OBJS-y += libxl_blktap2.o
 else
 LIBXL_OBJS-y += libxl_noblktap2.o
 endif
-LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o
+LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o
 
 ifeq ($(CONFIG_NetBSD),y)
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index bc68cac..5acc933 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -640,6 +640,13 @@ typedef uint8_t libxl_mac[6];
 #define LIBXL_MAC_BYTES(mac) mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]
 void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
 
+/*
+ * LIBXL_HAVE_PSR_CMT
+ *
+ * If this is defined, the Cache Monitoring Technology feature is supported.
+ */
+#define LIBXL_HAVE_PSR_CMT 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
@@ -1380,6 +1387,18 @@ bool libxl_ms_vm_genid_is_zero(const libxl_ms_vm_genid *id);
 void libxl_ms_vm_genid_copy(libxl_ctx *ctx, libxl_ms_vm_genid *dst,
                             libxl_ms_vm_genid *src);
 
+int libxl_get_socket_cpu(libxl_ctx *ctx, uint32_t socketid);
+
+int libxl_psr_cmt_attach(libxl_ctx *ctx, uint32_t domid);
+int libxl_psr_cmt_detach(libxl_ctx *ctx, uint32_t domid);
+int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid);
+int libxl_psr_cmt_enabled(libxl_ctx *ctx);
+int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid);
+int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx,
+    uint32_t *l3_cache_size);
+int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid,
+    uint32_t socketid, uint32_t *l3_cache_occupancy);
+
 /* misc */
 
 /* Each of these sets or clears the flag according to whether the
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
new file mode 100644
index 0000000..5551be8
--- /dev/null
+++ b/tools/libxl/libxl_psr.c
@@ -0,0 +1,184 @@
+/*
+ * Copyright (C) 2014      Intel Corporation
+ * Author Dongxiao Xu <dongxiao.xu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+#include "libxl_internal.h"
+
+
+#define IA32_QM_CTR_ERROR_MASK         (0x3ul << 62)
+
+static void libxl_psr_cmt_err_msg(libxl_ctx *ctx, int err)
+{
+    GC_INIT(ctx);
+
+    char *msg;
+
+    switch (err) {
+    case ENOSYS:
+        msg = "unsupported operation";
+        break;
+    case ENODEV:
+        msg = "Cache Monitoring Technology is not supported in this system";
+        break;
+    case EEXIST:
+        msg = "Cache Monitoring Technology is already attached to this domain";
+        break;
+    case ENOENT:
+        msg = "Cache Monitoring Technology is not attached to this domain";
+        break;
+    case EUSERS:
+        msg = "there is no free RMID available";
+        break;
+    case ESRCH:
+        msg = "is this Domain ID valid?";
+        break;
+    case EFAULT:
+        msg = "failed to exchange data with Xen";
+        break;
+    default:
+        msg = "unknown error";
+        break;
+    }
+
+    LOGE(ERROR, "%s", msg);
+
+    GC_FREE;
+}
+
+int libxl_psr_cmt_attach(libxl_ctx *ctx, uint32_t domid)
+{
+    int rc;
+
+    rc = xc_psr_cmt_attach(ctx->xch, domid);
+    if (rc < 0) {
+        libxl_psr_cmt_err_msg(ctx, errno);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
+int libxl_psr_cmt_detach(libxl_ctx *ctx, uint32_t domid)
+{
+    int rc;
+
+    rc = xc_psr_cmt_detach(ctx->xch, domid);
+    if (rc < 0) {
+        libxl_psr_cmt_err_msg(ctx, errno);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
+int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid)
+{
+    int rc;
+    uint32_t rmid;
+
+    rc = xc_psr_cmt_get_domain_rmid(ctx->xch, domid, &rmid);
+    if (rc < 0)
+        return 0;
+
+    return !!rmid;
+}
+
+int libxl_psr_cmt_enabled(libxl_ctx *ctx)
+{
+    return xc_psr_cmt_enabled(ctx->xch);
+}
+
+int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid)
+{
+    int rc;
+
+    rc = xc_psr_cmt_get_total_rmid(ctx->xch, total_rmid);
+    if (rc < 0) {
+        libxl_psr_cmt_err_msg(ctx, errno);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
+int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx,
+                                         uint32_t *l3_cache_size)
+{
+    int rc;
+
+    rc = xc_psr_cmt_get_l3_cache_size(ctx->xch, l3_cache_size);
+    if (rc < 0) {
+        libxl_psr_cmt_err_msg(ctx, errno);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
+int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid,
+    uint32_t socketid, uint32_t *l3_cache_occupancy)
+{
+    GC_INIT(ctx);
+
+    unsigned int rmid;
+    uint32_t upscaling_factor;
+    uint64_t monitor_data;
+    int cpu, rc;
+    xc_psr_cmt_type type;
+
+    rc = xc_psr_cmt_get_domain_rmid(ctx->xch, domid, &rmid);
+    if (rc < 0 || rmid == 0) {
+        LOGE(ERROR, "fail to get the domain rmid, "
+            "or domain is not attached with platform QoS monitoring service");
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    cpu = libxl_get_socket_cpu(ctx, socketid);
+    if (cpu < 0) {
+        LOGE(ERROR, "failed to get socket cpu");
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    type = XC_PSR_CMT_L3_OCCUPANCY;
+    rc = xc_psr_cmt_get_data(ctx->xch, rmid, cpu, type, &monitor_data);
+    if (rc < 0) {
+        LOGE(ERROR, "failed to get monitoring data");
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    rc = xc_psr_cmt_get_l3_upscaling_factor(ctx->xch, &upscaling_factor);
+    if (rc < 0) {
+        LOGE(ERROR, "failed to get L3 upscaling factor");
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    *l3_cache_occupancy = upscaling_factor * monitor_data / 1024;
+    rc = 0;
+out:
+    GC_FREE;
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index f1fcbc3..27a5022 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -635,3 +635,7 @@ libxl_event = Struct("event",[
                                  ])),
            ("domain_create_console_available", None),
            ]))])
+
+libxl_psr_cmt_type = Enumeration("psr_cmt_type", [
+    (1, "CACHE_OCCUPANCY"),
+    ])
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
index 58df4f3..8ec822d 100644
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -1065,6 +1065,34 @@ int libxl__random_bytes(libxl__gc *gc, uint8_t *buf, size_t len)
     return ret;
 }
 
+int libxl_get_socket_cpu(libxl_ctx *ctx, uint32_t socketid)
+{
+    int i, j, cpu, nr_cpus;
+    libxl_cputopology *topology;
+    int *socket_cpus;
+
+    topology = libxl_get_cpu_topology(ctx, &nr_cpus);
+    if (!topology)
+        return ERROR_FAIL;
+
+    socket_cpus = malloc(sizeof(int) * nr_cpus);
+    if (!socket_cpus) {
+        free(topology);
+        return ERROR_FAIL;
+    }
+
+    for (i = 0, j = 0; i < nr_cpus; i++)
+        if (topology[i].socket == socketid)
+            socket_cpus[j++] = i;
+
+    cpu = socket_cpus[rand() % j];
+
+    free(socket_cpus);
+    free(topology);
+
+    return cpu;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
index 10a2e66..abb1fac 100644
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -110,6 +110,9 @@ int main_loadpolicy(int argc, char **argv);
 int main_remus(int argc, char **argv);
 #endif
 int main_devd(int argc, char **argv);
+int main_psr_cmt_attach(int argc, char **argv);
+int main_psr_cmt_detach(int argc, char **argv);
+int main_psr_cmt_show(int argc, char **argv);
 
 void help(const char *command);
 
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 698b3bc..1fa8755 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -7428,6 +7428,137 @@ out:
     return ret;
 }
 
+static int psr_cmt_show_cache_occupancy(uint32_t domid)
+{
+    uint32_t i, socketid, nr_sockets, total_rmid;
+    uint32_t l3_cache_size, l3_cache_occupancy;
+    libxl_physinfo info;
+    char *domain_name;
+    int rc, print_header, nr_domains;
+    libxl_dominfo *dominfo;
+
+    if (!libxl_psr_cmt_enabled(ctx)) {
+        printf("CMT is not supported in the system\n");
+        return -1;
+    }
+
+    rc = libxl_get_physinfo(ctx, &info);
+    if (rc < 0) {
+        printf("failed to get system socket number\n");
+        return -1;
+    }
+    nr_sockets = info.nr_cpus / info.threads_per_core / info.cores_per_socket;
+
+    rc = libxl_psr_cmt_get_total_rmid(ctx, &total_rmid);
+    if (rc < 0) {
+        printf("failed to get system total rmid number\n");
+        return -1;
+    }
+
+    rc = libxl_psr_cmt_get_l3_cache_size(ctx, &l3_cache_size);
+    if (rc < 0) {
+        printf("failed to get system l3 cache size\n");
+        return -1;
+    }
+
+    printf("Total RMID: %d\n", total_rmid);
+    printf("Per-Socket L3 Cache Size: %d KB\n", l3_cache_size);
+
+    print_header = 1;
+    if (!(dominfo = libxl_list_domain(ctx, &nr_domains))) {
+        fprintf(stderr, "libxl_list_domain failed.\n");
+        return -1;
+    }
+    for (i = 0; i < nr_domains; i++) {
+        if (domid != ~0 && dominfo[i].domid != domid)
+            continue;
+        if (!libxl_psr_cmt_domain_attached(ctx, dominfo[i].domid))
+            continue;
+        if (print_header) {
+            printf("%-40s %5s", "Name", "ID");
+            for (socketid = 0; socketid < nr_sockets; socketid++)
+                printf("%14s %d", "Socket", socketid);
+            printf("\n");
+            print_header = 0;
+        }
+        domain_name = libxl_domid_to_name(ctx, dominfo[i].domid);
+        printf("%-40s %5d", domain_name, dominfo[i].domid);
+        free(domain_name);
+        for (socketid = 0; socketid < nr_sockets; socketid++) {
+            rc = libxl_psr_cmt_get_cache_occupancy(ctx, dominfo[i].domid,
+                     socketid, &l3_cache_occupancy);
+            printf("%13u KB", l3_cache_occupancy);
+        }
+        printf("\n");
+    }
+    libxl_dominfo_list_free(dominfo, nr_domains);
+
+    return 0;
+}
+
+int main_psr_cmt_attach(int argc, char **argv)
+{
+    uint32_t domid;
+    int opt, ret = 0;
+
+    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cmt-attach", 1) {
+        /* No options */
+    }
+
+    domid = find_domain(argv[optind]);
+    ret = libxl_psr_cmt_attach(ctx, domid);
+
+    return ret;
+}
+
+int main_psr_cmt_detach(int argc, char **argv)
+{
+    uint32_t domid;
+    int opt, ret = 0;
+
+    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cmt-detach", 1) {
+        /* No options */
+    }
+
+    domid = find_domain(argv[optind]);
+    ret = libxl_psr_cmt_detach(ctx, domid);
+
+    return ret;
+}
+
+int main_psr_cmt_show(int argc, char **argv)
+{
+    int opt, ret = 0;
+    uint32_t domid;
+    libxl_psr_cmt_type type;
+
+    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cmt-show", 1) {
+        /* No options */
+    }
+
+    libxl_psr_cmt_type_from_string(argv[optind], &type);
+
+    if (optind + 1 >= argc)
+        domid = ~0;
+    else if (optind + 1 == argc - 1)
+        domid = find_domain(argv[optind + 1]);
+    else {
+        help("psr-cmt-show");
+        return 2;
+    }
+
+    switch (type) {
+    case LIBXL_PSR_CMT_TYPE_CACHE_OCCUPANCY:
+        ret = psr_cmt_show_cache_occupancy(domid);
+        break;
+    default:
+        help("psr-cmt-show");
+        return 2;
+    }
+
+    return ret;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 35f02c4..da25c3f 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -502,6 +502,23 @@ struct cmd_spec cmd_table[] = {
       "[options]",
       "-F                      Run in the foreground",
     },
+    { "psr-cmt-attach",
+      &main_psr_cmt_attach, 0, 1,
+      "Attach Cache Monitoring Technology service to a domain",
+      "<Domain>",
+    },
+    { "psr-cmt-detach",
+      &main_psr_cmt_detach, 0, 1,
+      "Detach Cache Monitoring Technology service from a domain",
+      "<Domain>",
+    },
+    { "psr-cmt-show",
+      &main_psr_cmt_show, 0, 1,
+      "Show Cache Monitoring Technology information",
+      "<PSR-CMT-Type> <Domain>",
+      "Available monitor types:\n"
+      "\"cache_occupancy\":         Show L3 cache occupancy\n",
+    },
 };
 
 int cmdtable_len = sizeof(cmd_table)/sizeof(struct cmd_spec);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-25 10:19 ` [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall Chao Peng
@ 2014-09-25 19:57   ` Andrew Cooper
  2014-09-25 20:12     ` Konrad Rzeszutek Wilk
                       ` (2 more replies)
  2014-09-26 15:40   ` Jan Beulich
  1 sibling, 3 replies; 34+ messages in thread
From: Andrew Cooper @ 2014-09-25 19:57 UTC (permalink / raw)
  To: Chao Peng, xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Ian.Jackson, JBeulich, dgdegra

On 25/09/14 11:19, Chao Peng wrote:
> Add a generic resource access hypercall for tool stack or other
> components, e.g., accessing MSR, port I/O, etc.
>
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
>  xen/arch/x86/platform_hypercall.c        |   90 ++++++++++++++++++++++++++++++
>  xen/arch/x86/x86_64/platform_hypercall.c |    4 ++
>  xen/include/public/platform.h            |   23 ++++++++
>  xen/include/xlat.lst                     |    2 +
>  4 files changed, 119 insertions(+)
>
> diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
> index 2162811..081d9f5 100644
> --- a/xen/arch/x86/platform_hypercall.c
> +++ b/xen/arch/x86/platform_hypercall.c
> @@ -61,6 +61,68 @@ long cpu_down_helper(void *data);
>  long core_parking_helper(void *data);
>  uint32_t get_cur_idle_nums(void);
>  
> +struct xen_resource_access {
> +    int32_t ret;
> +    uint32_t nr;
> +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> +};
> +
> +static bool_t allow_access_msr(unsigned int msr)
> +{
> +    return 0;
> +}
> +
> +static void resource_access(void *info)
> +{
> +    struct xen_resource_access *ra = info;
> +    xenpf_resource_data_t data;
> +    int ret = 0;
> +    unsigned int i;
> +
> +    for ( i = 0; i < ra->nr; i++ )
> +    {
> +        if ( copy_from_guest_offset(&data, ra->data, i, 1) )

You cannot use copy_{to,from}_guest() here.  You are almost certainly on
the wrong cpu, and running on the wrong set of pagetables.

At best, you will copy bogus data from the wrong process/domain, and
likely corrupt it when copying back.

> +        {
> +            ret = -EFAULT;
> +            break;
> +        }
> +
> +        if ( data.rsvd ) {
> +            ret = -EINVAL;
> +            break;
> +        }
> +
> +        switch ( data.cmd )
> +        {
> +        case XEN_RESOURCE_OP_MSR_READ:
> +        case XEN_RESOURCE_OP_MSR_WRITE:
> +            if ( data.idx >> 32 )
> +                ret = -EINVAL;
> +            else if ( !allow_access_msr(data.idx) )
> +                ret = -EACCES;
> +            else if ( data.cmd == XEN_RESOURCE_OP_MSR_READ )
> +                ret = rdmsr_safe(data.idx, data.val);
> +            else
> +                ret = wrmsr_safe(data.idx, data.val);
> +            break;
> +        default:
> +            ret = -EINVAL;
> +            break;
> +        }
> +
> +        if ( ret )
> +            break;
> +
> +        if ( copy_to_guest_offset(ra->data, i, &data, 1) )
> +        {
> +            ret = -EFAULT;
> +            break;
> +        }
> +    }
> +
> +    ra->ret = ret;
> +}
> +
>  ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
>  {
>      ret_t ret = 0;
> @@ -601,6 +663,34 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
>      }
>      break;
>  
> +    case XENPF_resource_op:
> +    {
> +        struct xen_resource_access ra;
> +        struct xenpf_resource_op *rsc_op = &op->u.resource_op;
> +        unsigned int cpu = smp_processor_id();
> +
> +        ra.nr = rsc_op->nr;
> +        ra.data = rsc_op->data;

You must do all copy_{from,to}_user() here, and strictly only pass Xen
pointers to resource_access().

This means you will need to xmalloc() yourself some space for the
xenpf_resource_data_t array.


On a different note, you need to enforce a maximum resource_op.nr of
something rather low to (16/32 perhaps?) to prevent a toolstack asking
for 0xffffffff non-preemptible operations.

~Andrew

> +
> +        if ( rsc_op->cpu == cpu )
> +            resource_access(&ra);
> +        else if ( cpu_online(rsc_op->cpu) )
> +            on_selected_cpus(cpumask_of(rsc_op->cpu),
> +                         resource_access, &ra, 1);
> +        else
> +        {
> +            ret = -ENODEV;
> +            break;
> +        }
> +
> +        if ( ra.ret )
> +        {
> +            ret = ra.ret;
> +            break;
> +        }
> +    }
> +    break;
> +
>      default:
>          ret = -ENOSYS;
>          break;
> diff --git a/xen/arch/x86/x86_64/platform_hypercall.c b/xen/arch/x86/x86_64/platform_hypercall.c
> index b6f380e..4db6622 100644
> --- a/xen/arch/x86/x86_64/platform_hypercall.c
> +++ b/xen/arch/x86/x86_64/platform_hypercall.c
> @@ -32,6 +32,10 @@ CHECK_pf_pcpu_version;
>  CHECK_pf_enter_acpi_sleep;
>  #undef xen_pf_enter_acpi_sleep
>  
> +#define xen_pf_resource_data xenpf_resource_data
> +CHECK_pf_resource_data;
> +#undef xen_pf_resource_data
> +
>  #define COMPAT
>  #define _XEN_GUEST_HANDLE(t) XEN_GUEST_HANDLE(t)
>  #define _XEN_GUEST_HANDLE_PARAM(t) XEN_GUEST_HANDLE_PARAM(t)
> diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
> index 053b9fa..e4d9091 100644
> --- a/xen/include/public/platform.h
> +++ b/xen/include/public/platform.h
> @@ -527,6 +527,28 @@ struct xenpf_core_parking {
>  typedef struct xenpf_core_parking xenpf_core_parking_t;
>  DEFINE_XEN_GUEST_HANDLE(xenpf_core_parking_t);
>  
> +#define XENPF_resource_op   61
> +
> +#define XEN_RESOURCE_OP_MSR_READ  0
> +#define XEN_RESOURCE_OP_MSR_WRITE 1
> +
> +struct xenpf_resource_data {
> +    uint32_t cmd;       /* XEN_RESOURCE_OP_* */
> +    uint32_t rsvd;
> +    uint64_t idx;
> +    uint64_t val;
> +};
> +typedef struct xenpf_resource_data xenpf_resource_data_t;
> +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_data_t);
> +
> +struct xenpf_resource_op {
> +    uint32_t nr;    /* number of data entry */
> +    uint32_t cpu;   /* which cpu to run */
> +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> +};
> +typedef struct xenpf_resource_op xenpf_resource_op_t;
> +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_op_t);
> +
>  /*
>   * ` enum neg_errnoval
>   * ` HYPERVISOR_platform_op(const struct xen_platform_op*);
> @@ -553,6 +575,7 @@ struct xen_platform_op {
>          struct xenpf_cpu_hotadd        cpu_add;
>          struct xenpf_mem_hotadd        mem_add;
>          struct xenpf_core_parking      core_parking;
> +        struct xenpf_resource_op       resource_op;
>          uint8_t                        pad[128];
>      } u;
>  };
> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> index 9a35dd7..100fcf5 100644
> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -88,6 +88,8 @@
>  ?	xenpf_enter_acpi_sleep		platform.h
>  ?	xenpf_pcpuinfo			platform.h
>  ?	xenpf_pcpu_version		platform.h
> +?	xenpf_resource_op		platform.h
> +?	xenpf_resource_data		platform.h
>  !	sched_poll			sched.h
>  ?	sched_remote_shutdown		sched.h
>  ?	sched_shutdown			sched.h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 03/10] tools: provide interface for generic resource access
  2014-09-25 10:19 ` [PATCH v16 03/10] tools: provide interface for generic resource access Chao Peng
@ 2014-09-25 20:06   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-25 20:06 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 06:19:03PM +0800, Chao Peng wrote:
> Xen added a new platform_op hypercall for generic MSR access, and this
> is the the tool side change to wrapper the hypercall into xc APIs.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> ---
>  tools/libxc/Makefile      |    1 +
>  tools/libxc/xc_private.h  |   52 +++++++++++++++++
>  tools/libxc/xc_resource.c |  143 +++++++++++++++++++++++++++++++++++++++++++++
>  tools/libxc/xenctrl.h     |   13 +++++
>  4 files changed, 209 insertions(+)
>  create mode 100644 tools/libxc/xc_resource.c
> 
> diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
> index 3b04027..dde6109 100644
> --- a/tools/libxc/Makefile
> +++ b/tools/libxc/Makefile
> @@ -34,6 +34,7 @@ CTRL_SRCS-y       += xc_foreign_memory.c
>  CTRL_SRCS-y       += xc_kexec.c
>  CTRL_SRCS-y       += xtl_core.c
>  CTRL_SRCS-y       += xtl_logger_stdio.c
> +CTRL_SRCS-y       += xc_resource.c
>  CTRL_SRCS-$(CONFIG_X86) += xc_pagetab.c
>  CTRL_SRCS-$(CONFIG_Linux) += xc_linux.c xc_linux_osdep.c
>  CTRL_SRCS-$(CONFIG_FreeBSD) += xc_freebsd.c xc_freebsd_osdep.c
> diff --git a/tools/libxc/xc_private.h b/tools/libxc/xc_private.h
> index 94df688..fbdfe79 100644
> --- a/tools/libxc/xc_private.h
> +++ b/tools/libxc/xc_private.h
> @@ -46,6 +46,7 @@
>  #define DECLARE_SYSCTL struct xen_sysctl sysctl
>  #define DECLARE_PHYSDEV_OP struct physdev_op physdev_op
>  #define DECLARE_FLASK_OP struct xen_flask_op op
> +#define DECLARE_PLATFORM_OP struct xen_platform_op platform_op
>  
>  #undef PAGE_SHIFT
>  #undef PAGE_SIZE
> @@ -310,6 +311,57 @@ static inline int do_sysctl(xc_interface *xch, struct xen_sysctl *sysctl)
>      return ret;
>  }
>  
> +static inline int do_platform_op(xc_interface *xch,
> +                                 struct xen_platform_op *platform_op)
> +{
> +    int ret = -1;
> +    DECLARE_HYPERCALL;
> +    DECLARE_HYPERCALL_BOUNCE(platform_op, sizeof(*platform_op),
> +                             XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
> +
> +    platform_op->interface_version = XENPF_INTERFACE_VERSION;
> +
> +    if ( xc_hypercall_bounce_pre(xch, platform_op) )
> +    {
> +        PERROR("Could not bounce buffer for platform_op hypercall");
> +        goto out1;

Just return -1;
> +    }
> +
> +    hypercall.op     = __HYPERVISOR_platform_op;
> +    hypercall.arg[0] = HYPERCALL_BUFFER_AS_ARG(platform_op);
> +    if ( (ret = do_xen_hypercall(xch, &hypercall)) < 0 )
> +    {
> +        if ( errno == EACCES )
> +            DPRINTF("platform operation failed -- need to"
> +                    " rebuild the user-space tool set?\n");
> +    }
> +
> +    xc_hypercall_bounce_post(xch, platform_op);
> + out1:

And remove this label.

> +    return ret;
> +}
> +
> +static inline int do_multicall_op(xc_interface *xch,
> +                                  xc_hypercall_buffer_t *call_list,
> +                                  uint32_t nr_calls)
> +{
> +    int ret = -1;
> +    DECLARE_HYPERCALL;
> +    DECLARE_HYPERCALL_BUFFER_ARGUMENT(call_list);
> +
> +    hypercall.op     = __HYPERVISOR_multicall;
> +    hypercall.arg[0] = HYPERCALL_BUFFER_AS_ARG(call_list);
> +    hypercall.arg[1] = nr_calls;
> +    if ( (ret = do_xen_hypercall(xch, &hypercall)) < 0 )
> +    {
> +        if ( errno == EACCES )
> +            DPRINTF("multicall operation failed -- need to"
> +                    " rebuild the user-space tool set?\n");
> +    }
> +
> +    return ret;
> +}
> +
>  int do_memory_op(xc_interface *xch, int cmd, void *arg, size_t len);
>  
>  void *xc_map_foreign_ranges(xc_interface *xch, uint32_t dom,
> diff --git a/tools/libxc/xc_resource.c b/tools/libxc/xc_resource.c
> new file mode 100644
> index 0000000..c92910b
> --- /dev/null
> +++ b/tools/libxc/xc_resource.c
> @@ -0,0 +1,143 @@
> +/*
> + * xc_resource.c
> + *
> + * Generic resource access API
> + *
> + * Copyright (C) 2014      Intel Corporation
> + * Author Dongxiao Xu <dongxiao.xu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "xc_private.h"
> +
> +static int xc_resource_op_one(xc_interface *xch, xc_resource_op_t *op)
> +{
> +    int rc;
> +    DECLARE_PLATFORM_OP;
> +    DECLARE_NAMED_HYPERCALL_BOUNCE(data, op->entries,
> +                                op->nr_entries * sizeof(*op->entries),
> +                                XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
> +
> +    if ( xc_hypercall_bounce_pre(xch, data) )
> +        return -1;
> +
> +    platform_op.cmd = XENPF_resource_op;
> +    platform_op.u.resource_op.nr = op->nr_entries;
> +    platform_op.u.resource_op.cpu = op->cpu;
> +    set_xen_guest_handle(platform_op.u.resource_op.data, data);
> +
> +    rc = do_platform_op(xch, &platform_op);
> +
> +    xc_hypercall_bounce_post(xch, data);
> +
> +    return rc;
> +}
> +
> +static int xc_resource_op_multi(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops)
> +{
> +    int rc, i, entries_size;
> +    xc_resource_op_t *op;
> +    multicall_entry_t *call;
> +    DECLARE_HYPERCALL_BUFFER(multicall_entry_t, call_list);
> +    xc_hypercall_buffer_array_t *platform_ops, *entries_list = NULL;
> +
> +    call_list = xc_hypercall_buffer_alloc(xch, call_list,
> +                                          sizeof(*call_list) * nr_ops);
> +    if ( !call_list )
> +        return -1;
> +
> +    platform_ops = xc_hypercall_buffer_array_create(xch, nr_ops);
> +    if ( !platform_ops ) {

In the previous file you had '{' on an new-line. Can you make it uniform please?

> +        rc = -1;
> +        goto out;
> +    }
> +
> +    entries_list = xc_hypercall_buffer_array_create(xch, nr_ops);
> +    if ( !entries_list ) {

Ditto.
> +        rc = -1;
> +        goto out;
> +    }
> +
> +    for ( i = 0; i < nr_ops; i++ ) {

And here?

> +        DECLARE_HYPERCALL_BUFFER(xen_platform_op_t, platform_op);
> +        DECLARE_HYPERCALL_BUFFER(xc_resource_data_t, entries);
> +
> +        op = ops + i;
> +
> +        platform_op = xc_hypercall_buffer_array_alloc(xch, platform_ops, i,
> +                        platform_op, sizeof(xen_platform_op_t));
> +        if ( !platform_op ) {

?
> +            rc = -1;
> +            goto out;
> +        }
> +
> +        entries_size = sizeof(xc_resource_data_t) * op->nr_entries;
> +        entries = xc_hypercall_buffer_array_alloc(xch, entries_list, i,
> +                   entries, entries_size);
> +        if ( !entries) {

?
> +            rc = -1;
> +            goto out;
> +        }
> +        memcpy(entries, op->entries, entries_size);
> +
> +        call = call_list + i;
> +        call->op = __HYPERVISOR_platform_op;
> +        call->args[0] = HYPERCALL_BUFFER_AS_ARG(platform_op);
> +
> +        platform_op->interface_version = XENPF_INTERFACE_VERSION;
> +        platform_op->cmd = XENPF_resource_op;
> +        platform_op->u.resource_op.cpu = op->cpu;
> +        platform_op->u.resource_op.nr = op->nr_entries;
> +        set_xen_guest_handle(platform_op->u.resource_op.data, entries);
> +    }
> +
> +    rc = do_multicall_op(xch, HYPERCALL_BUFFER(call_list), nr_ops);
> +
> +    for ( i = 0; i < nr_ops; i++ ) {
> +        DECLARE_HYPERCALL_BUFFER(xc_resource_data_t, entries);
> +        op = ops + i;
> +
> +        call = call_list + i;
> +        op->result = call->result;
> +
> +        entries_size = sizeof(xc_resource_data_t) * op->nr_entries;
> +        entries = xc_hypercall_buffer_array_get(xch, entries_list, i,
> +                   entries, entries_size);
> +        memcpy(op->entries, entries, entries_size);
> +    }
> +
> +out:
> +    xc_hypercall_buffer_array_destroy(xch, entries_list);
> +    xc_hypercall_buffer_array_destroy(xch, platform_ops);
> +    xc_hypercall_buffer_free(xch, call_list);
> +    return rc;
> +}
> +
> +int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops)
> +{
> +    if ( nr_ops == 1 )
> +        return xc_resource_op_one(xch, ops);
> +    else if ( nr_ops > 1 )
> +        return xc_resource_op_multi(xch, nr_ops, ops);
> +    else
> +        return -1;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
> index 514b241..fb58c44 100644
> --- a/tools/libxc/xenctrl.h
> +++ b/tools/libxc/xenctrl.h
> @@ -47,6 +47,7 @@
>  #include <xen/xsm/flask_op.h>
>  #include <xen/tmem.h>
>  #include <xen/kexec.h>
> +#include <xen/platform.h>
>  
>  #include "xentoollog.h"
>  
> @@ -2655,6 +2656,18 @@ int xc_kexec_load(xc_interface *xch, uint8_t type, uint16_t arch,
>   */
>  int xc_kexec_unload(xc_interface *xch, int type);
>  
> +typedef xenpf_resource_data_t xc_resource_data_t;
> +

A comment would be nice.
> +struct xc_resource_op {
> +    uint64_t result;
> +    uint32_t cpu;
> +    uint32_t nr_entries;
> +    xc_resource_data_t *entries;
> +};
> +
> +typedef struct xc_resource_op xc_resource_op_t;
> +int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops);
> +
>  #endif /* XENCTRL_H */
>  
>  /*
> -- 
> 1.7.9.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-25 19:57   ` Andrew Cooper
@ 2014-09-25 20:12     ` Konrad Rzeszutek Wilk
  2014-09-25 20:17       ` Konrad Rzeszutek Wilk
  2014-09-26  1:34       ` Chao Peng
  2014-09-26  1:19     ` Chao Peng
  2014-09-26  8:28     ` Jan Beulich
  2 siblings, 2 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-25 20:12 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Ian.Jackson, xen-devel, JBeulich, Chao Peng, dgdegra

On Thu, Sep 25, 2014 at 08:57:38PM +0100, Andrew Cooper wrote:
> On 25/09/14 11:19, Chao Peng wrote:
> > Add a generic resource access hypercall for tool stack or other
> > components, e.g., accessing MSR, port I/O, etc.
> >

You should include a bit more information in the description.
Please give a bit information on what kind of parameters the
hypercall is to have.

> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > ---
> >  xen/arch/x86/platform_hypercall.c        |   90 ++++++++++++++++++++++++++++++
> >  xen/arch/x86/x86_64/platform_hypercall.c |    4 ++
> >  xen/include/public/platform.h            |   23 ++++++++
> >  xen/include/xlat.lst                     |    2 +
> >  4 files changed, 119 insertions(+)
> >
> > diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
> > index 2162811..081d9f5 100644
> > --- a/xen/arch/x86/platform_hypercall.c
> > +++ b/xen/arch/x86/platform_hypercall.c
> > @@ -61,6 +61,68 @@ long cpu_down_helper(void *data);
> >  long core_parking_helper(void *data);
> >  uint32_t get_cur_idle_nums(void);
> >  
> > +struct xen_resource_access {
> > +    int32_t ret;
> > +    uint32_t nr;
> > +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> > +};
> > +
> > +static bool_t allow_access_msr(unsigned int msr)
> > +{
> > +    return 0;
> > +}
> > +
> > +static void resource_access(void *info)
> > +{
> > +    struct xen_resource_access *ra = info;
> > +    xenpf_resource_data_t data;
> > +    int ret = 0;
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < ra->nr; i++ )
> > +    {
> > +        if ( copy_from_guest_offset(&data, ra->data, i, 1) )
> 
> You cannot use copy_{to,from}_guest() here.  You are almost certainly on
> the wrong cpu, and running on the wrong set of pagetables.
> 
> At best, you will copy bogus data from the wrong process/domain, and
> likely corrupt it when copying back.
> 
> > +        {
> > +            ret = -EFAULT;
> > +            break;
> > +        }
> > +
> > +        if ( data.rsvd ) {

That '{' needs to be on its own line.

> > +            ret = -EINVAL;
> > +            break;
> > +        }
> > +
> > +        switch ( data.cmd )
> > +        {
> > +        case XEN_RESOURCE_OP_MSR_READ:
> > +        case XEN_RESOURCE_OP_MSR_WRITE:
> > +            if ( data.idx >> 32 )
> > +                ret = -EINVAL;
> > +            else if ( !allow_access_msr(data.idx) )
> > +                ret = -EACCES;
> > +            else if ( data.cmd == XEN_RESOURCE_OP_MSR_READ )
> > +                ret = rdmsr_safe(data.idx, data.val);
> > +            else
> > +                ret = wrmsr_safe(data.idx, data.val);
> > +            break;
> > +        default:
> > +            ret = -EINVAL;
> > +            break;
> > +        }
> > +
> > +        if ( ret )
> > +            break;
> > +
> > +        if ( copy_to_guest_offset(ra->data, i, &data, 1) )
> > +        {
> > +            ret = -EFAULT;
> > +            break;
> > +        }
> > +    }
> > +
> > +    ra->ret = ret;
> > +}
> > +
> >  ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
> >  {
> >      ret_t ret = 0;
> > @@ -601,6 +663,34 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
> >      }
> >      break;
> >  
> > +    case XENPF_resource_op:
> > +    {
> > +        struct xen_resource_access ra;
> > +        struct xenpf_resource_op *rsc_op = &op->u.resource_op;
> > +        unsigned int cpu = smp_processor_id();
> > +
> > +        ra.nr = rsc_op->nr;
> > +        ra.data = rsc_op->data;
> 
> You must do all copy_{from,to}_user() here, and strictly only pass Xen
> pointers to resource_access().
> 
> This means you will need to xmalloc() yourself some space for the
> xenpf_resource_data_t array.
> 
> 
> On a different note, you need to enforce a maximum resource_op.nr of
> something rather low to (16/32 perhaps?) to prevent a toolstack asking
> for 0xffffffff non-preemptible operations.

The toolstack has:

int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops)
{
    if ( nr_ops == 1 )
        return xc_resource_op_one(xch, ops);

And with that expectation we ought to have a similar check here, in the form
of:

	if ( ra.nr != 1 )
		return -EINVAL;

> 
> ~Andrew
> 
> > +
> > +        if ( rsc_op->cpu == cpu )
> > +            resource_access(&ra);
> > +        else if ( cpu_online(rsc_op->cpu) )
> > +            on_selected_cpus(cpumask_of(rsc_op->cpu),
> > +                         resource_access, &ra, 1);
> > +        else
> > +        {
> > +            ret = -ENODEV;
> > +            break;
> > +        }
> > +
> > +        if ( ra.ret )
> > +        {
> > +            ret = ra.ret;
> > +            break;
> > +        }
> > +    }
> > +    break;
> > +
> >      default:
> >          ret = -ENOSYS;
> >          break;
> > diff --git a/xen/arch/x86/x86_64/platform_hypercall.c b/xen/arch/x86/x86_64/platform_hypercall.c
> > index b6f380e..4db6622 100644
> > --- a/xen/arch/x86/x86_64/platform_hypercall.c
> > +++ b/xen/arch/x86/x86_64/platform_hypercall.c
> > @@ -32,6 +32,10 @@ CHECK_pf_pcpu_version;
> >  CHECK_pf_enter_acpi_sleep;
> >  #undef xen_pf_enter_acpi_sleep
> >  
> > +#define xen_pf_resource_data xenpf_resource_data
> > +CHECK_pf_resource_data;
> > +#undef xen_pf_resource_data
> > +
> >  #define COMPAT
> >  #define _XEN_GUEST_HANDLE(t) XEN_GUEST_HANDLE(t)
> >  #define _XEN_GUEST_HANDLE_PARAM(t) XEN_GUEST_HANDLE_PARAM(t)
> > diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
> > index 053b9fa..e4d9091 100644
> > --- a/xen/include/public/platform.h
> > +++ b/xen/include/public/platform.h
> > @@ -527,6 +527,28 @@ struct xenpf_core_parking {
> >  typedef struct xenpf_core_parking xenpf_core_parking_t;
> >  DEFINE_XEN_GUEST_HANDLE(xenpf_core_parking_t);
> >  
> > +#define XENPF_resource_op   61

More details please.
> > +
> > +#define XEN_RESOURCE_OP_MSR_READ  0
> > +#define XEN_RESOURCE_OP_MSR_WRITE 1
> > +
> > +struct xenpf_resource_data {
> > +    uint32_t cmd;       /* XEN_RESOURCE_OP_* */
> > +    uint32_t rsvd;
> > +    uint64_t idx;
> > +    uint64_t val;

More details please. Pls say what the 'rsvd' is for, what
the expected values are for 'idx', and 'val'.

Do also say which ones are IN or OUT.

Put yourself in the mindset of somebody who wants to use this
and does not want to dive in the hypervisor to figure this out.
Give as much information as possible in the headers.

> > +typedef struct xenpf_resource_data xenpf_resource_data_t;
> > +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_data_t);
> > +
> > +struct xenpf_resource_op {
> > +    uint32_t nr;    /* number of data entry */
> > +    uint32_t cpu;   /* which cpu to run */
> > +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> > +};
> > +typedef struct xenpf_resource_op xenpf_resource_op_t;
> > +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_op_t);
> > +
> >  /*
> >   * ` enum neg_errnoval
> >   * ` HYPERVISOR_platform_op(const struct xen_platform_op*);
> > @@ -553,6 +575,7 @@ struct xen_platform_op {
> >          struct xenpf_cpu_hotadd        cpu_add;
> >          struct xenpf_mem_hotadd        mem_add;
> >          struct xenpf_core_parking      core_parking;
> > +        struct xenpf_resource_op       resource_op;

resource_op?  I would really call this 'msr' or 'msr_data'


> >          uint8_t                        pad[128];
> >      } u;
> >  };
> > diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> > index 9a35dd7..100fcf5 100644
> > --- a/xen/include/xlat.lst
> > +++ b/xen/include/xlat.lst
> > @@ -88,6 +88,8 @@
> >  ?	xenpf_enter_acpi_sleep		platform.h
> >  ?	xenpf_pcpuinfo			platform.h
> >  ?	xenpf_pcpu_version		platform.h
> > +?	xenpf_resource_op		platform.h
> > +?	xenpf_resource_data		platform.h
> >  !	sched_poll			sched.h
> >  ?	sched_remote_shutdown		sched.h
> >  ?	sched_shutdown			sched.h
> 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-25 20:12     ` Konrad Rzeszutek Wilk
@ 2014-09-25 20:17       ` Konrad Rzeszutek Wilk
  2014-09-26  1:34       ` Chao Peng
  1 sibling, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-25 20:17 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Ian.Jackson, xen-devel, JBeulich, Chao Peng, dgdegra

> > > index 053b9fa..e4d9091 100644
> > > --- a/xen/include/public/platform.h
> > > +++ b/xen/include/public/platform.h
> > > @@ -527,6 +527,28 @@ struct xenpf_core_parking {
> > >  typedef struct xenpf_core_parking xenpf_core_parking_t;
> > >  DEFINE_XEN_GUEST_HANDLE(xenpf_core_parking_t);
> > >  
> > > +#define XENPF_resource_op   61
> 
> More details please.
> > > +
> > > +#define XEN_RESOURCE_OP_MSR_READ  0
> > > +#define XEN_RESOURCE_OP_MSR_WRITE 1
> > > +
> > > +struct xenpf_resource_data {
> > > +    uint32_t cmd;       /* XEN_RESOURCE_OP_* */
> > > +    uint32_t rsvd;
> > > +    uint64_t idx;
> > > +    uint64_t val;
> 
> More details please. Pls say what the 'rsvd' is for, what
> the expected values are for 'idx', and 'val'.
> 
> Do also say which ones are IN or OUT.
> 
> Put yourself in the mindset of somebody who wants to use this
> and does not want to dive in the hypervisor to figure this out.
> Give as much information as possible in the headers.
> 
> > > +typedef struct xenpf_resource_data xenpf_resource_data_t;
> > > +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_data_t);
> > > +
> > > +struct xenpf_resource_op {
> > > +    uint32_t nr;    /* number of data entry */
> > > +    uint32_t cpu;   /* which cpu to run */
> > > +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> > > +};
> > > +typedef struct xenpf_resource_op xenpf_resource_op_t;
> > > +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_op_t);
> > > +
> > >  /*
> > >   * ` enum neg_errnoval
> > >   * ` HYPERVISOR_platform_op(const struct xen_platform_op*);
> > > @@ -553,6 +575,7 @@ struct xen_platform_op {
> > >          struct xenpf_cpu_hotadd        cpu_add;
> > >          struct xenpf_mem_hotadd        mem_add;
> > >          struct xenpf_core_parking      core_parking;
> > > +        struct xenpf_resource_op       resource_op;
> 
> resource_op?  I would really call this 'msr' or 'msr_data'

Thought on the other hand - you are trying to make this
generic (so it can be used for 'port I/O' or other).
In which case 'resource' looks like the best name.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature
  2014-09-25 10:19 ` [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature Chao Peng
@ 2014-09-25 20:33   ` Konrad Rzeszutek Wilk
  2014-09-25 21:14     ` Andrew Cooper
  2014-09-26 15:45   ` Jan Beulich
  1 sibling, 1 reply; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-25 20:33 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 06:19:04PM +0800, Chao Peng wrote:
> Detect Cache Monitoring Technology(CMT) feature and enumerate the
> resource types, one of which is to monitor the L3 cache occupancy.
> 
> Also introduce a Xen command line parameter to control the Platform
> Shared Resource such as CMT.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
>  docs/misc/xen-command-line.markdown |   12 ++++
>  xen/arch/x86/Makefile               |    1 +
>  xen/arch/x86/psr.c                  |  107 +++++++++++++++++++++++++++++++++++
>  xen/arch/x86/setup.c                |    3 +
>  xen/include/asm-x86/cpufeature.h    |    1 +
>  xen/include/asm-x86/psr.h           |   53 +++++++++++++++++
>  6 files changed, 177 insertions(+)
>  create mode 100644 xen/arch/x86/psr.c
>  create mode 100644 xen/include/asm-x86/psr.h
> 
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index af93e17..b106a46 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1005,6 +1005,18 @@ This option can be specified more than once (up to 8 times at present).
>  ### ple\_window
>  > `= <integer>`
>  
> +### psr (Intel)
> +> `= List of ( cmt:<boolean> | rmid_max:<integer> )`

Please explain what 'psr' is (the full name) and why one would want
to use it.

> +
> +> Default: `psr=cmt:0,rmid_max:255`
> +
> +Configure platform shared resource services, which are available on Intel
> +Haswell Server family and future platforms.
> +
> +`cmt` instructs Xen to enable/disable Cache Monitoring Technology.

Please include the default value.

> +
> +`rmid_max` indicates the max value for rmid.

Couple of issues:
 - It reads as not optional (from the documentation) - so what are the values
   that can used? What are the ranges?
 - What is the default value?
 - What is 'RMID'? 

Please please expand more on this. You want users to able to easily
read it and understand it right away without having to search for an
whitepaper on it.
> +
>  ### reboot
>  > `= t[riple] | k[bd] | a[cpi] | p[ci] | n[o] [, [w]arm | [c]old]`
>  
> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> index c1e244d..cf137fd 100644
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -59,6 +59,7 @@ obj-y += crash.o
>  obj-y += tboot.o
>  obj-y += hpet.o
>  obj-y += xstate.o
> +obj-y += psr.o
>  
>  obj-$(crash_debug) += gdbstub.o
>  
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> new file mode 100644
> index 0000000..9025aeb
> --- /dev/null
> +++ b/xen/arch/x86/psr.c
> @@ -0,0 +1,107 @@
> +/*
> + * pqos.c: Platform Shared Resource related service for guest.
> + *
> + * Copyright (c) 2014, Intel Corporation
> + * Author: Dongxiao Xu <dongxiao.xu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +#include <xen/init.h>
> +#include <xen/cpu.h>
> +#include <asm/psr.h>
> +
> +#define PSR_CMT        (1<<0)
> +
> +struct psr_cmt *__read_mostly  psr_cmt = NULL;

Extra space. No need for the NULL assigment (as this is in .rodata section).

> +static bool_t __initdata opt_psr = 0;

Ditto. No need to assign 0.

> +static unsigned int __initdata opt_rmid_max = 255;

It is not really an 'optional' value as the default is '255'.

I would just call it 'rmid_max'.
> +
> +static void __init parse_psr_param(char *s)
> +{
> +    char *ss, *val_str;
> +
> +    do {
> +        ss = strchr(s, ',');
> +        if ( ss )
> +            *ss = '\0';
> +
> +        val_str = strchr(s, ':');
> +        if ( val_str )
> +            *val_str++ = '\0';
> +
> +        if ( !strcmp(s, "cmt")
> +             && ( !val_str || parse_bool(val_str) == 1 )) {
> +            opt_psr &= PSR_CMT;

&= ?

Not opt_psr |= ?

> +        } else if ( val_str && !strcmp(s, "rmid_max") )
> +            opt_rmid_max = simple_strtoul(val_str, NULL, 0);
> +
> +        s = ss + 1;
> +    } while ( ss );
> +}
> +custom_param("psr", parse_psr_param);
> +
> +static void __init init_psr_cmt(unsigned int rmid_max)
> +{
> +    unsigned int eax, ebx, ecx, edx;
> +    unsigned int rmid;
> +
> +    if ( !boot_cpu_has(X86_FEATURE_CMT) )
> +        return;
> +
> +    cpuid_count(0xf, 0, &eax, &ebx, &ecx, &edx);
> +    if ( !edx )
> +        return;
> +
> +    psr_cmt = xzalloc(struct psr_cmt);
> +    if ( !psr_cmt )
> +        return;
> +
> +    psr_cmt->features = edx;
> +    psr_cmt->rmid_mask = ~(~0ull << get_count_order(ebx));
> +    psr_cmt->rmid_max = min(rmid_max, ebx);
> +
> +    if ( psr_cmt->features & PSR_RESOURCE_TYPE_L3 )
> +    {
> +        cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx);
> +        psr_cmt->l3.upscaling_factor = ebx;
> +        psr_cmt->l3.rmid_max = ecx;
> +        psr_cmt->l3.features = edx;
> +    }
> +
> +    psr_cmt->rmid_max = min(rmid_max, psr_cmt->l3.rmid_max);
> +    psr_cmt->rmid_to_dom = xmalloc_array(domid_t, psr_cmt->rmid_max + 1);
> +    if ( !psr_cmt->rmid_to_dom )
> +    {
> +        xfree(psr_cmt);

And:
	psr_cmt = NULL;

?
> +        return;
> +    }
> +    /* Reserve RMID 0 for all domains not being monitored */

Full stop missing.

Why do you reserve RMID 0? Can you include the explanation
in the comment please?

> +    psr_cmt->rmid_to_dom[0] = DOMID_XEN;
> +    for ( rmid = 1; rmid <= psr_cmt->rmid_max; rmid++ )
> +        psr_cmt->rmid_to_dom[rmid] = DOMID_INVALID;
> +
> +    printk(XENLOG_INFO "Cache Monitoring Technology Enabled.\n");
> +}
> +
> +void __init init_psr(void)


> +{
> +    if ( opt_psr & PSR_CMT && opt_rmid_max )
> +        init_psr_cmt(opt_rmid_max);
> +}

__initcall(init_psr);

> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 8c8b91f..ca4785e 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -49,6 +49,7 @@
>  #include <xen/cpu.h>
>  #include <asm/nmi.h>
>  #include <asm/alternative.h>
> +#include <asm/psr.h>
>  
>  /* opt_nosmp: If true, secondary processors are ignored. */
>  static bool_t __initdata opt_nosmp;
> @@ -1430,6 +1431,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  
>      domain_unpause_by_systemcontroller(dom0);
>  
> +    init_psr();
> +

And then you can remove this.

>      reset_stack_and_jump(init_done);
>  }
>  
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index 8014241..137d75c 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -148,6 +148,7 @@
>  #define X86_FEATURE_ERMS	(7*32+ 9) /* Enhanced REP MOVSB/STOSB */
>  #define X86_FEATURE_INVPCID	(7*32+10) /* Invalidate Process Context ID */
>  #define X86_FEATURE_RTM 	(7*32+11) /* Restricted Transactional Memory */
> +#define X86_FEATURE_CMT 	(7*32+12) /* Cache Monitoring Technology */
>  #define X86_FEATURE_NO_FPU_SEL 	(7*32+13) /* FPU CS/DS stored as zero */
>  #define X86_FEATURE_MPX		(7*32+14) /* Memory Protection Extensions */
>  #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> new file mode 100644
> index 0000000..e321890
> --- /dev/null
> +++ b/xen/include/asm-x86/psr.h
> @@ -0,0 +1,53 @@
> +/*
> + * psr.h: Platform Shared Resource related service for guest.
> + *
> + * Copyright (c) 2014, Intel Corporation
> + * Author: Dongxiao Xu <dongxiao.xu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +#ifndef __ASM_PSR_H__
> +#define __ASM_PSR_H__
> +
> +/* Resource Type Enumeration */
> +#define PSR_RESOURCE_TYPE_L3            0x2
> +
> +/* L3 Monitoring Features */
> +#define PSR_CMT_L3_OCCUPANCY           0x1
> +
> +struct psr_cmt_l3 {
> +    unsigned int features;
> +    unsigned int upscaling_factor;
> +    unsigned int rmid_max;
> +};
> +
> +struct psr_cmt {
> +    unsigned long rmid_mask;
> +    unsigned int rmid_max;
> +    unsigned int features;
> +    domid_t *rmid_to_dom;
> +    struct psr_cmt_l3 l3;
> +};
> +
> +extern struct psr_cmt *psr_cmt;
> +
> +void init_psr(void);
> +
> +#endif /* __ASM_PSR_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> -- 
> 1.7.9.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 05/10] x86: dynamically attach/detach CMT service for a guest
  2014-09-25 10:19 ` [PATCH v16 05/10] x86: dynamically attach/detach CMT service for a guest Chao Peng
@ 2014-09-25 20:41   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-25 20:41 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 06:19:05PM +0800, Chao Peng wrote:
> Add hypervisor side support for dynamically attach and detach
> Cache Monitoring Technology(CMT) services for a certain guest.
> 
> When attach CMT service for a guest, system will allocate an
> RMID for it. When detach or guest is shutdown, the RMID will be
> recycled for future use.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
>  xen/arch/x86/domain.c        |    3 +++
>  xen/arch/x86/domctl.c        |   29 ++++++++++++++++++++++++++
>  xen/arch/x86/psr.c           |   46 ++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/domain.h |    2 ++
>  xen/include/asm-x86/psr.h    |    9 +++++++++
>  xen/include/public/domctl.h  |   12 +++++++++++
>  6 files changed, 101 insertions(+)
> 
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 7b1dfe6..3cfd8f4 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -60,6 +60,7 @@
>  #include <xen/numa.h>
>  #include <xen/iommu.h>
>  #include <compat/vcpu.h>
> +#include <asm/psr.h>
>  
>  DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>  DEFINE_PER_CPU(unsigned long, cr4);
> @@ -647,6 +648,8 @@ void arch_domain_destroy(struct domain *d)
>  
>      free_xenheap_page(d->shared_info);
>      cleanup_domain_irq_mapping(d);
> +
> +    psr_free_rmid(d);
>  }
>  
>  unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long guest_cr4)
> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> index 7a5de43..6ed480a 100644
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -35,6 +35,7 @@
>  #include <asm/mem_sharing.h>
>  #include <asm/xstate.h>
>  #include <asm/debugger.h>
> +#include <asm/psr.h>
>  
>  static int gdbsx_guest_mem_io(
>      domid_t domid, struct xen_domctl_gdbsx_memio *iop)
> @@ -1319,6 +1320,34 @@ long arch_do_domctl(
>      }
>      break;
>  
> +    case XEN_DOMCTL_psr_cmt_op:
> +        if ( !psr_cmt_enabled() )
> +        {
> +            ret = -ENODEV;
> +            break;
> +        }
> +
> +        switch ( domctl->u.psr_cmt_op.cmd )
> +        {
> +        case XEN_DOMCTL_PSR_CMT_OP_ATTACH:
> +            ret = psr_alloc_rmid(d);
> +            break;
> +        case XEN_DOMCTL_PSR_CMT_OP_DETACH:
> +            if ( d->arch.psr_rmid > 0 )
> +                psr_free_rmid(d);
> +            else
> +                ret = -ENOENT;
> +            break;
> +        case XEN_DOMCTL_PSR_CMT_OP_QUERY_RMID:
> +            domctl->u.psr_cmt_op.data = d->arch.psr_rmid;
> +            copyback = 1;
> +            break;
> +        default:
> +            ret = -ENOSYS;
> +            break;
> +        }
> +        break;
> +
>      default:
>          ret = iommu_do_domctl(domctl, d, u_domctl);
>          break;
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 9025aeb..41f7496 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -15,6 +15,7 @@
>   */
>  #include <xen/init.h>
>  #include <xen/cpu.h>
> +#include <xen/sched.h>
>  #include <asm/psr.h>
>  
>  #define PSR_CMT        (1<<0)
> @@ -96,6 +97,51 @@ void __init init_psr(void)
>          init_psr_cmt(opt_rmid_max);
>  }
>  
> +/* Called with domain lock held, no psr specific lock needed */
> +int psr_alloc_rmid(struct domain *d)
> +{
> +    unsigned int rmid;
> +
> +    ASSERT(psr_cmt_enabled());
> +
> +    if ( d->arch.psr_rmid > 0 )
> +        return -EEXIST;
> +
> +    for ( rmid = 1; rmid <= psr_cmt->rmid_max; rmid++ )
> +    {
> +        if ( psr_cmt->rmid_to_dom[rmid] != DOMID_INVALID)

You need a space before the ')'

> +            continue;
> +
> +        psr_cmt->rmid_to_dom[rmid] = d->domain_id;
> +        break;
> +    }
> +
> +    /* No RMID available, assign RMID=0 by default */

Full stop missing.

> +    if ( rmid > psr_cmt->rmid_max )
> +    {
> +        d->arch.psr_rmid = 0;
> +        return -EUSERS;
> +    }
> +
> +    d->arch.psr_rmid = rmid;
> +
> +    return 0;
> +}
> +
> +/* Called with domain lock held, no psr specific lock needed */
> +void psr_free_rmid(struct domain *d)
> +{
> +    unsigned int rmid;
> +
> +    rmid = d->arch.psr_rmid;
> +    /* We do not free system reserved "RMID=0" */
> +    if ( rmid == 0 )
> +        return;
> +
> +    psr_cmt->rmid_to_dom[rmid] = DOMID_INVALID;
> +    d->arch.psr_rmid = 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index 7abe1b3..2be1d1e 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -314,6 +314,8 @@ struct arch_domain
>      /* Shared page for notifying that explicit PIRQ EOI is required. */
>      unsigned long *pirq_eoi_map;
>      unsigned long pirq_eoi_map_mfn;
> +
> +    unsigned int psr_rmid; /* RMID assigned to the domain for CMT */
>  } __cacheline_aligned;
>  
>  #define has_arch_pdevs(d)    (!list_empty(&(d)->arch.pdev_list))
> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> index e321890..930b22b 100644
> --- a/xen/include/asm-x86/psr.h
> +++ b/xen/include/asm-x86/psr.h
> @@ -16,6 +16,8 @@
>  #ifndef __ASM_PSR_H__
>  #define __ASM_PSR_H__
>  
> +#include <xen/types.h>
> +
>  /* Resource Type Enumeration */
>  #define PSR_RESOURCE_TYPE_L3            0x2
>  
> @@ -38,7 +40,14 @@ struct psr_cmt {
>  
>  extern struct psr_cmt *psr_cmt;
>  
> +static inline bool_t psr_cmt_enabled(void)
> +{
> +    return !!psr_cmt;
> +}
> +
>  void init_psr(void);
> +int psr_alloc_rmid(struct domain *d);
> +void psr_free_rmid(struct domain *d);
>  
>  #endif /* __ASM_PSR_H__ */
>  
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index cfa39b3..59220ed 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -965,6 +965,16 @@ struct xen_domctl_vnuma {
>  typedef struct xen_domctl_vnuma xen_domctl_vnuma_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vnuma_t);
>  
> +struct xen_domctl_psr_cmt_op {
> +#define XEN_DOMCTL_PSR_CMT_OP_DETACH         0
> +#define XEN_DOMCTL_PSR_CMT_OP_ATTACH         1
> +#define XEN_DOMCTL_PSR_CMT_OP_QUERY_RMID     2
> +    uint32_t cmd;
> +    uint32_t data;
> +};
> +typedef struct xen_domctl_psr_cmt_op xen_domctl_psr_cmt_op_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_psr_cmt_op_t);
> +
>  struct xen_domctl {
>      uint32_t cmd;
>  #define XEN_DOMCTL_createdomain                   1
> @@ -1038,6 +1048,7 @@ struct xen_domctl {
>  #define XEN_DOMCTL_get_vcpu_msrs                 72
>  #define XEN_DOMCTL_set_vcpu_msrs                 73
>  #define XEN_DOMCTL_setvnumainfo                  74
> +#define XEN_DOMCTL_psr_cmt_op                    75
>  #define XEN_DOMCTL_gdbsx_guestmemio            1000
>  #define XEN_DOMCTL_gdbsx_pausevcpu             1001
>  #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
> @@ -1099,6 +1110,7 @@ struct xen_domctl {
>          struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
>          struct xen_domctl_gdbsx_domstatus   gdbsx_domstatus;
>          struct xen_domctl_vnuma             vnuma;
> +        struct xen_domctl_psr_cmt_op        psr_cmt_op;
>          uint8_t                             pad[128];
>      } u;
>  };
> -- 
> 1.7.9.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 06/10] x86: collect global CMT information
  2014-09-25 10:19 ` [PATCH v16 06/10] x86: collect global CMT information Chao Peng
@ 2014-09-25 20:53   ` Konrad Rzeszutek Wilk
  2014-09-26  9:21     ` Chao Peng
  0 siblings, 1 reply; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-25 20:53 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 06:19:06PM +0800, Chao Peng wrote:
> This implementation tries to put all policies into user space, thus some
> global CMT information needs to be exposed, such as the total RMID count,
> L3 upscaling factor, etc.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
>  xen/arch/x86/cpu/intel_cacheinfo.c |   49 ++----------------------------------
>  xen/arch/x86/sysctl.c              |   43 +++++++++++++++++++++++++++++++
>  xen/include/asm-x86/cpufeature.h   |   45 +++++++++++++++++++++++++++++++++
>  xen/include/public/sysctl.h        |   14 +++++++++++
>  4 files changed, 104 insertions(+), 47 deletions(-)
> 
> diff --git a/xen/arch/x86/cpu/intel_cacheinfo.c b/xen/arch/x86/cpu/intel_cacheinfo.c
> index 430f939..48970c0 100644
> --- a/xen/arch/x86/cpu/intel_cacheinfo.c
> +++ b/xen/arch/x86/cpu/intel_cacheinfo.c
> @@ -81,54 +81,9 @@ static struct _cache_table cache_table[] __cpuinitdata =
>  	{ 0x00, 0, 0}
>  };
>  
> -
> -enum _cache_type
> -{
> -	CACHE_TYPE_NULL	= 0,
> -	CACHE_TYPE_DATA = 1,
> -	CACHE_TYPE_INST = 2,
> -	CACHE_TYPE_UNIFIED = 3
> -};
> -
> -union _cpuid4_leaf_eax {
> -	struct {
> -		enum _cache_type	type:5;
> -		unsigned int		level:3;
> -		unsigned int		is_self_initializing:1;
> -		unsigned int		is_fully_associative:1;
> -		unsigned int		reserved:4;
> -		unsigned int		num_threads_sharing:12;
> -		unsigned int		num_cores_on_die:6;
> -	} split;
> -	u32 full;
> -};
> -
> -union _cpuid4_leaf_ebx {
> -	struct {
> -		unsigned int		coherency_line_size:12;
> -		unsigned int		physical_line_partition:10;
> -		unsigned int		ways_of_associativity:10;
> -	} split;
> -	u32 full;
> -};
> -
> -union _cpuid4_leaf_ecx {
> -	struct {
> -		unsigned int		number_of_sets:32;
> -	} split;
> -	u32 full;
> -};
> -
> -struct _cpuid4_info {
> -	union _cpuid4_leaf_eax eax;
> -	union _cpuid4_leaf_ebx ebx;
> -	union _cpuid4_leaf_ecx ecx;
> -	unsigned long size;
> -};
> -
>  unsigned short			num_cache_leaves;
>  
> -static int __cpuinit cpuid4_cache_lookup(int index, struct _cpuid4_info *this_leaf)
> +int cpuid4_cache_lookup(int index, struct cpuid4_info *this_leaf)
>  {
>  	union _cpuid4_leaf_eax 	eax;
>  	union _cpuid4_leaf_ebx 	ebx;
> @@ -185,7 +140,7 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
>  		 * parameters cpuid leaf to find the cache details
>  		 */
>  		for (i = 0; i < num_cache_leaves; i++) {
> -			struct _cpuid4_info this_leaf;
> +			struct cpuid4_info this_leaf;
>  
>  			int retval;
>  
> diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
> index 15d4b91..b95408f 100644
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -28,6 +28,7 @@
>  #include <xen/nodemask.h>
>  #include <xen/cpu.h>
>  #include <xsm/xsm.h>
> +#include <asm/psr.h>
>  
>  #define get_xen_guest_handle(val, hnd)  do { val = (hnd).p; } while (0)
>  
> @@ -101,6 +102,48 @@ long arch_do_sysctl(
>      }
>      break;
>  
> +    case XEN_SYSCTL_psr_cmt_op:
> +        if ( !psr_cmt_enabled() )
> +            return -ENODEV;
> +
> +        if ( sysctl->u.psr_cmt_op.flags != 0 )
> +            return -EINVAL;
> +
> +        switch ( sysctl->u.psr_cmt_op.cmd )
> +        {
> +        case XEN_SYSCTL_PSR_CMT_enabled:
> +            sysctl->u.psr_cmt_op.data =
> +                (psr_cmt->features & PSR_RESOURCE_TYPE_L3) &&
> +                (psr_cmt->l3.features & PSR_CMT_L3_OCCUPANCY);
> +            break;
> +        case XEN_SYSCTL_PSR_CMT_get_total_rmid:
> +            sysctl->u.psr_cmt_op.data = psr_cmt->rmid_max;
> +            break;
> +        case XEN_SYSCTL_PSR_CMT_get_l3_upscaling_factor:
> +            sysctl->u.psr_cmt_op.data = psr_cmt->l3.upscaling_factor;
> +            break;
> +        case XEN_SYSCTL_PSR_CMT_get_l3_cache_size:
> +        {
> +            struct cpuid4_info info;
> +
> +            ret = cpuid4_cache_lookup(3, &info);

Couldn't you use 'struct cpuinfo_x86' and extend it if you need to?


> +            if ( ret < 0 )
> +                break;
> +
> +            sysctl->u.psr_cmt_op.data = info.size / 1024; /* in KB unit */

With the Haswell EP they have this weird setup where there
are 8 cores on one side and 10 cores on another. Also the cache size is
different (20MB LLC and 25MB LLC). With that wouldn't you want to enumerate
exactly _which_ CPU cache you want instead of the one you running at?

Or is my reading of the diagrams wrong and OS never sees the split and
gets 45MB?


> +        }
> +        break;
> +        default:
> +            sysctl->u.psr_cmt_op.data = 0;
> +            ret = -ENOSYS;
> +            break;
> +        }
> +
> +        if ( __copy_to_guest(u_sysctl, sysctl, 1) )
> +            ret = -EFAULT;
> +
> +        break;
> +
>      default:
>          ret = -ENOSYS;
>          break;
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index 137d75c..d3bd14d 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -215,6 +215,51 @@
>  #define cpu_has_vmx		boot_cpu_has(X86_FEATURE_VMXE)
>  
>  #define cpu_has_cpuid_faulting	boot_cpu_has(X86_FEATURE_CPUID_FAULTING)
> +
> +enum _cache_type {
> +    CACHE_TYPE_NULL = 0,
> +    CACHE_TYPE_DATA = 1,
> +    CACHE_TYPE_INST = 2,
> +    CACHE_TYPE_UNIFIED = 3
> +};
> +
> +union _cpuid4_leaf_eax {
> +    struct {
> +        enum _cache_type type:5;
> +        unsigned int level:3;
> +        unsigned int is_self_initializing:1;
> +        unsigned int is_fully_associative:1;
> +        unsigned int reserved:4;
> +        unsigned int num_threads_sharing:12;
> +        unsigned int num_cores_on_die:6;
> +    } split;
> +    u32 full;
> +};
> +
> +union _cpuid4_leaf_ebx {
> +    struct {
> +        unsigned int coherency_line_size:12;
> +        unsigned int physical_line_partition:10;
> +        unsigned int ways_of_associativity:10;
> +    } split;
> +    u32 full;
> +};
> +
> +union _cpuid4_leaf_ecx {
> +    struct {
> +        unsigned int number_of_sets:32;
> +    } split;
> +    u32 full;
> +};
> +
> +struct cpuid4_info {
> +    union _cpuid4_leaf_eax eax;
> +    union _cpuid4_leaf_ebx ebx;
> +    union _cpuid4_leaf_ecx ecx;
> +    unsigned long size;
> +};
> +
> +int cpuid4_cache_lookup(int index, struct cpuid4_info *this_leaf);
>  #endif
>  
>  #endif /* __ASM_I386_CPUFEATURE_H */
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 3588698..66b6e47 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -636,6 +636,18 @@ struct xen_sysctl_coverage_op {
>  typedef struct xen_sysctl_coverage_op xen_sysctl_coverage_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t);
>  
> +#define XEN_SYSCTL_PSR_CMT_get_total_rmid            0
> +#define XEN_SYSCTL_PSR_CMT_get_l3_upscaling_factor   1
> +/* The L3 cache size is returned in KB unit */
> +#define XEN_SYSCTL_PSR_CMT_get_l3_cache_size         2
> +#define XEN_SYSCTL_PSR_CMT_enabled                   3
> +struct xen_sysctl_psr_cmt_op {
> +    uint32_t cmd;
> +    uint32_t flags;      /* padding variable, may be extended for future use */
> +    uint64_t data;
> +};
> +typedef struct xen_sysctl_psr_cmt_op xen_sysctl_psr_cmt_op_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cmt_op_t);
>  
>  struct xen_sysctl {
>      uint32_t cmd;
> @@ -658,6 +670,7 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_cpupool_op                    18
>  #define XEN_SYSCTL_scheduler_op                  19
>  #define XEN_SYSCTL_coverage_op                   20
> +#define XEN_SYSCTL_psr_cmt_op                    21
>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>      union {
>          struct xen_sysctl_readconsole       readconsole;
> @@ -679,6 +692,7 @@ struct xen_sysctl {
>          struct xen_sysctl_cpupool_op        cpupool_op;
>          struct xen_sysctl_scheduler_op      scheduler_op;
>          struct xen_sysctl_coverage_op       coverage_op;
> +        struct xen_sysctl_psr_cmt_op        psr_cmt_op;
>          uint8_t                             pad[128];
>      } u;
>  };
> -- 
> 1.7.9.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 08/10] x86: add CMT related MSRs in allowed list
  2014-09-25 10:19 ` [PATCH v16 08/10] x86: add CMT related MSRs in allowed list Chao Peng
@ 2014-09-25 20:58   ` Konrad Rzeszutek Wilk
  2014-09-26  8:38     ` Jan Beulich
  0 siblings, 1 reply; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-25 20:58 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 06:19:08PM +0800, Chao Peng wrote:
> Tool stack will try to access the two MSRs to perform CMT
> related operations, thus added them in the allowed list.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
>  xen/arch/x86/platform_hypercall.c |    7 +++++++
>  xen/include/asm-x86/msr-index.h   |    2 ++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
> index 081d9f5..be06f3a 100644
> --- a/xen/arch/x86/platform_hypercall.c
> +++ b/xen/arch/x86/platform_hypercall.c
> @@ -69,6 +69,13 @@ struct xen_resource_access {
>  
>  static bool_t allow_access_msr(unsigned int msr)
>  {
> +    switch ( msr )
> +    {
> +    case MSR_IA32_QOSEVTSEL:
> +    case MSR_IA32_QMC:
> +        return 1;
> +    }
> +
>      return 0;
>  }
>  
> diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
> index dcb2b87..ae089fb 100644
> --- a/xen/include/asm-x86/msr-index.h
> +++ b/xen/include/asm-x86/msr-index.h
> @@ -324,6 +324,8 @@
>  #define MSR_IA32_ENERGY_PERF_BIAS	0x000001b0
>  
>  /* Platform Shared Resource MSRs */
> +#define MSR_IA32_QOSEVTSEL		0x00000c8d
> +#define MSR_IA32_QMC			0x00000c8e

Could you mention where they are in the SDM ?

Thank you.
>  #define MSR_IA32_PQR_ASSOC		0x00000c8f
>  
>  /* Intel Model 6 */
> -- 
> 1.7.9.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 10/10] tools: CMDs and APIs for Cache Monitoring Technology
  2014-09-25 10:19 ` [PATCH v16 10/10] tools: CMDs and APIs for Cache Monitoring Technology Chao Peng
@ 2014-09-25 21:14   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-25 21:14 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 06:19:10PM +0800, Chao Peng wrote:
> Introduced some new xl commands to enable/disable Cache Monitoring
> Technology(CMT) feature.
> 
> The following two commands is to attach/detach the CMT feature
> to/from a certain domain.
> 
> $ xl psr-cmt-attach domid
> $ xl psr-cmt-detach domid
> 
> This command is to display the CMT information, such as L3 cache
> occupancy.
> 
> $ xl psr-cmt-show cache_occupancy <domid>
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> ---
>  docs/man/xl.pod.1           |   25 +++++
>  tools/libxc/Makefile        |    1 +
>  tools/libxc/xc_msr_x86.h    |   36 ++++++++
>  tools/libxc/xc_psr.c        |  215 +++++++++++++++++++++++++++++++++++++++++++
>  tools/libxc/xenctrl.h       |   17 ++++
>  tools/libxl/Makefile        |    2 +-
>  tools/libxl/libxl.h         |   19 ++++
>  tools/libxl/libxl_psr.c     |  184 ++++++++++++++++++++++++++++++++++++
>  tools/libxl/libxl_types.idl |    4 +
>  tools/libxl/libxl_utils.c   |   28 ++++++
>  tools/libxl/xl.h            |    3 +
>  tools/libxl/xl_cmdimpl.c    |  131 ++++++++++++++++++++++++++
>  tools/libxl/xl_cmdtable.c   |   17 ++++
>  13 files changed, 681 insertions(+), 1 deletion(-)
>  create mode 100644 tools/libxc/xc_msr_x86.h
>  create mode 100644 tools/libxc/xc_psr.c
>  create mode 100644 tools/libxl/libxl_psr.c
> 
> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
> index f1e95db..ef8cd24 100644
> --- a/docs/man/xl.pod.1
> +++ b/docs/man/xl.pod.1
> @@ -1387,6 +1387,31 @@ Load FLASK policy from the given policy file. The initial policy is provided to
>  the hypervisor as a multiboot module; this command allows runtime updates to the
>  policy. Loading new security policy will reset runtime changes to device labels.
>  
> +=head1 Cache Monitoring Technology
> +
> +Some new hardware may offer monitoring capability in each logical processor to
> +measure specific platform shared resource metric, for example, L3 cache
> +occupancy. In Xen implementation, the monitoring granularity is domain level.
> +To monitor a specific domain, just attach the domain id with the monitoring
> +service. When the domain doesn't need to be monitored any more, detach the
> +domain id from the monitoring service.
> +
> +=over 4
> +
> +=item B<psr-cmt-attach> [I<domain-id>]
> +
> +attach: Attach the platform shared resource monitoring service to a domain.
> +
> +=item B<psr-cmt-detach> [I<domain-id>]
> +
> +detach: Detach the platform shared resource monitoring service from a domain.
> +
> +=item B<psr-cmt-show> [I<psr-monitor-type>] [I<domain-id>]
> +
> +Show monitoring data for a certain domain or all domains. Current supported
> +monitor types are:
> + - "cache-occupancy": showing the L3 cache occupancy.
> +
>  =back
>  
>  =head1 TO BE DOCUMENTED
> diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
> index dde6109..d8cc21b 100644
> --- a/tools/libxc/Makefile
> +++ b/tools/libxc/Makefile
> @@ -35,6 +35,7 @@ CTRL_SRCS-y       += xc_kexec.c
>  CTRL_SRCS-y       += xtl_core.c
>  CTRL_SRCS-y       += xtl_logger_stdio.c
>  CTRL_SRCS-y       += xc_resource.c
> +CTRL_SRCS-y       += xc_psr.c
>  CTRL_SRCS-$(CONFIG_X86) += xc_pagetab.c
>  CTRL_SRCS-$(CONFIG_Linux) += xc_linux.c xc_linux_osdep.c
>  CTRL_SRCS-$(CONFIG_FreeBSD) += xc_freebsd.c xc_freebsd_osdep.c
> diff --git a/tools/libxc/xc_msr_x86.h b/tools/libxc/xc_msr_x86.h
> new file mode 100644
> index 0000000..1e0ee99
> --- /dev/null
> +++ b/tools/libxc/xc_msr_x86.h
> @@ -0,0 +1,36 @@
> +/*
> + * xc_msr_x86.h
> + *
> + * MSR definition macros
> + *
> + * Copyright (C) 2014      Intel Corporation
> + * Author Dongxiao Xu <dongxiao.xu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#ifndef XC_MSR_X86_H
> +#define XC_MSR_X86_H
> +
> +#define MSR_IA32_QOSEVTSEL      0x00000c8d
> +#define MSR_IA32_QMC            0x00000c8e
> +
> +#endif
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
> new file mode 100644
> index 0000000..0b5a227
> --- /dev/null
> +++ b/tools/libxc/xc_psr.c
> @@ -0,0 +1,215 @@
> +/*
> + * xc_psr.c
> + *
> + * platform shared resource related API functions.
> + *
> + * Copyright (C) 2014      Intel Corporation
> + * Author Dongxiao Xu <dongxiao.xu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "xc_private.h"
> +#include "xc_msr_x86.h"
> +
> +#define IA32_QM_CTR_ERROR_MASK         (0x3ull << 62)
> +
> +#define EVTID_L3_OCCUPANCY             0x1
> +
> +int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid)
> +{
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_psr_cmt_op;
> +    domctl.domain = (domid_t)domid;
> +    domctl.u.psr_cmt_op.cmd = XEN_DOMCTL_PSR_CMT_OP_ATTACH;
> +
> +    return do_domctl(xch, &domctl);
> +}
> +
> +int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid)
> +{
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_psr_cmt_op;
> +    domctl.domain = (domid_t)domid;
> +    domctl.u.psr_cmt_op.cmd = XEN_DOMCTL_PSR_CMT_OP_DETACH;
> +
> +    return do_domctl(xch, &domctl);
> +}
> +
> +int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
> +                                    uint32_t *rmid)
> +{
> +    int rc;
> +    DECLARE_DOMCTL;
> +
> +    domctl.cmd = XEN_DOMCTL_psr_cmt_op;
> +    domctl.domain = (domid_t)domid;
> +    domctl.u.psr_cmt_op.cmd = XEN_DOMCTL_PSR_CMT_OP_QUERY_RMID;
> +
> +    rc = do_domctl(xch, &domctl);
> +
> +    if ( !rc )
> +        *rmid = domctl.u.psr_cmt_op.data;
> +
> +    return rc;
> +}
> +
> +int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid)
> +{
> +    static int val = 0;
> +    int rc;
> +    DECLARE_SYSCTL;
> +
> +    if ( val )
> +    {
> +        *total_rmid = val;
> +        return 0;
> +    }
> +
> +    sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
> +    sysctl.u.psr_cmt_op.cmd = XEN_SYSCTL_PSR_CMT_get_total_rmid;
> +    sysctl.u.psr_cmt_op.flags = 0;
> +
> +    rc = xc_sysctl(xch, &sysctl);
> +    if ( !rc )
> +        val = *total_rmid = sysctl.u.psr_cmt_op.data;
> +
> +    return rc;
> +}
> +
> +int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
> +                                            uint32_t *upscaling_factor)
> +{
> +    static int val = 0;
> +    int rc;
> +    DECLARE_SYSCTL;
> +
> +    if ( val )
> +    {
> +        *upscaling_factor = val;
> +        return 0;
> +    }
> +
> +    sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
> +    sysctl.u.psr_cmt_op.cmd =
> +        XEN_SYSCTL_PSR_CMT_get_l3_upscaling_factor;
> +    sysctl.u.psr_cmt_op.flags = 0;
> +
> +    rc = xc_sysctl(xch, &sysctl);
> +    if ( !rc )
> +        val = *upscaling_factor = sysctl.u.psr_cmt_op.data;
> +
> +    return rc;
> +}
> +
> +int xc_psr_cmt_get_l3_cache_size(xc_interface *xch,
> +                                      uint32_t *l3_cache_size)
> +{
> +    static int val = 0;
> +    int rc;
> +    DECLARE_SYSCTL;
> +
> +    if ( val )
> +    {
> +        *l3_cache_size = val;
> +        return 0;
> +    }
> +
> +    sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
> +    sysctl.u.psr_cmt_op.cmd =
> +        XEN_SYSCTL_PSR_CMT_get_l3_cache_size;
> +    sysctl.u.psr_cmt_op.flags = 0;
> +
> +    rc = xc_sysctl(xch, &sysctl);
> +    if ( !rc )
> +        val = *l3_cache_size= sysctl.u.psr_cmt_op.data;
> +
> +    return rc;
> +}
> +
> +int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid,
> +    uint32_t cpu, xc_psr_cmt_type type, uint64_t *monitor_data)
> +{
> +    xc_resource_op_t op;
> +    xc_resource_data_t entries[2];
> +    uint32_t evtid;
> +    int rc;
> +
> +    switch ( type )
> +    {
> +    case XC_PSR_CMT_L3_OCCUPANCY:
> +        evtid = EVTID_L3_OCCUPANCY;
> +        break;
> +    default:
> +        return -1;
> +    }
> +
> +    entries[0].cmd = XEN_RESOURCE_OP_MSR_WRITE;
> +    entries[0].idx = MSR_IA32_QOSEVTSEL;
> +    entries[0].val = (uint64_t)rmid << 32 | evtid;
> +    entries[0].rsvd = 0;
> +
> +    entries[1].cmd = XEN_RESOURCE_OP_MSR_READ;
> +    entries[1].idx = MSR_IA32_QMC;
> +    entries[1].val = 0;
> +    entries[1].rsvd = 0;
> +
> +    op.result = 0;
> +    op.cpu = cpu;
> +    op.nr_entries = 2;
> +    op.entries = entries;
> +
> +    rc = xc_resource_op(xch, 1, &op);
> +    if ( rc )
> +        return rc;
> +
> +    if ( op.result || entries[1].val & IA32_QM_CTR_ERROR_MASK )
> +        return -1;
> +
> +    *monitor_data = entries[1].val;
> +
> +    return 0;
> +}
> +
> +int xc_psr_cmt_enabled(xc_interface *xch)
> +{
> +    static int val = -1;
> +    int rc;
> +    DECLARE_SYSCTL;
> +
> +    if ( val >= 0 )
> +        return val;
> +
> +    sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
> +    sysctl.u.psr_cmt_op.cmd = XEN_SYSCTL_PSR_CMT_enabled;
> +    sysctl.u.psr_cmt_op.flags = 0;
> +
> +    rc = do_sysctl(xch, &sysctl);
> +    if ( !rc )
> +    {
> +        val = sysctl.u.psr_cmt_op.data;
> +        return val;
> +    }
> +
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
> index fb58c44..d5a3c0a 100644
> --- a/tools/libxc/xenctrl.h
> +++ b/tools/libxc/xenctrl.h
> @@ -2668,6 +2668,23 @@ struct xc_resource_op {
>  typedef struct xc_resource_op xc_resource_op_t;
>  int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops);
>  
> +enum xc_psr_cmt_type {
> +    XC_PSR_CMT_L3_OCCUPANCY,
> +};
> +typedef enum xc_psr_cmt_type xc_psr_cmt_type;
> +int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid);
> +int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid);
> +int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
> +    uint32_t *rmid);
> +int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid);
> +int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
> +    uint32_t *upscaling_factor);
> +int xc_psr_cmt_get_l3_cache_size(xc_interface *xch,
> +    uint32_t *l3_cache_size);
> +int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid,
> +    uint32_t cpu, uint32_t psr_cmt_type, uint64_t *monitor_data);
> +int xc_psr_cmt_enabled(xc_interface *xch);
> +
>  #endif /* XENCTRL_H */
>  
>  /*
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index 990414b..fd9ae28 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -43,7 +43,7 @@ LIBXL_OBJS-y += libxl_blktap2.o
>  else
>  LIBXL_OBJS-y += libxl_noblktap2.o
>  endif
> -LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o
> +LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
>  LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o
>  
>  ifeq ($(CONFIG_NetBSD),y)
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index bc68cac..5acc933 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -640,6 +640,13 @@ typedef uint8_t libxl_mac[6];
>  #define LIBXL_MAC_BYTES(mac) mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]
>  void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
>  
> +/*
> + * LIBXL_HAVE_PSR_CMT
> + *
> + * If this is defined, the Cache Monitoring Technology feature is supported.
> + */
> +#define LIBXL_HAVE_PSR_CMT 1
> +
>  typedef char **libxl_string_list;
>  void libxl_string_list_dispose(libxl_string_list *sl);
>  int libxl_string_list_length(const libxl_string_list *sl);
> @@ -1380,6 +1387,18 @@ bool libxl_ms_vm_genid_is_zero(const libxl_ms_vm_genid *id);
>  void libxl_ms_vm_genid_copy(libxl_ctx *ctx, libxl_ms_vm_genid *dst,
>                              libxl_ms_vm_genid *src);
>  
> +int libxl_get_socket_cpu(libxl_ctx *ctx, uint32_t socketid);
> +
> +int libxl_psr_cmt_attach(libxl_ctx *ctx, uint32_t domid);
> +int libxl_psr_cmt_detach(libxl_ctx *ctx, uint32_t domid);
> +int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid);
> +int libxl_psr_cmt_enabled(libxl_ctx *ctx);
> +int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid);
> +int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx,
> +    uint32_t *l3_cache_size);
> +int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid,
> +    uint32_t socketid, uint32_t *l3_cache_occupancy);
> +
>  /* misc */
>  
>  /* Each of these sets or clears the flag according to whether the
> diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
> new file mode 100644
> index 0000000..5551be8
> --- /dev/null
> +++ b/tools/libxl/libxl_psr.c
> @@ -0,0 +1,184 @@
> +/*
> + * Copyright (C) 2014      Intel Corporation
> + * Author Dongxiao Xu <dongxiao.xu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +#include "libxl_internal.h"
> +
> +
> +#define IA32_QM_CTR_ERROR_MASK         (0x3ul << 62)
> +
> +static void libxl_psr_cmt_err_msg(libxl_ctx *ctx, int err)
> +{
> +    GC_INIT(ctx);
> +
> +    char *msg;
> +
> +    switch (err) {
> +    case ENOSYS:
> +        msg = "unsupported operation";
> +        break;
> +    case ENODEV:
> +        msg = "Cache Monitoring Technology is not supported in this system";
> +        break;
> +    case EEXIST:
> +        msg = "Cache Monitoring Technology is already attached to this domain";
> +        break;
> +    case ENOENT:
> +        msg = "Cache Monitoring Technology is not attached to this domain";
> +        break;
> +    case EUSERS:
> +        msg = "there is no free RMID available";
> +        break;
> +    case ESRCH:
> +        msg = "is this Domain ID valid?";
> +        break;
> +    case EFAULT:
> +        msg = "failed to exchange data with Xen";
> +        break;
> +    default:
> +        msg = "unknown error";
> +        break;
> +    }
> +
> +    LOGE(ERROR, "%s", msg);
> +
> +    GC_FREE;
> +}
> +
> +int libxl_psr_cmt_attach(libxl_ctx *ctx, uint32_t domid)
> +{
> +    int rc;
> +
> +    rc = xc_psr_cmt_attach(ctx->xch, domid);
> +    if (rc < 0) {
> +        libxl_psr_cmt_err_msg(ctx, errno);
> +        return ERROR_FAIL;
> +    }
> +
> +    return 0;
> +}
> +
> +int libxl_psr_cmt_detach(libxl_ctx *ctx, uint32_t domid)
> +{
> +    int rc;
> +
> +    rc = xc_psr_cmt_detach(ctx->xch, domid);
> +    if (rc < 0) {
> +        libxl_psr_cmt_err_msg(ctx, errno);
> +        return ERROR_FAIL;
> +    }
> +
> +    return 0;
> +}
> +
> +int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid)
> +{
> +    int rc;
> +    uint32_t rmid;
> +
> +    rc = xc_psr_cmt_get_domain_rmid(ctx->xch, domid, &rmid);
> +    if (rc < 0)
> +        return 0;
> +
> +    return !!rmid;
> +}
> +
> +int libxl_psr_cmt_enabled(libxl_ctx *ctx)
> +{
> +    return xc_psr_cmt_enabled(ctx->xch);
> +}
> +
> +int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid)
> +{
> +    int rc;
> +
> +    rc = xc_psr_cmt_get_total_rmid(ctx->xch, total_rmid);
> +    if (rc < 0) {
> +        libxl_psr_cmt_err_msg(ctx, errno);
> +        return ERROR_FAIL;
> +    }
> +
> +    return 0;
> +}
> +
> +int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx,
> +                                         uint32_t *l3_cache_size)
> +{
> +    int rc;
> +
> +    rc = xc_psr_cmt_get_l3_cache_size(ctx->xch, l3_cache_size);
> +    if (rc < 0) {
> +        libxl_psr_cmt_err_msg(ctx, errno);
> +        return ERROR_FAIL;
> +    }
> +
> +    return 0;
> +}
> +
> +int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid,
> +    uint32_t socketid, uint32_t *l3_cache_occupancy)
> +{
> +    GC_INIT(ctx);
> +
> +    unsigned int rmid;
> +    uint32_t upscaling_factor;
> +    uint64_t monitor_data;
> +    int cpu, rc;
> +    xc_psr_cmt_type type;
> +
> +    rc = xc_psr_cmt_get_domain_rmid(ctx->xch, domid, &rmid);
> +    if (rc < 0 || rmid == 0) {
> +        LOGE(ERROR, "fail to get the domain rmid, "
> +            "or domain is not attached with platform QoS monitoring service");
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +
> +    cpu = libxl_get_socket_cpu(ctx, socketid);
> +    if (cpu < 0) {
> +        LOGE(ERROR, "failed to get socket cpu");
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +
> +    type = XC_PSR_CMT_L3_OCCUPANCY;
> +    rc = xc_psr_cmt_get_data(ctx->xch, rmid, cpu, type, &monitor_data);
> +    if (rc < 0) {
> +        LOGE(ERROR, "failed to get monitoring data");
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +
> +    rc = xc_psr_cmt_get_l3_upscaling_factor(ctx->xch, &upscaling_factor);
> +    if (rc < 0) {
> +        LOGE(ERROR, "failed to get L3 upscaling factor");
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +
> +    *l3_cache_occupancy = upscaling_factor * monitor_data / 1024;
> +    rc = 0;
> +out:
> +    GC_FREE;
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index f1fcbc3..27a5022 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -635,3 +635,7 @@ libxl_event = Struct("event",[
>                                   ])),
>             ("domain_create_console_available", None),
>             ]))])
> +
> +libxl_psr_cmt_type = Enumeration("psr_cmt_type", [
> +    (1, "CACHE_OCCUPANCY"),
> +    ])
> diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
> index 58df4f3..8ec822d 100644
> --- a/tools/libxl/libxl_utils.c
> +++ b/tools/libxl/libxl_utils.c
> @@ -1065,6 +1065,34 @@ int libxl__random_bytes(libxl__gc *gc, uint8_t *buf, size_t len)
>      return ret;
>  }
>  
> +int libxl_get_socket_cpu(libxl_ctx *ctx, uint32_t socketid)
> +{
> +    int i, j, cpu, nr_cpus;
> +    libxl_cputopology *topology;
> +    int *socket_cpus;
> +
> +    topology = libxl_get_cpu_topology(ctx, &nr_cpus);
> +    if (!topology)
> +        return ERROR_FAIL;
> +
> +    socket_cpus = malloc(sizeof(int) * nr_cpus);
> +    if (!socket_cpus) {
> +        free(topology);

Not libxl_cputopology_list_free(topology, nr_cpus) ?
That is how for example 'libxl_nodemap_to_cpumap' does it.


> +        return ERROR_FAIL;
> +    }
> +
> +    for (i = 0, j = 0; i < nr_cpus; i++)
> +        if (topology[i].socket == socketid)
> +            socket_cpus[j++] = i;
> +
> +    cpu = socket_cpus[rand() % j];


Could you describe the reasoning behind this please?

> +
> +    free(socket_cpus);
> +    free(topology);
> +
> +    return cpu;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
> index 10a2e66..abb1fac 100644
> --- a/tools/libxl/xl.h
> +++ b/tools/libxl/xl.h
> @@ -110,6 +110,9 @@ int main_loadpolicy(int argc, char **argv);
>  int main_remus(int argc, char **argv);
>  #endif
>  int main_devd(int argc, char **argv);
> +int main_psr_cmt_attach(int argc, char **argv);
> +int main_psr_cmt_detach(int argc, char **argv);
> +int main_psr_cmt_show(int argc, char **argv);
>  
>  void help(const char *command);
>  
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 698b3bc..1fa8755 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -7428,6 +7428,137 @@ out:
>      return ret;
>  }
>  
> +static int psr_cmt_show_cache_occupancy(uint32_t domid)
> +{
> +    uint32_t i, socketid, nr_sockets, total_rmid;
> +    uint32_t l3_cache_size, l3_cache_occupancy;
> +    libxl_physinfo info;
> +    char *domain_name;
> +    int rc, print_header, nr_domains;
> +    libxl_dominfo *dominfo;
> +
> +    if (!libxl_psr_cmt_enabled(ctx)) {
> +        printf("CMT is not supported in the system\n");

s/not supported/not support (or disabled)/

> +        return -1;
> +    }
> +
> +    rc = libxl_get_physinfo(ctx, &info);
> +    if (rc < 0) {
> +        printf("failed to get system socket number\n");

That is not exactly what this hypercall gets you. I would just
say: "Failed getting physinfo, rc: %d."

> +        return -1;
> +    }
> +    nr_sockets = info.nr_cpus / info.threads_per_core / info.cores_per_socket;
> +
> +    rc = libxl_psr_cmt_get_total_rmid(ctx, &total_rmid);
> +    if (rc < 0) {
> +        printf("failed to get system total rmid number\n");

Failed to get max RMID value.
> +        return -1;
> +    }
> +
> +    rc = libxl_psr_cmt_get_l3_cache_size(ctx, &l3_cache_size);
> +    if (rc < 0) {
> +        printf("failed to get system l3 cache size\n");

Missing full stop.

> +        return -1;
> +    }
> +
> +    printf("Total RMID: %d\n", total_rmid);
> +    printf("Per-Socket L3 Cache Size: %d KB\n", l3_cache_size);
> +
> +    print_header = 1;
> +    if (!(dominfo = libxl_list_domain(ctx, &nr_domains))) {
> +        fprintf(stderr, "libxl_list_domain failed.\n");
> +        return -1;
> +    }
> +    for (i = 0; i < nr_domains; i++) {
> +        if (domid != ~0 && dominfo[i].domid != domid)
> +            continue;
> +        if (!libxl_psr_cmt_domain_attached(ctx, dominfo[i].domid))
> +            continue;
> +        if (print_header) {
> +            printf("%-40s %5s", "Name", "ID");
> +            for (socketid = 0; socketid < nr_sockets; socketid++)
> +                printf("%14s %d", "Socket", socketid);
> +            printf("\n");
> +            print_header = 0;
> +        }
> +        domain_name = libxl_domid_to_name(ctx, dominfo[i].domid);
> +        printf("%-40s %5d", domain_name, dominfo[i].domid);
> +        free(domain_name);
> +        for (socketid = 0; socketid < nr_sockets; socketid++) {
> +            rc = libxl_psr_cmt_get_cache_occupancy(ctx, dominfo[i].domid,
> +                     socketid, &l3_cache_occupancy);

Not checking the 'rc' ?

> +            printf("%13u KB", l3_cache_occupancy);
> +        }
> +        printf("\n");
> +    }
> +    libxl_dominfo_list_free(dominfo, nr_domains);
> +
> +    return 0;
> +}
> +
> +int main_psr_cmt_attach(int argc, char **argv)
> +{
> +    uint32_t domid;
> +    int opt, ret = 0;
> +
> +    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cmt-attach", 1) {
> +        /* No options */
> +    }
> +
> +    domid = find_domain(argv[optind]);
> +    ret = libxl_psr_cmt_attach(ctx, domid);
> +
> +    return ret;
> +}
> +
> +int main_psr_cmt_detach(int argc, char **argv)
> +{
> +    uint32_t domid;
> +    int opt, ret = 0;
> +
> +    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cmt-detach", 1) {
> +        /* No options */
> +    }
> +
> +    domid = find_domain(argv[optind]);
> +    ret = libxl_psr_cmt_detach(ctx, domid);
> +
> +    return ret;
> +}
> +
> +int main_psr_cmt_show(int argc, char **argv)
> +{
> +    int opt, ret = 0;
> +    uint32_t domid;
> +    libxl_psr_cmt_type type;
> +
> +    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cmt-show", 1) {
> +        /* No options */
> +    }
> +
> +    libxl_psr_cmt_type_from_string(argv[optind], &type);
> +
> +    if (optind + 1 >= argc)
> +        domid = ~0;
> +    else if (optind + 1 == argc - 1)
> +        domid = find_domain(argv[optind + 1]);
> +    else {
> +        help("psr-cmt-show");
> +        return 2;
> +    }
> +
> +    switch (type) {
> +    case LIBXL_PSR_CMT_TYPE_CACHE_OCCUPANCY:
> +        ret = psr_cmt_show_cache_occupancy(domid);
> +        break;
> +    default:
> +        help("psr-cmt-show");
> +        return 2;
> +    }
> +
> +    return ret;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
> index 35f02c4..da25c3f 100644
> --- a/tools/libxl/xl_cmdtable.c
> +++ b/tools/libxl/xl_cmdtable.c
> @@ -502,6 +502,23 @@ struct cmd_spec cmd_table[] = {
>        "[options]",
>        "-F                      Run in the foreground",
>      },
> +    { "psr-cmt-attach",
> +      &main_psr_cmt_attach, 0, 1,
> +      "Attach Cache Monitoring Technology service to a domain",
> +      "<Domain>",
> +    },
> +    { "psr-cmt-detach",
> +      &main_psr_cmt_detach, 0, 1,
> +      "Detach Cache Monitoring Technology service from a domain",
> +      "<Domain>",
> +    },
> +    { "psr-cmt-show",
> +      &main_psr_cmt_show, 0, 1,
> +      "Show Cache Monitoring Technology information",
> +      "<PSR-CMT-Type> <Domain>",
> +      "Available monitor types:\n"
> +      "\"cache_occupancy\":         Show L3 cache occupancy\n",
> +    },
>  };
>  
>  int cmdtable_len = sizeof(cmd_table)/sizeof(struct cmd_spec);
> -- 
> 1.7.9.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature
  2014-09-25 20:33   ` Konrad Rzeszutek Wilk
@ 2014-09-25 21:14     ` Andrew Cooper
  2014-09-26  1:54       ` Chao Peng
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Cooper @ 2014-09-25 21:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Ian.Jackson, xen-devel, JBeulich, dgdegra

On 25/09/2014 21:33, Konrad Rzeszutek Wilk wrote:
> On Thu, Sep 25, 2014 at 06:19:04PM +0800, Chao Peng wrote:
>> Detect Cache Monitoring Technology(CMT) feature and enumerate the
>> resource types, one of which is to monitor the L3 cache occupancy.
>>
>> Also introduce a Xen command line parameter to control the Platform
>> Shared Resource such as CMT.
>>
>> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
>> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
>> ---
>>  docs/misc/xen-command-line.markdown |   12 ++++
>>  xen/arch/x86/Makefile               |    1 +
>>  xen/arch/x86/psr.c                  |  107 +++++++++++++++++++++++++++++++++++
>>  xen/arch/x86/setup.c                |    3 +
>>  xen/include/asm-x86/cpufeature.h    |    1 +
>>  xen/include/asm-x86/psr.h           |   53 +++++++++++++++++
>>  6 files changed, 177 insertions(+)
>>  create mode 100644 xen/arch/x86/psr.c
>>  create mode 100644 xen/include/asm-x86/psr.h
>>
>> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
>> index af93e17..b106a46 100644
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -1005,6 +1005,18 @@ This option can be specified more than once (up to 8 times at present).
>>  ### ple\_window
>>  > `= <integer>`
>>  
>> +### psr (Intel)
>> +> `= List of ( cmt:<boolean> | rmid_max:<integer> )`
> Please explain what 'psr' is (the full name) and why one would want
> to use it.
>
>> +
>> +> Default: `psr=cmt:0,rmid_max:255`
>> +

Personally, I think these first 5 lines of context are fine.  They
follow the standard layout of all other parameters in this document wrt
names, valid values and default values.

>> +Configure platform shared resource services, which are available on Intel
>> +Haswell Server family and future platforms.
>> +
>> +`cmt` instructs Xen to enable/disable Cache Monitoring Technology.

I feel that the wording here could be improved for extra clarity.  How
about:

Platform Shared Resource Services.  Intel Haswell and later server
platforms offer information about the sharing of resources.

The following resources are available:
* Cache Monitoring Technology (Haswell and later).  Information
regarding the L3 cache occupancy.

(I seem to remember another one about L3 data bandwidth to local and
non-local memory controllers, but cant remember its name offhand)

> Please include the default value.
>
>> +
>> +`rmid_max` indicates the max value for rmid.
> Couple of issues:
>  - It reads as not optional (from the documentation) - so what are the values
>    that can used? What are the ranges?
>  - What is the default value?
>  - What is 'RMID'? 

The hardware has a maximum supported RMID, but instead of allocating
memory based on a u32 out of cpuid, I insisted on a command line max
parameter to provide a sane upper bound.

I would however agree that a sentence or two describing what an RMID is
would be useful here, although being strictly the command line
documentation, I don't think it warrants a full whitepapers worth of detail.

>
> Please please expand more on this. You want users to able to easily
> read it and understand it right away without having to search for an
> whitepaper on it.
>> +
>>  ### reboot
>>  > `= t[riple] | k[bd] | a[cpi] | p[ci] | n[o] [, [w]arm | [c]old]`
>>  
>> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
>> index c1e244d..cf137fd 100644
>> --- a/xen/arch/x86/Makefile
>> +++ b/xen/arch/x86/Makefile
>> @@ -59,6 +59,7 @@ obj-y += crash.o
>>  obj-y += tboot.o
>>  obj-y += hpet.o
>>  obj-y += xstate.o
>> +obj-y += psr.o
>>  
>>  obj-$(crash_debug) += gdbstub.o
>>  
>> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
>> new file mode 100644
>> index 0000000..9025aeb
>> --- /dev/null
>> +++ b/xen/arch/x86/psr.c
>> @@ -0,0 +1,107 @@
>> +/*
>> + * pqos.c: Platform Shared Resource related service for guest.
>> + *
>> + * Copyright (c) 2014, Intel Corporation
>> + * Author: Dongxiao Xu <dongxiao.xu@intel.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + */
>> +#include <xen/init.h>
>> +#include <xen/cpu.h>
>> +#include <asm/psr.h>
>> +
>> +#define PSR_CMT        (1<<0)
>> +
>> +struct psr_cmt *__read_mostly  psr_cmt = NULL;
> Extra space. No need for the NULL assigment (as this is in .rodata section).
>
>> +static bool_t __initdata opt_psr = 0;
> Ditto. No need to assign 0.
>
>> +static unsigned int __initdata opt_rmid_max = 255;
> It is not really an 'optional' value as the default is '255'.
>
> I would just call it 'rmid_max'.

It is a value provided on the command line, which likely differs from
CPUID's max reported RMID.  opt_$FOO is the correct variable name.

>> +
>> +static void __init parse_psr_param(char *s)
>> +{
>> +    char *ss, *val_str;
>> +
>> +    do {
>> +        ss = strchr(s, ',');
>> +        if ( ss )
>> +            *ss = '\0';
>> +
>> +        val_str = strchr(s, ':');
>> +        if ( val_str )
>> +            *val_str++ = '\0';
>> +
>> +        if ( !strcmp(s, "cmt")
>> +             && ( !val_str || parse_bool(val_str) == 1 )) {
>> +            opt_psr &= PSR_CMT;
> &= ?
>
> Not opt_psr |= ?

Agreed - this appears to disable cmt if specified.

>
>> +        } else if ( val_str && !strcmp(s, "rmid_max") )
>> +            opt_rmid_max = simple_strtoul(val_str, NULL, 0);
>> +
>> +        s = ss + 1;
>> +    } while ( ss );
>> +}
>> +custom_param("psr", parse_psr_param);
>> +
>> +static void __init init_psr_cmt(unsigned int rmid_max)
>> +{
>> +    unsigned int eax, ebx, ecx, edx;
>> +    unsigned int rmid;
>> +
>> +    if ( !boot_cpu_has(X86_FEATURE_CMT) )
>> +        return;
>> +
>> +    cpuid_count(0xf, 0, &eax, &ebx, &ecx, &edx);
>> +    if ( !edx )
>> +        return;
>> +
>> +    psr_cmt = xzalloc(struct psr_cmt);
>> +    if ( !psr_cmt )
>> +        return;
>> +
>> +    psr_cmt->features = edx;
>> +    psr_cmt->rmid_mask = ~(~0ull << get_count_order(ebx));
>> +    psr_cmt->rmid_max = min(rmid_max, ebx);
>> +
>> +    if ( psr_cmt->features & PSR_RESOURCE_TYPE_L3 )
>> +    {
>> +        cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx);
>> +        psr_cmt->l3.upscaling_factor = ebx;
>> +        psr_cmt->l3.rmid_max = ecx;
>> +        psr_cmt->l3.features = edx;
>> +    }
>> +
>> +    psr_cmt->rmid_max = min(rmid_max, psr_cmt->l3.rmid_max);
>> +    psr_cmt->rmid_to_dom = xmalloc_array(domid_t, psr_cmt->rmid_max + 1);
>> +    if ( !psr_cmt->rmid_to_dom )
>> +    {
>> +        xfree(psr_cmt);
> And:
> 	psr_cmt = NULL;
>
> ?

Good catch, as "psr_cmt == NULL" is the check for psr being enabled.

~Andrew

>> +        return;
>> +    }
>> +    /* Reserve RMID 0 for all domains not being monitored */
> Full stop missing.
>
> Why do you reserve RMID 0? Can you include the explanation
> in the comment please?
>
>> +    psr_cmt->rmid_to_dom[0] = DOMID_XEN;
>> +    for ( rmid = 1; rmid <= psr_cmt->rmid_max; rmid++ )
>> +        psr_cmt->rmid_to_dom[rmid] = DOMID_INVALID;
>> +
>> +    printk(XENLOG_INFO "Cache Monitoring Technology Enabled.\n");
>> +}
>> +
>> +void __init init_psr(void)
>
>> +{
>> +    if ( opt_psr & PSR_CMT && opt_rmid_max )
>> +        init_psr_cmt(opt_rmid_max);
>> +}
> __initcall(init_psr);
>
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
>> index 8c8b91f..ca4785e 100644
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -49,6 +49,7 @@
>>  #include <xen/cpu.h>
>>  #include <asm/nmi.h>
>>  #include <asm/alternative.h>
>> +#include <asm/psr.h>
>>  
>>  /* opt_nosmp: If true, secondary processors are ignored. */
>>  static bool_t __initdata opt_nosmp;
>> @@ -1430,6 +1431,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>  
>>      domain_unpause_by_systemcontroller(dom0);
>>  
>> +    init_psr();
>> +
> And then you can remove this.
>
>>      reset_stack_and_jump(init_done);
>>  }
>>  
>> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
>> index 8014241..137d75c 100644
>> --- a/xen/include/asm-x86/cpufeature.h
>> +++ b/xen/include/asm-x86/cpufeature.h
>> @@ -148,6 +148,7 @@
>>  #define X86_FEATURE_ERMS	(7*32+ 9) /* Enhanced REP MOVSB/STOSB */
>>  #define X86_FEATURE_INVPCID	(7*32+10) /* Invalidate Process Context ID */
>>  #define X86_FEATURE_RTM 	(7*32+11) /* Restricted Transactional Memory */
>> +#define X86_FEATURE_CMT 	(7*32+12) /* Cache Monitoring Technology */
>>  #define X86_FEATURE_NO_FPU_SEL 	(7*32+13) /* FPU CS/DS stored as zero */
>>  #define X86_FEATURE_MPX		(7*32+14) /* Memory Protection Extensions */
>>  #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
>> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
>> new file mode 100644
>> index 0000000..e321890
>> --- /dev/null
>> +++ b/xen/include/asm-x86/psr.h
>> @@ -0,0 +1,53 @@
>> +/*
>> + * psr.h: Platform Shared Resource related service for guest.
>> + *
>> + * Copyright (c) 2014, Intel Corporation
>> + * Author: Dongxiao Xu <dongxiao.xu@intel.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + */
>> +#ifndef __ASM_PSR_H__
>> +#define __ASM_PSR_H__
>> +
>> +/* Resource Type Enumeration */
>> +#define PSR_RESOURCE_TYPE_L3            0x2
>> +
>> +/* L3 Monitoring Features */
>> +#define PSR_CMT_L3_OCCUPANCY           0x1
>> +
>> +struct psr_cmt_l3 {
>> +    unsigned int features;
>> +    unsigned int upscaling_factor;
>> +    unsigned int rmid_max;
>> +};
>> +
>> +struct psr_cmt {
>> +    unsigned long rmid_mask;
>> +    unsigned int rmid_max;
>> +    unsigned int features;
>> +    domid_t *rmid_to_dom;
>> +    struct psr_cmt_l3 l3;
>> +};
>> +
>> +extern struct psr_cmt *psr_cmt;
>> +
>> +void init_psr(void);
>> +
>> +#endif /* __ASM_PSR_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> -- 
>> 1.7.9.5
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 07/10] x86: enable CMT for each domain RMID
  2014-09-25 10:19 ` [PATCH v16 07/10] x86: enable CMT for each domain RMID Chao Peng
@ 2014-09-25 21:23   ` Andrew Cooper
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Cooper @ 2014-09-25 21:23 UTC (permalink / raw)
  To: Chao Peng, xen-devel
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Ian.Jackson, JBeulich, dgdegra

On 25/09/2014 11:19, Chao Peng wrote:
> If the CMT service is attached to a domain, its related RMID
> will be set to hardware for monitoring when the domain's vcpu is
> scheduled in. When the domain's vcpu is scheduled out, RMID 0
> (system reserved) will be set for monitoring.
>
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

> ---
>  xen/arch/x86/domain.c           |    5 +++++
>  xen/arch/x86/psr.c              |   27 +++++++++++++++++++++++++++
>  xen/include/asm-x86/msr-index.h |    3 +++
>  xen/include/asm-x86/psr.h       |    1 +
>  4 files changed, 36 insertions(+)
>
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 3cfd8f4..04a6719 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1418,6 +1418,8 @@ static void __context_switch(void)
>      {
>          memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
>          vcpu_save_fpu(p);
> +        if ( psr_cmt_enabled() )
> +            psr_assoc_rmid(0);
>          p->arch.ctxt_switch_from(p);
>      }
>  
> @@ -1442,6 +1444,9 @@ static void __context_switch(void)
>          }
>          vcpu_restore_fpu_eager(n);
>          n->arch.ctxt_switch_to(n);
> +
> +        if ( psr_cmt_enabled() && n->domain->arch.psr_rmid > 0 )
> +            psr_assoc_rmid(n->domain->arch.psr_rmid);
>      }
>  
>      gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 41f7496..56163bd 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -20,9 +20,15 @@
>  
>  #define PSR_CMT        (1<<0)
>  
> +struct pqr_assoc {
> +    uint64_t val;
> +    bool_t initialized;
> +};
> +
>  struct psr_cmt *__read_mostly  psr_cmt = NULL;
>  static bool_t __initdata opt_psr = 0;
>  static unsigned int __initdata opt_rmid_max = 255;
> +static DEFINE_PER_CPU(struct pqr_assoc, pqr_assoc);
>  
>  static void __init parse_psr_param(char *s)
>  {
> @@ -142,6 +148,27 @@ void psr_free_rmid(struct domain *d)
>      d->arch.psr_rmid = 0;
>  }
>  
> +void psr_assoc_rmid(unsigned int rmid)
> +{
> +    uint64_t val;
> +    uint64_t new_val;
> +    struct pqr_assoc *pqr = &this_cpu(pqr_assoc);
> +
> +    if ( !pqr->initialized )
> +    {
> +        rdmsrl(MSR_IA32_PQR_ASSOC, pqr->val);
> +        pqr->initialized = 1;
> +    }
> +    val = pqr->val;
> +
> +    new_val = (val & ~psr_cmt->rmid_mask) | (rmid & psr_cmt->rmid_mask);
> +    if ( val != new_val )
> +    {
> +        wrmsrl(MSR_IA32_PQR_ASSOC, new_val);
> +        pqr->val = new_val;
> +    }
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
> index 542222e..dcb2b87 100644
> --- a/xen/include/asm-x86/msr-index.h
> +++ b/xen/include/asm-x86/msr-index.h
> @@ -323,6 +323,9 @@
>  #define MSR_IA32_TSC_DEADLINE		0x000006E0
>  #define MSR_IA32_ENERGY_PERF_BIAS	0x000001b0
>  
> +/* Platform Shared Resource MSRs */
> +#define MSR_IA32_PQR_ASSOC		0x00000c8f
> +
>  /* Intel Model 6 */
>  #define MSR_P6_PERFCTR(n)		(0x000000c1 + (n))
>  #define MSR_P6_EVNTSEL(n)		(0x00000186 + (n))
> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> index 930b22b..00b4625 100644
> --- a/xen/include/asm-x86/psr.h
> +++ b/xen/include/asm-x86/psr.h
> @@ -48,6 +48,7 @@ static inline bool_t psr_cmt_enabled(void)
>  void init_psr(void);
>  int psr_alloc_rmid(struct domain *d);
>  void psr_free_rmid(struct domain *d);
> +void psr_assoc_rmid(unsigned int rmid);
>  
>  #endif /* __ASM_PSR_H__ */
>  

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-25 19:57   ` Andrew Cooper
  2014-09-25 20:12     ` Konrad Rzeszutek Wilk
@ 2014-09-26  1:19     ` Chao Peng
  2014-09-26  8:28     ` Jan Beulich
  2 siblings, 0 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-26  1:19 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 08:57:38PM +0100, Andrew Cooper wrote:
> On 25/09/14 11:19, Chao Peng wrote:
> > Add a generic resource access hypercall for tool stack or other
> > components, e.g., accessing MSR, port I/O, etc.
> >
> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > ---
> >  xen/arch/x86/platform_hypercall.c        |   90 ++++++++++++++++++++++++++++++
> >  xen/arch/x86/x86_64/platform_hypercall.c |    4 ++
> >  xen/include/public/platform.h            |   23 ++++++++
> >  xen/include/xlat.lst                     |    2 +
> >  4 files changed, 119 insertions(+)
> >
> > diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
> > index 2162811..081d9f5 100644
> > --- a/xen/arch/x86/platform_hypercall.c
> > +++ b/xen/arch/x86/platform_hypercall.c
> > @@ -61,6 +61,68 @@ long cpu_down_helper(void *data);
> >  long core_parking_helper(void *data);
> >  uint32_t get_cur_idle_nums(void);
> >  
> > +struct xen_resource_access {
> > +    int32_t ret;
> > +    uint32_t nr;
> > +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> > +};
> > +
> > +static bool_t allow_access_msr(unsigned int msr)
> > +{
> > +    return 0;
> > +}
> > +
> > +static void resource_access(void *info)
> > +{
> > +    struct xen_resource_access *ra = info;
> > +    xenpf_resource_data_t data;
> > +    int ret = 0;
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < ra->nr; i++ )
> > +    {
> > +        if ( copy_from_guest_offset(&data, ra->data, i, 1) )
> 
> You cannot use copy_{to,from}_guest() here.  You are almost certainly on
> the wrong cpu, and running on the wrong set of pagetables.
> 
> At best, you will copy bogus data from the wrong process/domain, and
> likely corrupt it when copying back.
> 
> > +        {
> > +            ret = -EFAULT;
> > +            break;
> > +        }
> > +
> > +        if ( data.rsvd ) {
> > +            ret = -EINVAL;
> > +            break;
> > +        }
> > +
> > +        switch ( data.cmd )
> > +        {
> > +        case XEN_RESOURCE_OP_MSR_READ:
> > +        case XEN_RESOURCE_OP_MSR_WRITE:
> > +            if ( data.idx >> 32 )
> > +                ret = -EINVAL;
> > +            else if ( !allow_access_msr(data.idx) )
> > +                ret = -EACCES;
> > +            else if ( data.cmd == XEN_RESOURCE_OP_MSR_READ )
> > +                ret = rdmsr_safe(data.idx, data.val);
> > +            else
> > +                ret = wrmsr_safe(data.idx, data.val);
> > +            break;
> > +        default:
> > +            ret = -EINVAL;
> > +            break;
> > +        }
> > +
> > +        if ( ret )
> > +            break;
> > +
> > +        if ( copy_to_guest_offset(ra->data, i, &data, 1) )
> > +        {
> > +            ret = -EFAULT;
> > +            break;
> > +        }
> > +    }
> > +
> > +    ra->ret = ret;
> > +}
> > +
> >  ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
> >  {
> >      ret_t ret = 0;
> > @@ -601,6 +663,34 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
> >      }
> >      break;
> >  
> > +    case XENPF_resource_op:
> > +    {
> > +        struct xen_resource_access ra;
> > +        struct xenpf_resource_op *rsc_op = &op->u.resource_op;
> > +        unsigned int cpu = smp_processor_id();
> > +
> > +        ra.nr = rsc_op->nr;
> > +        ra.data = rsc_op->data;
> 
> You must do all copy_{from,to}_user() here, and strictly only pass Xen
> pointers to resource_access().
> 
> This means you will need to xmalloc() yourself some space for the
> xenpf_resource_data_t array.
> 
> 
> On a different note, you need to enforce a maximum resource_op.nr of
> something rather low to (16/32 perhaps?) to prevent a toolstack asking
> for 0xffffffff non-preemptible operations.
> 
Correct, thanks Andrew.
Chao
> 
> > +
> > +        if ( rsc_op->cpu == cpu )
> > +            resource_access(&ra);
> > +        else if ( cpu_online(rsc_op->cpu) )
> > +            on_selected_cpus(cpumask_of(rsc_op->cpu),
> > +                         resource_access, &ra, 1);
> > +        else
> > +        {
> > +            ret = -ENODEV;
> > +            break;
> > +        }
> > +
> > +        if ( ra.ret )
> > +        {
> > +            ret = ra.ret;
> > +            break;
> > +        }
> > +    }
> > +    break;
> > +
> >      default:
> >          ret = -ENOSYS;
> >          break;
> > diff --git a/xen/arch/x86/x86_64/platform_hypercall.c b/xen/arch/x86/x86_64/platform_hypercall.c
> > index b6f380e..4db6622 100644
> > --- a/xen/arch/x86/x86_64/platform_hypercall.c
> > +++ b/xen/arch/x86/x86_64/platform_hypercall.c
> > @@ -32,6 +32,10 @@ CHECK_pf_pcpu_version;
> >  CHECK_pf_enter_acpi_sleep;
> >  #undef xen_pf_enter_acpi_sleep
> >  
> > +#define xen_pf_resource_data xenpf_resource_data
> > +CHECK_pf_resource_data;
> > +#undef xen_pf_resource_data
> > +
> >  #define COMPAT
> >  #define _XEN_GUEST_HANDLE(t) XEN_GUEST_HANDLE(t)
> >  #define _XEN_GUEST_HANDLE_PARAM(t) XEN_GUEST_HANDLE_PARAM(t)
> > diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
> > index 053b9fa..e4d9091 100644
> > --- a/xen/include/public/platform.h
> > +++ b/xen/include/public/platform.h
> > @@ -527,6 +527,28 @@ struct xenpf_core_parking {
> >  typedef struct xenpf_core_parking xenpf_core_parking_t;
> >  DEFINE_XEN_GUEST_HANDLE(xenpf_core_parking_t);
> >  
> > +#define XENPF_resource_op   61
> > +
> > +#define XEN_RESOURCE_OP_MSR_READ  0
> > +#define XEN_RESOURCE_OP_MSR_WRITE 1
> > +
> > +struct xenpf_resource_data {
> > +    uint32_t cmd;       /* XEN_RESOURCE_OP_* */
> > +    uint32_t rsvd;
> > +    uint64_t idx;
> > +    uint64_t val;
> > +};
> > +typedef struct xenpf_resource_data xenpf_resource_data_t;
> > +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_data_t);
> > +
> > +struct xenpf_resource_op {
> > +    uint32_t nr;    /* number of data entry */
> > +    uint32_t cpu;   /* which cpu to run */
> > +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> > +};
> > +typedef struct xenpf_resource_op xenpf_resource_op_t;
> > +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_op_t);
> > +
> >  /*
> >   * ` enum neg_errnoval
> >   * ` HYPERVISOR_platform_op(const struct xen_platform_op*);
> > @@ -553,6 +575,7 @@ struct xen_platform_op {
> >          struct xenpf_cpu_hotadd        cpu_add;
> >          struct xenpf_mem_hotadd        mem_add;
> >          struct xenpf_core_parking      core_parking;
> > +        struct xenpf_resource_op       resource_op;
> >          uint8_t                        pad[128];
> >      } u;
> >  };
> > diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> > index 9a35dd7..100fcf5 100644
> > --- a/xen/include/xlat.lst
> > +++ b/xen/include/xlat.lst
> > @@ -88,6 +88,8 @@
> >  ?	xenpf_enter_acpi_sleep		platform.h
> >  ?	xenpf_pcpuinfo			platform.h
> >  ?	xenpf_pcpu_version		platform.h
> > +?	xenpf_resource_op		platform.h
> > +?	xenpf_resource_data		platform.h
> >  !	sched_poll			sched.h
> >  ?	sched_remote_shutdown		sched.h
> >  ?	sched_shutdown			sched.h

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-25 20:12     ` Konrad Rzeszutek Wilk
  2014-09-25 20:17       ` Konrad Rzeszutek Wilk
@ 2014-09-26  1:34       ` Chao Peng
  1 sibling, 0 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-26  1:34 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Andrew Cooper, Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 04:12:18PM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Sep 25, 2014 at 08:57:38PM +0100, Andrew Cooper wrote:
> > On 25/09/14 11:19, Chao Peng wrote:
> > > Add a generic resource access hypercall for tool stack or other
> > > components, e.g., accessing MSR, port I/O, etc.
> > >
> 
> You should include a bit more information in the description.
> Please give a bit information on what kind of parameters the
> hypercall is to have.
ok
> 
> > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> > > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > > ---
> > >  xen/arch/x86/platform_hypercall.c        |   90 ++++++++++++++++++++++++++++++
> > >  xen/arch/x86/x86_64/platform_hypercall.c |    4 ++
> > >  xen/include/public/platform.h            |   23 ++++++++
> > >  xen/include/xlat.lst                     |    2 +
> > >  4 files changed, 119 insertions(+)
> > >
> > > diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
> > > index 2162811..081d9f5 100644
> > > --- a/xen/arch/x86/platform_hypercall.c
> > > +++ b/xen/arch/x86/platform_hypercall.c
> > > @@ -61,6 +61,68 @@ long cpu_down_helper(void *data);
> > >  long core_parking_helper(void *data);
> > >  uint32_t get_cur_idle_nums(void);
> > >  
> > > +struct xen_resource_access {
> > > +    int32_t ret;
> > > +    uint32_t nr;
> > > +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> > > +};
> > > +
> > > +static bool_t allow_access_msr(unsigned int msr)
> > > +{
> > > +    return 0;
> > > +}
> > > +
> > > +static void resource_access(void *info)
> > > +{
> > > +    struct xen_resource_access *ra = info;
> > > +    xenpf_resource_data_t data;
> > > +    int ret = 0;
> > > +    unsigned int i;
> > > +
> > > +    for ( i = 0; i < ra->nr; i++ )
> > > +    {
> > > +        if ( copy_from_guest_offset(&data, ra->data, i, 1) )
> > 
> > You cannot use copy_{to,from}_guest() here.  You are almost certainly on
> > the wrong cpu, and running on the wrong set of pagetables.
> > 
> > At best, you will copy bogus data from the wrong process/domain, and
> > likely corrupt it when copying back.
> > 
> > > +        {
> > > +            ret = -EFAULT;
> > > +            break;
> > > +        }
> > > +
> > > +        if ( data.rsvd ) {
> 
> That '{' needs to be on its own line.
> 
> > > +            ret = -EINVAL;
> > > +            break;
> > > +        }
> > > +
> > > +        switch ( data.cmd )
> > > +        {
> > > +        case XEN_RESOURCE_OP_MSR_READ:
> > > +        case XEN_RESOURCE_OP_MSR_WRITE:
> > > +            if ( data.idx >> 32 )
> > > +                ret = -EINVAL;
> > > +            else if ( !allow_access_msr(data.idx) )
> > > +                ret = -EACCES;
> > > +            else if ( data.cmd == XEN_RESOURCE_OP_MSR_READ )
> > > +                ret = rdmsr_safe(data.idx, data.val);
> > > +            else
> > > +                ret = wrmsr_safe(data.idx, data.val);
> > > +            break;
> > > +        default:
> > > +            ret = -EINVAL;
> > > +            break;
> > > +        }
> > > +
> > > +        if ( ret )
> > > +            break;
> > > +
> > > +        if ( copy_to_guest_offset(ra->data, i, &data, 1) )
> > > +        {
> > > +            ret = -EFAULT;
> > > +            break;
> > > +        }
> > > +    }
> > > +
> > > +    ra->ret = ret;
> > > +}
> > > +
> > >  ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
> > >  {
> > >      ret_t ret = 0;
> > > @@ -601,6 +663,34 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
> > >      }
> > >      break;
> > >  
> > > +    case XENPF_resource_op:
> > > +    {
> > > +        struct xen_resource_access ra;
> > > +        struct xenpf_resource_op *rsc_op = &op->u.resource_op;
> > > +        unsigned int cpu = smp_processor_id();
> > > +
> > > +        ra.nr = rsc_op->nr;
> > > +        ra.data = rsc_op->data;
> > 
> > You must do all copy_{from,to}_user() here, and strictly only pass Xen
> > pointers to resource_access().
> > 
> > This means you will need to xmalloc() yourself some space for the
> > xenpf_resource_data_t array.
> > 
> > 
> > On a different note, you need to enforce a maximum resource_op.nr of
> > something rather low to (16/32 perhaps?) to prevent a toolstack asking
> > for 0xffffffff non-preemptible operations.
> 
> The toolstack has:
> 
> int xc_resource_op(xc_interface *xch, uint32_t nr_ops, xc_resource_op_t *ops)
> {
>     if ( nr_ops == 1 )
>         return xc_resource_op_one(xch, ops);
> 
> And with that expectation we ought to have a similar check here, in the form
> of:
> 
> 	if ( ra.nr != 1 )
> 		return -EINVAL;
> 
The 'nr' here is not the 'nr_ops', but the 'nr_entries' contained in an 'op'.
This hypercall can only process only one 'op' at a time, multiple 'ops'
are processed by multicall then pass each 'op' down to this hypercall.

While I think I can improve the name here(nr => nr_entries).
Thanks.

> > 
> > ~Andrew
> > 
> > > +
> > > +        if ( rsc_op->cpu == cpu )
> > > +            resource_access(&ra);
> > > +        else if ( cpu_online(rsc_op->cpu) )
> > > +            on_selected_cpus(cpumask_of(rsc_op->cpu),
> > > +                         resource_access, &ra, 1);
> > > +        else
> > > +        {
> > > +            ret = -ENODEV;
> > > +            break;
> > > +        }
> > > +
> > > +        if ( ra.ret )
> > > +        {
> > > +            ret = ra.ret;
> > > +            break;
> > > +        }
> > > +    }
> > > +    break;
> > > +
> > >      default:
> > >          ret = -ENOSYS;
> > >          break;
> > > diff --git a/xen/arch/x86/x86_64/platform_hypercall.c b/xen/arch/x86/x86_64/platform_hypercall.c
> > > index b6f380e..4db6622 100644
> > > --- a/xen/arch/x86/x86_64/platform_hypercall.c
> > > +++ b/xen/arch/x86/x86_64/platform_hypercall.c
> > > @@ -32,6 +32,10 @@ CHECK_pf_pcpu_version;
> > >  CHECK_pf_enter_acpi_sleep;
> > >  #undef xen_pf_enter_acpi_sleep
> > >  
> > > +#define xen_pf_resource_data xenpf_resource_data
> > > +CHECK_pf_resource_data;
> > > +#undef xen_pf_resource_data
> > > +
> > >  #define COMPAT
> > >  #define _XEN_GUEST_HANDLE(t) XEN_GUEST_HANDLE(t)
> > >  #define _XEN_GUEST_HANDLE_PARAM(t) XEN_GUEST_HANDLE_PARAM(t)
> > > diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
> > > index 053b9fa..e4d9091 100644
> > > --- a/xen/include/public/platform.h
> > > +++ b/xen/include/public/platform.h
> > > @@ -527,6 +527,28 @@ struct xenpf_core_parking {
> > >  typedef struct xenpf_core_parking xenpf_core_parking_t;
> > >  DEFINE_XEN_GUEST_HANDLE(xenpf_core_parking_t);
> > >  
> > > +#define XENPF_resource_op   61
> 
> More details please.
ok
> > > +
> > > +#define XEN_RESOURCE_OP_MSR_READ  0
> > > +#define XEN_RESOURCE_OP_MSR_WRITE 1
> > > +
> > > +struct xenpf_resource_data {
> > > +    uint32_t cmd;       /* XEN_RESOURCE_OP_* */
> > > +    uint32_t rsvd;
> > > +    uint64_t idx;
> > > +    uint64_t val;
> 
> More details please. Pls say what the 'rsvd' is for, what
> the expected values are for 'idx', and 'val'.
> 
> Do also say which ones are IN or OUT.
> 
> Put yourself in the mindset of somebody who wants to use this
> and does not want to dive in the hypervisor to figure this out.
> Give as much information as possible in the headers.
> 
Good suggestion, I will follow this.
> > > +typedef struct xenpf_resource_data xenpf_resource_data_t;
> > > +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_data_t);
> > > +
> > > +struct xenpf_resource_op {
> > > +    uint32_t nr;    /* number of data entry */
> > > +    uint32_t cpu;   /* which cpu to run */
> > > +    XEN_GUEST_HANDLE(xenpf_resource_data_t) data;
> > > +};
> > > +typedef struct xenpf_resource_op xenpf_resource_op_t;
> > > +DEFINE_XEN_GUEST_HANDLE(xenpf_resource_op_t);
> > > +
> > >  /*
> > >   * ` enum neg_errnoval
> > >   * ` HYPERVISOR_platform_op(const struct xen_platform_op*);
> > > @@ -553,6 +575,7 @@ struct xen_platform_op {
> > >          struct xenpf_cpu_hotadd        cpu_add;
> > >          struct xenpf_mem_hotadd        mem_add;
> > >          struct xenpf_core_parking      core_parking;
> > > +        struct xenpf_resource_op       resource_op;
> 
> resource_op?  I would really call this 'msr' or 'msr_data'
So you have known that this is not just for msr.
> 
> 
> > >          uint8_t                        pad[128];
> > >      } u;
> > >  };
> > > diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> > > index 9a35dd7..100fcf5 100644
> > > --- a/xen/include/xlat.lst
> > > +++ b/xen/include/xlat.lst
> > > @@ -88,6 +88,8 @@
> > >  ?	xenpf_enter_acpi_sleep		platform.h
> > >  ?	xenpf_pcpuinfo			platform.h
> > >  ?	xenpf_pcpu_version		platform.h
> > > +?	xenpf_resource_op		platform.h
> > > +?	xenpf_resource_data		platform.h
> > >  !	sched_poll			sched.h
> > >  ?	sched_remote_shutdown		sched.h
> > >  ?	sched_shutdown			sched.h
> > 
> > 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature
  2014-09-25 21:14     ` Andrew Cooper
@ 2014-09-26  1:54       ` Chao Peng
  0 siblings, 0 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-26  1:54 UTC (permalink / raw)
  To: Andrew Cooper, Konrad Rzeszutek Wilk
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 10:14:35PM +0100, Andrew Cooper wrote:
> On 25/09/2014 21:33, Konrad Rzeszutek Wilk wrote:
> > On Thu, Sep 25, 2014 at 06:19:04PM +0800, Chao Peng wrote:
> >> Detect Cache Monitoring Technology(CMT) feature and enumerate the
> >> resource types, one of which is to monitor the L3 cache occupancy.
> >>
> >> Also introduce a Xen command line parameter to control the Platform
> >> Shared Resource such as CMT.
> >>
> >> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> >> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> >> ---
> >>  docs/misc/xen-command-line.markdown |   12 ++++
> >>  xen/arch/x86/Makefile               |    1 +
> >>  xen/arch/x86/psr.c                  |  107 +++++++++++++++++++++++++++++++++++
> >>  xen/arch/x86/setup.c                |    3 +
> >>  xen/include/asm-x86/cpufeature.h    |    1 +
> >>  xen/include/asm-x86/psr.h           |   53 +++++++++++++++++
> >>  6 files changed, 177 insertions(+)
> >>  create mode 100644 xen/arch/x86/psr.c
> >>  create mode 100644 xen/include/asm-x86/psr.h
> >>
> >> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> >> index af93e17..b106a46 100644
> >> --- a/docs/misc/xen-command-line.markdown
> >> +++ b/docs/misc/xen-command-line.markdown
> >> @@ -1005,6 +1005,18 @@ This option can be specified more than once (up to 8 times at present).
> >>  ### ple\_window
> >>  > `= <integer>`
> >>  
> >> +### psr (Intel)
> >> +> `= List of ( cmt:<boolean> | rmid_max:<integer> )`
> > Please explain what 'psr' is (the full name) and why one would want
> > to use it.
> >
> >> +
> >> +> Default: `psr=cmt:0,rmid_max:255`
> >> +
> 
> Personally, I think these first 5 lines of context are fine.  They
> follow the standard layout of all other parameters in this document wrt
> names, valid values and default values.
> 
> >> +Configure platform shared resource services, which are available on Intel
> >> +Haswell Server family and future platforms.
> >> +
> >> +`cmt` instructs Xen to enable/disable Cache Monitoring Technology.
> 
> I feel that the wording here could be improved for extra clarity.  How
> about:
> 
> Platform Shared Resource Services.  Intel Haswell and later server
> platforms offer information about the sharing of resources.
> 
> The following resources are available:
> * Cache Monitoring Technology (Haswell and later).  Information
> regarding the L3 cache occupancy.
This is good.
> 
> (I seem to remember another one about L3 data bandwidth to local and
> non-local memory controllers, but cant remember its name offhand)
> 
Yeah, there are local/total L3 data bandwidth monitoring. They are
planed in another patchset, so here I may not mention them.
> > Please include the default value.
> >
> >> +
> >> +`rmid_max` indicates the max value for rmid.
> > Couple of issues:
> >  - It reads as not optional (from the documentation) - so what are the values
> >    that can used? What are the ranges?
> >  - What is the default value?
> >  - What is 'RMID'? 
> 
> The hardware has a maximum supported RMID, but instead of allocating
> memory based on a u32 out of cpuid, I insisted on a command line max
> parameter to provide a sane upper bound.
> 
> I would however agree that a sentence or two describing what an RMID is
> would be useful here, although being strictly the command line
> documentation, I don't think it warrants a full whitepapers worth of detail.
> 
Sure.
> >
> > Please please expand more on this. You want users to able to easily
> > read it and understand it right away without having to search for an
> > whitepaper on it.
> >> +
> >>  ### reboot
> >>  > `= t[riple] | k[bd] | a[cpi] | p[ci] | n[o] [, [w]arm | [c]old]`
> >>  
> >> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> >> index c1e244d..cf137fd 100644
> >> --- a/xen/arch/x86/Makefile
> >> +++ b/xen/arch/x86/Makefile
> >> @@ -59,6 +59,7 @@ obj-y += crash.o
> >>  obj-y += tboot.o
> >>  obj-y += hpet.o
> >>  obj-y += xstate.o
> >> +obj-y += psr.o
> >>  
> >>  obj-$(crash_debug) += gdbstub.o
> >>  
> >> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> >> new file mode 100644
> >> index 0000000..9025aeb
> >> --- /dev/null
> >> +++ b/xen/arch/x86/psr.c
> >> @@ -0,0 +1,107 @@
> >> +/*
> >> + * pqos.c: Platform Shared Resource related service for guest.
> >> + *
> >> + * Copyright (c) 2014, Intel Corporation
> >> + * Author: Dongxiao Xu <dongxiao.xu@intel.com>
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify it
> >> + * under the terms and conditions of the GNU General Public License,
> >> + * version 2, as published by the Free Software Foundation.
> >> + *
> >> + * This program is distributed in the hope it will be useful, but WITHOUT
> >> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> >> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> >> + * more details.
> >> + */
> >> +#include <xen/init.h>
> >> +#include <xen/cpu.h>
> >> +#include <asm/psr.h>
> >> +
> >> +#define PSR_CMT        (1<<0)
> >> +
> >> +struct psr_cmt *__read_mostly  psr_cmt = NULL;
> > Extra space. No need for the NULL assigment (as this is in .rodata section).
> >
> >> +static bool_t __initdata opt_psr = 0;
> > Ditto. No need to assign 0.
> >
> >> +static unsigned int __initdata opt_rmid_max = 255;
> > It is not really an 'optional' value as the default is '255'.
> >
> > I would just call it 'rmid_max'.
> 
> It is a value provided on the command line, which likely differs from
> CPUID's max reported RMID.  opt_$FOO is the correct variable name.
> 
> >> +
> >> +static void __init parse_psr_param(char *s)
> >> +{
> >> +    char *ss, *val_str;
> >> +
> >> +    do {
> >> +        ss = strchr(s, ',');
> >> +        if ( ss )
> >> +            *ss = '\0';
> >> +
> >> +        val_str = strchr(s, ':');
> >> +        if ( val_str )
> >> +            *val_str++ = '\0';
> >> +
> >> +        if ( !strcmp(s, "cmt")
> >> +             && ( !val_str || parse_bool(val_str) == 1 )) {
> >> +            opt_psr &= PSR_CMT;
> > &= ?
> >
> > Not opt_psr |= ?
> 
> Agreed - this appears to disable cmt if specified.
> 
> >
> >> +        } else if ( val_str && !strcmp(s, "rmid_max") )
> >> +            opt_rmid_max = simple_strtoul(val_str, NULL, 0);
> >> +
> >> +        s = ss + 1;
> >> +    } while ( ss );
> >> +}
> >> +custom_param("psr", parse_psr_param);
> >> +
> >> +static void __init init_psr_cmt(unsigned int rmid_max)
> >> +{
> >> +    unsigned int eax, ebx, ecx, edx;
> >> +    unsigned int rmid;
> >> +
> >> +    if ( !boot_cpu_has(X86_FEATURE_CMT) )
> >> +        return;
> >> +
> >> +    cpuid_count(0xf, 0, &eax, &ebx, &ecx, &edx);
> >> +    if ( !edx )
> >> +        return;
> >> +
> >> +    psr_cmt = xzalloc(struct psr_cmt);
> >> +    if ( !psr_cmt )
> >> +        return;
> >> +
> >> +    psr_cmt->features = edx;
> >> +    psr_cmt->rmid_mask = ~(~0ull << get_count_order(ebx));
> >> +    psr_cmt->rmid_max = min(rmid_max, ebx);
> >> +
> >> +    if ( psr_cmt->features & PSR_RESOURCE_TYPE_L3 )
> >> +    {
> >> +        cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx);
> >> +        psr_cmt->l3.upscaling_factor = ebx;
> >> +        psr_cmt->l3.rmid_max = ecx;
> >> +        psr_cmt->l3.features = edx;
> >> +    }
> >> +
> >> +    psr_cmt->rmid_max = min(rmid_max, psr_cmt->l3.rmid_max);
> >> +    psr_cmt->rmid_to_dom = xmalloc_array(domid_t, psr_cmt->rmid_max + 1);
> >> +    if ( !psr_cmt->rmid_to_dom )
> >> +    {
> >> +        xfree(psr_cmt);
> > And:
> > 	psr_cmt = NULL;
> >
> > ?
> 
> Good catch, as "psr_cmt == NULL" is the check for psr being enabled.
Thanks Konrad.
> ~Andrew
> 
> >> +        return;
> >> +    }
> >> +    /* Reserve RMID 0 for all domains not being monitored */
> > Full stop missing.
> >
> > Why do you reserve RMID 0? Can you include the explanation
> > in the comment please?
ok
> >
> >> +    psr_cmt->rmid_to_dom[0] = DOMID_XEN;
> >> +    for ( rmid = 1; rmid <= psr_cmt->rmid_max; rmid++ )
> >> +        psr_cmt->rmid_to_dom[rmid] = DOMID_INVALID;
> >> +
> >> +    printk(XENLOG_INFO "Cache Monitoring Technology Enabled.\n");
> >> +}
> >> +
> >> +void __init init_psr(void)
> >
> >> +{
> >> +    if ( opt_psr & PSR_CMT && opt_rmid_max )
> >> +        init_psr_cmt(opt_rmid_max);
> >> +}
> > __initcall(init_psr);
> >
> >> +
> >> +/*
> >> + * Local variables:
> >> + * mode: C
> >> + * c-file-style: "BSD"
> >> + * c-basic-offset: 4
> >> + * tab-width: 4
> >> + * indent-tabs-mode: nil
> >> + * End:
> >> + */
> >> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> >> index 8c8b91f..ca4785e 100644
> >> --- a/xen/arch/x86/setup.c
> >> +++ b/xen/arch/x86/setup.c
> >> @@ -49,6 +49,7 @@
> >>  #include <xen/cpu.h>
> >>  #include <asm/nmi.h>
> >>  #include <asm/alternative.h>
> >> +#include <asm/psr.h>
> >>  
> >>  /* opt_nosmp: If true, secondary processors are ignored. */
> >>  static bool_t __initdata opt_nosmp;
> >> @@ -1430,6 +1431,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> >>  
> >>      domain_unpause_by_systemcontroller(dom0);
> >>  
> >> +    init_psr();
> >> +
> > And then you can remove this.
> >
> >>      reset_stack_and_jump(init_done);
> >>  }
> >>  
> >> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> >> index 8014241..137d75c 100644
> >> --- a/xen/include/asm-x86/cpufeature.h
> >> +++ b/xen/include/asm-x86/cpufeature.h
> >> @@ -148,6 +148,7 @@
> >>  #define X86_FEATURE_ERMS	(7*32+ 9) /* Enhanced REP MOVSB/STOSB */
> >>  #define X86_FEATURE_INVPCID	(7*32+10) /* Invalidate Process Context ID */
> >>  #define X86_FEATURE_RTM 	(7*32+11) /* Restricted Transactional Memory */
> >> +#define X86_FEATURE_CMT 	(7*32+12) /* Cache Monitoring Technology */
> >>  #define X86_FEATURE_NO_FPU_SEL 	(7*32+13) /* FPU CS/DS stored as zero */
> >>  #define X86_FEATURE_MPX		(7*32+14) /* Memory Protection Extensions */
> >>  #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
> >> diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
> >> new file mode 100644
> >> index 0000000..e321890
> >> --- /dev/null
> >> +++ b/xen/include/asm-x86/psr.h
> >> @@ -0,0 +1,53 @@
> >> +/*
> >> + * psr.h: Platform Shared Resource related service for guest.
> >> + *
> >> + * Copyright (c) 2014, Intel Corporation
> >> + * Author: Dongxiao Xu <dongxiao.xu@intel.com>
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify it
> >> + * under the terms and conditions of the GNU General Public License,
> >> + * version 2, as published by the Free Software Foundation.
> >> + *
> >> + * This program is distributed in the hope it will be useful, but WITHOUT
> >> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> >> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> >> + * more details.
> >> + */
> >> +#ifndef __ASM_PSR_H__
> >> +#define __ASM_PSR_H__
> >> +
> >> +/* Resource Type Enumeration */
> >> +#define PSR_RESOURCE_TYPE_L3            0x2
> >> +
> >> +/* L3 Monitoring Features */
> >> +#define PSR_CMT_L3_OCCUPANCY           0x1
> >> +
> >> +struct psr_cmt_l3 {
> >> +    unsigned int features;
> >> +    unsigned int upscaling_factor;
> >> +    unsigned int rmid_max;
> >> +};
> >> +
> >> +struct psr_cmt {
> >> +    unsigned long rmid_mask;
> >> +    unsigned int rmid_max;
> >> +    unsigned int features;
> >> +    domid_t *rmid_to_dom;
> >> +    struct psr_cmt_l3 l3;
> >> +};
> >> +
> >> +extern struct psr_cmt *psr_cmt;
> >> +
> >> +void init_psr(void);
> >> +
> >> +#endif /* __ASM_PSR_H__ */
> >> +
> >> +/*
> >> + * Local variables:
> >> + * mode: C
> >> + * c-file-style: "BSD"
> >> + * c-basic-offset: 4
> >> + * tab-width: 4
> >> + * indent-tabs-mode: nil
> >> + * End:
> >> + */
> >> -- 
> >> 1.7.9.5
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xen.org
> >> http://lists.xen.org/xen-devel
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-25 19:57   ` Andrew Cooper
  2014-09-25 20:12     ` Konrad Rzeszutek Wilk
  2014-09-26  1:19     ` Chao Peng
@ 2014-09-26  8:28     ` Jan Beulich
  2014-09-26  8:58       ` Chao Peng
  2 siblings, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2014-09-26  8:28 UTC (permalink / raw)
  To: Andrew Cooper, Chao Peng, xen-devel
  Cc: keir, Ian.Campbell, George.Dunlap, stefano.stabellini,
	Ian.Jackson, dgdegra

>>> On 25.09.14 at 21:57, <andrew.cooper3@citrix.com> wrote:
> On a different note, you need to enforce a maximum resource_op.nr of
> something rather low to (16/32 perhaps?) to prevent a toolstack asking
> for 0xffffffff non-preemptible operations.

I'd start out as low as we can, i.e. right now just 2 (write followed
by read). Any increase of the threshold will then need proper (and
trackable) reasoning.

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 08/10] x86: add CMT related MSRs in allowed list
  2014-09-25 20:58   ` Konrad Rzeszutek Wilk
@ 2014-09-26  8:38     ` Jan Beulich
  2014-09-26 13:14       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2014-09-26  8:38 UTC (permalink / raw)
  To: Chao Peng, Konrad Rzeszutek Wilk
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, dgdegra

>>> On 25.09.14 at 22:58, <konrad.wilk@oracle.com> wrote:
> On Thu, Sep 25, 2014 at 06:19:08PM +0800, Chao Peng wrote:
>> Tool stack will try to access the two MSRs to perform CMT
>> related operations, thus added them in the allowed list.
>> 
>> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
>> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
>> ---
>>  xen/arch/x86/platform_hypercall.c |    7 +++++++
>>  xen/include/asm-x86/msr-index.h   |    2 ++
>>  2 files changed, 9 insertions(+)
>> 
>> diff --git a/xen/arch/x86/platform_hypercall.c 
> b/xen/arch/x86/platform_hypercall.c
>> index 081d9f5..be06f3a 100644
>> --- a/xen/arch/x86/platform_hypercall.c
>> +++ b/xen/arch/x86/platform_hypercall.c
>> @@ -69,6 +69,13 @@ struct xen_resource_access {
>>  
>>  static bool_t allow_access_msr(unsigned int msr)
>>  {
>> +    switch ( msr )
>> +    {
>> +    case MSR_IA32_QOSEVTSEL:
>> +    case MSR_IA32_QMC:
>> +        return 1;
>> +    }
>> +
>>      return 0;
>>  }
>>  
>> diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
>> index dcb2b87..ae089fb 100644
>> --- a/xen/include/asm-x86/msr-index.h
>> +++ b/xen/include/asm-x86/msr-index.h
>> @@ -324,6 +324,8 @@
>>  #define MSR_IA32_ENERGY_PERF_BIAS	0x000001b0
>>  
>>  /* Platform Shared Resource MSRs */
>> +#define MSR_IA32_QOSEVTSEL		0x00000c8d
>> +#define MSR_IA32_QMC			0x00000c8e
> 
> Could you mention where they are in the SDM ?

In the MSR related appendix of course. Let's not go overboard with
adding all kinds of information here that was never added for other
MSRs: The header's purpose is just giving names to numbers.

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-26  8:28     ` Jan Beulich
@ 2014-09-26  8:58       ` Chao Peng
  0 siblings, 0 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-26  8:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	Andrew Cooper, Ian.Jackson, xen-devel, dgdegra

On Fri, Sep 26, 2014 at 09:28:21AM +0100, Jan Beulich wrote:
> >>> On 25.09.14 at 21:57, <andrew.cooper3@citrix.com> wrote:
> > On a different note, you need to enforce a maximum resource_op.nr of
> > something rather low to (16/32 perhaps?) to prevent a toolstack asking
> > for 0xffffffff non-preemptible operations.
> 
> I'd start out as low as we can, i.e. right now just 2 (write followed
> by read). Any increase of the threshold will then need proper (and
> trackable) reasoning.
Agree, especially when we have xmalloc() here.
> 
> Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 06/10] x86: collect global CMT information
  2014-09-25 20:53   ` Konrad Rzeszutek Wilk
@ 2014-09-26  9:21     ` Chao Peng
  2014-09-26 13:23       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 34+ messages in thread
From: Chao Peng @ 2014-09-26  9:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, JBeulich, dgdegra

On Thu, Sep 25, 2014 at 04:53:58PM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Sep 25, 2014 at 06:19:06PM +0800, Chao Peng wrote:
> > This implementation tries to put all policies into user space, thus some
> > global CMT information needs to be exposed, such as the total RMID count,
> > L3 upscaling factor, etc.
> > 
> > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > Acked-by: Jan Beulich <jbeulich@suse.com>
> > ---
> >  xen/arch/x86/cpu/intel_cacheinfo.c |   49 ++----------------------------------
> >  xen/arch/x86/sysctl.c              |   43 +++++++++++++++++++++++++++++++
> >  xen/include/asm-x86/cpufeature.h   |   45 +++++++++++++++++++++++++++++++++
> >  xen/include/public/sysctl.h        |   14 +++++++++++
> >  4 files changed, 104 insertions(+), 47 deletions(-)
> > 
> > diff --git a/xen/arch/x86/cpu/intel_cacheinfo.c b/xen/arch/x86/cpu/intel_cacheinfo.c
> > index 430f939..48970c0 100644
> > --- a/xen/arch/x86/cpu/intel_cacheinfo.c
> > +++ b/xen/arch/x86/cpu/intel_cacheinfo.c
> > @@ -81,54 +81,9 @@ static struct _cache_table cache_table[] __cpuinitdata =
> >  	{ 0x00, 0, 0}
> >  };
> >  
> > -
> > -enum _cache_type
> > -{
> > -	CACHE_TYPE_NULL	= 0,
> > -	CACHE_TYPE_DATA = 1,
> > -	CACHE_TYPE_INST = 2,
> > -	CACHE_TYPE_UNIFIED = 3
> > -};
> > -
> > -union _cpuid4_leaf_eax {
> > -	struct {
> > -		enum _cache_type	type:5;
> > -		unsigned int		level:3;
> > -		unsigned int		is_self_initializing:1;
> > -		unsigned int		is_fully_associative:1;
> > -		unsigned int		reserved:4;
> > -		unsigned int		num_threads_sharing:12;
> > -		unsigned int		num_cores_on_die:6;
> > -	} split;
> > -	u32 full;
> > -};
> > -
> > -union _cpuid4_leaf_ebx {
> > -	struct {
> > -		unsigned int		coherency_line_size:12;
> > -		unsigned int		physical_line_partition:10;
> > -		unsigned int		ways_of_associativity:10;
> > -	} split;
> > -	u32 full;
> > -};
> > -
> > -union _cpuid4_leaf_ecx {
> > -	struct {
> > -		unsigned int		number_of_sets:32;
> > -	} split;
> > -	u32 full;
> > -};
> > -
> > -struct _cpuid4_info {
> > -	union _cpuid4_leaf_eax eax;
> > -	union _cpuid4_leaf_ebx ebx;
> > -	union _cpuid4_leaf_ecx ecx;
> > -	unsigned long size;
> > -};
> > -
> >  unsigned short			num_cache_leaves;
> >  
> > -static int __cpuinit cpuid4_cache_lookup(int index, struct _cpuid4_info *this_leaf)
> > +int cpuid4_cache_lookup(int index, struct cpuid4_info *this_leaf)
> >  {
> >  	union _cpuid4_leaf_eax 	eax;
> >  	union _cpuid4_leaf_ebx 	ebx;
> > @@ -185,7 +140,7 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
> >  		 * parameters cpuid leaf to find the cache details
> >  		 */
> >  		for (i = 0; i < num_cache_leaves; i++) {
> > -			struct _cpuid4_info this_leaf;
> > +			struct cpuid4_info this_leaf;
> >  
> >  			int retval;
> >  
> > diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
> > index 15d4b91..b95408f 100644
> > --- a/xen/arch/x86/sysctl.c
> > +++ b/xen/arch/x86/sysctl.c
> > @@ -28,6 +28,7 @@
> >  #include <xen/nodemask.h>
> >  #include <xen/cpu.h>
> >  #include <xsm/xsm.h>
> > +#include <asm/psr.h>
> >  
> >  #define get_xen_guest_handle(val, hnd)  do { val = (hnd).p; } while (0)
> >  
> > @@ -101,6 +102,48 @@ long arch_do_sysctl(
> >      }
> >      break;
> >  
> > +    case XEN_SYSCTL_psr_cmt_op:
> > +        if ( !psr_cmt_enabled() )
> > +            return -ENODEV;
> > +
> > +        if ( sysctl->u.psr_cmt_op.flags != 0 )
> > +            return -EINVAL;
> > +
> > +        switch ( sysctl->u.psr_cmt_op.cmd )
> > +        {
> > +        case XEN_SYSCTL_PSR_CMT_enabled:
> > +            sysctl->u.psr_cmt_op.data =
> > +                (psr_cmt->features & PSR_RESOURCE_TYPE_L3) &&
> > +                (psr_cmt->l3.features & PSR_CMT_L3_OCCUPANCY);
> > +            break;
> > +        case XEN_SYSCTL_PSR_CMT_get_total_rmid:
> > +            sysctl->u.psr_cmt_op.data = psr_cmt->rmid_max;
> > +            break;
> > +        case XEN_SYSCTL_PSR_CMT_get_l3_upscaling_factor:
> > +            sysctl->u.psr_cmt_op.data = psr_cmt->l3.upscaling_factor;
> > +            break;
> > +        case XEN_SYSCTL_PSR_CMT_get_l3_cache_size:
> > +        {
> > +            struct cpuid4_info info;
> > +
> > +            ret = cpuid4_cache_lookup(3, &info);
> 
> Couldn't you use 'struct cpuinfo_x86' and extend it if you need to?
I can, indeed. Field 'x86_cache_size' is actully l3 cache size if it is
available. I still need to add a new field to indicate it's l3 to use
in this way.
> 
> 
> > +            if ( ret < 0 )
> > +                break;
> > +
> > +            sysctl->u.psr_cmt_op.data = info.size / 1024; /* in KB unit */
> 
> With the Haswell EP they have this weird setup where there
> are 8 cores on one side and 10 cores on another. Also the cache size is
> different (20MB LLC and 25MB LLC). With that wouldn't you want to enumerate
> exactly _which_ CPU cache you want instead of the one you running at?
> 
> Or is my reading of the diagrams wrong and OS never sees the split and
> gets 45MB?
Not sure as I don't have such machine. If this is the case, better to
use per-socket value here.
> 
> 
> > +        }
> > +        break;
> > +        default:
> > +            sysctl->u.psr_cmt_op.data = 0;
> > +            ret = -ENOSYS;
> > +            break;
> > +        }
> > +
> > +        if ( __copy_to_guest(u_sysctl, sysctl, 1) )
> > +            ret = -EFAULT;
> > +
> > +        break;
> > +
> >      default:
> >          ret = -ENOSYS;
> >          break;
> > diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> > index 137d75c..d3bd14d 100644
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -215,6 +215,51 @@
> >  #define cpu_has_vmx		boot_cpu_has(X86_FEATURE_VMXE)
> >  
> >  #define cpu_has_cpuid_faulting	boot_cpu_has(X86_FEATURE_CPUID_FAULTING)
> > +
> > +enum _cache_type {
> > +    CACHE_TYPE_NULL = 0,
> > +    CACHE_TYPE_DATA = 1,
> > +    CACHE_TYPE_INST = 2,
> > +    CACHE_TYPE_UNIFIED = 3
> > +};
> > +
> > +union _cpuid4_leaf_eax {
> > +    struct {
> > +        enum _cache_type type:5;
> > +        unsigned int level:3;
> > +        unsigned int is_self_initializing:1;
> > +        unsigned int is_fully_associative:1;
> > +        unsigned int reserved:4;
> > +        unsigned int num_threads_sharing:12;
> > +        unsigned int num_cores_on_die:6;
> > +    } split;
> > +    u32 full;
> > +};
> > +
> > +union _cpuid4_leaf_ebx {
> > +    struct {
> > +        unsigned int coherency_line_size:12;
> > +        unsigned int physical_line_partition:10;
> > +        unsigned int ways_of_associativity:10;
> > +    } split;
> > +    u32 full;
> > +};
> > +
> > +union _cpuid4_leaf_ecx {
> > +    struct {
> > +        unsigned int number_of_sets:32;
> > +    } split;
> > +    u32 full;
> > +};
> > +
> > +struct cpuid4_info {
> > +    union _cpuid4_leaf_eax eax;
> > +    union _cpuid4_leaf_ebx ebx;
> > +    union _cpuid4_leaf_ecx ecx;
> > +    unsigned long size;
> > +};
> > +
> > +int cpuid4_cache_lookup(int index, struct cpuid4_info *this_leaf);
> >  #endif
> >  
> >  #endif /* __ASM_I386_CPUFEATURE_H */
> > diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> > index 3588698..66b6e47 100644
> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -636,6 +636,18 @@ struct xen_sysctl_coverage_op {
> >  typedef struct xen_sysctl_coverage_op xen_sysctl_coverage_op_t;
> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t);
> >  
> > +#define XEN_SYSCTL_PSR_CMT_get_total_rmid            0
> > +#define XEN_SYSCTL_PSR_CMT_get_l3_upscaling_factor   1
> > +/* The L3 cache size is returned in KB unit */
> > +#define XEN_SYSCTL_PSR_CMT_get_l3_cache_size         2
> > +#define XEN_SYSCTL_PSR_CMT_enabled                   3
> > +struct xen_sysctl_psr_cmt_op {
> > +    uint32_t cmd;
> > +    uint32_t flags;      /* padding variable, may be extended for future use */
> > +    uint64_t data;
> > +};
> > +typedef struct xen_sysctl_psr_cmt_op xen_sysctl_psr_cmt_op_t;
> > +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cmt_op_t);
> >  
> >  struct xen_sysctl {
> >      uint32_t cmd;
> > @@ -658,6 +670,7 @@ struct xen_sysctl {
> >  #define XEN_SYSCTL_cpupool_op                    18
> >  #define XEN_SYSCTL_scheduler_op                  19
> >  #define XEN_SYSCTL_coverage_op                   20
> > +#define XEN_SYSCTL_psr_cmt_op                    21
> >      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
> >      union {
> >          struct xen_sysctl_readconsole       readconsole;
> > @@ -679,6 +692,7 @@ struct xen_sysctl {
> >          struct xen_sysctl_cpupool_op        cpupool_op;
> >          struct xen_sysctl_scheduler_op      scheduler_op;
> >          struct xen_sysctl_coverage_op       coverage_op;
> > +        struct xen_sysctl_psr_cmt_op        psr_cmt_op;
> >          uint8_t                             pad[128];
> >      } u;
> >  };
> > -- 
> > 1.7.9.5
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 08/10] x86: add CMT related MSRs in allowed list
  2014-09-26  8:38     ` Jan Beulich
@ 2014-09-26 13:14       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-26 13:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, Chao Peng, dgdegra

On Fri, Sep 26, 2014 at 09:38:35AM +0100, Jan Beulich wrote:
> >>> On 25.09.14 at 22:58, <konrad.wilk@oracle.com> wrote:
> > On Thu, Sep 25, 2014 at 06:19:08PM +0800, Chao Peng wrote:
> >> Tool stack will try to access the two MSRs to perform CMT
> >> related operations, thus added them in the allowed list.
> >> 
> >> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> >> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> >> ---
> >>  xen/arch/x86/platform_hypercall.c |    7 +++++++
> >>  xen/include/asm-x86/msr-index.h   |    2 ++
> >>  2 files changed, 9 insertions(+)
> >> 
> >> diff --git a/xen/arch/x86/platform_hypercall.c 
> > b/xen/arch/x86/platform_hypercall.c
> >> index 081d9f5..be06f3a 100644
> >> --- a/xen/arch/x86/platform_hypercall.c
> >> +++ b/xen/arch/x86/platform_hypercall.c
> >> @@ -69,6 +69,13 @@ struct xen_resource_access {
> >>  
> >>  static bool_t allow_access_msr(unsigned int msr)
> >>  {
> >> +    switch ( msr )
> >> +    {
> >> +    case MSR_IA32_QOSEVTSEL:
> >> +    case MSR_IA32_QMC:
> >> +        return 1;
> >> +    }
> >> +
> >>      return 0;
> >>  }
> >>  
> >> diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
> >> index dcb2b87..ae089fb 100644
> >> --- a/xen/include/asm-x86/msr-index.h
> >> +++ b/xen/include/asm-x86/msr-index.h
> >> @@ -324,6 +324,8 @@
> >>  #define MSR_IA32_ENERGY_PERF_BIAS	0x000001b0
> >>  
> >>  /* Platform Shared Resource MSRs */
> >> +#define MSR_IA32_QOSEVTSEL		0x00000c8d
> >> +#define MSR_IA32_QMC			0x00000c8e
> > 
> > Could you mention where they are in the SDM ?
> 
> In the MSR related appendix of course. Let's not go overboard with
> adding all kinds of information here that was never added for other
> MSRs: The header's purpose is just giving names to numbers.

They are of course there, but the explanation of how to use it or what
they are good for in usually in other chapters in the SDM. I was hoping
an chapter to just that. But if the header is not the right location
perhaps then in 'allow_access_msr' function?

Anyhow not going to argue strongly for it - as it is a minor part
of the patches - so either way I am comfortable with it.

> 
> Jan
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 06/10] x86: collect global CMT information
  2014-09-26  9:21     ` Chao Peng
@ 2014-09-26 13:23       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 34+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-26 13:23 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, JBeulich, dgdegra

> > > +        case XEN_SYSCTL_PSR_CMT_get_l3_cache_size:
> > > +        {
> > > +            struct cpuid4_info info;
> > > +
> > > +            ret = cpuid4_cache_lookup(3, &info);
> > 
> > Couldn't you use 'struct cpuinfo_x86' and extend it if you need to?
> I can, indeed. Field 'x86_cache_size' is actully l3 cache size if it is
> available. I still need to add a new field to indicate it's l3 to use
> in this way.

That should make it easier I would think? As you would not actually do
the cpuid call anymore and just pull the data from a 'new' field?

You would naturally have to make sure it also reports an sensible
value under AMD CPUs. If it is too difficult to do that under the
AMD code that stuffs 'struct cpuinfo_x86' then lets ignore this
whole suggestion and just keep your patch as in regards to the
'cpuid' call.

> > 
> > 
> > > +            if ( ret < 0 )
> > > +                break;
> > > +
> > > +            sysctl->u.psr_cmt_op.data = info.size / 1024; /* in KB unit */
> > 
> > With the Haswell EP they have this weird setup where there
> > are 8 cores on one side and 10 cores on another. Also the cache size is
> > different (20MB LLC and 25MB LLC). With that wouldn't you want to enumerate
> > exactly _which_ CPU cache you want instead of the one you running at?
> > 
> > Or is my reading of the diagrams wrong and OS never sees the split and
> > gets 45MB?
> Not sure as I don't have such machine. If this is the case, better to
> use per-socket value here.

<nods> In which case my comment above is irrelevant.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-25 10:19 ` [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall Chao Peng
  2014-09-25 19:57   ` Andrew Cooper
@ 2014-09-26 15:40   ` Jan Beulich
  2014-09-28  2:47     ` Chao Peng
  1 sibling, 1 reply; 34+ messages in thread
From: Jan Beulich @ 2014-09-26 15:40 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, dgdegra

>>> On 25.09.14 at 12:19, <chao.p.peng@linux.intel.com> wrote:
> +        if ( ret )
> +            break;
> +
> +        if ( copy_to_guest_offset(ra->data, i, &data, 1) )

As said (I think multiple times) before, considering the earlier
copy-in this should be __copy_to_guest_offset().

> +    case XENPF_resource_op:
> +    {
> +        struct xen_resource_access ra;
> +        struct xenpf_resource_op *rsc_op = &op->u.resource_op;
> +        unsigned int cpu = smp_processor_id();

This variable is used just once and hence not warranted.

> +
> +        ra.nr = rsc_op->nr;

Apart from the missing upper bound check I think you also ought
to drop out (successfully, but without causing any IPIs) when the
count is zero. And with you needing to move the copy-in of the
array here too, doing some of the error checking here without
sending IPIs might be worthwhile too.

> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -88,6 +88,8 @@
>  ?	xenpf_enter_acpi_sleep		platform.h
>  ?	xenpf_pcpuinfo			platform.h
>  ?	xenpf_pcpu_version		platform.h
> +?	xenpf_resource_op		platform.h
> +?	xenpf_resource_data		platform.h

Alphabetically please. But then again - why is _op being put here
anyway? I realize there are a number of bad examples in the
compat wrapper source file, but we shouldn't extend this (and
you only put a check for _data there anyway).

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature
  2014-09-25 10:19 ` [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature Chao Peng
  2014-09-25 20:33   ` Konrad Rzeszutek Wilk
@ 2014-09-26 15:45   ` Jan Beulich
  1 sibling, 0 replies; 34+ messages in thread
From: Jan Beulich @ 2014-09-26 15:45 UTC (permalink / raw)
  To: Chao Peng
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, dgdegra

>>> On 25.09.14 at 12:19, <chao.p.peng@linux.intel.com> wrote:
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -59,6 +59,7 @@ obj-y += crash.o
>  obj-y += tboot.o
>  obj-y += hpet.o
>  obj-y += xstate.o
> +obj-y += psr.o

The list upwards from here is at least roughly alphabetically ordered
(or once was), so please put your addition elsewhere than at the end.

> +static void __init parse_psr_param(char *s)
> +{
> +    char *ss, *val_str;
> +
> +    do {
> +        ss = strchr(s, ',');
> +        if ( ss )
> +            *ss = '\0';
> +
> +        val_str = strchr(s, ':');
> +        if ( val_str )
> +            *val_str++ = '\0';
> +
> +        if ( !strcmp(s, "cmt")
> +             && ( !val_str || parse_bool(val_str) == 1 )) {
> +            opt_psr &= PSR_CMT;
> +        } else if ( val_str && !strcmp(s, "rmid_max") )

Coding style.

> +    printk(XENLOG_INFO "Cache Monitoring Technology Enabled.\n");

"... enabled\n"

Jan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall
  2014-09-26 15:40   ` Jan Beulich
@ 2014-09-28  2:47     ` Chao Peng
  0 siblings, 0 replies; 34+ messages in thread
From: Chao Peng @ 2014-09-28  2:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: keir, Ian.Campbell, stefano.stabellini, George.Dunlap,
	andrew.cooper3, Ian.Jackson, xen-devel, dgdegra

On Fri, Sep 26, 2014 at 04:40:08PM +0100, Jan Beulich wrote:
> >>> On 25.09.14 at 12:19, <chao.p.peng@linux.intel.com> wrote:
> > +        if ( ret )
> > +            break;
> > +
> > +        if ( copy_to_guest_offset(ra->data, i, &data, 1) )
> 
> As said (I think multiple times) before, considering the earlier
> copy-in this should be __copy_to_guest_offset().
Agree, this can be optimized.
> 
> > +    case XENPF_resource_op:
> > +    {
> > +        struct xen_resource_access ra;
> > +        struct xenpf_resource_op *rsc_op = &op->u.resource_op;
> > +        unsigned int cpu = smp_processor_id();
> 
> This variable is used just once and hence not warranted.
Sure, will call it in place.
> 
> > +
> > +        ra.nr = rsc_op->nr;
> 
> Apart from the missing upper bound check I think you also ought
> to drop out (successfully, but without causing any IPIs) when the
> count is zero. And with you needing to move the copy-in of the
> array here too, doing some of the error checking here without
> sending IPIs might be worthwhile too.
NP.
> 
> > --- a/xen/include/xlat.lst
> > +++ b/xen/include/xlat.lst
> > @@ -88,6 +88,8 @@
> >  ?	xenpf_enter_acpi_sleep		platform.h
> >  ?	xenpf_pcpuinfo			platform.h
> >  ?	xenpf_pcpu_version		platform.h
> > +?	xenpf_resource_op		platform.h
> > +?	xenpf_resource_data		platform.h
> 
> Alphabetically please. But then again - why is _op being put here
> anyway? I realize there are a number of bad examples in the
> compat wrapper source file, but we shouldn't extend this (and
> you only put a check for _data there anyway).
yes, _op should be dropped here.
> 
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2014-09-28  2:47 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-25 10:19 [PATCH v16 00/10] enable Cache Monitoring Technology (CMT) feature Chao Peng
2014-09-25 10:19 ` [PATCH v16 01/10] x86: add generic resource (e.g. MSR) access hypercall Chao Peng
2014-09-25 19:57   ` Andrew Cooper
2014-09-25 20:12     ` Konrad Rzeszutek Wilk
2014-09-25 20:17       ` Konrad Rzeszutek Wilk
2014-09-26  1:34       ` Chao Peng
2014-09-26  1:19     ` Chao Peng
2014-09-26  8:28     ` Jan Beulich
2014-09-26  8:58       ` Chao Peng
2014-09-26 15:40   ` Jan Beulich
2014-09-28  2:47     ` Chao Peng
2014-09-25 10:19 ` [PATCH v16 02/10] xsm: add resource operation related xsm policy Chao Peng
2014-09-25 10:19 ` [PATCH v16 03/10] tools: provide interface for generic resource access Chao Peng
2014-09-25 20:06   ` Konrad Rzeszutek Wilk
2014-09-25 10:19 ` [PATCH v16 04/10] x86: detect and initialize Cache Monitoring Technology feature Chao Peng
2014-09-25 20:33   ` Konrad Rzeszutek Wilk
2014-09-25 21:14     ` Andrew Cooper
2014-09-26  1:54       ` Chao Peng
2014-09-26 15:45   ` Jan Beulich
2014-09-25 10:19 ` [PATCH v16 05/10] x86: dynamically attach/detach CMT service for a guest Chao Peng
2014-09-25 20:41   ` Konrad Rzeszutek Wilk
2014-09-25 10:19 ` [PATCH v16 06/10] x86: collect global CMT information Chao Peng
2014-09-25 20:53   ` Konrad Rzeszutek Wilk
2014-09-26  9:21     ` Chao Peng
2014-09-26 13:23       ` Konrad Rzeszutek Wilk
2014-09-25 10:19 ` [PATCH v16 07/10] x86: enable CMT for each domain RMID Chao Peng
2014-09-25 21:23   ` Andrew Cooper
2014-09-25 10:19 ` [PATCH v16 08/10] x86: add CMT related MSRs in allowed list Chao Peng
2014-09-25 20:58   ` Konrad Rzeszutek Wilk
2014-09-26  8:38     ` Jan Beulich
2014-09-26 13:14       ` Konrad Rzeszutek Wilk
2014-09-25 10:19 ` [PATCH v16 09/10] xsm: add CMT related xsm policies Chao Peng
2014-09-25 10:19 ` [PATCH v16 10/10] tools: CMDs and APIs for Cache Monitoring Technology Chao Peng
2014-09-25 21:14   ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.