All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c
@ 2017-04-01 13:53 Yi Sun
  2017-04-01 13:53 ` [PATCH v10 01/25] docs: create Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) feature document Yi Sun
                   ` (24 more replies)
  0 siblings, 25 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

Hi all,

We plan to bring a new PSR (Platform Shared Resource) feature called
Intel L2 Cache Allocation Technology (L2 CAT) to Xen. It has been enabled
in Linux Kernel.

Besides the L2 CAT implementaion, we refactor the psr.c to make it more
flexible and easily to extend to add new features. We abstract the general
operations of all features and encapsulate them into a structure. Then,
the development of new feature is simple to mainly implement these callback
functions.

The patch set can be found at:
https://github.com/yisun-git/xen.git l2_cat_v10

---
Acked and Reviewed list before V10:

a - Acked-by
r - Reviewed-by

  r  patch 1  - docs: create Cache Allocation Technology (CAT) and Code and
                Data Prioritization (CDP) feature document
  ar patch 2  - x86: refactor psr: remove L3 CAT/CDP codes.
  a  patch 4  - x86: move cpuid_count_leaf from cpuid.c to processor.h.
  a  patch 25 - docs: add L2 CAT description in docs.

---
V10 change list:

Patch 3:
    - remove initialization for 'PSR_SOCKET_L3_CAT'.
      (suggested by Jan Beulich)
    - rename 'feat_ops' to 'feat_props'.
      (suggested by Jan Beulich)
    - move 'cbm_len' to 'feat_props' because it is feature independent so far.
      (suggested by Jan Beulich)
    - move 'cos_max' to 'feat_props' because it is feature independent.
      (suggested by Jan Beulich)
    - move 'cos_num' to 'feat_props' because it is feature independent.
      (suggested by Jan Beulich)
    - remove union 'info' and struct 'psr_cat_hw_info'.
    - remove 'get_cos_max' from 'feat_props'.
      (suggested by Jan Beulich)
    - remove 'feat_mask' from 'psr_socket_info' because we can use 'features[]'
      to check if any feature is initialized.
      (suggested by Jan Beulich)
    - move 'ref_lock' above 'cos_ref'.
      (suggested by Jan Beulich)
    - adjust comments and commit message according to above changes.
Patch 4:
    - Acked by Jan.
Patch 5:
    - remove 'asm/x86_emulate.h' inclusion as it has been indirectly included.
      (suggested by Jan Beulich)
    - remove 'CAT_COS_NUM' as it is only used once.
      (suggested by Jan Beulich)
    - remove 'feat_mask'.
      (suggested by Jan Beulich)
    - changes about 'feat_props'.
      (suggested by Jan Beulich)
    - remove 'get_cos_max' hook declaration.
      (suggested by Jan Beulich)
    - modify 'cat_default_val' implementation.
      (suggested by Jan Beulich)
    - modify 'psr_alloc_feat_enabled' implementation to make it simple.
      (suggested by Jan Beulich)
    - rename 'free_feature' to 'free_socket_resources' because it is executed
      when socket is offline. It needs free resources related to the socket.
      (suggested by Jan Beulich)
    - define 'feat_init_done' to iterate feature array to check if any feature
      has been initialized.
      (suggested by Jan Beulich)
    - input 'struct cpuid_leaf' pointer into 'cat_init_feature' to avoid memory
      copy.
      (suggested by Jan Beulich)
    - modify 'cat_init_feature' to use switch and things related to above
      changes.
      (suggested by Jan Beulich)
    - add an indentation for label.
      (suggested by Jan Beulich)
Patch 6:
    - remove 'cat_get_cos_max' as 'cos_max' is a feature property now which
      can be directly used.
      (suggested by Jan Beulich)
    - replace 'info->feat_mask' check to ''feat_init_done'.
      (suggested by Jan Beulich)
Patch 7:
    - remove 'PSR_SOCKET_UNKNOWN' and use 'ASSERT_UNREACHABLE()' to handle
      this case.
      (suggested by Jan Beulich)
    - check 'feat_type'.
      (suggested by Jan Beulich)
    - adjust macros names and values to make them more appropriate.
      (suggested by Jan Beulich)
    - use 'feat_init_done'.
      (suggested by Jan Beulich)
    - changes about 'cbm_len'.
      (suggested by Jan Beulich)
Patch 8:
    - use an intermediate variable to get value and avoid cast in domctl.
      (suggested by Jan Beulich)
    - remove 'type' in 'get_val' parameters and will add it back when
      implementing CDP.
      (suggested by Jan Beulich)
    - add 'err' in 'psr_get_feat' parameters to get error number back.
      (suggested by Jan Beulich)
    - remove unnecessary variable in 'psr_get_feat'.
      (suggested by Jan Beulich)
    - use 'ASSET' to check input parameter in 'psr_get_val'.
      (suggested by Jan Beulich)
    - changes about 'feat_props'.
      (suggested by Jan Beulich)
Patch 9:
    - restore domain cos id to 0 when socket is offline.
      (suggested by Jan Beulich)
    - check 'psr_cat_op.data' to make sure only lower 32 bits are valid.
      (suggested by Jan Beulich)
    - remove unnecessary fixed width type of parameters and variables.
      (suggested by Jan Beulich)
    - rename 'insert_new_val_to_array' to 'insert_val_to_array'.
      (suggested by Jan Beulich)
    - input 'ref_lock' pointer into functions to check if it has been locked.
      (suggested by Jan Beulich)
    - add comment to declare the set process is protected by 'domctl_lock'.
      (suggested by Jan Beulich)
    - check 'feat_type'.
      (suggested by Jan Beulich)
    - remove 'feat_mask'.
      (suggested by Jan Beulich)
    - remove unnecessary criteria of ASSERT.
      (suggested by Jan Beulich)
    - adjust flow of 'psr_set_val' to avoid 'goto' for successful cases.
      (suggested by Jan Beulich)
    - use ASSERT to check 'socket_info' in 'psr_free_cos'.
      (suggested by Jan Beulich)
    - remove unnecessary comment in 'psr_free_cos'.
      (suggested by Jan Beulich)
Patch 10:
    - remove 'get_old_val' to directly call 'get_val' to get needed val.
      (suggested by Jan Beulich)
    - move 'psr_check_cbm' into 'insert_val_to_array'.
      (suggested by Jan Beulich)
    - change type of 'cbm' in 'psr_check_cbm' to 'unsigned long'.
      (suggested by Jan Beulich)
    - remove 'set_new_val' as it can be handled in generic process.
    - changes related to 'feat_props'.
      (suggested by Jan Beulich)
    - adjust flow in 'gather_val_array' to avoid array cross.
      (suggested by Jan Beulich)
    - adjust flow in 'insert_val_to_array' to avoid array cross.
      (suggested by Jan Beulich)
Patch 11:
    - remove 'compare_val' hook and its CAT implementation. Make its
      functionality be generic in 'find_cos' flow.
      (suggested by Jan Beulich)
    - changes related to 'props'.
      (suggested by Jan Beulich)
    - rename 'val_array' to 'val_ptr'.
      (suggested by Jan Beulich)
    - rename 'find' to 'found'.
      (suggested by Jan Beulich)
    - move some variables declaration and initialization into loop.
      (suggested by Jan Beulich)
    - adjust codes positions.
      (suggested by Jan Beulich)
Patch 12:
    - remove 'fits_cos_max' hook and CAT implementation. Move the process into
      generic flow.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
    - adjust codes positions.
      (suggested by Jan Beulich)
Patch 13:
    - remove 'type' from 'write_msr' parameter list. Will add it back when
      implementing CDP.
      (suggested by Jan Beulich)
    - remove unnecessary casts.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
Patch 14:
    - fix comment.
      (suggested by Jan Beulich)
    - use swith in 'cat_init_feature' to handle different feature types.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
    - restore MSRs to default value when cpu online.
      (suggested by Jan Beulich)
    - remove feat_mask.
      (suggested by Jan Beulich)
Patch 15:
    - update renamed macros used by psr_get_info.
      (suggested by Jan Beulich)
    - change 'psr_get_info' flow to cover CDP case to make codes in sysctl
      more simple.
      (suggested by Jan Beulich)
    - remove sysctl redundant codes after applying above changes.
      (suggested by Jan Beulich)
Patch 16:
    - add 'enum cbm_type type' into 'get_val' parameters to handle CDP case.
      (suggested by Jan Beulich)
Patch 17:
    - remove 'l3_cdp_get_old_val' and use 'l3_cdp_get_val' to replace it.
      (suggested by Jan Beulich)
    - remvoe 'l3_cdp_set_new_val'.
    - modify 'insert_val_to_array' flow to handle multiple COSs case.
      (suggested by Jan Beulich)
    - remove 'l3_cdp_compare_val' and implement a generic function
      'comapre_val'.
      (suggested by Jan Beulich)
    - remove 'l3_cdp_fits_cos_max'.
      (suggested by Jan Beulich)
    - introduce macro 'PSR_MAX_COS_NUM'.
    - introduce a new member in 'feat_props', 'type[PSR_MAX_COS_NUM]' to record
      all 'cbm_type' the feature has.
      (suggested by Jan Beulich)
    - modify 'gather_val_array' flow to handle multiple COSs case.
      (suggested by Jan Beulich)
    - modify 'find_cos' flow and implement 'compare_val' to handle multiple
      COSs case.
      (suggested by Jan Beulich)
    - modify 'fits_cos_max' flow to handle multiple COSs case.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
    - remove cast in 'l3_cdp_write_msr'.
      (suggested by Jan Beulich)
    - implement 'compare_val' function to compare if feature values are what
      we expect in finding flow.
    - implement 'restore_default_val' function to restore feature's COS values
      to default if the feature has multiple COSs. It is called when the COS
      ID is reduced to 0.
Patch 18:
    - implement L2 CAT case in 'cat_init_feature'.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
    - introduce 'PSR_CBM_TYPE_L2'.
Patch 19:
    - modify macro name according to previous patch change.
      (suggested by Jan Beulich)
Patch 20:
    - remove cast in domctl.
      (suggested by Jan Beulich)
Patch 21:
    - check input data and remove cast in domctl.
      (suggested by Jan Beulich)
    - remove some hooks assignment due to previous patches changes.
      (suggested by Jan Beulich)
    - remove cast in 'l2_cat_write_msr'.
      (suggested by Jan Beulich)
Patch 22:
    - change macros names according to previous changes.
      (suggested by Jan Beulich)
Patch 24:
    - fix comments.
      (suggested by Wei Liu)

Yi Sun (25):
  docs: create Cache Allocation Technology (CAT) and Code and Data
    Prioritization (CDP) feature document
  x86: refactor psr: remove L3 CAT/CDP codes.
  x86: refactor psr: implement main data structures.
  x86: move cpuid_count_leaf from cpuid.c to processor.h.
  x86: refactor psr: L3 CAT: implement CPU init and free flow.
  x86: refactor psr: L3 CAT: implement Domain init/free and schedule
    flows.
  x86: refactor psr: L3 CAT: implement get hw info flow.
  x86: refactor psr: L3 CAT: implement get value flow.
  x86: refactor psr: L3 CAT: set value: implement framework.
  x86: refactor psr: L3 CAT: set value: assemble features value array.
  x86: refactor psr: L3 CAT: set value: implement cos finding flow.
  x86: refactor psr: L3 CAT: set value: implement cos id picking flow.
  x86: refactor psr: L3 CAT: set value: implement write msr flow.
  x86: refactor psr: CDP: implement CPU init and free flow.
  x86: refactor psr: CDP: implement get hw info flow.
  x86: refactor psr: CDP: implement get value flow.
  x86: refactor psr: CDP: implement set value callback functions.
  x86: L2 CAT: implement CPU init and free flow.
  x86: L2 CAT: implement get hw info flow.
  x86: L2 CAT: implement get value flow.
  x86: L2 CAT: implement set value flow.
  tools: L2 CAT: support get HW info for L2 CAT.
  tools: L2 CAT: support show cbm for L2 CAT.
  tools: L2 CAT: support set cbm for L2 CAT.
  docs: add L2 CAT description in docs.

 docs/features/intel_psr_cat_cdp.pandoc |  469 +++++++++++
 docs/man/xl.pod.1.in                   |   25 +-
 docs/misc/xl-psr.markdown              |   18 +-
 tools/libxc/include/xenctrl.h          |    7 +-
 tools/libxc/xc_psr.c                   |   45 +-
 tools/libxl/libxl.h                    |    9 +
 tools/libxl/libxl_psr.c                |   68 +-
 tools/libxl/libxl_types.idl            |    1 +
 tools/xl/xl_cmdtable.c                 |    6 +-
 tools/xl/xl_psr.c                      |  168 ++--
 xen/arch/x86/cpuid.c                   |    6 -
 xen/arch/x86/domctl.c                  |   81 +-
 xen/arch/x86/psr.c                     | 1450 +++++++++++++++++++++++++-------
 xen/arch/x86/sysctl.c                  |   40 +-
 xen/include/asm-x86/msr-index.h        |    1 +
 xen/include/asm-x86/processor.h        |    7 +
 xen/include/asm-x86/psr.h              |   26 +-
 xen/include/public/domctl.h            |    2 +
 xen/include/public/sysctl.h            |    3 +-
 19 files changed, 1989 insertions(+), 443 deletions(-)
 create mode 100644 docs/features/intel_psr_cat_cdp.pandoc

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v10 01/25] docs: create Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) feature document
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 02/25] x86: refactor psr: remove L3 CAT/CDP codes Yi Sun
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch creates CAT and CDP feature document in doc/features/. It describes
key points to implement L3 CAT/CDP and L2 CAT which is described in details in
Intel SDM "INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES".

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v9:
    - add 'CMT' to the list of acronyms.
      (suggested by Wei Liu)
    - change feature list to feature array.
    - modify data structure descriptions according to latest codes.
    - modify revision.
v8:
    - change revision info.
      (suggested by Konrad Rzeszutek Wilk)
    - add content int 'Areas for improvement'.
      (suggested by Konrad Rzeszutek Wilk)
v7:
    - correct typo.
      (suggested by Konrad Rzeszutek Wilk)
    - replace application/VM to domain.
      (suggested by Konrad Rzeszutek Wilk)
    - amend description of `feat_mask` to make it clearer.
      (suggested by Konrad Rzeszutek Wilk)
    - update revision.
      (suggested by Konrad Rzeszutek Wilk)
    - other minor fixes.
      (suggested by Konrad Rzeszutek Wilk)
v6:
    - write a new feature document to cover L3 CAT/CDP and L2 CAT.
      (suggested by Kevin Tian)
    - adjust 'Terminology' position in document.
      (suggested by Dario Faggioli)
    - fix wordings.
      (suggested by Dario Faggioli, Kevin Tian and Konrad Rzeszutek Wilk)
    - add SDM chapter title in commit message.
      (suggested by Konrad Rzeszutek Wilk)
    - add more explanations.
      (suggested by Kevin Tian)
v4:
    - change file name to be more descriptive, 'intel_psr_l2_cat.pandoc'.
      (suggested by Dario Faggioli)
    - remove 'Ext' and 'New' prefixes.
      (suggested by Dario Faggioli)
    - remove change log in Revison part.
      (suggested by Dario Faggioli)
    - adjust Xen release number to 4.9 to show this feature targets 4.9.
      (suggested by Dario Faggioli)
    - provide 'Terminology' and more sections.
      (suggested by Dario Faggioli)
    - fix wordings.
      (suggested by Konrad Rzeszutek Wilk)
    - remove chapter number.
      (suggested by Konrad Rzeszutek Wilk)
v3:
    - make design document be a patch.
      (suggested by Konrad Rzeszutek Wilk)
v2:
    - provide chapter for the L2 CAT.
      (suggested by Meng Xu)
---
 docs/features/intel_psr_cat_cdp.pandoc | 469 +++++++++++++++++++++++++++++++++
 1 file changed, 469 insertions(+)
 create mode 100644 docs/features/intel_psr_cat_cdp.pandoc

diff --git a/docs/features/intel_psr_cat_cdp.pandoc b/docs/features/intel_psr_cat_cdp.pandoc
new file mode 100644
index 0000000..022fbdc
--- /dev/null
+++ b/docs/features/intel_psr_cat_cdp.pandoc
@@ -0,0 +1,469 @@
+% Intel Cache Allocation Technology and Code and Data Prioritization Features
+% Revision 1.9
+
+\clearpage
+
+# Basics
+
+---------------- ----------------------------------------------------
+         Status: **Tech Preview**
+
+Architecture(s): Intel x86
+
+   Component(s): Hypervisor, toolstack
+
+       Hardware: L3 CAT: Haswell and beyond CPUs
+                 CDP   : Broadwell and beyond CPUs
+                 L2 CAT: Atom codename Goldmont and beyond CPUs
+---------------- ----------------------------------------------------
+
+# Terminology
+
+* CAT         Cache Allocation Technology
+* CBM         Capacity BitMasks
+* CDP         Code and Data Prioritization
+* CMT         Cache Monitoring Technology
+* COS/CLOS    Class of Service
+* MSRs        Machine Specific Registers
+* PSR         Intel Platform Shared Resource
+
+# Overview
+
+Intel provides a set of allocation capabilities including Cache Allocatation
+Technology (CAT) and Code and Data Prioritization (CDP).
+
+CAT allows an OS or hypervisor to control allocation of a CPU's shared cache
+based on application/domain priority or Class of Service (COS). Each COS is
+configured using capacity bitmasks (CBMs) which represent cache capacity and
+indicate the degree of overlap and isolation between classes. Once CAT is co-
+nfigured, the processor allows access to portions of cache according to the
+established COS. Intel Xeon processor E5 v4 family (and some others) introduce
+capabilities to configure and make use of the CAT mechanism on the L3 cache.
+Intel Goldmont processor provides support for control over the L2 cache.
+
+Code and Data Prioritization (CDP) Technology is an extension of CAT. CDP
+enables isolation and separate prioritization of code and data fetches to
+the L3 cache in a SW configurable manner, which can enable workload priorit-
+ization and tuning of cache capacity to the characteristics of the workload.
+CDP extends CAT by providing separate code and data masks per Class of Service
+(COS). When SW configures to enable CDP, L3 CAT is disabled.
+
+# User details
+
+* Feature Enabling:
+
+  Add "psr=cat" to boot line parameter to enable all supported level CAT featu-
+  res. Add "psr=cdp" to enable L3 CDP but disables L3 CAT by SW.
+
+* xl interfaces:
+
+  1. `psr-cat-show [OPTIONS] domain-id`:
+
+     Show L2 CAT or L3 CAT/CDP CBM of the domain designated by Xen domain-id.
+
+     Option `-l`:
+     `-l2`: Show cbm for L2 cache.
+     `-l3`: Show cbm for L3 cache.
+
+     If `-lX` is specified and LX is not supported, print error.
+     If no `-l` is specified, level 3 is the default option.
+
+  2. `psr-cat-set [OPTIONS] domain-id cbm`:
+
+     Set L2 CAT or L3 CAT/CDP CBM to the domain designated by Xen domain-id.
+
+     Option `-s`: Specify the socket to process, otherwise all sockets are
+     processed.
+
+     Option `-l`:
+     `-l2`: Specify cbm for L2 cache.
+     `-l3`: Specify cbm for L3 cache.
+
+     If `-lX` is specified and LX is not supported, print error.
+     If no `-l` is specified, level 3 is the default option.
+
+     Option `-c` or `-d`:
+     `-c`: Set L3 CDP code cbm.
+     `-d`: Set L3 CDP data cbm.
+
+  3. `psr-hwinfo [OPTIONS]`:
+
+     Show CMT & L2 CAT & L3 CAT/CDP HW information on every socket.
+
+     Option `-m, --cmt`: Show Cache Monitoring Technology (CMT) hardware info.
+
+     Option `-a, --cat`: Show CAT/CDP hardware info.
+
+# Technical details
+
+L3 CAT/CDP and L2 CAT are all members of Intel PSR features, they share the base
+PSR infrastructure in Xen.
+
+## Hardware perspective
+
+  CAT/CDP defines a range of MSRs to assign different cache access patterns
+  which are known as CBMs, each CBM is associated with a COS.
+
+  ```
+  E.g. L2 CAT:
+                          +----------------------------+----------------+
+     IA32_PQR_ASSOC       | MSR (per socket)           |    Address     |
+   +----+---+-------+     +----------------------------+----------------+
+   |    |COS|       |     | IA32_L2_QOS_MASK_0         |     0xD10      |
+   +----+---+-------+     +----------------------------+----------------+
+          └-------------> | ...                        |  ...           |
+                          +----------------------------+----------------+
+                          | IA32_L2_QOS_MASK_n         | 0xD10+n (n<64) |
+                          +----------------------------+----------------+
+  ```
+
+  L3 CAT/CDP uses a range of MSRs from 0xC90 ~ 0xC90+n (n<128).
+
+  L2 CAT uses a range of MSRs from 0xD10 ~ 0xD10+n (n<64), following the L3
+  CAT/CDP MSRs, setting different L2 cache accessing patterns from L3 cache is
+  supported.
+
+  Every MSR stores a CBM value. A capacity bitmask (CBM) provides a hint to the
+  hardware indicating the cache space a domain should be limited to as well as
+  providing an indication of overlap and isolation in the CAT-capable cache from
+  other domains contending for the cache.
+
+  Sample cache capacity bitmasks for a bitlength of 8 are shown below. Please
+  note that all (and only) contiguous '1' combinations are allowed (e.g. FFFFH,
+  0FF0H, 003CH, etc.).
+
+  ```
+       +----+----+----+----+----+----+----+----+
+       | M7 | M6 | M5 | M4 | M3 | M2 | M1 | M0 |
+       +----+----+----+----+----+----+----+----+
+  COS0 | A  | A  | A  | A  | A  | A  | A  | A  | Default Bitmask
+       +----+----+----+----+----+----+----+----+
+  COS1 | A  | A  | A  | A  | A  | A  | A  | A  |
+       +----+----+----+----+----+----+----+----+
+  COS2 | A  | A  | A  | A  | A  | A  | A  | A  |
+       +----+----+----+----+----+----+----+----+
+
+       +----+----+----+----+----+----+----+----+
+       | M7 | M6 | M5 | M4 | M3 | M2 | M1 | M0 |
+       +----+----+----+----+----+----+----+----+
+  COS0 | A  | A  | A  | A  | A  | A  | A  | A  | Overlapped Bitmask
+       +----+----+----+----+----+----+----+----+
+  COS1 |    |    |    |    | A  | A  | A  | A  |
+       +----+----+----+----+----+----+----+----+
+  COS2 |    |    |    |    |    |    | A  | A  |
+       +----+----+----+----+----+----+----+----+
+
+       +----+----+----+----+----+----+----+----+
+       | M7 | M6 | M5 | M4 | M3 | M2 | M1 | M0 |
+       +----+----+----+----+----+----+----+----+
+  COS0 | A  | A  | A  | A  |    |    |    |    | Isolated Bitmask
+       +----+----+----+----+----+----+----+----+
+  COS1 |    |    |    |    | A  | A  |    |    |
+       +----+----+----+----+----+----+----+----+
+  COS2 |    |    |    |    |    |    | A  | A  |
+       +----+----+----+----+----+----+----+----+
+  ```
+
+  We can get the CBM length through CPUID. The default value of CBM is calcul-
+  ated by `(1ull << cbm_len) - 1`. That is a fully open bitmask, all ones bitm-
+  ask. The COS[0] always stores the default value without change.
+
+  There is a `IA32_PQR_ASSOC` register which stores the COS ID of the VCPU. HW
+  enforces cache allocation according to the corresponding CBM.
+
+## The relationship between L3 CAT/CDP and L2 CAT
+
+  HW may support all features. By default, CDP is disabled on the processor.
+  If the L3 CAT MSRs are used without enabling CDP, the processor operates in
+  a traditional CAT-only mode. When CDP is enabled:
+  * the CAT mask MSRs are re-mapped into interleaved pairs of mask MSRs for
+    data or code fetches.
+  * the range of COS for CAT is re-indexed, with the lower-half of the COS
+    range available for CDP.
+
+  L2 CAT is independent of L3 CAT/CDP, which means L2 CAT can be enabled while
+  L3 CAT/CDP is disabled, or L2 CAT and L3 CAT/CDP are both enabled.
+
+  As a requirement, the bits of CBM of CAT/CDP must be continuous.
+
+  N.B. L2 CAT and L3 CAT/CDP share the same COS field in the same associate
+  register `IA32_PQR_ASSOC`, which means one COS is associated with a pair of
+  L2 CAT CBM and L3 CAT/CDP CBM.
+
+  Besides, the max COS of L2 CAT may be different from L3 CAT/CDP (or other
+  PSR features in future). In some cases, a domain is permitted to have a COS
+  that is beyond one (or more) of PSR features but within the others. For
+  instance, let's assume the max COS of L2 CAT is 8 but the max COS of L3
+  CAT is 16, when a domain is assigned 9 as COS, the L3 CAT CBM associated to
+  COS 9 would be enforced, but for L2 CAT, the HW works as default value is
+  set since COS 9 is beyond the max COS (8) of L2 CAT.
+
+## Design Overview
+
+* Core COS/CBM association
+
+  When enforcing CAT/CDP, all cores of domains have the same default COS (COS0)
+  which is associated with the fully open CBM (all ones bitmask) to access all
+  cache. The default COS is used only in hypervisor and is transparent to tool
+  stack and user.
+
+  System administrator can change PSR allocation policy at runtime by tool stack.
+  Since L2 CAT shares COS with L3 CAT/CDP, a COS corresponds to a 2-tuple, like
+  [L2 CBM, L3 CBM] with only-CAT enabled, when CDP is enabled, one COS correspo-
+  nds to a 3-tuple, like [L2 CBM, L3 Code_CBM, L3 Data_CBM]. If neither L3 CAT
+  nor L3 CDP is enabled, things would be easier, one COS corresponds to one L2
+  CBM.
+
+* VCPU schedule
+
+  When context switch happens, the COS of VCPU is written to per-thread MSR
+  `IA32_PQR_ASSOC`, and then hardware enforces cache allocation according to
+  the corresponding CBM.
+
+* Multi-sockets
+
+  Different sockets may have different CAT/CDP capability (e.g. max COS) alth-
+  ough it is consistent on the same socket. So the capability of per-socket CAT/
+  CDP is specified.
+
+  'psr-cat-set' can set CBM for one domain per socket. On each socket, we main-
+  tain a COS array for all domains. One domain uses one COS at one time. One COS
+  stores the CBM of the domain to work. So, when a VCPU of the domain is migrat-
+  ed from socket 1 to socket 2, it follows configuration on socket 2.
+
+  E.g. user sets domain 1 CBM on socket 1 to 0x7f which uses COS 9 but sets do-
+  main 1 CBM on socket 2 to 0x3f which uses COS 7. When VCPU of this domain
+  is migrated from socket 1 to 2, the COS ID used is 7, that means 0x3f is the
+  CBM to work for this domain 1 now.
+
+## Implementation Description
+
+* Hypervisor interfaces:
+
+  1. Boot line parameter "psr=cat" enables L2 CAT and L3 CAT if hardware suppo-
+     rted. "psr=cdp" enables CDP if hardware supported.
+
+  2. SYSCTL:
+          - XEN_SYSCTL_PSR_CAT_get_l3_info: Get L3 CAT/CDP information.
+          - XEN_SYSCTL_PSR_CAT_get_l2_info: Get L2 CAT information.
+
+  3. DOMCTL:
+          - XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM: Get L3 CBM for a domain.
+          - XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM: Set L3 CBM for a domain.
+          - XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE: Get CDP Code CBM for a domain.
+          - XEN_DOMCTL_PSR_CAT_OP_SET_L3_CODE: Set CDP Code CBM for a domain.
+          - XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA: Get CDP Data CBM for a domain.
+          - XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA: Set CDP Data CBM for a domain.
+          - XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM: Get L2 CBM for a domain.
+          - XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM: Set L2 CBM for a domain.
+
+* xl interfaces:
+
+  1. psr-cat-show -lX domain-id
+          Show LX cbm for a domain.
+          => XEN_SYSCTL_PSR_CAT_get_l3_info    /
+             XEN_SYSCTL_PSR_CAT_get_l2_info    /
+             XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM  /
+             XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE /
+             XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA /
+             XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM
+
+  2. psr-cat-set -lX domain-id cbm
+          Set LX cbm for a domain.
+          => XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM  /
+             XEN_DOMCTL_PSR_CAT_OP_SET_L3_CODE /
+             XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA /
+             XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM
+
+  3. psr-hwinfo
+          Show PSR HW information, including L3 CAT/CDP/L2 CAT
+          => XEN_SYSCTL_PSR_CAT_get_l3_info /
+             XEN_SYSCTL_PSR_CAT_get_l2_info
+
+* Key data structure:
+
+   1. Feature HW info
+
+      ```
+      struct psr_cat_hw_info {
+          unsigned int cbm_len;
+          unsigned int cos_max;
+      };
+      ```
+
+      - Member `cbm_len`
+
+        `cbm_len` is one of the hardware info of CAT. It means the max number
+        of bits to set.
+
+      - Member `cos_max`
+
+        `cos_max` is one of the hardware info of CAT. It means the max number
+        of COS registers.
+
+   2. Feature node
+
+      ```
+      struct feat_node {
+          struct feat_ops {
+              unsigned int (*get_cos_max)(const struct feat_node *feat);
+              bool (*get_feat_info)(const struct feat_node *feat,
+                                    uint32_t data[], unsigned int array_len);
+              void (*get_val)(const struct feat_node *feat, unsigned int cos,
+                              enum cbm_type type, uint32_t *val);
+              void (*get_old_val)(uint32_t val[],
+                                  const struct feat_node *feat,
+                                  unsigned int old_cos);
+              int (*set_new_val)(uint32_t val[],
+                                 const struct feat_node *feat,
+                                 enum cbm_type type,
+                                 uint32_t new_val);
+              int (*compare_val)(const uint32_t val[], const struct feat_node *feat,
+                                 unsigned int cos);
+              bool (*fits_cos_max)(const uint32_t val[],
+                                   const struct feat_node *feat,
+                                   unsigned int cos);
+              void (*write_msr)(unsigned int cos, uint32_t val,
+                                enum cbm_type type, struct feat_node *feat);
+          } ops;
+
+          union {
+              struct psr_cat_hw_info cat_info;
+          } info;
+
+          uint32_t cos_reg_val[MAX_COS_REG_CNT];
+          unsigned int cos_num;
+      };
+      ```
+
+      When a PSR enforcement feature is enabled, it will be added into a
+      feature array.
+
+      - Member `ops`
+
+        `ops` maintains a callback function list of the feature.
+
+        We abstract above callback functions to encapsulate the feature specific
+        behaviors into them. Then, it is easy to add a new feature:
+          1) Implement such ops and callback functions for every feature.
+          2) Register the ops into `struct feat_node`.
+          3) Add the feature into feature array during CPU initialization.
+
+      - Member `info`
+
+        `info` maintains the feature HW information which is provided to
+        psr_hwinfo command.
+
+      - Member `cos_reg_val`
+
+        `cos_reg_val` is an array to maintain the value set in all COS registers
+        of the feature. The array is indexed by COS ID.
+
+      - Member `cos_num`
+
+        `cos_num` is the number of COS registers the feature uses, e.g. L3/L2
+        CAT uses 1 register but CDP uses 2 registers.
+
+   3. Per-socket PSR features information structure
+
+      ```
+      struct psr_socket_info {
+          unsigned int feat_mask;
+          struct feat_node *features[PSR_SOCKET_MAX_FEAT];
+          unsigned int cos_ref[MAX_COS_REG_NUM];
+          spinlock_t ref_lock;
+      };
+      ```
+
+      We collect all PSR allocation features information of a socket in this
+      `struct psr_socket_info`.
+
+      - Member `feat_mask`
+
+        `feat_mask` is a bitmap, to indicate which feature is enabled on current
+        socket. See values defined in `enum psr_feat_type`. E.g.
+
+        bit 0: L3 CAT status.
+        bit 1: L3 CDP status.
+        bit 2: L2 CAT status.
+
+      - Member `features`
+
+        `features` is a pointer array to save all enabled features poniters
+        according to feature position defined in `enum psr_feat_type`.
+
+      - Member `cos_ref`
+
+        `cos_ref` is an array which maintains the reference of one COS. It maps
+        to cos_reg_val[MAX_COS_REG_NUM] in `struct feat_node`. If one COS is
+        used by one domain, the corresponding reference will increase by one. If
+        a domain releases the COS, the reference will decrease by one. The array
+        is indexed by COS ID.
+
+# Limitations
+
+CAT/CDP can only work on HW which enables it(check by CPUID). So far, there is
+no HW which enables both L2 CAT and L3 CAT/CDP. But SW implementation has cons-
+idered such scenario to enable both L2 CAT and L3 CAT/CDP.
+
+# Testing
+
+We can execute above xl commands to verify L2 CAT and L3 CAT/CDP on different
+HWs support them.
+
+For example:
+    root@:~$ xl psr-hwinfo --cat
+    Cache Allocation Technology (CAT): L2
+    Socket ID       : 0
+    Maximum COS     : 3
+    CBM length      : 8
+    Default CBM     : 0xff
+
+    root@:~$ xl psr-cat-cbm-set -l2 1 0x7f
+
+    root@:~$ xl psr-cat-show -l2 1
+    Socket ID       : 0
+    Default CBM     : 0xff
+       ID                     NAME             CBM
+        1                 ubuntu14            0x7f
+
+# Areas for improvement
+
+A hexadecimal number is used to set/show CBM for a domain now. Although this
+is convenient to cover overlap/isolated bitmask requirement, it is not user-
+friendly.
+
+To improve this, the libxl interfaces can be wrapped in libvirt to provide more
+usr-friendly interfaces to user, e.g. a percentage number of the cache to set
+and show.
+
+# Known issues
+
+N/A
+
+# References
+
+"INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES" [Intel® 64 and IA-32 Architectures Software Developer Manuals, vol3](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
+
+# History
+
+------------------------------------------------------------------------
+Date       Revision Version  Notes
+---------- -------- -------- -------------------------------------------
+2016-08-12 1.0      Xen 4.9  Design document written
+2017-02-13 1.7      Xen 4.9  Changes:
+                             1. Modify the design document to cover L3
+                                CAT/CDP and L2 CAT;
+                             2. Fix typos;
+                             3. Amend description of `feat_mask` to make
+                                it clearer;
+                             4. Other minor changes.
+2017-02-15 1.8      Xen 4.9  Changes:
+                             1. Add content in 'Areas for improvement';
+                             2. Adjust revision number.
+2017-03-16 1.9      Xen 4.9  Changes:
+                             1. Add 'CMT' in 'Terminology';
+                             2. Change 'feature list' to 'feature array'.
+                             3. Modify data structure descriptions.
+                             4. Adjust revision number.
+---------- -------- -------- -------------------------------------------
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 02/25] x86: refactor psr: remove L3 CAT/CDP codes.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
  2017-04-01 13:53 ` [PATCH v10 01/25] docs: create Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) feature document Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 03/25] x86: refactor psr: implement main data structures Yi Sun
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

The current cache allocation codes in psr.c do not consider
future features addition and are not friendly to extend.

To make psr.c be more flexible to add new features and fulfill
the program principle, open for extension but closed for
modification, we have to refactor the psr.c:
1. Analyze cache allocation features and abstract general data
   structures.
2. Analyze the init and all other functions flow, abstract all
   steps that different features may have different implementations.
   Make these steps be callback functions and register feature
   specific fuctions. Then, the main processes will not be changed
   when introducing a new feature.

Because the quantity of refactor codes is big and the logics are
changed a lot, it will cause reviewers confused if just change
old codes. Reviewers have to understand both old codes and new
implementations. After review iterations from V1 to V3, Jan has
proposed to remove all old cache allocation codes firstly, then
implement new codes step by step. This will help to make codes
be more easily reviewable.

There is no construction without destruction. So, this patch
removes all current L3 CAT/CDP codes in psr.c. The following
patches will introduce the new mechanism.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v4:
    - create this patch to make codes easily understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 470 +----------------------------------------------------
 1 file changed, 5 insertions(+), 465 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 0b5073c..96a8589 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -23,24 +23,6 @@
 #define PSR_CAT        (1<<1)
 #define PSR_CDP        (1<<2)
 
-struct psr_cat_cbm {
-    union {
-        uint64_t cbm;
-        struct {
-            uint64_t code;
-            uint64_t data;
-        };
-    };
-    unsigned int ref;
-};
-
-struct psr_cat_socket_info {
-    unsigned int cbm_len;
-    unsigned int cos_max;
-    struct psr_cat_cbm *cos_to_cbm;
-    spinlock_t cbm_lock;
-};
-
 struct psr_assoc {
     uint64_t val;
     uint64_t cos_mask;
@@ -48,26 +30,11 @@ struct psr_assoc {
 
 struct psr_cmt *__read_mostly psr_cmt;
 
-static unsigned long *__read_mostly cat_socket_enable;
-static struct psr_cat_socket_info *__read_mostly cat_socket_info;
-static unsigned long *__read_mostly cdp_socket_enable;
-
 static unsigned int opt_psr;
 static unsigned int __initdata opt_rmid_max = 255;
-static unsigned int __read_mostly opt_cos_max = 255;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
-static struct psr_cat_cbm *temp_cos_to_cbm;
-
-static unsigned int get_socket_cpu(unsigned int socket)
-{
-    if ( likely(socket < nr_sockets) )
-        return cpumask_any(socket_cpumask[socket]);
-
-    return nr_cpu_ids;
-}
-
 static void __init parse_psr_bool(char *s, char *value, char *feature,
                                   unsigned int mask)
 {
@@ -107,9 +74,6 @@ static void __init parse_psr_param(char *s)
         if ( val_str && !strcmp(s, "rmid_max") )
             opt_rmid_max = simple_strtoul(val_str, NULL, 0);
 
-        if ( val_str && !strcmp(s, "cos_max") )
-            opt_cos_max = simple_strtoul(val_str, NULL, 0);
-
         s = ss + 1;
     } while ( ss );
 }
@@ -213,16 +177,7 @@ static inline void psr_assoc_init(void)
 {
     struct psr_assoc *psra = &this_cpu(psr_assoc);
 
-    if ( cat_socket_info )
-    {
-        unsigned int socket = cpu_to_socket(smp_processor_id());
-
-        if ( test_bit(socket, cat_socket_enable) )
-            psra->cos_mask = ((1ull << get_count_order(
-                             cat_socket_info[socket].cos_max)) - 1) << 32;
-    }
-
-    if ( psr_cmt_enabled() || psra->cos_mask )
+    if ( psr_cmt_enabled() )
         rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
 }
 
@@ -231,12 +186,6 @@ static inline void psr_assoc_rmid(uint64_t *reg, unsigned int rmid)
     *reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
 }
 
-static inline void psr_assoc_cos(uint64_t *reg, unsigned int cos,
-                                 uint64_t cos_mask)
-{
-    *reg = (*reg & ~cos_mask) | (((uint64_t)cos << 32) & cos_mask);
-}
-
 void psr_ctxt_switch_to(struct domain *d)
 {
     struct psr_assoc *psra = &this_cpu(psr_assoc);
@@ -245,459 +194,54 @@ void psr_ctxt_switch_to(struct domain *d)
     if ( psr_cmt_enabled() )
         psr_assoc_rmid(&reg, d->arch.psr_rmid);
 
-    if ( psra->cos_mask )
-        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
-                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
-                      0, psra->cos_mask);
-
     if ( reg != psra->val )
     {
         wrmsrl(MSR_IA32_PSR_ASSOC, reg);
         psra->val = reg;
     }
 }
-static struct psr_cat_socket_info *get_cat_socket_info(unsigned int socket)
-{
-    if ( !cat_socket_info )
-        return ERR_PTR(-ENODEV);
-
-    if ( socket >= nr_sockets )
-        return ERR_PTR(-ENOTSOCK);
-
-    if ( !test_bit(socket, cat_socket_enable) )
-        return ERR_PTR(-ENOENT);
-
-    return cat_socket_info + socket;
-}
-
-static inline bool_t cdp_is_enabled(unsigned int socket)
-{
-    return cdp_socket_enable && test_bit(socket, cdp_socket_enable);
-}
 
 int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
                         uint32_t *cos_max, uint32_t *flags)
 {
-    struct psr_cat_socket_info *info = get_cat_socket_info(socket);
-
-    if ( IS_ERR(info) )
-        return PTR_ERR(info);
-
-    *cbm_len = info->cbm_len;
-    *cos_max = info->cos_max;
-
-    *flags = 0;
-    if ( cdp_is_enabled(socket) )
-        *flags |= XEN_SYSCTL_PSR_CAT_L3_CDP;
-
     return 0;
 }
 
 int psr_get_l3_cbm(struct domain *d, unsigned int socket,
                    uint64_t *cbm, enum cbm_type type)
 {
-    struct psr_cat_socket_info *info = get_cat_socket_info(socket);
-    bool_t cdp_enabled = cdp_is_enabled(socket);
-
-    if ( IS_ERR(info) )
-        return PTR_ERR(info);
-
-    switch ( type )
-    {
-    case PSR_CBM_TYPE_L3:
-        if ( cdp_enabled )
-            return -EXDEV;
-        *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].cbm;
-        break;
-
-    case PSR_CBM_TYPE_L3_CODE:
-        if ( !cdp_enabled )
-            *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].cbm;
-        else
-            *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].code;
-        break;
-
-    case PSR_CBM_TYPE_L3_DATA:
-        if ( !cdp_enabled )
-            *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].cbm;
-        else
-            *cbm = info->cos_to_cbm[d->arch.psr_cos_ids[socket]].data;
-        break;
-
-    default:
-        ASSERT_UNREACHABLE();
-    }
-
-    return 0;
-}
-
-static bool_t psr_check_cbm(unsigned int cbm_len, uint64_t cbm)
-{
-    unsigned int first_bit, zero_bit;
-
-    /* Set bits should only in the range of [0, cbm_len). */
-    if ( cbm & (~0ull << cbm_len) )
-        return 0;
-
-    /* At least one bit need to be set. */
-    if ( cbm == 0 )
-        return 0;
-
-    first_bit = find_first_bit(&cbm, cbm_len);
-    zero_bit = find_next_zero_bit(&cbm, cbm_len, first_bit);
-
-    /* Set bits should be contiguous. */
-    if ( zero_bit < cbm_len &&
-         find_next_bit(&cbm, cbm_len, zero_bit) < cbm_len )
-        return 0;
-
-    return 1;
-}
-
-struct cos_cbm_info
-{
-    unsigned int cos;
-    bool_t cdp;
-    uint64_t cbm_code;
-    uint64_t cbm_data;
-};
-
-static void do_write_l3_cbm(void *data)
-{
-    struct cos_cbm_info *info = data;
-
-    if ( info->cdp )
-    {
-        wrmsrl(MSR_IA32_PSR_L3_MASK_CODE(info->cos), info->cbm_code);
-        wrmsrl(MSR_IA32_PSR_L3_MASK_DATA(info->cos), info->cbm_data);
-    }
-    else
-        wrmsrl(MSR_IA32_PSR_L3_MASK(info->cos), info->cbm_code);
-}
-
-static int write_l3_cbm(unsigned int socket, unsigned int cos,
-                        uint64_t cbm_code, uint64_t cbm_data, bool_t cdp)
-{
-    struct cos_cbm_info info =
-    {
-        .cos = cos,
-        .cbm_code = cbm_code,
-        .cbm_data = cbm_data,
-        .cdp = cdp,
-    };
-
-    if ( socket == cpu_to_socket(smp_processor_id()) )
-        do_write_l3_cbm(&info);
-    else
-    {
-        unsigned int cpu = get_socket_cpu(socket);
-
-        if ( cpu >= nr_cpu_ids )
-            return -ENOTSOCK;
-        on_selected_cpus(cpumask_of(cpu), do_write_l3_cbm, &info, 1);
-    }
-
     return 0;
 }
 
-static int find_cos(struct psr_cat_cbm *map, unsigned int cos_max,
-                    uint64_t cbm_code, uint64_t cbm_data, bool_t cdp_enabled)
-{
-    unsigned int cos;
-
-    for ( cos = 0; cos <= cos_max; cos++ )
-    {
-        if ( (map[cos].ref || cos == 0) &&
-             ((!cdp_enabled && map[cos].cbm == cbm_code) ||
-              (cdp_enabled && map[cos].code == cbm_code &&
-                              map[cos].data == cbm_data)) )
-            return cos;
-    }
-
-    return -ENOENT;
-}
-
-static int pick_avail_cos(struct psr_cat_cbm *map, unsigned int cos_max,
-                          unsigned int old_cos)
-{
-    unsigned int cos;
-
-    /* If old cos is referred only by the domain, then use it. */
-    if ( map[old_cos].ref == 1 && old_cos != 0 )
-        return old_cos;
-
-    /* Find an unused one other than cos0. */
-    for ( cos = 1; cos <= cos_max; cos++ )
-        if ( map[cos].ref == 0 )
-            return cos;
-
-    return -ENOENT;
-}
-
 int psr_set_l3_cbm(struct domain *d, unsigned int socket,
                    uint64_t cbm, enum cbm_type type)
 {
-    unsigned int old_cos, cos_max;
-    int cos, ret;
-    uint64_t cbm_data, cbm_code;
-    bool_t cdp_enabled = cdp_is_enabled(socket);
-    struct psr_cat_cbm *map;
-    struct psr_cat_socket_info *info = get_cat_socket_info(socket);
-
-    if ( IS_ERR(info) )
-        return PTR_ERR(info);
-
-    if ( !psr_check_cbm(info->cbm_len, cbm) )
-        return -EINVAL;
-
-    if ( !cdp_enabled && (type == PSR_CBM_TYPE_L3_CODE ||
-                          type == PSR_CBM_TYPE_L3_DATA) )
-        return -ENXIO;
-
-    cos_max = info->cos_max;
-    old_cos = d->arch.psr_cos_ids[socket];
-    map = info->cos_to_cbm;
-
-    switch ( type )
-    {
-    case PSR_CBM_TYPE_L3:
-        cbm_code = cbm;
-        cbm_data = cbm;
-        break;
-
-    case PSR_CBM_TYPE_L3_CODE:
-        cbm_code = cbm;
-        cbm_data = map[old_cos].data;
-        break;
-
-    case PSR_CBM_TYPE_L3_DATA:
-        cbm_code = map[old_cos].code;
-        cbm_data = cbm;
-        break;
-
-    default:
-        ASSERT_UNREACHABLE();
-        return -EINVAL;
-    }
-
-    spin_lock(&info->cbm_lock);
-    cos = find_cos(map, cos_max, cbm_code, cbm_data, cdp_enabled);
-    if ( cos >= 0 )
-    {
-        if ( cos == old_cos )
-        {
-            spin_unlock(&info->cbm_lock);
-            return 0;
-        }
-    }
-    else
-    {
-        cos = pick_avail_cos(map, cos_max, old_cos);
-        if ( cos < 0 )
-        {
-            spin_unlock(&info->cbm_lock);
-            return cos;
-        }
-
-        /* We try to avoid writing MSR. */
-        if ( (cdp_enabled &&
-             (map[cos].code != cbm_code || map[cos].data != cbm_data)) ||
-             (!cdp_enabled && map[cos].cbm != cbm_code) )
-        {
-            ret = write_l3_cbm(socket, cos, cbm_code, cbm_data, cdp_enabled);
-            if ( ret )
-            {
-                spin_unlock(&info->cbm_lock);
-                return ret;
-            }
-            map[cos].code = cbm_code;
-            map[cos].data = cbm_data;
-        }
-    }
-
-    map[cos].ref++;
-    map[old_cos].ref--;
-    spin_unlock(&info->cbm_lock);
-
-    d->arch.psr_cos_ids[socket] = cos;
-
     return 0;
 }
 
-/* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
-static void psr_free_cos(struct domain *d)
-{
-    unsigned int socket;
-    unsigned int cos;
-    struct psr_cat_socket_info *info;
-
-    if( !d->arch.psr_cos_ids )
-        return;
-
-    for_each_set_bit(socket, cat_socket_enable, nr_sockets)
-    {
-        if ( (cos = d->arch.psr_cos_ids[socket]) == 0 )
-            continue;
-
-        info = cat_socket_info + socket;
-        spin_lock(&info->cbm_lock);
-        info->cos_to_cbm[cos].ref--;
-        spin_unlock(&info->cbm_lock);
-    }
-
-    xfree(d->arch.psr_cos_ids);
-    d->arch.psr_cos_ids = NULL;
-}
-
 int psr_domain_init(struct domain *d)
 {
-    if ( cat_socket_info )
-    {
-        d->arch.psr_cos_ids = xzalloc_array(unsigned int, nr_sockets);
-        if ( !d->arch.psr_cos_ids )
-            return -ENOMEM;
-    }
-
     return 0;
 }
 
 void psr_domain_free(struct domain *d)
 {
     psr_free_rmid(d);
-    psr_free_cos(d);
-}
-
-static int cat_cpu_prepare(unsigned int cpu)
-{
-    if ( !cat_socket_info )
-        return 0;
-
-    if ( temp_cos_to_cbm == NULL &&
-         (temp_cos_to_cbm = xzalloc_array(struct psr_cat_cbm,
-                                          opt_cos_max + 1UL)) == NULL )
-        return -ENOMEM;
-
-    return 0;
-}
-
-static void cat_cpu_init(void)
-{
-    unsigned int eax, ebx, ecx, edx;
-    struct psr_cat_socket_info *info;
-    unsigned int socket;
-    unsigned int cpu = smp_processor_id();
-    uint64_t val;
-    const struct cpuinfo_x86 *c = cpu_data + cpu;
-
-    if ( !cpu_has(c, X86_FEATURE_PQE) || c->cpuid_level < PSR_CPUID_LEVEL_CAT )
-        return;
-
-    socket = cpu_to_socket(cpu);
-    if ( test_bit(socket, cat_socket_enable) )
-        return;
-
-    cpuid_count(PSR_CPUID_LEVEL_CAT, 0, &eax, &ebx, &ecx, &edx);
-    if ( ebx & PSR_RESOURCE_TYPE_L3 )
-    {
-        cpuid_count(PSR_CPUID_LEVEL_CAT, 1, &eax, &ebx, &ecx, &edx);
-        info = cat_socket_info + socket;
-        info->cbm_len = (eax & 0x1f) + 1;
-        info->cos_max = min(opt_cos_max, edx & 0xffff);
-
-        info->cos_to_cbm = temp_cos_to_cbm;
-        temp_cos_to_cbm = NULL;
-        /* cos=0 is reserved as default cbm(all ones). */
-        info->cos_to_cbm[0].cbm = (1ull << info->cbm_len) - 1;
-
-        spin_lock_init(&info->cbm_lock);
-
-        set_bit(socket, cat_socket_enable);
-
-        if ( (ecx & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) &&
-             cdp_socket_enable && !test_bit(socket, cdp_socket_enable) )
-        {
-            info->cos_to_cbm[0].code = (1ull << info->cbm_len) - 1;
-            info->cos_to_cbm[0].data = (1ull << info->cbm_len) - 1;
-
-            /* We only write mask1 since mask0 is always all ones by default. */
-            wrmsrl(MSR_IA32_PSR_L3_MASK(1), (1ull << info->cbm_len) - 1);
-
-            rdmsrl(MSR_IA32_PSR_L3_QOS_CFG, val);
-            wrmsrl(MSR_IA32_PSR_L3_QOS_CFG, val | (1 << PSR_L3_QOS_CDP_ENABLE_BIT));
-
-            /* Cut half of cos_max when CDP is enabled. */
-            info->cos_max >>= 1;
-
-            set_bit(socket, cdp_socket_enable);
-        }
-        printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u, CDP:%s\n",
-               socket, info->cos_max, info->cbm_len,
-               cdp_is_enabled(socket) ? "on" : "off");
-    }
-}
-
-static void cat_cpu_fini(unsigned int cpu)
-{
-    unsigned int socket = cpu_to_socket(cpu);
-
-    if ( !socket_cpumask[socket] || cpumask_empty(socket_cpumask[socket]) )
-    {
-        struct psr_cat_socket_info *info = cat_socket_info + socket;
-
-        if ( info->cos_to_cbm )
-        {
-            xfree(info->cos_to_cbm);
-            info->cos_to_cbm = NULL;
-        }
-
-        if ( cdp_is_enabled(socket) )
-            clear_bit(socket, cdp_socket_enable);
-
-        clear_bit(socket, cat_socket_enable);
-    }
-}
-
-static void __init psr_cat_free(void)
-{
-    xfree(cat_socket_enable);
-    cat_socket_enable = NULL;
-    xfree(cat_socket_info);
-    cat_socket_info = NULL;
-}
-
-static void __init init_psr_cat(void)
-{
-    if ( opt_cos_max < 1 )
-    {
-        printk(XENLOG_INFO "CAT: disabled, cos_max is too small\n");
-        return;
-    }
-
-    cat_socket_enable = xzalloc_array(unsigned long, BITS_TO_LONGS(nr_sockets));
-    cat_socket_info = xzalloc_array(struct psr_cat_socket_info, nr_sockets);
-    cdp_socket_enable = xzalloc_array(unsigned long, BITS_TO_LONGS(nr_sockets));
-
-    if ( !cat_socket_enable || !cat_socket_info )
-        psr_cat_free();
 }
 
 static int psr_cpu_prepare(unsigned int cpu)
 {
-    return cat_cpu_prepare(cpu);
+    return 0;
 }
 
 static void psr_cpu_init(void)
 {
-    if ( cat_socket_info )
-        cat_cpu_init();
-
     psr_assoc_init();
 }
 
 static void psr_cpu_fini(unsigned int cpu)
 {
-    if ( cat_socket_info )
-        cat_cpu_fini(cpu);
+    return;
 }
 
 static int cpu_callback(
@@ -738,14 +282,10 @@ static int __init psr_presmp_init(void)
     if ( (opt_psr & PSR_CMT) && opt_rmid_max )
         init_psr_cmt(opt_rmid_max);
 
-    if ( opt_psr & PSR_CAT )
-        init_psr_cat();
-
-    if ( psr_cpu_prepare(0) )
-        psr_cat_free();
+    psr_cpu_prepare(0);
 
     psr_cpu_init();
-    if ( psr_cmt_enabled() || cat_socket_info )
+    if ( psr_cmt_enabled() )
         register_cpu_notifier(&cpu_nfb);
 
     return 0;
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 03/25] x86: refactor psr: implement main data structures.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
  2017-04-01 13:53 ` [PATCH v10 01/25] docs: create Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) feature document Yi Sun
  2017-04-01 13:53 ` [PATCH v10 02/25] x86: refactor psr: remove L3 CAT/CDP codes Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-03 15:50   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 04/25] x86: move cpuid_count_leaf from cpuid.c to processor.h Yi Sun
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

To construct an extendible framework, we need analyze PSR features
and abstract the common things and feature specific things. Then,
encapsulate them into different data structures.

By analyzing PSR features, we can get below map.
                +------+------+------+
      --------->| Dom0 | Dom1 | ...  |
      |         +------+------+------+
      |            |
      |Dom ID      | cos_id of domain
      |            V
      |        +-----------------------------------------------------------------------------+
User --------->| PSR                                                                         |
     Socket ID |  +--------------+---------------+---------------+                           |
               |  | Socket0 Info | Socket 1 Info |    ...        |                           |
               |  +--------------+---------------+---------------+                           |
               |    |                   cos_id=0               cos_id=1          ...         |
               |    |          +-----------------------+-----------------------+-----------+ |
               |    |->Ref   : |         ref 0         |         ref 1         | ...       | |
               |    |          +-----------------------+-----------------------+-----------+ |
               |    |          +-----------------------+-----------------------+-----------+ |
               |    |->L3 CAT: |         cos 0         |         cos 1         | ...       | |
               |    |          +-----------------------+-----------------------+-----------+ |
               |    |          +-----------------------+-----------------------+-----------+ |
               |    |->L2 CAT: |         cos 0         |         cos 1         | ...       | |
               |    |          +-----------------------+-----------------------+-----------+ |
               |    |          +-----------+-----------+-----------+-----------+-----------+ |
               |    |->CDP   : | cos0 code | cos0 data | cos1 code | cos1 data | ...       | |
               |               +-----------+-----------+-----------+-----------+-----------+ |
               +-----------------------------------------------------------------------------+

So, we need define a socket info data structure, 'struct
psr_socket_info' to manage information per socket. It contains a
reference count array according to COS ID and a feature array to
manage all features enabled. Every entry of the reference count
array is used to record how many domains are using the COS registers
according to the COS ID. For example, L3 CAT and L2 CAT are enabled,
Dom1 uses COS_ID=1 registers of both features to save CBM values, like
below.
        +-------+-------+-------+-----+
        | COS 0 | COS 1 | COS 2 | ... |
        +-------+-------+-------+-----+
L3 CAT  | 0x7ff | 0x1ff | ...   | ... |
        +-------+-------+-------+-----+
L2 CAT  | 0xff  | 0xff  | ...   | ... |
        +-------+-------+-------+-----+

If Dom2 has same CBM values, it can reuse these registers which COS_ID=1.
That means, both Dom1 and Dom2 use same COS registers(ID=1) to keep same
L3/L2 values. So, the value of ref[1] is 2 which means 2 domains are using
COS_ID 1.

To manage a feature, we need define a feature node data structure,
'struct feat_node', to manage feature's specific HW info, its common
properties (callback functions - all feature's specific behaviors are
encapsulated into these callback functions, and generic values - e.g. the
cos_max), the feature independent values, and an array of all COS registers
values of this feature.

CDP is a special feature which uses two entries of the array
for one COS ID. So, the number of CDP COS registers is the half of L3
CAT. E.g. L3 CAT has 16 COS registers, then CDP has 8 COS registers if
it is enabled. CDP uses the COS registers array as below.

                         +-----------+-----------+-----------+-----------+-----------+
CDP cos_reg_val[] index: |     0     |     1     |     2     |     3     |    ...    |
                         +-----------+-----------+-----------+-----------+-----------+
                  value: | cos0 code | cos0 data | cos1 code | cos1 data |    ...    |
                         +-----------+-----------+-----------+-----------+-----------+

For more details, please refer SDM and patches to implement 'get value' and
'set value'.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove initialization for 'PSR_SOCKET_L3_CAT'.
      (suggested by Jan Beulich)
    - rename 'feat_ops' to 'feat_props'.
      (suggested by Jan Beulich)
    - move 'cbm_len' to 'feat_props' because it is feature independent so far.
      (suggested by Jan Beulich)
    - move 'cos_max' to 'feat_props' because it is feature independent.
      (suggested by Jan Beulich)
    - move 'cos_num' to 'feat_props' because it is feature independent.
      (suggested by Jan Beulich)
    - remove union 'info' and struct 'psr_cat_hw_info'.
    - remove 'get_cos_max' from 'feat_props'.
      (suggested by Jan Beulich)
    - remove 'feat_mask' from 'psr_socket_info' because we can use 'features[]'
      to check if any feature is initialized.
      (suggested by Jan Beulich)
    - move 'ref_lock' above 'cos_ref'.
      (suggested by Jan Beulich)
    - adjust comments and commit message according to above changes.
v9:
    - replace feature list to a feature pointer array.
      (suggested by Roger Pau)
    - add 'PSR_SOCKET_MAX_FEAT' in 'enum psr_feat_type' to know features
      account.
      (suggested by Roger Pau)
    - move 'feat_ops' declaration into 'feat_node' structure.
      (suggested by Roger Pau)
    - directly use uninon for feature HW info and move its declaration into
      'feat_node' structure.
      (suggested by Roger Pau)
    - remove 'enum psr_feat_type feature' declared in 'feat_ops' because it is
      not useful after using feature pointer array.
      (suggested by Roger Pau)
    - rename 'l3_cat_info' to 'cat_info' to be used by all CAT/CDP features.
    - remove 'nr_feat' which is only for a record.
      (suggested by Jan Beulich)
    - add 'cos_num' to record how many COS registers are used by a feature in
      one time access.
      (suggested by Jan Beulich)
    - replace 'uint64_t' to 'uint32_t' for cbm value because SDM specifies the
      max 32 bits for it.
      (suggested by Jan Beulich)
v7:
    - sort inclusion files position.
      (suggested by Wei Liu)
v6:
    - make commit message be clearer.
      (suggested by Konrad Rzeszutek Wilk)
    - fix wordings.
      (suggested by Konrad Rzeszutek Wilk)
    - add comments to explain relationship between 'feat_mask' and
      'enum psr_feat_type'.
      (suggested by Konrad Rzeszutek Wilk)
v5:
    - remove section number.
      (suggested by Jan Beulich)
    - remove double blank.
      (suggested by Jan Beulich)
v4:
    - create this patch because of removing all old CAT/CDP codes to make
      implementation be more easily understood.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 85 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 96a8589..cf352d2 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -13,16 +13,100 @@
  * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
  * more details.
  */
-#include <xen/init.h>
 #include <xen/cpu.h>
 #include <xen/err.h>
+#include <xen/init.h>
 #include <xen/sched.h>
 #include <asm/psr.h>
 
+/*
+ * Terminology:
+ * - CAT         Cache Allocation Technology
+ * - CBM         Capacity BitMasks
+ * - CDP         Code and Data Prioritization
+ * - COS/CLOS    Class of Service. Also mean COS registers.
+ * - COS_MAX     Max number of COS for the feature (minus 1)
+ * - MSRs        Machine Specific Registers
+ * - PSR         Intel Platform Shared Resource
+ */
+
 #define PSR_CMT        (1<<0)
 #define PSR_CAT        (1<<1)
 #define PSR_CDP        (1<<2)
 
+/*
+ * Per SDM chapter 'Cache Allocation Technology: Cache Mask Configuration',
+ * the MSRs ranging from 0C90H through 0D0FH (inclusive), enables support for
+ * up to 128 L3 CAT Classes of Service. The COS_ID=[0,127].
+ *
+ * The MSRs ranging from 0D10H through 0D4FH (inclusive), enables support for
+ * up to 64 L2 CAT COS. The COS_ID=[0,63].
+ *
+ * So, the maximum COS register count of one feature is 128.
+ */
+#define MAX_COS_REG_CNT  128
+
+enum psr_feat_type {
+    PSR_SOCKET_L3_CAT,
+    PSR_SOCKET_L3_CDP,
+    PSR_SOCKET_L2_CAT,
+    PSR_SOCKET_MAX_FEAT,
+};
+
+/*
+ * This structure represents one feature.
+ * feat_props  - Feature properties, including operation callback functions
+                 and feature common values.
+ * cos_reg_val - Array to store the values of COS registers. One entry stores
+ *               the value of one COS register.
+ *               For L3 CAT and L2 CAT, one entry corresponds to one COS_ID.
+ *               For CDP, two entries correspond to one COS_ID. E.g.
+ *               COS_ID=0 corresponds to cos_reg_val[0] (Data) and
+ *               cos_reg_val[1] (Code).
+ */
+struct feat_node {
+    /*
+     * This structure defines feature operation callback functions. Every
+     * feature enabled MUST implement such callback functions and register
+     * them to props.
+     *
+     * Feature specific behaviors will be encapsulated into these callback
+     * functions. Then, the main flows will not be changed when introducing
+     * a new feature.
+     *
+     * Feature independent HW info and common values are also defined in it.
+     */
+    const struct feat_props {
+        /*
+         * cos_num, cos_max and cbm_len are common values for all features
+         * so far.
+         * cos_num - COS registers number that feature uses for one COS ID.
+         *           It is defined in SDM.
+         * cos_max - The max COS registers number got through CPUID.
+         * cbm_len - The length of CBM got through CPUID.
+         */
+        unsigned int cos_num;
+        unsigned int cos_max;
+        unsigned int cbm_len;
+    } *props;
+
+    uint32_t cos_reg_val[MAX_COS_REG_CNT];
+};
+
+/*
+ * PSR features are managed per socket. Below structure defines the members
+ * used to manage these features.
+ * features  - A feature node array used to manage all features enabled.
+ * ref_lock  - A lock to protect cos_ref.
+ * cos_ref   - A reference count array to record how many domains are using the
+ *             COS ID. Every entry of cos_ref corresponds to one COS ID.
+ */
+struct psr_socket_info {
+    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
+    spinlock_t ref_lock;
+    unsigned int cos_ref[MAX_COS_REG_CNT];
+};
+
 struct psr_assoc {
     uint64_t val;
     uint64_t cos_mask;
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 04/25] x86: move cpuid_count_leaf from cpuid.c to processor.h.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (2 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 03/25] x86: refactor psr: implement main data structures Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow Yi Sun
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch moves 'cpuid_count_leaf' from cpuid.c to processor.h to
make it available to external codes.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v10:
    - Acked by Jan.
v9:
    - create this patch alone to move 'cpuid_count_leaf'.
      (suggested by Wei Liu)
v6:
    - use 'struct cpuid_leaf' in psr.c. So we have to access 'cpuid_count_leaf'
      which has to be moved to processor.h.
      (suggested by Andrew Cooper)
---
 xen/arch/x86/cpuid.c            | 6 ------
 xen/include/asm-x86/processor.h | 7 +++++++
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index d6f6b88..13a28ca 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -35,12 +35,6 @@ static void cpuid_leaf(uint32_t leaf, struct cpuid_leaf *data)
     cpuid(leaf, &data->a, &data->b, &data->c, &data->d);
 }
 
-static void cpuid_count_leaf(uint32_t leaf, uint32_t subleaf,
-                             struct cpuid_leaf *data)
-{
-    cpuid_count(leaf, subleaf, &data->a, &data->b, &data->c, &data->d);
-}
-
 static void sanitise_featureset(uint32_t *fs)
 {
     /* for_each_set_bit() uses unsigned longs.  Extend with zeroes. */
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index dda8b83..2588a1b 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -13,6 +13,7 @@
 #include <asm/types.h>
 #include <asm/cpufeature.h>
 #include <asm/desc.h>
+#include <asm/x86_emulate.h>
 #endif
 
 #include <asm/x86-defns.h>
@@ -261,6 +262,12 @@ static always_inline unsigned int cpuid_count_ebx(
     return ebx;
 }
 
+static always_inline void cpuid_count_leaf(uint32_t leaf, uint32_t subleaf,
+                                           struct cpuid_leaf *data)
+{
+    cpuid_count(leaf, subleaf, &data->a, &data->b, &data->c, &data->d);
+}
+
 static inline unsigned long read_cr0(void)
 {
     unsigned long cr0;
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (3 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 04/25] x86: move cpuid_count_leaf from cpuid.c to processor.h Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-05 15:10   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows Yi Sun
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements the CPU init and free flow including L3 CAT
initialization and some resources free. It includes below flows:
1. presmp init:
    - parse command line parameter.
    - allocate socket info for every socket.
    - allocate feature resource.
    - initialize socket info, get feature info and add feature into feature
      array per cpuid result.
    - free resources allocated if error happens.
    - register cpu notifier to handle cpu events.
2. cpu notifier:
    - handle cpu online events, if initialization work has been done before,
      do nothing.
    - handle cpu offline events, if it is the last cpu offline, free some
      socket resources.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove 'asm/x86_emulate.h' inclusion as it has been indirectly included.
      (suggested by Jan Beulich)
    - remove 'CAT_COS_NUM' as it is only used once.
      (suggested by Jan Beulich)
    - remove 'feat_mask'.
      (suggested by Jan Beulich)
    - changes about 'feat_props'.
      (suggested by Jan Beulich)
    - remove 'get_cos_max' hook declaration.
      (suggested by Jan Beulich)
    - modify 'cat_default_val' implementation.
      (suggested by Jan Beulich)
    - modify 'psr_alloc_feat_enabled' implementation to make it simple.
      (suggested by Jan Beulich)
    - rename 'free_feature' to 'free_socket_resources' because it is executed
      when socket is offline. It needs free resources related to the socket.
      (suggested by Jan Beulich)
    - define 'feat_init_done' to iterate feature array to check if any feature
      has been initialized.
      (suggested by Jan Beulich)
    - input 'struct cpuid_leaf' pointer into 'cat_init_feature' to avoid memory
      copy.
      (suggested by Jan Beulich)
    - modify 'cat_init_feature' to use switch and things related to above
      changes.
      (suggested by Jan Beulich)
    - add an indentation for label.
      (suggested by Jan Beulich)
v9:
    - add commit message to explain the flows.
    - handle cpu offline and online again case to read MSRs registers values
      back and save them into cos array to make user can get real data.
    - create a new patch about moving 'cpuid_count_leaf'.
      (suggested by Wei Liu)
    - modify comment to explain why not free some resource in 'free_feature'.
      (suggested by Wei Liu)
    - implement 'psr_alloc_feat_enabled' to check if allocation feature is
      enabled in cmdline and some initialization work done.
      (suggested by Wei Liu)
    - implement 'cat_default_val' to set default value for CAT features.
      (suggested by Wei Liu)
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - implement a common 'cat_init_feature' to replace L3 CAT/L2 CAT specific
      init functions.
      (suggested by Roger Pau)
    - modify comments for global feature node.
      (suggested by Jan Beulich)
    - remove unnecessary comments.
      (suggested by Jan Beulich)
    - remove unnecessary 'else'.
      (suggested by Jan Beulich)
    - remove 'nr_feat'.
      (suggested by Jan Beulich)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - check global flag with boot cpu operations.
      (suggested by Jan Beulich)
    - remove 'cpu_init_work' and move codes into 'psr_cpu_init'.
      (suggested by Jan Beulich)
    - remove 'cpu_fini_work' and move codes into 'psr_cpu_fini'.
      (suggested by Jan Beulich)
    - assign value for 'cos_num'.
      (suggested by Jan Beulich)
    - change about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v8:
    - fix format issue.
      (suggested by Konrad Rzeszutek Wilk)
    - add comments to explain why we care about cpumask_empty when the last
      cpu on socket is offline.
      (suggested by Konrad Rzeszutek Wilk)
v7:
    - initialize structure objects for avoiding surprise.
      (suggested by Konrad Rzeszutek Wilk)
    - fix typo.
      (suggested by Konrad Rzeszutek Wilk)
    - fix a logical mistake when handling the last cpu offline event.
      (suggested by Konrad Rzeszutek Wilk)
v6:
    - use 'struct cpuid_leaf' introduced in Andrew's patch.
      (suggested by Konrad Rzeszutek Wilk)
    - add comments about cpu_add_remove_lock.
      (suggested by Konrad Rzeszutek Wilk)
    - change 'clear_bit' to '__clear_bit'.
      (suggested by Konrad Rzeszutek Wilk)
    - add 'ASSERT' check when setting 'feat_mask'.
      (suggested by Konrad Rzeszutek Wilk)
    - adjust 'printk' position to avoid odd spacing.
      (suggested by Konrad Rzeszutek Wilk)
    - add comment to explain usage of 'feat_l3_cat'.
      (suggested by Konrad Rzeszutek Wilk)
    - fix wording.
      (suggested by Konrad Rzeszutek Wilk)
    - move 'cpuid_count_leaf' helper function to 'asm-x86/processor.h'.
      It cannot be moved to 'cpuid.h' which causes compilation error because
      of header file loop reference.
      (suggested by Andrew Cooper)
v5:
    - add comment to explain the reason to define 'feat_l3_cat'.
      (suggested by Jan Beulich)
    - use 'list_for_each_entry_safe'.
      (suggested by Jan Beulich)
    - remove codes to free 'feat_l3_cat' in 'free_feature' to avoid the need
      for an allocation the next time a CPU comes online.
      (suggested by Jan Beulich)
    - define 'struct cpuid_leaf_regs' to encapsulate eax~edx.
      (suggested by Jan Beulich)
    - print feature info on a socket only when 'opt_cpu_info' is true.
      (suggested by Jan Beulich)
    - declare global variable 'l3_cat_ops' to 'static const'.
      (suggested by Jan Beulich)
    - use 'current_cpu_data'.
      (suggested by Jan Beulich)
    - rename 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
    - clear PQE feature bit when the maximum CPUID level is too low.
      (suggested by Jan Beulich)
    - directly call 'l3_cat_init_feature'. No need to make it a callback
      function.
      (suggested by Jan Beulich)
    - remove local variable 'info'.
      (suggested by Jan Beulich)
    - move 'INIT_LIST_HEAD' into 'cpu_init_work' to be together with
      spin_lock_init().
      (suggested by Jan Beulich)
    - remove 'cpu_prepare_work' and move its content into 'psr_cpu_prepare'.
      (suggested by Jan Beulich)
v4:
    - create this patch because of removing all CAT/CDP old codes to make
      implementation be more easily understood.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 205 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index cf352d2..e422a23 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -34,6 +34,9 @@
 #define PSR_CAT        (1<<1)
 #define PSR_CDP        (1<<2)
 
+#define CAT_CBM_LEN_MASK 0x1f
+#define CAT_COS_MAX_MASK 0xffff
+
 /*
  * Per SDM chapter 'Cache Allocation Technology: Cache Mask Configuration',
  * the MSRs ranging from 0C90H through 0D0FH (inclusive), enables support for
@@ -76,7 +79,7 @@ struct feat_node {
      *
      * Feature independent HW info and common values are also defined in it.
      */
-    const struct feat_props {
+    struct feat_props {
         /*
          * cos_num, cos_max and cbm_len are common values for all features
          * so far.
@@ -114,11 +117,124 @@ struct psr_assoc {
 
 struct psr_cmt *__read_mostly psr_cmt;
 
+static struct psr_socket_info *__read_mostly socket_info;
+
 static unsigned int opt_psr;
 static unsigned int __initdata opt_rmid_max = 255;
+static unsigned int __read_mostly opt_cos_max = MAX_COS_REG_CNT;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
+/*
+ * Declare global feature node for every feature to facilitate the feature
+ * array creation. It is used to transiently store a spare node.
+ */
+static struct feat_node *feat_l3_cat;
+
+/* Common functions */
+#define cat_default_val(len) (0xffffffff >> (32 - (len)))
+
+/*
+ * Use this function to check if any allocation feature has been enabled
+ * in cmdline.
+ */
+static bool psr_alloc_feat_enabled(void)
+{
+    return !!socket_info;
+}
+
+static void free_socket_resources(unsigned int socket)
+{
+    unsigned int i;
+    struct psr_socket_info *info = socket_info + socket;
+
+    if ( !info )
+        return;
+
+    /*
+     * Free resources of features. The global feature object, e.g. feat_l3_cat,
+     * may not be freed here if it is not added into array. It is simply being
+     * kept until the next CPU online attempt.
+     */
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+    {
+        if ( !info->features[i] )
+            continue;
+
+        xfree(info->features[i]);
+        info->features[i] = NULL;
+    }
+}
+
+static bool feat_init_done(const struct psr_socket_info *info)
+{
+    unsigned int i;
+
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+    {
+        if ( !info->features[i] )
+            continue;
+
+        return true;
+    }
+
+    return false;
+}
+
+/* CAT common functions implementation. */
+static void cat_init_feature(const struct cpuid_leaf *regs,
+                             struct feat_node *feat,
+                             struct psr_socket_info *info,
+                             enum psr_feat_type type)
+{
+    unsigned int socket, i;
+
+    /* No valid value so do not enable feature. */
+    if ( !regs->a || !regs->d )
+        return;
+
+    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
+    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
+
+    switch ( type )
+    {
+    case PSR_SOCKET_L3_CAT:
+        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
+        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
+
+        /*
+         * To handle cpu offline and then online case, we need restore MSRs to
+         * default values.
+         */
+        for ( i = 1; i <= feat->props->cos_max; i++ )
+        {
+            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
+            feat->cos_reg_val[i] = feat->cos_reg_val[0];
+        }
+
+        break;
+
+    default:
+        return;
+    }
+
+    /* Add this feature into array. */
+    info->features[type] = feat;
+
+    socket = cpu_to_socket(smp_processor_id());
+    if ( !opt_cpu_info )
+        return;
+
+    printk(XENLOG_INFO "%s CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
+           ((type == PSR_SOCKET_L3_CAT) ? "L3" : "L2"),
+           socket, feat->props->cos_max, feat->props->cbm_len);
+}
+
+/* L3 CAT ops */
+static struct feat_props l3_cat_props = {
+    .cos_num = 1,
+};
+
 static void __init parse_psr_bool(char *s, char *value, char *feature,
                                   unsigned int mask)
 {
@@ -158,6 +274,9 @@ static void __init parse_psr_param(char *s)
         if ( val_str && !strcmp(s, "rmid_max") )
             opt_rmid_max = simple_strtoul(val_str, NULL, 0);
 
+        if ( val_str && !strcmp(s, "cos_max") )
+            opt_cos_max = simple_strtoul(val_str, NULL, 0);
+
         s = ss + 1;
     } while ( ss );
 }
@@ -313,19 +432,95 @@ void psr_domain_free(struct domain *d)
     psr_free_rmid(d);
 }
 
-static int psr_cpu_prepare(unsigned int cpu)
+static void __init init_psr(void)
+{
+    if ( opt_cos_max < 1 )
+    {
+        printk(XENLOG_INFO "CAT: disabled, cos_max is too small\n");
+        return;
+    }
+
+    socket_info = xzalloc_array(struct psr_socket_info, nr_sockets);
+
+    if ( !socket_info )
+    {
+        printk(XENLOG_INFO "Failed to alloc socket_info!\n");
+        return;
+    }
+}
+
+static void __init psr_free(void)
+{
+    xfree(socket_info);
+    socket_info = NULL;
+}
+
+static int psr_cpu_prepare(void)
 {
+    if ( !psr_alloc_feat_enabled() )
+        return 0;
+
+    /* Malloc memory for the global feature node here. */
+    if ( feat_l3_cat == NULL &&
+         (feat_l3_cat = xzalloc(struct feat_node)) == NULL )
+        return -ENOMEM;
+
     return 0;
 }
 
 static void psr_cpu_init(void)
 {
+    struct psr_socket_info *info;
+    unsigned int socket;
+    unsigned int cpu = smp_processor_id();
+    struct feat_node *feat;
+    struct cpuid_leaf regs;
+
+    if ( !psr_alloc_feat_enabled() || !boot_cpu_has(X86_FEATURE_PQE) )
+        goto assoc_init;
+
+    if ( boot_cpu_data.cpuid_level < PSR_CPUID_LEVEL_CAT )
+    {
+        setup_clear_cpu_cap(X86_FEATURE_PQE);
+        goto assoc_init;
+    }
+
+    socket = cpu_to_socket(cpu);
+    info = socket_info + socket;
+    if ( feat_init_done(info) )
+        goto assoc_init;
+
+    spin_lock_init(&info->ref_lock);
+
+    cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, &regs);
+    if ( regs.b & PSR_RESOURCE_TYPE_L3 )
+    {
+        cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 1, &regs);
+
+        feat = feat_l3_cat;
+        feat_l3_cat = NULL;
+        feat->props = &l3_cat_props;
+
+        cat_init_feature(&regs, feat, info, PSR_SOCKET_L3_CAT);
+    }
+
+ assoc_init:
     psr_assoc_init();
 }
 
 static void psr_cpu_fini(unsigned int cpu)
 {
-    return;
+    unsigned int socket = cpu_to_socket(cpu);
+
+    if ( !psr_alloc_feat_enabled() )
+        return;
+
+    /*
+     * We only free when we are the last CPU in the socket. The socket_cpumask
+     * is cleared prior to this notification code by remove_siblinginfo().
+     */
+    if ( socket_cpumask[socket] && cpumask_empty(socket_cpumask[socket]) )
+        free_socket_resources(socket);
 }
 
 static int cpu_callback(
@@ -337,7 +532,7 @@ static int cpu_callback(
     switch ( action )
     {
     case CPU_UP_PREPARE:
-        rc = psr_cpu_prepare(cpu);
+        rc = psr_cpu_prepare();
         break;
     case CPU_STARTING:
         psr_cpu_init();
@@ -366,10 +561,14 @@ static int __init psr_presmp_init(void)
     if ( (opt_psr & PSR_CMT) && opt_rmid_max )
         init_psr_cmt(opt_rmid_max);
 
-    psr_cpu_prepare(0);
+    if ( opt_psr & PSR_CAT )
+        init_psr();
+
+    if ( psr_cpu_prepare() )
+        psr_free();
 
     psr_cpu_init();
-    if ( psr_cmt_enabled() )
+    if ( psr_cmt_enabled() || psr_alloc_feat_enabled() )
         register_cpu_notifier(&cpu_nfb);
 
     return 0;
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (4 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-05 15:23   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow Yi Sun
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements the Domain init/free and schedule flows.
- When domain init, its psr resource should be allocated.
- When domain free, its psr resource should be freed too.
- When domain is scheduled, its COS ID on the socket should be
  set into ASSOC register to make corresponding COS MSR value
  work.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove 'cat_get_cos_max' as 'cos_max' is a feature property now which
      can be directly used.
      (suggested by Jan Beulich)
    - replace 'info->feat_mask' check to 'feat_init_done'.
      (suggested by Jan Beulich)
v9:
    - rename 'l3_cat_get_cos_max' to 'cat_get_cos_max' to cover all CAT/CDP
      features.
      (suggested by Roger Pau)
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - implement 'psr_alloc_cos' to match 'psr_free_cos'.
      (suggested by Wei Liu)
    - use 'psr_alloc_feat_enabled'.
      (suggested by Wei Liu)
    - fix coding style issue.
      (suggested by Wei Liu)
    - remove 'inline'.
      (suggested by Jan Beulich)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - remove 'psr_cos_ids' check in 'psr_free_cos'.
      (suggested by Jan Beulich)
v6:
    - change 'PSR_ASSOC_REG_POS' to 'PSR_ASSOC_REG_SHIFT'.
      (suggested by Konrad Rzeszutek Wilk)
v5:
    - rename 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
    - define 'PSR_ASSOC_REG_POS'.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 68 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index e422a23..3421219 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -49,6 +49,8 @@
  */
 #define MAX_COS_REG_CNT  128
 
+#define PSR_ASSOC_REG_SHIFT 32
+
 enum psr_feat_type {
     PSR_SOCKET_L3_CAT,
     PSR_SOCKET_L3_CDP,
@@ -376,11 +378,39 @@ void psr_free_rmid(struct domain *d)
     d->arch.psr_rmid = 0;
 }
 
-static inline void psr_assoc_init(void)
+static unsigned int get_max_cos_max(const struct psr_socket_info *info)
+{
+    const struct feat_node *feat;
+    unsigned int cos_max = 0, i;
+
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+    {
+        feat = info->features[i];
+        if ( !feat )
+            continue;
+
+        cos_max = max(feat->props->cos_max, cos_max);
+    }
+
+    return cos_max;
+}
+
+static void psr_assoc_init(void)
 {
     struct psr_assoc *psra = &this_cpu(psr_assoc);
 
-    if ( psr_cmt_enabled() )
+    if ( psr_alloc_feat_enabled() )
+    {
+        unsigned int socket = cpu_to_socket(smp_processor_id());
+        const struct psr_socket_info *info = socket_info + socket;
+        unsigned int cos_max = get_max_cos_max(info);
+
+        if ( feat_init_done(info) )
+            psra->cos_mask = ((1ull << get_count_order(cos_max)) - 1) <<
+                             PSR_ASSOC_REG_SHIFT;
+    }
+
+    if ( psr_cmt_enabled() || psra->cos_mask )
         rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
 }
 
@@ -389,6 +419,13 @@ static inline void psr_assoc_rmid(uint64_t *reg, unsigned int rmid)
     *reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
 }
 
+static void psr_assoc_cos(uint64_t *reg, unsigned int cos,
+                          uint64_t cos_mask)
+{
+    *reg = (*reg & ~cos_mask) |
+            (((uint64_t)cos << PSR_ASSOC_REG_SHIFT) & cos_mask);
+}
+
 void psr_ctxt_switch_to(struct domain *d)
 {
     struct psr_assoc *psra = &this_cpu(psr_assoc);
@@ -397,6 +434,11 @@ void psr_ctxt_switch_to(struct domain *d)
     if ( psr_cmt_enabled() )
         psr_assoc_rmid(&reg, d->arch.psr_rmid);
 
+    if ( psra->cos_mask )
+        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
+                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
+                      0, psra->cos_mask);
+
     if ( reg != psra->val )
     {
         wrmsrl(MSR_IA32_PSR_ASSOC, reg);
@@ -422,14 +464,37 @@ int psr_set_l3_cbm(struct domain *d, unsigned int socket,
     return 0;
 }
 
-int psr_domain_init(struct domain *d)
+/* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
+static void psr_free_cos(struct domain *d)
+{
+    xfree(d->arch.psr_cos_ids);
+    d->arch.psr_cos_ids = NULL;
+}
+
+static int psr_alloc_cos(struct domain *d)
 {
+    d->arch.psr_cos_ids = xzalloc_array(unsigned int, nr_sockets);
+    if ( !d->arch.psr_cos_ids )
+        return -ENOMEM;
+
     return 0;
 }
 
+int psr_domain_init(struct domain *d)
+{
+    /* Init to success value */
+    int ret = 0;
+
+    if ( psr_alloc_feat_enabled() )
+        ret = psr_alloc_cos(d);
+
+    return ret;
+}
+
 void psr_domain_free(struct domain *d)
 {
     psr_free_rmid(d);
+    psr_free_cos(d);
 }
 
 static void __init init_psr(void)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (5 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-05 15:37   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow Yi Sun
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements get HW info flow including L3 CAT callback
function.

It also changes sysctl interface to make it more general.

With this patch, 'psr-hwinfo' can work for L3 CAT.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove 'PSR_SOCKET_UNKNOWN' and use 'ASSERT_UNREACHABLE()' to handle
      this case.
      (suggested by Jan Beulich)
    - check 'feat_type'.
      (suggested by Jan Beulich)
    - adjust macros names and values to make them more appropriate.
      (suggested by Jan Beulich)
    - use 'feat_init_done'.
      (suggested by Jan Beulich)
    - changes about 'cbm_len'.
      (suggested by Jan Beulich)
v9:
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - define 'PSR_INFO_SIZE'.
      (suggested by Roger Pau)
    - fix coding style issue.
      (suggested by Roger Pau and Jan Beulich)
    - use 'ARRAY_SIZE'.
      (suggested by Roger Pau)
    - rename 'l3_cat_get_feat_info' to 'cat_get_feat_info' to make it a common
      function for both L3/L2 CAT.
      (suggested by Roger Pau)
    - move constant to the right of comparison.
      (suggested by Wei Liu)
    - remove wrong comment.
      (suggested by Jan Beulich)
    - rename macros used by psr_get_info to make them meaningful.
      (suggested by Jan Beulich)
    - remove assignment for 'PSR_SOCKET_UNKNOWN'.
      (suggested by Jan Beulich)
    - retain blank line after 'case XEN_SYSCTL_PSR_CAT_get_l3_info'.
      (suggested by Jan Beulich)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - move common data check into common function.
      (suggested by Jan Beulich)
v6:
    - fix coding style issue.
      (suggested by Konrad Rzeszutek Wilk)
    - define 'PSR_SOCKET_UNKNOWN' in 'psr_feat_type'.
      (suggested by Konrad Rzeszutek Wilk)
    - change '-ENOTSOCK' to 'ERANGE'.
      (suggested by Konrad Rzeszutek Wilk)
    - modify position of macros to remove odd spacing in psr.h.
      (suggested by Konrad Rzeszutek Wilk)
v5:
    - change 'dat[]' to 'data[]'.
      (suggested by Jan Beulich)
    - modify parameter type to avoid fixed width type when there is no such
      intention.
      (suggested by Jan Beulich)
    - use 'const' when it is possible.
      (suggested by Jan Beulich)
    - check feature type outside callback function.
      (suggested by Jan Beulich)
    - modify macros names to add prefix 'PSR_' and change 'CDP_FLAG' to
      'PSR_FLAG'.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c        | 75 +++++++++++++++++++++++++++++++++++++++++++++--
 xen/arch/x86/sysctl.c     | 19 +++++++++---
 xen/include/asm-x86/psr.h | 16 ++++++----
 3 files changed, 98 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 3421219..36ade48 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -93,6 +93,10 @@ struct feat_node {
         unsigned int cos_num;
         unsigned int cos_max;
         unsigned int cbm_len;
+
+        /* get_feat_info is used to get feature HW info. */
+        bool (*get_feat_info)(const struct feat_node *feat,
+                              uint32_t data[], unsigned int array_len);
     } *props;
 
     uint32_t cos_reg_val[MAX_COS_REG_CNT];
@@ -183,6 +187,22 @@ static bool feat_init_done(const struct psr_socket_info *info)
     return false;
 }
 
+static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
+{
+    enum psr_feat_type feat_type;
+
+    switch ( type )
+    {
+    case PSR_CBM_TYPE_L3:
+        feat_type = PSR_SOCKET_L3_CAT;
+        break;
+    default:
+        ASSERT_UNREACHABLE();
+    }
+
+    return feat_type;
+}
+
 /* CAT common functions implementation. */
 static void cat_init_feature(const struct cpuid_leaf *regs,
                              struct feat_node *feat,
@@ -232,9 +252,23 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
            socket, feat->props->cos_max, feat->props->cbm_len);
 }
 
+static bool cat_get_feat_info(const struct feat_node *feat,
+                              uint32_t data[], unsigned int array_len)
+{
+    if ( array_len != PSR_INFO_ARRAY_SIZE )
+        return false;
+
+    data[PSR_INFO_IDX_COS_MAX] = feat->props->cos_max;
+    data[PSR_INFO_IDX_CAT_CBM_LEN] = feat->props->cbm_len;
+    data[PSR_INFO_IDX_CAT_FLAG] = 0;
+
+    return true;
+}
+
 /* L3 CAT ops */
 static struct feat_props l3_cat_props = {
     .cos_num = 1,
+    .get_feat_info = cat_get_feat_info,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
@@ -446,10 +480,45 @@ void psr_ctxt_switch_to(struct domain *d)
     }
 }
 
-int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
-                        uint32_t *cos_max, uint32_t *flags)
+static struct psr_socket_info *get_socket_info(unsigned int socket)
 {
-    return 0;
+    if ( !socket_info )
+        return ERR_PTR(-ENODEV);
+
+    if ( socket >= nr_sockets )
+        return ERR_PTR(-ERANGE);
+
+    if ( !feat_init_done(socket_info + socket) )
+        return ERR_PTR(-ENOENT);
+
+    return socket_info + socket;
+}
+
+int psr_get_info(unsigned int socket, enum cbm_type type,
+                 uint32_t data[], unsigned int array_len)
+{
+    const struct psr_socket_info *info = get_socket_info(socket);
+    const struct feat_node *feat;
+    enum psr_feat_type feat_type;
+
+    if ( IS_ERR(info) )
+        return PTR_ERR(info);
+
+    if ( !data )
+        return -EINVAL;
+
+    feat_type = psr_cbm_type_to_feat_type(type);
+    if ( feat_type > ARRAY_SIZE(info->features) )
+        return -ENOENT;
+
+    feat = info->features[feat_type];
+    if ( !feat )
+        return -ENOENT;
+
+    if ( feat->props->get_feat_info(feat, data, array_len) )
+        return 0;
+
+    return -EINVAL;
 }
 
 int psr_get_l3_cbm(struct domain *d, unsigned int socket,
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 2f7056e..c23270d 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -175,14 +175,25 @@ long arch_do_sysctl(
         switch ( sysctl->u.psr_cat_op.cmd )
         {
         case XEN_SYSCTL_PSR_CAT_get_l3_info:
-            ret = psr_get_cat_l3_info(sysctl->u.psr_cat_op.target,
-                                      &sysctl->u.psr_cat_op.u.l3_info.cbm_len,
-                                      &sysctl->u.psr_cat_op.u.l3_info.cos_max,
-                                      &sysctl->u.psr_cat_op.u.l3_info.flags);
+        {
+            uint32_t data[PSR_INFO_ARRAY_SIZE];
+
+            ret = psr_get_info(sysctl->u.psr_cat_op.target,
+                               PSR_CBM_TYPE_L3, data, ARRAY_SIZE(data));
+            if ( ret )
+                break;
+
+            sysctl->u.psr_cat_op.u.l3_info.cos_max =
+                                      data[PSR_INFO_IDX_COS_MAX];
+            sysctl->u.psr_cat_op.u.l3_info.cbm_len =
+                                      data[PSR_INFO_IDX_CAT_CBM_LEN];
+            sysctl->u.psr_cat_op.u.l3_info.flags =
+                                      data[PSR_INFO_IDX_CAT_FLAG];
 
             if ( !ret && __copy_field_to_guest(u_sysctl, sysctl, u.psr_cat_op) )
                 ret = -EFAULT;
             break;
+        }
 
         default:
             ret = -EOPNOTSUPP;
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 57f47e9..af3a465 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -19,20 +19,26 @@
 #include <xen/types.h>
 
 /* CAT cpuid level */
-#define PSR_CPUID_LEVEL_CAT   0x10
+#define PSR_CPUID_LEVEL_CAT             0x10
 
 /* Resource Type Enumeration */
 #define PSR_RESOURCE_TYPE_L3            0x2
 
 /* L3 Monitoring Features */
-#define PSR_CMT_L3_OCCUPANCY           0x1
+#define PSR_CMT_L3_OCCUPANCY            0x1
 
 /* CDP Capability */
-#define PSR_CAT_CDP_CAPABILITY       (1u << 2)
+#define PSR_CAT_CDP_CAPABILITY          (1u << 2)
 
 /* L3 CDP Enable bit*/
 #define PSR_L3_QOS_CDP_ENABLE_BIT       0x0
 
+/* Used by psr_get_info() */
+#define PSR_INFO_IDX_COS_MAX            0
+#define PSR_INFO_IDX_CAT_CBM_LEN        1
+#define PSR_INFO_IDX_CAT_FLAG           2
+#define PSR_INFO_ARRAY_SIZE             3
+
 struct psr_cmt_l3 {
     unsigned int features;
     unsigned int upscaling_factor;
@@ -63,8 +69,8 @@ int psr_alloc_rmid(struct domain *d);
 void psr_free_rmid(struct domain *d);
 void psr_ctxt_switch_to(struct domain *d);
 
-int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
-                        uint32_t *cos_max, uint32_t *flags);
+int psr_get_info(unsigned int socket, enum cbm_type type,
+                 uint32_t data[], unsigned int array_len);
 int psr_get_l3_cbm(struct domain *d, unsigned int socket,
                    uint64_t *cbm, enum cbm_type type);
 int psr_set_l3_cbm(struct domain *d, unsigned int socket,
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (6 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-05 15:51   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework Yi Sun
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

There is an interface in user space to show feature value of
domains.

This patch implements get value flow in hypervisor including
L3 CAT callback function.

It also changes domctl interface to make it more general.

With this patch, 'psr-cat-show' can work for L3 CAT but not for
L3 code/data which is implemented in patch "x86: refactor psr:
implement get value flow for CDP.".

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - use an intermediate variable to get value and avoid cast in domctl.
      (suggested by Jan Beulich)
    - remove 'type' in 'get_val' parameters and will add it back when
      implementing CDP.
      (suggested by Jan Beulich)
    - remove unnecessary variable and return error about 'info' in
      'psr_get_feat'.
      (suggested by Jan Beulich)
    - use 'ASSERT' to check input parameter in 'psr_get_val'.
      (suggested by Jan Beulich)
    - changes about 'feat_props'.
      (suggested by Jan Beulich)
v9:
    - add commit message to explain there is an user space interface.
    - rename 'l3_cat_get_val' to 'cat_get_val' to cover all L3/L2 CAT features.
      (suggested by Roger Pau)
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - change parameter of 'psr_get'. Use 'psr_cos_ids' directly to replace
      domain. Also declare it to 'const'.
      (suggested by Jan Beulich)
    - change code flow to remove 'psr_get' but add 'psr_get_feat' to make codes
      more reasonable.
      (suggested by Jan Beulich)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - move cos check into common function because this check is required by all
      features.
      (suggested by Jan Beulich)
    - fix coding style issue.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v7:
    - rename '__psr_get' to 'psr_get'.
      (suggested by Wei Liu)
v6:
    - modify commit message to make it clearer.
      (suggested by Konrad Rzeszutek Wilk)
    - remove one extra space in code.
      (suggested by Konrad Rzeszutek Wilk)
    - remove unnecessary comment.
      (suggested by Konrad Rzeszutek Wilk)
    - write a helper function to move get info and get val functions into
      it. Because most codes of 'get_info' and 'get_val' are same.
      (suggested by Konrad Rzeszutek Wilk)
v5:
    - rename 'dat[]' to 'data[]'
      (suggested by Jan Beulich)
    - modify variables names to make them better, e.g. 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
    - check if feature type match in caller of feature callback function.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/domctl.c     | 30 ++++++++++++++-------
 xen/arch/x86/psr.c        | 67 ++++++++++++++++++++++++++++++++++++++++-------
 xen/include/asm-x86/psr.h |  4 +--
 3 files changed, 80 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 02b48e8..dc213a7 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1455,25 +1455,37 @@ long arch_do_domctl(
             break;
 
         case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
-            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
-                                 &domctl->u.psr_cat_op.data,
-                                 PSR_CBM_TYPE_L3);
+        {
+            uint32_t val;
+
+            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
+                              &val, PSR_CBM_TYPE_L3);
+            domctl->u.psr_cat_op.data = val;
             copyback = 1;
             break;
+        }
 
         case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE:
-            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
-                                 &domctl->u.psr_cat_op.data,
-                                 PSR_CBM_TYPE_L3_CODE);
+        {
+            uint32_t val;
+
+            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
+                              &val, PSR_CBM_TYPE_L3_CODE);
+            domctl->u.psr_cat_op.data = val;
             copyback = 1;
             break;
+        }
 
         case XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA:
-            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
-                                 &domctl->u.psr_cat_op.data,
-                                 PSR_CBM_TYPE_L3_DATA);
+        {
+            uint32_t val;
+
+            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
+                              &val, PSR_CBM_TYPE_L3_DATA);
+            domctl->u.psr_cat_op.data = val;
             copyback = 1;
             break;
+        }
 
         default:
             ret = -EOPNOTSUPP;
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 36ade48..25fcd21 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -97,6 +97,10 @@ struct feat_node {
         /* get_feat_info is used to get feature HW info. */
         bool (*get_feat_info)(const struct feat_node *feat,
                               uint32_t data[], unsigned int array_len);
+
+        /* get_val is used to get feature COS register value. */
+        void (*get_val)(const struct feat_node *feat, unsigned int cos,
+                        uint32_t *val);
     } *props;
 
     uint32_t cos_reg_val[MAX_COS_REG_CNT];
@@ -265,10 +269,17 @@ static bool cat_get_feat_info(const struct feat_node *feat,
     return true;
 }
 
+static void cat_get_val(const struct feat_node *feat, unsigned int cos,
+                        uint32_t *val)
+{
+    *val = feat->cos_reg_val[cos];
+}
+
 /* L3 CAT ops */
 static struct feat_props l3_cat_props = {
     .cos_num = 1,
     .get_feat_info = cat_get_feat_info,
+    .get_val = cat_get_val,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
@@ -494,24 +505,34 @@ static struct psr_socket_info *get_socket_info(unsigned int socket)
     return socket_info + socket;
 }
 
-int psr_get_info(unsigned int socket, enum cbm_type type,
-                 uint32_t data[], unsigned int array_len)
+static struct feat_node * psr_get_feat(unsigned int socket,
+                                       enum cbm_type type)
 {
     const struct psr_socket_info *info = get_socket_info(socket);
-    const struct feat_node *feat;
     enum psr_feat_type feat_type;
 
     if ( IS_ERR(info) )
-        return PTR_ERR(info);
+        return ERR_PTR(PTR_ERR(info));
+
+    feat_type = psr_cbm_type_to_feat_type(type);
+    if ( feat_type > ARRAY_SIZE(info->features) )
+        return NULL;
+
+    return info->features[feat_type];
+}
+
+int psr_get_info(unsigned int socket, enum cbm_type type,
+                 uint32_t data[], unsigned int array_len)
+{
+    const struct feat_node *feat;
 
     if ( !data )
         return -EINVAL;
 
-    feat_type = psr_cbm_type_to_feat_type(type);
-    if ( feat_type > ARRAY_SIZE(info->features) )
-        return -ENOENT;
+    feat = psr_get_feat(socket, type);
+    if ( IS_ERR(feat) )
+        return PTR_ERR(feat);
 
-    feat = info->features[feat_type];
     if ( !feat )
         return -ENOENT;
 
@@ -521,9 +542,35 @@ int psr_get_info(unsigned int socket, enum cbm_type type,
     return -EINVAL;
 }
 
-int psr_get_l3_cbm(struct domain *d, unsigned int socket,
-                   uint64_t *cbm, enum cbm_type type)
+int psr_get_val(struct domain *d, unsigned int socket,
+                uint32_t *val, enum cbm_type type)
 {
+    const struct feat_node *feat;
+    unsigned int cos;
+
+    ASSERT(d && val);
+
+    feat = psr_get_feat(socket, type);
+    if ( IS_ERR(feat) )
+        return PTR_ERR(feat);
+
+    if ( !feat )
+        return -ENOENT;
+
+    cos = d->arch.psr_cos_ids[socket];
+    /*
+     * If input cos exceeds current feature's cos_max, we should return its
+     * default value which is stored in cos 0. This case only happens
+     * when more than two features enabled concurrently and at least one
+     * features's cos_max is bigger than others. When a domain's working cos
+     * id is bigger than some features' cos_max, HW automatically works as
+     * default value for those features which cos_max is smaller.
+     */
+    if ( cos > feat->props->cos_max )
+        cos = 0;
+
+    feat->props->get_val(feat, cos, val);
+
     return 0;
 }
 
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index af3a465..7c6d38a 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -71,8 +71,8 @@ void psr_ctxt_switch_to(struct domain *d);
 
 int psr_get_info(unsigned int socket, enum cbm_type type,
                  uint32_t data[], unsigned int array_len);
-int psr_get_l3_cbm(struct domain *d, unsigned int socket,
-                   uint64_t *cbm, enum cbm_type type);
+int psr_get_val(struct domain *d, unsigned int socket,
+                uint32_t *val, enum cbm_type type);
 int psr_set_l3_cbm(struct domain *d, unsigned int socket,
                    uint64_t cbm, enum cbm_type type);
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (7 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-11 15:01   ` Jan Beulich
  2017-04-20  5:38   ` [PATCH] dom_ids array implementation Yi Sun
  2017-04-01 13:53 ` [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array Yi Sun
                   ` (15 subsequent siblings)
  24 siblings, 2 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

As set value flow is the most complicated one in psr, it will be
divided to some patches to make things clearer. This patch
implements the set value framework to show a whole picture firstly.

It also changes domctl interface to make it more general.

To make the set value flow be general and can support multiple features
at same time, it includes below steps:
1. Get COS ID that current domain is using.
2. Gather a value array to store all features current value
   into it and replace the current value of the feature which is
   being set to the new input value.
3. Find if there is already a COS ID on which all features'
   values are same as the array. Then, we can reuse this COS
   ID.
4. If fail to find, we need pick an available COS ID. Only COS ID which ref
   is 0 or 1 can be picked.
5. Write the feature's MSR according to the COS ID and cbm_type.
6. Update ref according to COS ID.
7. Save the COS ID into current domain's psr_cos_ids[socket] so that we
   can know which COS the domain is using on the socket.

So, some functions are abstracted and the callback functions will be
implemented in next patches.

Here is an example to understand the process. The CPU supports
two featuers, e.g. L3 CAT and L2 CAT. User wants to set L3 CAT
of Dom1 to 0x1ff.
1. At the initial time, the old_cos of Dom1 is 0. The COS registers values
are below at this time.
        -------------------------------
        | COS 0 | COS 1 | COS 2 | ... |
        -------------------------------
L3 CAT  | 0x7ff | 0x7ff | 0x7ff | ... |
        -------------------------------
L2 CAT  | 0xff  | 0xff  | 0xff  | ... |
        -------------------------------

2. Gather the value array and insert new value into it:
val[0]: 0x1ff
val[1]: 0xff

3. It cannot find a matching COS.

4. Pick COS 1 to store the value set.

5. Write the L3 CAT COS 1 registers. The COS registers values are
changed to below now.
        -------------------------------
        | COS 0 | COS 1 | COS 2 | ... |
        -------------------------------
L3 CAT  | 0x7ff | 0x1ff | ...   | ... |
        -------------------------------
L2 CAT  | 0xff  | 0xff  | ...   | ... |
        -------------------------------

6. The ref[1] is increased to 1 because Dom1 is using it now.

7. Save 1 to Dom1's psr_cos_ids[socket].

Then, user wants to set L3 CAT of Dom2 to 0x1ff too. The old_cos
of Dom2 is 0 too. Repeat above flow.

The val array assembled is:
val[0]: 0x1ff
val[1]: 0xff

So, it can find a matching COS, COS 1. Then, it can reuse COS 1
for Dom2.

The ref[1] is increased to 2 now because both Dom1 and Dom2 are
using this COS ID. Set 1 to Dom2's psr_cos_ids[socket].

Another thing need to emphasize is the context switch. When context
switch happens, 'psr_ctxt_switch_to' is called by system to get
domain's COS ID from 'psr_cos_ids[socket]'. But 'psr_cos_ids[socket]'
is set at step 7 above. So, there are three scenarios, e.g.:
1. User calls domctl interface on Dom0 to set a COS ID 1 for Dom1 into its
   psr_cos_ids[]. Then, Dom1 is scheduled so that 'psr_ctxt_switch_to()' is
   called which makes COS ID 1 work. For this case, no action is needed.

2. Dom1 runs on CPU 1 and COS ID 1 is working. At same time, user calls domctl
   interface on Dom0 to set a new COS ID 2 for Dom1 into psr_cos_ids[]. After
   time slice ends, the Dom1 is scheduled again, the new COS ID 2 will work.

3. When a new COS ID is being set to psr_cos_ids[], 'psr_ctxt_switch_to()'
   is called to access the same psr_cos_ids[] member through 'psr_assoc_cos'.
   The COS ID is constrained by cos_mask so that it cannot exceeds the cos_max.
   So even the COS ID got here is wrong, it is still a workable ID (within
   cos_max). The functionality is still workable, the actual valid CBM will be
   effective for a short time. In next schedule, the correct CBM will take
   effect.

All these cases will not cause race condition and no harm to system. The PSR
features are to set cache capacity for a domain. The setting to cache is
progressively effective. When the cache setting becomes really effective, the
time slice to schedule a domain may have passed. So, even if a wrong COS ID is
used to set ASSOC, only another valid CBM be effective for a short time during
cache preparation time. The correct COS ID will take effect in a short time.
This does not affect cache capacity setting much.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - restore domain cos id to 0 when socket is offline.
      (suggested by Jan Beulich)
    - check 'psr_cat_op.data' to make sure only lower 32 bits are valid.
      (suggested by Jan Beulich)
    - remove unnecessary fixed width type of parameters and variables.
      (suggested by Jan Beulich)
    - rename 'insert_new_val_to_array' to 'insert_val_to_array'.
      (suggested by Jan Beulich)
    - input 'ref_lock' pointer into functions to check if it has been locked.
      (suggested by Jan Beulich)
    - add comment to declare the set process is protected by 'domctl_lock'.
      (suggested by Jan Beulich)
    - check 'feat_type'.
      (suggested by Jan Beulich)
    - remove 'feat_mask'.
      (suggested by Jan Beulich)
    - remove unnecessary criteria of ASSERT.
      (suggested by Jan Beulich)
    - adjust flow of 'psr_set_val' to avoid 'goto' for successful cases.
      (suggested by Jan Beulich)
    - use ASSERT to check 'socket_info' in 'psr_free_cos'.
      (suggested by Jan Beulich)
    - remove unnecessary comment in 'psr_free_cos'.
      (suggested by Jan Beulich)
v9:
    - use goto style error handling in 'psr_set_val'.
      (suggested by Wei Liu)
    - use ASSERT for checking old_cos.
      (suggested by Wei Liu and Jan Beulich)
    - fix coding style issue.
      (suggested by Wei Liu)
    - rename 'assemble_val_array' to 'combine_val_array' in pervious patch.
      (suggested by Wei Liu)
    - use 'spin_is_locked' to check ref_lock.
      (suggested by Roger Pau)
    - add an input parameter 'array_len' for 'write_psr_msr'.
    - check 'socket_info' and 'psr_cos_ids' in this patch.
      (suggested by Jan Beulich)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - fix commit message words.
      (suggested by Jan Beulich)
    - change 'assemble_val_array' to 'gather_val_array'.
      (suggested by Jan Beulich)
    - change 'set_new_val_to_array' to 'insert_new_val_to_array'.
      (suggested by Jan Beulich)
    - change parameter 'm' of 'insert_new_val_to_array' to 'new_val'.
      (suggested by Jan Beulich)
    - change 'write_psr_msr' to 'write_psr_msrs'.
      (suggested by Jan Beulich)
    - correct comments.
      (suggested by Jan Beulich)
    - remove unnecessary comments.
      (suggested by Jan Beulich)
    - adjust conditions after 'find_cos' to save a level of indentation.
      (suggested by Jan Beulich)
    - add 'ASSERT(!old_cos || ref[old_cos])'.
      (suggested by Jan Beulich)
    - move ASSERT() check into locked region.
      (suggested by Jan Beulich)
    - replace parameter '*val' to 'val[]' in some functions.
      (suggested by Jan Beulich)
    - change 'write_psr_msr' parameters to prepare to only set one new value
      for one feature.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
    - add explanation about context switch.
      (suggested by Jan Beulich)
v5:
    - modify commit message.
      (suggested by Jan Beulich)
    - return an error for all helper functions in set flow.
      (suggested by Jan Beulich)
    - remove unnecessary cast.
      (suggested by Jan Beulich)
    - divide 'get_old_set_new' to two functions, 'assemble_val_array' and
      'set_new_val_to_array'.
      (suggested by Jan Beulich)
    - modify comments.
      (suggested by Jan Beulich)
    - adjust code format.
      (suggested by Jan Beulich)
    - change 'alloc_new_cos' to 'pick_avail_cos' to make name accurate.
      (suggested by Jan Beulich)
    - check feature type when entering 'psr_set_val'.
      (suggested by Jan Beulich)
    - use ASSERT to check ref.
      (suggested by Jan Beulich)
    - rename 'dat[]' to 'data[]'.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/domctl.c     |  30 +++++--
 xen/arch/x86/psr.c        | 215 +++++++++++++++++++++++++++++++++++++++++++++-
 xen/include/asm-x86/psr.h |   4 +-
 3 files changed, 236 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index dc213a7..6ed71e2 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1437,21 +1437,33 @@ long arch_do_domctl(
         switch ( domctl->u.psr_cat_op.cmd )
         {
         case XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM:
-            ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target,
-                                 domctl->u.psr_cat_op.data,
-                                 PSR_CBM_TYPE_L3);
+            if ( domctl->u.psr_cat_op.data !=
+                 (uint32_t)domctl->u.psr_cat_op.data )
+                return -EINVAL;
+
+            ret = psr_set_val(d, domctl->u.psr_cat_op.target,
+                              domctl->u.psr_cat_op.data,
+                              PSR_CBM_TYPE_L3);
             break;
 
         case XEN_DOMCTL_PSR_CAT_OP_SET_L3_CODE:
-            ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target,
-                                 domctl->u.psr_cat_op.data,
-                                 PSR_CBM_TYPE_L3_CODE);
+            if ( domctl->u.psr_cat_op.data !=
+                 (uint32_t)domctl->u.psr_cat_op.data )
+                return -EINVAL;
+
+            ret = psr_set_val(d, domctl->u.psr_cat_op.target,
+                              domctl->u.psr_cat_op.data,
+                              PSR_CBM_TYPE_L3_CODE);
             break;
 
         case XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA:
-            ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target,
-                                 domctl->u.psr_cat_op.data,
-                                 PSR_CBM_TYPE_L3_DATA);
+            if ( domctl->u.psr_cat_op.data !=
+                 (uint32_t)domctl->u.psr_cat_op.data )
+                return -EINVAL;
+
+            ret = psr_set_val(d, domctl->u.psr_cat_op.target,
+                              domctl->u.psr_cat_op.data,
+                              PSR_CBM_TYPE_L3_DATA);
             break;
 
         case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 25fcd21..9d805d6 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -157,10 +157,26 @@ static void free_socket_resources(unsigned int socket)
 {
     unsigned int i;
     struct psr_socket_info *info = socket_info + socket;
+    struct domain *d;
 
     if ( !info )
         return;
 
+    /* Restore domain cos id to 0 when socket is offline. */
+    for_each_domain ( d )
+    {
+        unsigned int cos = d->arch.psr_cos_ids[socket];
+        if ( cos == 0 )
+            continue;
+
+        spin_lock(&info->ref_lock);
+        ASSERT(!cos || info->cos_ref[cos]);
+        info->cos_ref[cos]--;
+        spin_unlock(&info->ref_lock);
+
+        d->arch.psr_cos_ids[socket] = 0;
+    }
+
     /*
      * Free resources of features. The global feature object, e.g. feat_l3_cat,
      * may not be freed here if it is not added into array. It is simply being
@@ -574,15 +590,210 @@ int psr_get_val(struct domain *d, unsigned int socket,
     return 0;
 }
 
-int psr_set_l3_cbm(struct domain *d, unsigned int socket,
-                   uint64_t cbm, enum cbm_type type)
+/* Set value functions */
+static unsigned int get_cos_num(const struct psr_socket_info *info)
 {
     return 0;
 }
 
+static int gather_val_array(uint32_t val[],
+                            unsigned int array_len,
+                            const struct psr_socket_info *info,
+                            unsigned int old_cos)
+{
+    return -EINVAL;
+}
+
+static int insert_val_to_array(uint32_t val[],
+                               unsigned int array_len,
+                               const struct psr_socket_info *info,
+                               enum psr_feat_type feat_type,
+                               enum cbm_type type,
+                               uint32_t new_val)
+{
+    return -EINVAL;
+}
+
+static int find_cos(const uint32_t val[], unsigned int array_len,
+                    enum psr_feat_type feat_type,
+                    const struct psr_socket_info *info,
+                    spinlock_t *ref_lock)
+{
+    ASSERT(spin_is_locked(ref_lock));
+
+    return -ENOENT;
+}
+
+static int pick_avail_cos(const struct psr_socket_info *info,
+                          spinlock_t *ref_lock,
+                          const uint32_t val[], unsigned int array_len,
+                          unsigned int old_cos,
+                          enum psr_feat_type feat_type)
+{
+    ASSERT(spin_is_locked(ref_lock));
+
+    return -ENOENT;
+}
+
+static int write_psr_msr(unsigned int socket, unsigned int cos,
+                         uint32_t val, enum psr_feat_type feat_type)
+{
+    return -ENOENT;
+}
+
+/* The whole set process is protected by domctl_lock. */
+int psr_set_val(struct domain *d, unsigned int socket,
+                uint32_t val, enum cbm_type type)
+{
+    unsigned int old_cos;
+    int cos, ret;
+    unsigned int *ref;
+    uint32_t *val_array;
+    struct psr_socket_info *info = get_socket_info(socket);
+    unsigned int array_len;
+    enum psr_feat_type feat_type;
+
+    if ( IS_ERR(info) )
+        return PTR_ERR(info);
+
+    feat_type = psr_cbm_type_to_feat_type(type);
+    if ( feat_type > ARRAY_SIZE(info->features) ||
+         !info->features[feat_type] )
+        return -ENOENT;
+
+    /*
+     * Step 0:
+     * old_cos means the COS ID current domain is using. By default, it is 0.
+     *
+     * For every COS ID, there is a reference count to record how many domains
+     * are using the COS register corresponding to this COS ID.
+     * - If ref[old_cos] is 0, that means this COS is not used by any domain.
+     * - If ref[old_cos] is 1, that means this COS is only used by current
+     *   domain.
+     * - If ref[old_cos] is more than 1, that mean multiple domains are using
+     *   this COS.
+     */
+    old_cos = d->arch.psr_cos_ids[socket];
+    ASSERT(old_cos < MAX_COS_REG_CNT);
+
+    ref = info->cos_ref;
+
+    /*
+     * Step 1:
+     * Gather a value array to store all features cos_reg_val[old_cos].
+     * And, set the input new val into array according to the feature's
+     * position in array.
+     */
+    array_len = get_cos_num(info);
+    val_array = xzalloc_array(uint32_t, array_len);
+    if ( !val_array )
+        return -ENOMEM;
+
+    if ( (ret = gather_val_array(val_array, array_len, info, old_cos)) != 0 )
+        goto free_array;
+
+    if ( (ret = insert_val_to_array(val_array, array_len, info,
+                                    feat_type, type, val)) != 0 )
+        goto free_array;
+
+    spin_lock(&info->ref_lock);
+
+    /*
+     * Step 2:
+     * Try to find if there is already a COS ID on which all features' values
+     * are same as the array. Then, we can reuse this COS ID.
+     */
+    cos = find_cos(val_array, array_len, feat_type, info, &info->ref_lock);
+    if ( cos == old_cos )
+    {
+        ret = 0;
+        goto unlock_free_array;
+    }
+
+    /*
+     * Step 3:
+     * If fail to find, we need pick an available COS ID.
+     * In fact, only COS ID which ref is 1 or 0 can be picked for current
+     * domain. If old_cos is not 0 and its ref==1, that means only current
+     * domain is using this old_cos ID. So, this old_cos ID certainly can
+     * be reused by current domain. Ref==0 means there is no any domain
+     * using this COS ID. So it can be used for current domain too.
+     */
+    if ( cos < 0 )
+    {
+        cos = pick_avail_cos(info, &info->ref_lock, val_array,
+                             array_len, old_cos, feat_type);
+        if ( cos < 0 )
+        {
+            ret = cos;
+            goto unlock_free_array;
+        }
+
+        /*
+         * Step 4:
+         * Write all features MSRs according to the COS ID.
+         */
+        ret = write_psr_msr(socket, cos, val, feat_type);
+        if ( ret )
+            goto unlock_free_array;
+    }
+
+    /*
+     * Step 5:
+     * Find the COS ID (find_cos result is '>= 0' or an available COS ID is
+     * picked, then update ref according to COS ID.
+     */
+    ref[cos]++;
+    ASSERT(!cos || ref[cos]);
+    ASSERT(!old_cos || ref[old_cos]);
+    ref[old_cos]--;
+    spin_unlock(&info->ref_lock);
+
+    /*
+     * Step 6:
+     * Save the COS ID into current domain's psr_cos_ids[] so that we can know
+     * which COS the domain is using on the socket. One domain can only use
+     * one COS ID at same time on each socket.
+     */
+    d->arch.psr_cos_ids[socket] = cos;
+
+    xfree(val_array);
+    return ret;
+
+ unlock_free_array:
+    spin_unlock(&info->ref_lock);
+ free_array:
+    xfree(val_array);
+    return ret;
+}
+
 /* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
 static void psr_free_cos(struct domain *d)
 {
+    unsigned int socket, cos;
+
+    ASSERT(socket_info);
+
+    if ( !d->arch.psr_cos_ids )
+        return;
+
+    /* Domain is destroied so its cos_ref should be decreased. */
+    for ( socket = 0; socket < nr_sockets; socket++ )
+    {
+        struct psr_socket_info *info;
+
+        /* cos 0 is default one which does not need be handled. */
+        cos = d->arch.psr_cos_ids[socket];
+        if ( cos == 0 )
+            continue;
+
+        info = socket_info + socket;
+        spin_lock(&info->ref_lock);
+        ASSERT(!cos || info->cos_ref[cos]);
+        info->cos_ref[cos]--;
+        spin_unlock(&info->ref_lock);
+    }
+
     xfree(d->arch.psr_cos_ids);
     d->arch.psr_cos_ids = NULL;
 }
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 7c6d38a..66d5218 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -73,8 +73,8 @@ int psr_get_info(unsigned int socket, enum cbm_type type,
                  uint32_t data[], unsigned int array_len);
 int psr_get_val(struct domain *d, unsigned int socket,
                 uint32_t *val, enum cbm_type type);
-int psr_set_l3_cbm(struct domain *d, unsigned int socket,
-                   uint64_t cbm, enum cbm_type type);
+int psr_set_val(struct domain *d, unsigned int socket,
+                uint32_t val, enum cbm_type type);
 
 int psr_domain_init(struct domain *d);
 void psr_domain_free(struct domain *d);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (8 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-11 15:11   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 11/25] x86: refactor psr: L3 CAT: set value: implement cos finding flow Yi Sun
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

Only can one COS ID be used by one domain at one time. That means all enabled
features' COS registers at this COS ID are valid for this domain at that time.

When user updates a feature's value, we need make sure all other features'
values are not affected. So, we firstly need gather an array which contains
all features current values and replace the setting feature's value in array
to new value.

Then, we can try to find if there is a COS ID on which all features' COS
registers values are same as the array. If we can find, we just use this COS
ID. If fail to find, we need pick a new COS ID.

This patch implements value array assembling flow.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove 'get_old_val' to directly call 'get_val' to get needed val.
      (suggested by Jan Beulich)
    - move 'psr_check_cbm' into 'insert_val_to_array'.
      (suggested by Jan Beulich)
    - change type of 'cbm' in 'psr_check_cbm' to 'unsigned long'.
      (suggested by Jan Beulich)
    - remove 'set_new_val' as it can be handled in generic process.
    - changes related to 'feat_props'.
      (suggested by Jan Beulich)
    - adjust flow in 'gather_val_array' to avoid array cross.
      (suggested by Jan Beulich)
    - adjust flow in 'insert_val_to_array' to avoid array cross.
      (suggested by Jan Beulich)
v9:
    - add comments about boundary checking.
      (suggested by Wei Liu)
    - rename 'assemble_val_array' to 'combine_val_array' in pervious patch.
      (suggested by Wei Liu)
    - rename 'l3_cat_get_cos_num' to 'cat_get_cos_num' to cover all L3/L2 CAT
      features.
      (suggested by Roger Pau)
    - rename 'l3_cat_get_old_val' to 'cat_get_old_val' to cover all L3/L2 CAT
      features and reuse cat_get_val in it.
      (suggested by Roger Pau)
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - replace 'm' to 'new_val'.
      (suggested by Jan Beulich)
    - move cos check outside callback function.
      (suggested by Jan Beulich)
    - remove 'get_cos_num' callback function.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v6:
    - change 'assemble_val_array' to 'combine_val_array'.
      (suggested by Konrad Rzeszutek Wilk)
    - check return value of 'get_old_val'.
      (suggested by Konrad Rzeszutek Wilk)
    - replace some 'EINVAL' to 'ENOSPC'.
      (suggested by Konrad Rzeszutek Wilk)
v5:
    - modify comments according to changes of codes.
      (suggested by Jan Beulich)
    - change 'bool_t' to 'bool'.
      (suggested by Jan Beulich)
    - modify return value of callback functions because we do not need them
      to return number of entries the feature uses. In caller, we call
      'get_cos_num' to get the number of entries the feature uses.
      (suggested by Jan Beulich)
    - modify variables names to make them better, e.g. 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 104 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 9d805d6..c912478 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -224,6 +224,29 @@ static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
 }
 
 /* CAT common functions implementation. */
+static bool psr_check_cbm(unsigned int cbm_len, unsigned long cbm)
+{
+    unsigned int first_bit, zero_bit;
+
+    /* Set bits should only in the range of [0, cbm_len]. */
+    if ( cbm & (~0ul << cbm_len) )
+        return false;
+
+    /* At least one bit need to be set. */
+    if ( cbm == 0 )
+        return false;
+
+    first_bit = find_first_bit(&cbm, cbm_len);
+    zero_bit = find_next_zero_bit(&cbm, cbm_len, first_bit);
+
+    /* Set bits should be contiguous. */
+    if ( zero_bit < cbm_len &&
+         find_next_bit(&cbm, cbm_len, zero_bit) < cbm_len )
+        return false;
+
+    return true;
+}
+
 static void cat_init_feature(const struct cpuid_leaf *regs,
                              struct feat_node *feat,
                              struct psr_socket_info *info,
@@ -593,7 +616,21 @@ int psr_get_val(struct domain *d, unsigned int socket,
 /* Set value functions */
 static unsigned int get_cos_num(const struct psr_socket_info *info)
 {
-    return 0;
+    unsigned int num = 0, i;
+
+    /* Get all features total amount. */
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+    {
+        const struct feat_node *feat = info->features[i];
+        if ( !feat )
+            continue;
+
+        feat = info->features[i];
+
+        num += feat->props->cos_num;
+    }
+
+    return num;
 }
 
 static int gather_val_array(uint32_t val[],
@@ -601,7 +638,38 @@ static int gather_val_array(uint32_t val[],
                             const struct psr_socket_info *info,
                             unsigned int old_cos)
 {
-    return -EINVAL;
+    unsigned int i;
+
+    if ( !val )
+        return -EINVAL;
+
+    /* Get all features current values according to old_cos. */
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+    {
+        unsigned int cos = old_cos;
+        const struct feat_node *feat = info->features[i];
+        if ( !feat )
+            continue;
+
+        if ( array_len < feat->props->cos_num )
+            return -ENOSPC;
+
+        /*
+         * If old_cos exceeds current feature's cos_max, we should get
+         * default value. So assign cos to 0 which stores default value.
+         */
+        if ( cos > feat->props->cos_max )
+            cos = 0;
+
+        /* Value getting order is same as feature array. */
+        feat->props->get_val(feat, cos, &val[0]);
+
+        array_len -= feat->props->cos_num;
+
+        val += feat->props->cos_num;
+    }
+
+    return 0;
 }
 
 static int insert_val_to_array(uint32_t val[],
@@ -611,7 +679,40 @@ static int insert_val_to_array(uint32_t val[],
                                enum cbm_type type,
                                uint32_t new_val)
 {
-    return -EINVAL;
+    const struct feat_node *feat;
+    unsigned int i;
+
+    ASSERT(feat_type < PSR_SOCKET_MAX_FEAT);
+
+    /* Insert new value into array according to feature's position in array. */
+    for ( i = 0; i < feat_type; i++ )
+    {
+        feat = info->features[i];
+        if ( !feat )
+            continue;
+
+        if ( array_len <= feat->props->cos_num )
+            return -ENOSPC;
+
+        array_len -= feat->props->cos_num;
+
+        val += feat->props->cos_num;
+    }
+
+    feat = info->features[feat_type];
+    if ( !feat )
+        return -ENOENT;
+
+    if ( array_len < feat->props->cos_num )
+        return -ENOSPC;
+
+    if ( !psr_check_cbm(feat->props->cbm_len, new_val) )
+        return -EINVAL;
+
+    /* Value setting position is same as feature array. */
+    val[0] = new_val;
+
+    return 0;
 }
 
 static int find_cos(const uint32_t val[], unsigned int array_len,
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 11/25] x86: refactor psr: L3 CAT: set value: implement cos finding flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (9 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-11 15:17   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 12/25] x86: refactor psr: L3 CAT: set value: implement cos id picking flow Yi Sun
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

Continue from patch:
'x86: refactor psr: L3 CAT: set value: assemble features value array'

We can try to find if there is a COS ID on which all features' COS registers
values are same as the array assembled before.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove 'compare_val' hook and its CAT implementation. Make its
      functionality be generic in 'find_cos' flow.
      (suggested by Jan Beulich)
    - changes related to 'props'.
      (suggested by Jan Beulich)
    - rename 'val_array' to 'val_ptr'.
      (suggested by Jan Beulich)
    - rename 'find' to 'found'.
      (suggested by Jan Beulich)
    - move some variables declaration and initialization into loop.
      (suggested by Jan Beulich)
    - adjust codes positions.
      (suggested by Jan Beulich)
v9:
    - modify comments of 'compare_val' to be same as current implementation.
      (suggested by Wei Liu)
    - fix indentation issue.
      (suggested by Wei Liu)
    - rename 'l3_cat_compare_val' to 'cat_compare_val' to cover all L3/L2 CAT
      features.
      (suggested by Roger Pau)
    - remove parameter 'found' from 'cat_compare_val' and modify the return
      values to let caller know if the id is found or not.
      (suggested by Roger Pau)
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - replace 'get_cos_num' to 'feat->cos_num'.
      (suggested by Jan Beulich)
    - directly use 'cos_reg_val[0]' as default value.
      (suggested by Jan Beulich)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v5:
    - modify commit message to provide exact patch name to continue from.
      (suggested by Jan Beulich)
    - remove 'get_cos_max_from_type' because it can be replaced by
      'get_cos_max'.
    - move type check out from callback functions to caller.
      (suggested by Jan Beulich)
    - modify variables names to make them better, e.g. 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
    - modify comments according to changes of codes.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index c912478..a6c6f18 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -720,8 +720,83 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
                     const struct psr_socket_info *info,
                     spinlock_t *ref_lock)
 {
+    unsigned int cos, i;
+    const unsigned int *ref = info->cos_ref;
+    const struct feat_node *feat;
+    unsigned int cos_max;
+
     ASSERT(spin_is_locked(ref_lock));
 
+    /* cos_max is the one of the feature which is being set. */
+    feat = info->features[feat_type];
+    if ( !feat )
+        return -ENOENT;
+
+    cos_max = feat->props->cos_max;
+
+    for ( cos = 0; cos <= cos_max; cos++ )
+    {
+        const uint32_t *val_ptr = val;
+        bool found = false;
+
+        if ( cos && !ref[cos] )
+            continue;
+
+        /*
+         * If fail to find cos in below loop, need find whole feature array
+         * again from beginning.
+         */
+        for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+        {
+            uint32_t default_val = 0;
+
+            feat = info->features[i];
+            if ( !feat )
+                continue;
+
+            /*
+             * COS ID 0 always stores the default value so input 0 to get
+             * default value.
+             */
+            feat->props->get_val(feat, 0, &default_val);
+
+            /*
+             * Compare value according to feature array order.
+             * We must follow this order because value array is assembled
+             * as this order.
+             */
+            if ( cos > feat->props->cos_max )
+            {
+                /*
+                 * If cos is bigger than feature's cos_max, the val should be
+                 * default value. Otherwise, it fails to find a COS ID. So we
+                 * have to exit find flow.
+                 */
+                if ( val[0] != default_val )
+                    return -EINVAL;
+
+                found = true;
+            }
+            else
+            {
+                if ( val[0] == feat->cos_reg_val[cos] )
+                    found = true;
+            }
+
+            /* If fail to match, go to next cos to compare. */
+            if ( !found )
+                break;
+
+            val_ptr += feat->props->cos_num;
+            if ( val_ptr - val > array_len )
+                return -ENOSPC;
+        }
+
+        /* For this COS ID all entries in the values array do match. Use it. */
+        if ( found )
+            return cos;
+    }
+
     return -ENOENT;
 }
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 12/25] x86: refactor psr: L3 CAT: set value: implement cos id picking flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (10 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 11/25] x86: refactor psr: L3 CAT: set value: implement cos finding flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-11 15:20   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 13/25] x86: refactor psr: L3 CAT: set value: implement write msr flow Yi Sun
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

Continue from previous patch:
'x86: refactor psr: L3 CAT: set value: implement cos finding flow.'

If fail to find a COS ID, we need pick a new COS ID for domain. Only COS ID
that ref[COS_ID] is 1 or 0 can be picked to input a new set feature values.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove 'fits_cos_max' hook and CAT implementation. Move the process into
      generic flow.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
    - adjust codes positions.
      (suggested by Jan Beulich)
v9:
    - modify return value of 'pick_avail_cos' to make it more accurate.
    - rename 'l3_cat_fits_cos_max' to 'cat_fits_cos_max' to cover L3/L2 CAT
      features.
      (suggested by Roger Pau)
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - fix comment.
      (suggested by Wei Liu)
    - directly use 'cos_reg_val[0]' as default value.
      (suggested by Jan Beulich)
    - replace 'get_cos_num' to 'feat->cos_num'.
      (suggested by Jan Beulich)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v5:
    - modify commit message to provide exact patch name to continue from.
      (suggested by Jan Beulich)
    - change 'exceeds_cos_max' to 'fits_cos_max' to be accurate.
      (suggested by Jan Beulich)
    - modify comments according to changes of codes.
      (suggested by Jan Beulich)
    - modify return value of callback functions because we do not need them
      to return number of entries the feature uses. In caller, we call
      'get_cos_num' to get the number of entries the feature uses.
      (suggested by Jan Beulich)
    - move type check out from callback functions to caller.
      (suggested by Jan Beulich)
    - modify variables names to make them better, e.g. 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
    - modify code format.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 68 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index a6c6f18..44c9313 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -800,15 +800,82 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
     return -ENOENT;
 }
 
+static bool fits_cos_max(const uint32_t val[],
+                         uint32_t array_len,
+                         const struct psr_socket_info *info,
+                         unsigned int cos)
+{
+    unsigned int i;
+
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+    {
+        uint32_t default_val = 0;
+        const struct feat_node *feat = info->features[i];
+        if ( !feat )
+            continue;
+
+        if ( array_len < feat->props->cos_num )
+            return false;
+
+        if ( cos > feat->props->cos_max )
+        {
+            feat->props->get_val(feat, 0, &default_val);
+            if ( val[0] != default_val )
+                return false;
+        }
+
+        array_len -= feat->props->cos_num;
+
+        val += feat->props->cos_num;
+    }
+
+    return true;
+}
+
 static int pick_avail_cos(const struct psr_socket_info *info,
                           spinlock_t *ref_lock,
                           const uint32_t val[], unsigned int array_len,
                           unsigned int old_cos,
                           enum psr_feat_type feat_type)
 {
+    unsigned int cos;
+    unsigned int cos_max = 0;
+    const struct feat_node *feat;
+    const unsigned int *ref = info->cos_ref;
+
     ASSERT(spin_is_locked(ref_lock));
 
-    return -ENOENT;
+    /* cos_max is the one of the feature which is being set. */
+    feat = info->features[feat_type];
+    if ( !feat )
+        return -ENOENT;
+
+    cos_max = feat->props->cos_max;
+    if ( !cos_max )
+        return -ENOENT;
+
+    /* We cannot use id 0 because it stores the default values. */
+    if ( old_cos && ref[old_cos] == 1 &&
+         fits_cos_max(val, array_len, info, old_cos) )
+            return old_cos;
+
+    /* Find an unused one other than cos0. */
+    for ( cos = 1; cos <= cos_max; cos++ )
+    {
+        /*
+         * ref is 0 means this COS is not used by other domain and
+         * can be used for current setting.
+         */
+        if ( !ref[cos] )
+        {
+            if ( !fits_cos_max(val, array_len, info, cos) )
+                break;
+
+            return cos;
+        }
+    }
+
+    return -EOVERFLOW;
 }
 
 static int write_psr_msr(unsigned int socket, unsigned int cos,
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 13/25] x86: refactor psr: L3 CAT: set value: implement write msr flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (11 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 12/25] x86: refactor psr: L3 CAT: set value: implement cos id picking flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-11 15:25   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 14/25] x86: refactor psr: CDP: implement CPU init and free flow Yi Sun
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

Continue from previous patch:
'x86: refactor psr: L3 CAT: set value: implement cos id picking flow.'

We have got the feature value and COS ID to set. Then, we write MSR of the
designated feature.

Till now, set value process is completed.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove 'type' from 'write_msr' parameter list. Will add it back when
      implementing CDP.
      (suggested by Jan Beulich)
    - remove unnecessary casts.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
v9:
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - add 'array_len' in 'struct cos_write_info' and check if val array
      exceeds it.
    - modify 'write_psr_msr' flow only to set one value a time. No need to
      set whole feature array values.
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v8:
    - modify 'write_msr' callback function to 'void' because we have to set
      all features' cbm. When input cos exceeds some features' cos_max, just
      skip them but not break the iteration.
v5:
    - modify commit message to provide exact patch name to continue from.
      (suggested by Jan Beulich)
    - modify return value of callback functions because we do not need them
      to return number of entries the feature uses. In caller, we call
      'get_cos_num' to get the number of entries the feature uses.
      (suggested by Jan Beulich)
    - move type check out from callback functions to caller.
      (suggested by Jan Beulich)
    - modify variables names to make them better, e.g. 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
    - correct code format.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 62 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 44c9313..0f57676 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -101,6 +101,10 @@ struct feat_node {
         /* get_val is used to get feature COS register value. */
         void (*get_val)(const struct feat_node *feat, unsigned int cos,
                         uint32_t *val);
+
+        /* write_msr is used to write out feature MSR register. */
+        void (*write_msr)(unsigned int cos, uint32_t val,
+                          struct feat_node *feat);
     } *props;
 
     uint32_t cos_reg_val[MAX_COS_REG_CNT];
@@ -315,10 +319,21 @@ static void cat_get_val(const struct feat_node *feat, unsigned int cos,
 }
 
 /* L3 CAT ops */
+static void l3_cat_write_msr(unsigned int cos, uint32_t val,
+                             struct feat_node *feat)
+{
+    if ( feat->cos_reg_val[cos] != val )
+    {
+        feat->cos_reg_val[cos] = val;
+        wrmsrl(MSR_IA32_PSR_L3_MASK(cos), val);
+    }
+}
+
 static struct feat_props l3_cat_props = {
     .cos_num = 1,
     .get_feat_info = cat_get_feat_info,
     .get_val = cat_get_val,
+    .write_msr = l3_cat_write_msr,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
@@ -878,10 +893,56 @@ static int pick_avail_cos(const struct psr_socket_info *info,
     return -EOVERFLOW;
 }
 
+static unsigned int get_socket_cpu(unsigned int socket)
+{
+    if ( likely(socket < nr_sockets) )
+        return cpumask_any(socket_cpumask[socket]);
+
+    return nr_cpu_ids;
+}
+
+struct cos_write_info
+{
+    unsigned int cos;
+    struct feat_node *feature;
+    uint32_t val;
+};
+
+static void do_write_psr_msr(void *data)
+{
+    struct cos_write_info *info = data;
+    unsigned int cos            = info->cos;
+    struct feat_node *feat      = info->feature;
+
+    if ( cos > feat->props->cos_max )
+        return;
+
+    feat->props->write_msr(cos, info->val, feat);
+}
+
 static int write_psr_msr(unsigned int socket, unsigned int cos,
                          uint32_t val, enum psr_feat_type feat_type)
 {
-    return -ENOENT;
+    struct psr_socket_info *info = get_socket_info(socket);
+    struct cos_write_info data =
+    {
+        .cos = cos,
+        .feature = info->features[feat_type],
+        .val = val,
+    };
+
+    if ( socket == cpu_to_socket(smp_processor_id()) )
+        do_write_psr_msr(&data);
+    else
+    {
+        unsigned int cpu = get_socket_cpu(socket);
+
+        if ( cpu >= nr_cpu_ids )
+            return -ENOTSOCK;
+        on_selected_cpus(cpumask_of(cpu), do_write_psr_msr, &data, 1);
+    }
+
+    return 0;
 }
 
 /* The whole set process is protected by domctl_lock. */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 14/25] x86: refactor psr: CDP: implement CPU init and free flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (12 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 13/25] x86: refactor psr: L3 CAT: set value: implement write msr flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 15/25] x86: refactor psr: CDP: implement get hw info flow Yi Sun
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements the CPU init and free flow for CDP including L3 CDP
initialization callback function. The flow is almost same as L3 CAT.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - fix comment.
      (suggested by Jan Beulich)
    - use swith in 'cat_init_feature' to handle different feature types.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
    - restore MSRs to default value when cpu online.
      (suggested by Jan Beulich)
    - remove feat_mask.
      (suggested by Jan Beulich)
v9:
    - modify commit message to describe flow clearer.
    - handle cpu offline and online again case to read MSRs registers values
      back and save them into cos array to make user can get real data.
    - modify error handling process in 'psr_cpu_prepare' to reduce redundant
      codes.
    - modify 'get_cdp_data' and 'get_cdp_code' to make them standard.
      (suggested by Roger Pau and Jan Beulich)
    - encapsulate CDP operations into 'cat_init_feature' to reduce redundant
      codes.
      (suggested by Roger Pau)
    - reuse 'cat_get_cos_max' for CDP.
      (suggested by Roger Pau)
    - handle 'PSR_CDP' in psr_presmp_init to make init work can be done when
      there is only 'psr=cdp' in cmdline.
    - remove unnecessary comment.
      (suggested by Jan Beulich)
    - move CDP related codes in 'cpu_init_work' into 'psr_cpu_init'.
      (suggested by Jan Beulich)
    - add codes to handle CDP's 'cos_num'.
      (suggested by Jan Beulich)
    - fix coding style issue.
      (suggested by Jan Beulich)
    - do not free resources when allocation fails in 'psr_cpu_prepare'.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v7:
    - initialize 'l3_cdp'.
      (suggested by Konrad Rzeszutek Wilk)
v6:
    - use 'cpuid_leaf'.
      (suggested by Konrad Rzeszutek Wilk and Jan Beulich)
v5:
    - remove codes to free 'feat_l3_cdp' in 'free_feature'.
      (suggested by Jan Beulich)
    - encapsulate cpuid registers into 'struct cpuid_leaf_regs'.
      (suggested by Jan Beulich)
    - print socket info when 'opt_cpu_info' is true.
      (suggested by Jan Beulich)
    - rename 'l3_cdp_get_max_cos_max' to 'l3_cdp_get_cos_max'.
      (suggested by Jan Beulich)
    - rename 'dat[]' to 'data[]'.
      (suggested by Jan Beulich)
    - move 'cpu_prepare_work' contents into 'psr_cpu_prepare'.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 76 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 0f57676..58970fa 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -144,11 +144,28 @@ static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
  * array creation. It is used to transiently store a spare node.
  */
 static struct feat_node *feat_l3_cat;
+static struct feat_node *feat_l3_cdp;
 
 /* Common functions */
 #define cat_default_val(len) (0xffffffff >> (32 - (len)))
 
 /*
+ * get_cdp_data - get DATA COS register value from input COS ID.
+ * @feat:        the feature node.
+ * @cos:         the COS ID.
+ */
+#define get_cdp_data(feat, cos)              \
+            ( (feat)->cos_reg_val[(cos) * 2] )
+
+/*
+ * get_cdp_code - get CODE COS register value from input COS ID.
+ * @feat:        the feature node.
+ * @cos:         the COS ID.
+ */
+#define get_cdp_code(feat, cos)              \
+            ( (feat)->cos_reg_val[(cos) * 2 + 1] )
+
+/*
  * Use this function to check if any allocation feature has been enabled
  * in cmdline.
  */
@@ -283,6 +300,37 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
 
         break;
 
+    case PSR_SOCKET_L3_CDP:
+    {
+        unsigned long val;
+
+        /* Cut half of cos_max when CDP is enabled. */
+        feat->props->cos_max >>= 1;
+
+        /* We only write mask1 since mask0 is always all ones by default. */
+        wrmsrl(MSR_IA32_PSR_L3_MASK(1), cat_default_val(feat->props->cbm_len));
+        rdmsrl(MSR_IA32_PSR_L3_QOS_CFG, val);
+        wrmsrl(MSR_IA32_PSR_L3_QOS_CFG, val | (1 << PSR_L3_QOS_CDP_ENABLE_BIT));
+
+        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
+        get_cdp_code(feat, 0) = cat_default_val(feat->props->cbm_len);
+        get_cdp_data(feat, 0) = cat_default_val(feat->props->cbm_len);
+
+        /*
+         * To handle cpu offline and then online case, we need restore MSRs to
+         * default values.
+         */
+        for ( i = 1; i <= feat->props->cos_max; i++ )
+        {
+            wrmsrl(MSR_IA32_PSR_L3_MASK_DATA(i), get_cdp_data(feat, 0));
+            wrmsrl(MSR_IA32_PSR_L3_MASK_CODE(i), get_cdp_code(feat, 0));
+            get_cdp_code(feat, i) = get_cdp_code(feat, 0);
+            get_cdp_data(feat, i) = get_cdp_data(feat, 0);
+        }
+
+        break;
+    }
+
     default:
         return;
     }
@@ -294,8 +342,9 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
     if ( !opt_cpu_info )
         return;
 
-    printk(XENLOG_INFO "%s CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
-           ((type == PSR_SOCKET_L3_CAT) ? "L3" : "L2"),
+    printk(XENLOG_INFO "%s: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
+           ((type == PSR_SOCKET_L3_CDP) ? "CDP" :
+            ((type == PSR_SOCKET_L3_CAT) ? "L3 CAT": "L2 CAT")),
            socket, feat->props->cos_max, feat->props->cbm_len);
 }
 
@@ -336,6 +385,11 @@ static struct feat_props l3_cat_props = {
     .write_msr = l3_cat_write_msr,
 };
 
+/* L3 CDP ops */
+static struct feat_props l3_cdp_props = {
+    .cos_num = 2,
+};
+
 static void __init parse_psr_bool(char *s, char *value, char *feature,
                                   unsigned int mask)
 {
@@ -1161,6 +1215,10 @@ static int psr_cpu_prepare(void)
          (feat_l3_cat = xzalloc(struct feat_node)) == NULL )
         return -ENOMEM;
 
+    if ( feat_l3_cdp == NULL &&
+         (feat_l3_cdp = xzalloc(struct feat_node)) == NULL )
+        return -ENOMEM;
+
     return 0;
 }
 
@@ -1193,11 +1251,21 @@ static void psr_cpu_init(void)
     {
         cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 1, &regs);
 
-        feat = feat_l3_cat;
-        feat_l3_cat = NULL;
-        feat->props = &l3_cat_props;
-
-        cat_init_feature(&regs, feat, info, PSR_SOCKET_L3_CAT);
+        if ( (regs.c & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) &&
+             !info->features[PSR_SOCKET_L3_CDP] )
+        {
+            feat = feat_l3_cdp;
+            feat_l3_cdp = NULL;
+            feat->props = &l3_cdp_props;
+            cat_init_feature(&regs, feat, info, PSR_SOCKET_L3_CDP);
+        }
+        else
+        {
+            feat = feat_l3_cat;
+            feat_l3_cat = NULL;
+            feat->props = &l3_cat_props;
+            cat_init_feature(&regs, feat, info, PSR_SOCKET_L3_CAT);
+        }
     }
 
  assoc_init:
@@ -1257,7 +1325,7 @@ static int __init psr_presmp_init(void)
     if ( (opt_psr & PSR_CMT) && opt_rmid_max )
         init_psr_cmt(opt_rmid_max);
 
-    if ( opt_psr & PSR_CAT )
+    if ( opt_psr & (PSR_CAT | PSR_CDP) )
         init_psr();
 
     if ( psr_cpu_prepare() )
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 15/25] x86: refactor psr: CDP: implement get hw info flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (13 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 14/25] x86: refactor psr: CDP: implement CPU init and free flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow Yi Sun
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements get HW info flow for CDP including L3 CDP callback
function. The flow is almost same as L3 CAT.

With this patch, 'psr-hwinfo' can work for L3 CDP.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - update renamed macros used by psr_get_info.
      (suggested by Jan Beulich)
    - change 'psr_get_info' flow to cover CDP case to make codes in sysctl
      more simple.
      (suggested by Jan Beulich)
    - remove sysctl redundant codes after applying above changes.
      (suggested by Jan Beulich)
v9:
    - modify commit message to explain flow more clearly.
    - reuse 'cat_get_feat_info' for CDP to reduce redundant codes.
      (suggested by Roger Pau)
    - fix coding style issues.
      (suggested by Wei Liu and Roger Pau)
    - rename macros used by psr_get_info to make them meaningful.
      (suggested by Jan Beulich)
v5:
    - rename 'dat[]' to 'data[]'.
      (suggested by Jan Beulich)
    - remove type check in callback function.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 58970fa..f0611ad 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -237,6 +237,10 @@ static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
     case PSR_CBM_TYPE_L3:
         feat_type = PSR_SOCKET_L3_CAT;
         break;
+    case PSR_CBM_TYPE_L3_DATA:
+    case PSR_CBM_TYPE_L3_CODE:
+        feat_type = PSR_SOCKET_L3_CDP;
+        break;
     default:
         ASSERT_UNREACHABLE();
     }
@@ -386,8 +390,20 @@ static struct feat_props l3_cat_props = {
 };
 
 /* L3 CDP ops */
+static bool l3_cdp_get_feat_info(const struct feat_node *feat,
+                                 uint32_t data[], uint32_t array_len)
+{
+    if ( !cat_get_feat_info(feat, data, array_len) )
+        return false;
+
+    data[PSR_INFO_IDX_CAT_FLAG] |= XEN_SYSCTL_PSR_CAT_L3_CDP;
+
+    return true;
+}
+
 static struct feat_props l3_cdp_props = {
     .cos_num = 2,
+    .get_feat_info = l3_cdp_get_feat_info,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
@@ -641,6 +657,14 @@ int psr_get_info(unsigned int socket, enum cbm_type type,
     if ( IS_ERR(feat) )
         return PTR_ERR(feat);
 
+    /* If type is L3 CAT but we cannot find it in feature array, try CDP. */
+    if ( !feat && type == PSR_CBM_TYPE_L3 )
+    {
+        feat = psr_get_feat(socket, PSR_CBM_TYPE_L3_CODE);
+        if ( IS_ERR(feat) )
+            return PTR_ERR(feat);
+    }
+
     if ( !feat )
         return -ENOENT;
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (14 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 15/25] x86: refactor psr: CDP: implement get hw info flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-11 15:39   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 17/25] x86: refactor psr: CDP: implement set value callback functions Yi Sun
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements L3 CDP get value callback function.

With this patch, 'psr-cat-show' can work for L3 CDP.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - add 'enum cbm_type type' into 'get_val' parameters to handle CDP case.
      (suggested by Jan Beulich)
v9:
    - modify the type of 'l3_cdp_get_val' to 'void'.
    - cos checking has been done in common function so remove related codes
      in CDP callback function.
      (suggested by Jan Beulich)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v5:
    - remove type check in callback function.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index f0611ad..aced012 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -100,7 +100,7 @@ struct feat_node {
 
         /* get_val is used to get feature COS register value. */
         void (*get_val)(const struct feat_node *feat, unsigned int cos,
-                        uint32_t *val);
+                        enum cbm_type type, uint32_t *val);
 
         /* write_msr is used to write out feature MSR register. */
         void (*write_msr)(unsigned int cos, uint32_t val,
@@ -366,7 +366,7 @@ static bool cat_get_feat_info(const struct feat_node *feat,
 }
 
 static void cat_get_val(const struct feat_node *feat, unsigned int cos,
-                        uint32_t *val)
+                        enum cbm_type type, uint32_t *val)
 {
     *val = feat->cos_reg_val[cos];
 }
@@ -401,9 +401,19 @@ static bool l3_cdp_get_feat_info(const struct feat_node *feat,
     return true;
 }
 
+static void l3_cdp_get_val(const struct feat_node *feat, unsigned int cos,
+                           enum cbm_type type, uint32_t *val)
+{
+    if ( type == PSR_CBM_TYPE_L3_DATA )
+        *val = get_cdp_data(feat, cos);
+    else
+        *val = get_cdp_code(feat, cos);
+}
+
 static struct feat_props l3_cdp_props = {
     .cos_num = 2,
     .get_feat_info = l3_cdp_get_feat_info,
+    .get_val = l3_cdp_get_val,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
@@ -701,7 +711,7 @@ int psr_get_val(struct domain *d, unsigned int socket,
     if ( cos > feat->props->cos_max )
         cos = 0;
 
-    feat->props->get_val(feat, cos, val);
+    feat->props->get_val(feat, cos, type, val);
 
     return 0;
 }
@@ -755,7 +765,7 @@ static int gather_val_array(uint32_t val[],
             cos = 0;
 
         /* Value getting order is same as feature array. */
-        feat->props->get_val(feat, cos, &val[0]);
+        feat->props->get_val(feat, cos, 0, &val[0]);
 
         array_len -= feat->props->cos_num;
 
@@ -851,7 +861,7 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
              * COS ID 0 always stores the default value so input 0 to get
              * default value.
              */
-            feat->props->get_val(feat, 0, &default_val);
+            feat->props->get_val(feat, 0, 0, &default_val);
 
             /*
              * Compare value according to feature array order.
@@ -912,7 +922,7 @@ static bool fits_cos_max(const uint32_t val[],
 
         if ( cos > feat->props->cos_max )
         {
-            feat->props->get_val(feat, 0, &default_val);
+            feat->props->get_val(feat, 0, 0, &default_val);
             if ( val[0] != default_val )
                 return false;
         }
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 17/25] x86: refactor psr: CDP: implement set value callback functions.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (15 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-11 16:03   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow Yi Sun
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements L3 CDP set value related callback functions.

With this patch, 'psr-cat-cbm-set' command can work for L3 CDP.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove 'l3_cdp_get_old_val' and use 'l3_cdp_get_val' to replace it.
      (suggested by Jan Beulich)
    - remove 'l3_cdp_set_new_val'.
    - modify 'insert_val_to_array' flow to handle multiple COSs case.
      (suggested by Jan Beulich)
    - remove 'l3_cdp_compare_val' and implement a generic function
      'comapre_val'.
      (suggested by Jan Beulich)
    - remove 'l3_cdp_fits_cos_max'.
      (suggested by Jan Beulich)
    - introduce macro 'PSR_MAX_COS_NUM'.
    - introduce a new member in 'feat_props', 'type[PSR_MAX_COS_NUM]' to record
      all 'cbm_type' the feature has.
      (suggested by Jan Beulich)
    - modify 'gather_val_array' flow to handle multiple COSs case.
      (suggested by Jan Beulich)
    - modify 'find_cos' flow and implement 'compare_val' to handle multiple
      COSs case.
      (suggested by Jan Beulich)
    - modify 'fits_cos_max' flow to handle multiple COSs case.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
    - remove cast in 'l3_cdp_write_msr'.
      (suggested by Jan Beulich)
    - implement 'compare_val' function to compare if feature values are what
      we expect in finding flow.
    - implement 'restore_default_val' function to restore feature's COS values
      to default if the feature has multiple COSs. It is called when the COS
      ID is reduced to 0.
v9:
    - add comment to explain why CDP uses 2 COSs.
      (suggested by Wei Liu)
    - use 'cat_default_val'.
      (suggested by Wei Liu)
    - remove 'l3_cdp_get_cos_num' because we can directly get cos_num from
      feat_node now.
      (suggested by Jan Beulich)
    - remove cos checking because it has been moved to common function.
      (suggested by Jan Beulich)
    - l3_cdp_set_new_val parameter 'm' is changed to 'new_val'.
      (suggested by Jan Beulich)
    - directly use get_cdp_data(feat, 0) and get_cdp_code(feat, 0) to get
      default value.
      (suggested by Jan Beulich)
    - modify 'l3_cdp_write_msr' flow to write value into register according
      to input type.
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v8:
    - modify 'l3_cdp_write_msr' type to 'void'.
v5:
    - remove type check in callback function.
      (suggested by Jan Beulich)
    - modify return value of callback functions because we do not need them
      to return number of entries the feature uses. In caller, we call
      'get_cos_num' to get the number of entries the feature uses.
      (suggested by Jan Beulich)
    - remove 'l3_cdp_get_cos_max_from_type'.
    - rename 'l3_cdp_exceeds_cos_max' to 'l3_cdp_fits_cos_max'.
      (suggested by Jan Beulich)
v4:
    - create this patch to make codes easier to understand.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 226 +++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 183 insertions(+), 43 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index aced012..bfa1777 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -51,6 +51,14 @@
 
 #define PSR_ASSOC_REG_SHIFT 32
 
+/*
+ * Every PSR feature uses some COS registers for each COS ID, e.g. CDP uses 2
+ * COS registers (DATA and CODE) for one COS ID, but CAT uses 1 COS register.
+ * We use below macro as the max number of COS registers used by all features.
+ * So far, it is 2 which means CDP's COS registers number.
+ */
+#define PSR_MAX_COS_NUM 2
+
 enum psr_feat_type {
     PSR_SOCKET_L3_CAT,
     PSR_SOCKET_L3_CDP,
@@ -94,6 +102,13 @@ struct feat_node {
         unsigned int cos_max;
         unsigned int cbm_len;
 
+        /*
+         * An array to save all 'enum cbm_type' values of the feature. It is
+         * used with cos_num together to get/write a feature's COS registers
+         * values one by one.
+         */
+        enum cbm_type type[PSR_MAX_COS_NUM];
+
         /* get_feat_info is used to get feature HW info. */
         bool (*get_feat_info)(const struct feat_node *feat,
                               uint32_t data[], unsigned int array_len);
@@ -104,7 +119,7 @@ struct feat_node {
 
         /* write_msr is used to write out feature MSR register. */
         void (*write_msr)(unsigned int cos, uint32_t val,
-                          struct feat_node *feat);
+                          enum cbm_type type, struct feat_node *feat);
     } *props;
 
     uint32_t cos_reg_val[MAX_COS_REG_CNT];
@@ -292,6 +307,8 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
         /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
         feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
 
+        feat->props->type[0] = PSR_CBM_TYPE_L3;
+
         /*
          * To handle cpu offline and then online case, we need restore MSRs to
          * default values.
@@ -320,6 +337,9 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
         get_cdp_code(feat, 0) = cat_default_val(feat->props->cbm_len);
         get_cdp_data(feat, 0) = cat_default_val(feat->props->cbm_len);
 
+        feat->props->type[0] = PSR_CBM_TYPE_L3_DATA;
+        feat->props->type[1] = PSR_CBM_TYPE_L3_CODE;
+
         /*
          * To handle cpu offline and then online case, we need restore MSRs to
          * default values.
@@ -373,7 +393,7 @@ static void cat_get_val(const struct feat_node *feat, unsigned int cos,
 
 /* L3 CAT ops */
 static void l3_cat_write_msr(unsigned int cos, uint32_t val,
-                             struct feat_node *feat)
+                             enum cbm_type type, struct feat_node *feat)
 {
     if ( feat->cos_reg_val[cos] != val )
     {
@@ -410,10 +430,28 @@ static void l3_cdp_get_val(const struct feat_node *feat, unsigned int cos,
         *val = get_cdp_code(feat, cos);
 }
 
+static void l3_cdp_write_msr(unsigned int cos, uint32_t val,
+                             enum cbm_type type, struct feat_node *feat)
+{
+    /* Data */
+    if ( type == PSR_CBM_TYPE_L3_DATA && get_cdp_data(feat, cos) != val )
+    {
+        get_cdp_data(feat, cos) = val;
+        wrmsrl(MSR_IA32_PSR_L3_MASK_DATA(cos), val);
+    }
+    /* Code */
+    if ( type == PSR_CBM_TYPE_L3_CODE && get_cdp_code(feat, cos) != val )
+    {
+        get_cdp_code(feat, cos) = val;
+        wrmsrl(MSR_IA32_PSR_L3_MASK_CODE(cos), val);
+    }
+}
+
 static struct feat_props l3_cdp_props = {
     .cos_num = 2,
     .get_feat_info = l3_cdp_get_feat_info,
     .get_val = l3_cdp_get_val,
+    .write_msr = l3_cdp_write_msr,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
@@ -741,7 +779,7 @@ static int gather_val_array(uint32_t val[],
                             const struct psr_socket_info *info,
                             unsigned int old_cos)
 {
-    unsigned int i;
+    unsigned int i, j;
 
     if ( !val )
         return -EINVAL;
@@ -764,8 +802,13 @@ static int gather_val_array(uint32_t val[],
         if ( cos > feat->props->cos_max )
             cos = 0;
 
-        /* Value getting order is same as feature array. */
-        feat->props->get_val(feat, cos, 0, &val[0]);
+        /*
+         * Value getting order is same as feature array.
+         * For CDP, it has two COS values to get. We get them in this order:
+         * DATA is first, CODE is second.
+         */
+        for ( j = 0; j < feat->props->cos_num; j++ )
+            feat->props->get_val(feat, cos, feat->props->type[j], &val[j]);
 
         array_len -= feat->props->cos_num;
 
@@ -812,12 +855,66 @@ static int insert_val_to_array(uint32_t val[],
     if ( !psr_check_cbm(feat->props->cbm_len, new_val) )
         return -EINVAL;
 
-    /* Value setting position is same as feature array. */
-    val[0] = new_val;
+    /*
+     * Value setting position is same as feature array.
+     * Different features may have different setting behaviors, e.g. CDP
+     * has two values (DATA/CODE) which need us to save input value to
+     * different position in the array according to type.
+     */
+    for ( i = 0; i < feat->props->cos_num; i++ )
+    {
+        if ( type == feat->props->type[i] )
+            val[i] = new_val;
+    }
 
     return 0;
 }
 
+static int compare_val(const uint32_t val[],
+                       const struct feat_node *feat,
+                       unsigned int cos)
+{
+    int rc = 0;
+    unsigned int i;
+
+    for ( i = 0; i < feat->props->cos_num; i++ )
+    {
+        uint32_t feat_val = 0;
+
+        rc = 0;
+
+        /* If cos is bigger than cos_max, we need compare default value. */
+        if ( cos > feat->props->cos_max )
+        {
+            /*
+             * COS ID 0 always stores the default value so input 0 to get
+             * default value.
+             */
+            feat->props->get_val(feat, 0, feat->props->type[i], &feat_val);
+
+            /*
+             * If cos is bigger than feature's cos_max, the val should be
+             * default value. Otherwise, it fails to find a COS ID. So we
+             * have to exit find flow.
+             */
+            if ( val[i] != feat_val )
+                return -EINVAL;
+
+            rc = 1;
+            continue;
+        }
+
+        /* For CDP, DATA is the first item in val[], CODE is the second. */
+        feat->props->get_val(feat, cos, feat->props->type[i], &feat_val);
+        if ( val[i] != feat_val )
+            break;
+
+        rc = 1;
+    }
+
+    return rc;
+}
+
 static int find_cos(const uint32_t val[], unsigned int array_len,
                     enum psr_feat_type feat_type,
                     const struct psr_socket_info *info,
@@ -840,7 +937,7 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
     for ( cos = 0; cos <= cos_max; cos++ )
     {
         const uint32_t *val_ptr = val;
-        bool found = false;
+        int rc = 0;
 
         if ( cos && !ref[cos] )
             continue;
@@ -851,43 +948,21 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
          */
         for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
         {
-            uint32_t default_val = 0;
-
             feat = info->features[i];
             if ( !feat )
                 continue;
 
             /*
-             * COS ID 0 always stores the default value so input 0 to get
-             * default value.
-             */
-            feat->props->get_val(feat, 0, 0, &default_val);
-
-            /*
              * Compare value according to feature array order.
              * We must follow this order because value array is assembled
              * as this order.
              */
-            if ( cos > feat->props->cos_max )
-            {
-                /*
-                 * If cos is bigger than feature's cos_max, the val should be
-                 * default value. Otherwise, it fails to find a COS ID. So we
-                 * have to exit find flow.
-                 */
-                if ( val[0] != default_val )
-                    return -EINVAL;
-
-                found = true;
-            }
-            else
-            {
-                if ( val[0] == feat->cos_reg_val[cos] )
-                    found = true;
-            }
+            rc = compare_val(val, feat, cos);
+            if ( rc < 0 )
+                return rc;
 
             /* If fail to match, go to next cos to compare. */
-            if ( !found )
+            if ( !rc )
                 break;
 
             val_ptr += feat->props->cos_num;
@@ -896,7 +971,7 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
         }
 
         /* For this COS ID all entries in the values array do match. Use it. */
-        if ( found )
+        if ( rc )
             return cos;
     }
 
@@ -908,7 +983,7 @@ static bool fits_cos_max(const uint32_t val[],
                          const struct psr_socket_info *info,
                          unsigned int cos)
 {
-    unsigned int i;
+    unsigned int i, j;
 
     for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
     {
@@ -922,9 +997,14 @@ static bool fits_cos_max(const uint32_t val[],
 
         if ( cos > feat->props->cos_max )
         {
-            feat->props->get_val(feat, 0, 0, &default_val);
-            if ( val[0] != default_val )
-                return false;
+            /* For CDP, DATA is the first item in val[], CODE is the second. */
+            for ( j = 0; j < feat->props->cos_num; j++ )
+            {
+                feat->props->get_val(feat, 0, feat->props->type[j],
+                                     &default_val);
+                if ( val[j] != default_val )
+                    return false;
+            }
         }
 
         array_len -= feat->props->cos_num;
@@ -994,6 +1074,7 @@ struct cos_write_info
     unsigned int cos;
     struct feat_node *feature;
     uint32_t val;
+    enum cbm_type type;
 };
 
 static void do_write_psr_msr(void *data)
@@ -1005,11 +1086,12 @@ static void do_write_psr_msr(void *data)
     if ( cos > feat->props->cos_max )
         return;
 
-    feat->props->write_msr(cos, info->val, feat);
+    feat->props->write_msr(cos, info->val, info->type, feat);
 }
 
 static int write_psr_msr(unsigned int socket, unsigned int cos,
-                         uint32_t val, enum psr_feat_type feat_type)
+                         uint32_t val, enum cbm_type type,
+                         enum psr_feat_type feat_type)
 {
     struct psr_socket_info *info = get_socket_info(socket);
     struct cos_write_info data =
@@ -1017,6 +1099,7 @@ static int write_psr_msr(unsigned int socket, unsigned int cos,
         .cos = cos,
         .feature = info->features[feat_type],
         .val = val,
+        .type = type,
     };
 
     if ( socket == cpu_to_socket(smp_processor_id()) )
@@ -1033,6 +1116,40 @@ static int write_psr_msr(unsigned int socket, unsigned int cos,
     return 0;
 }
 
+static void restore_default_val(unsigned int socket, unsigned int cos,
+                                enum psr_feat_type feat_type)
+{
+    unsigned int i, j;
+    uint32_t default_val;
+    const struct psr_socket_info *info = get_socket_info(socket);
+
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+    {
+        const struct feat_node *feat = info->features[i];
+        /*
+         * There are four judgements:
+         * 1. Input 'feat_type' is valid so we have to get feature according to
+         *    it. If current feature type (i) does not match 'feat_type', we
+         *    need skip it, so continue to check next feature.
+         * 2. Input 'feat_type' is 'PSR_SOCKET_MAX_FEAT' which means we should
+         *    handle all features in this case. So, go to next loop.
+         * 3. Do not need restore the COS value back to default if cos_num is 1,
+         *    e.g. L3 CAT. Because next value setting will overwrite it.
+         * 4. 'feat' we got is NULL, continue.
+         */
+        if ( ( feat_type != PSR_SOCKET_MAX_FEAT && feat_type != i ) ||
+             !feat || feat->props->cos_num == 1 )
+            continue;
+
+        for ( j = 0; j < feat->props->cos_num; j++ )
+        {
+            feat->props->get_val(feat, 0, feat->props->type[j], &default_val);
+            write_psr_msr(socket, cos, default_val,
+                          feat->props->type[j], i);
+        }
+    }
+}
+
 /* The whole set process is protected by domctl_lock. */
 int psr_set_val(struct domain *d, unsigned int socket,
                 uint32_t val, enum cbm_type type)
@@ -1125,7 +1242,7 @@ int psr_set_val(struct domain *d, unsigned int socket,
          * Step 4:
          * Write all features MSRs according to the COS ID.
          */
-        ret = write_psr_msr(socket, cos, val, feat_type);
+        ret = write_psr_msr(socket, cos, val, type, feat_type);
         if ( ret )
             goto unlock_free_array;
     }
@@ -1139,10 +1256,26 @@ int psr_set_val(struct domain *d, unsigned int socket,
     ASSERT(!cos || ref[cos]);
     ASSERT(!old_cos || ref[old_cos]);
     ref[old_cos]--;
-    spin_unlock(&info->ref_lock);
 
     /*
      * Step 6:
+     * For features,  e.g. CDP, which cos_num is more than 1, we have to
+     * restore the old_cos value back to default when ref[old_cos] is 0.
+     * Otherwise, user will see wrong values when this COS ID is reused. E.g.
+     * user wants to set DATA to 0x3ff for a new domain. He hopes to see the
+     * DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
+     * if the COS ID picked for this action is the one that has been used by
+     * other domain and the CODE has been set to 0x1ff. Then, user will see
+     * DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
+     * using multiple COSs.
+     */
+    if ( old_cos && !ref[old_cos] )
+        restore_default_val(socket, old_cos, feat_type);
+
+    spin_unlock(&info->ref_lock);
+
+    /*
+     * Step 7:
      * Save the COS ID into current domain's psr_cos_ids[] so that we can know
      * which COS the domain is using on the socket. One domain can only use
      * one COS ID at same time on each socket.
@@ -1183,6 +1316,13 @@ static void psr_free_cos(struct domain *d)
         spin_lock(&info->ref_lock);
         ASSERT(!cos || info->cos_ref[cos]);
         info->cos_ref[cos]--;
+        /*
+         * The 'cos_ref[cos]' of 'd' is 0 now so we need restore corresponding
+         * COS registers to default value. Because this case happens when a
+         * domain is destroied, we need restore all features.
+         */
+        if ( !info->cos_ref[cos] )
+            restore_default_val(socket, cos, PSR_SOCKET_MAX_FEAT);
         spin_unlock(&info->ref_lock);
     }
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (16 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 17/25] x86: refactor psr: CDP: implement set value callback functions Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-12 15:18   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 19/25] x86: L2 CAT: implement get hw info flow Yi Sun
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements the CPU init and free flow for L2 CAT.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - implement L2 CAT case in 'cat_init_feature'.
      (suggested by Jan Beulich)
    - changes about 'props'.
      (suggested by Jan Beulich)
    - introduce 'PSR_CBM_TYPE_L2'.
v9:
    - modify error handling process in 'psr_cpu_prepare' to reduce redundant
      codes.
    - reuse 'cat_init_feature' and 'cat_get_cos_max' for L2 CAT to reduce
      redundant codes.
      (suggested by Roger Pau)
    - remove unnecessary comment.
      (suggested by Jan Beulich)
    - move L2 CAT related codes from 'cpu_init_work' into 'psr_cpu_init'.
      (suggested by Jan Beulich)
    - do not free resource when allocation fails in 'psr_cpu_prepare'.
      (suggested by Jan Beulich)
v7:
    - initialize 'l2_cat'.
      (suggested by Konrad Rzeszutek Wilk)
v6:
    - use 'struct cpuid_leaf'.
      (suggested by Konrad Rzeszutek Wilk and Jan Beulich)
v5:
    - remove 'feat_l2_cat' free in 'free_feature'.
      (suggested by Jan Beulich)
    - encapsulate cpuid registers into 'struct cpuid_leaf_regs'.
      (suggested by Jan Beulich)
    - print socket info when 'opt_cpu_info' is true.
      (suggested by Jan Beulich)
    - rename 'l2_cat_get_max_cos_max' to 'l2_cat_get_cos_max'.
      (suggested by Jan Beulich)
    - rename 'dat[]' to 'data[]'
      (suggested by Jan Beulich)
    - move 'cpu_prepare_work' contents into 'psr_cpu_prepare'.
      (suggested by Jan Beulich)
v4:
    - create this patch because of codes architecture change.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c              | 33 +++++++++++++++++++++++++++++++--
 xen/include/asm-x86/msr-index.h |  1 +
 xen/include/asm-x86/psr.h       |  2 ++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index bfa1777..6a9cd88 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -160,6 +160,7 @@ static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
  */
 static struct feat_node *feat_l3_cat;
 static struct feat_node *feat_l3_cdp;
+static struct feat_node *feat_l2_cat;
 
 /* Common functions */
 #define cat_default_val(len) (0xffffffff >> (32 - (len)))
@@ -304,10 +305,14 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
     switch ( type )
     {
     case PSR_SOCKET_L3_CAT:
+    case PSR_SOCKET_L2_CAT:
         /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
         feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
 
-        feat->props->type[0] = PSR_CBM_TYPE_L3;
+        if ( type == PSR_SOCKET_L3_CAT )
+            feat->props->type[0] = PSR_CBM_TYPE_L3;
+        else
+            feat->props->type[0] = PSR_CBM_TYPE_L2;
 
         /*
          * To handle cpu offline and then online case, we need restore MSRs to
@@ -315,7 +320,11 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
          */
         for ( i = 1; i <= feat->props->cos_max; i++ )
         {
-            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
+            if ( type == PSR_SOCKET_L3_CAT )
+                wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
+            else
+                wrmsrl(MSR_IA32_PSR_L2_MASK(i), feat->cos_reg_val[0]);
+
             feat->cos_reg_val[i] = feat->cos_reg_val[0];
         }
 
@@ -454,6 +463,11 @@ static struct feat_props l3_cdp_props = {
     .write_msr = l3_cdp_write_msr,
 };
 
+/* L2 CAT ops */
+static struct feat_props l2_cat_props = {
+    .cos_num = 1,
+};
+
 static void __init parse_psr_bool(char *s, char *value, char *feature,
                                   unsigned int mask)
 {
@@ -1393,6 +1407,10 @@ static int psr_cpu_prepare(void)
          (feat_l3_cdp = xzalloc(struct feat_node)) == NULL )
         return -ENOMEM;
 
+    if ( feat_l2_cat == NULL &&
+         (feat_l2_cat = xzalloc(struct feat_node)) == NULL )
+        return -ENOMEM;
+
     return 0;
 }
 
@@ -1442,6 +1460,17 @@ static void psr_cpu_init(void)
         }
     }
 
+    cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, &regs);
+    if ( regs.b & PSR_RESOURCE_TYPE_L2 )
+    {
+        cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 2, &regs);
+
+        feat = feat_l2_cat;
+        feat_l2_cat = NULL;
+        feat->props = &l2_cat_props;
+        cat_init_feature(&regs, feat, info, PSR_SOCKET_L2_CAT);
+    }
+
  assoc_init:
     psr_assoc_init();
 }
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 771e750..6c49c6d 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -345,6 +345,7 @@
 #define MSR_IA32_PSR_L3_MASK(n)	(0x00000c90 + (n))
 #define MSR_IA32_PSR_L3_MASK_CODE(n)	(0x00000c90 + (n) * 2 + 1)
 #define MSR_IA32_PSR_L3_MASK_DATA(n)	(0x00000c90 + (n) * 2)
+#define MSR_IA32_PSR_L2_MASK(n)		(0x00000d10 + (n))
 
 /* Intel Model 6 */
 #define MSR_P6_PERFCTR(n)		(0x000000c1 + (n))
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 66d5218..e576f27 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -23,6 +23,7 @@
 
 /* Resource Type Enumeration */
 #define PSR_RESOURCE_TYPE_L3            0x2
+#define PSR_RESOURCE_TYPE_L2            0x4
 
 /* L3 Monitoring Features */
 #define PSR_CMT_L3_OCCUPANCY            0x1
@@ -56,6 +57,7 @@ enum cbm_type {
     PSR_CBM_TYPE_L3,
     PSR_CBM_TYPE_L3_CODE,
     PSR_CBM_TYPE_L3_DATA,
+    PSR_CBM_TYPE_L2,
 };
 
 extern struct psr_cmt *psr_cmt;
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 19/25] x86: L2 CAT: implement get hw info flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (17 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 20/25] x86: L2 CAT: implement get value flow Yi Sun
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements L2 CAT get HW info flow and interface in sysctl.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - modify macro name according to previous patch change.
      (suggested by Jan Beulich)
    - modify commit message.
v9:
    - reuse 'cat_get_feat_info' for L2 CAT to reduce redundant codes.
      (suggested by Roger Pau)
    - modify sysctl implementation of L2 CAT to input data[3] to use
      'cat_get_feat_info'.
      (suggested by Roger Pau)
    - modify macros names to newly defined ones.
      (suggested by Jan Beulich)
    - remove 'l2_info' to reuse 'l3_info'.
      (suggested by Jan Beulich)
    - modify macro name according to previous patch change.
      (suggested by Jan Beulich)
v5:
    - rename 'dat[]' to 'data[]'
      (suggested by Jan Beulich)
    - remove type check in callback function.
      (suggested by Jan Beulich)
v4:
    - create this patch because of codes architecture change.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c          |  4 ++++
 xen/arch/x86/sysctl.c       | 21 +++++++++++++++++++++
 xen/include/public/sysctl.h |  1 +
 3 files changed, 26 insertions(+)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 6a9cd88..8114bed 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -257,6 +257,9 @@ static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
     case PSR_CBM_TYPE_L3_CODE:
         feat_type = PSR_SOCKET_L3_CDP;
         break;
+    case PSR_CBM_TYPE_L2:
+        feat_type = PSR_SOCKET_L2_CAT;
+        break;
     default:
         ASSERT_UNREACHABLE();
     }
@@ -466,6 +469,7 @@ static struct feat_props l3_cdp_props = {
 /* L2 CAT ops */
 static struct feat_props l2_cat_props = {
     .cos_num = 1,
+    .get_feat_info = cat_get_feat_info,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index c23270d..ba6b6a6 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -195,6 +195,27 @@ long arch_do_sysctl(
             break;
         }
 
+        case XEN_SYSCTL_PSR_CAT_get_l2_info:
+        {
+            uint32_t data[PSR_INFO_ARRAY_SIZE];
+
+            ret = psr_get_info(sysctl->u.psr_cat_op.target,
+                               PSR_CBM_TYPE_L2, data, ARRAY_SIZE(data));
+            if ( ret )
+                break;
+
+            sysctl->u.psr_cat_op.u.l3_info.cos_max =
+                                      data[PSR_INFO_IDX_COS_MAX];
+            sysctl->u.psr_cat_op.u.l3_info.cbm_len =
+                                      data[PSR_INFO_IDX_CAT_CBM_LEN];
+            sysctl->u.psr_cat_op.u.l3_info.flags =
+                                      data[PSR_INFO_IDX_CAT_FLAG];
+
+            if ( !ret && __copy_field_to_guest(u_sysctl, sysctl, u.psr_cat_op) )
+                ret = -EFAULT;
+            break;
+        }
+
         default:
             ret = -EOPNOTSUPP;
             break;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 00f5e77..1fe8fe4 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -744,6 +744,7 @@ typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
 
 #define XEN_SYSCTL_PSR_CAT_get_l3_info               0
+#define XEN_SYSCTL_PSR_CAT_get_l2_info               1
 struct xen_sysctl_psr_cat_op {
     uint32_t cmd;       /* IN: XEN_SYSCTL_PSR_CAT_* */
     uint32_t target;    /* IN */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 20/25] x86: L2 CAT: implement get value flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (18 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 19/25] x86: L2 CAT: implement get hw info flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 21/25] x86: L2 CAT: implement set " Yi Sun
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements L2 CAT get value flow and interface in domctl.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - remove cast in domctl.
      (suggested by Jan Beulich)
v9:
    - reuse 'cat_get_val' for L2 CAT to reduce redundant codes
      (suggested by Roger Pau)
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v5:
    - remove type check in callback function.
      (suggested by Jan Beulich)
v4:
    - create this patch because of codes architecture change.
      (suggested by Jan Beulich)
---
 xen/arch/x86/domctl.c       | 11 +++++++++++
 xen/arch/x86/psr.c          |  1 +
 xen/include/public/domctl.h |  1 +
 3 files changed, 13 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 6ed71e2..59d472c 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1499,6 +1499,17 @@ long arch_do_domctl(
             break;
         }
 
+        case XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM:
+        {
+            uint32_t val;
+
+            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
+                              &val, PSR_CBM_TYPE_L2);
+            domctl->u.psr_cat_op.data = val;
+            copyback = 1;
+            break;
+        }
+
         default:
             ret = -EOPNOTSUPP;
             break;
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 8114bed..426d725 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -470,6 +470,7 @@ static struct feat_props l3_cdp_props = {
 static struct feat_props l2_cat_props = {
     .cos_num = 1,
     .get_feat_info = cat_get_feat_info,
+    .get_val = cat_get_val,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 85cbb7c..8c183ba 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1138,6 +1138,7 @@ struct xen_domctl_psr_cat_op {
 #define XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA    3
 #define XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE    4
 #define XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA    5
+#define XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM     7
     uint32_t cmd;       /* IN: XEN_DOMCTL_PSR_CAT_OP_* */
     uint32_t target;    /* IN */
     uint64_t data;      /* IN/OUT */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 21/25] x86: L2 CAT: implement set value flow.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (19 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 20/25] x86: L2 CAT: implement get value flow Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-12 15:23   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 22/25] tools: L2 CAT: support get HW info for L2 CAT Yi Sun
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements L2 CAT set value related callback functions
and domctl interface.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - check input data and remove cast in domctl.
      (suggested by Jan Beulich)
    - remove some hooks assignment due to previous patches changes.
      (suggested by Jan Beulich)
    - remove cast in 'l2_cat_write_msr'.
      (suggested by Jan Beulich)
    - remove 'return in 'l2_cat_write_msr'.
      (suggested by Jan Beulich)
v9:
    - reuse some CAT common functions for L2 CAT to reduce redundant codes.
      (suggested by Roger Pau)
    - remove parameter 'found' from 'cat_compare_val' and modify the return
      values to let caller know if the id is found or not. These things are
      done in patch "x86: refactor psr: set value: implement cos finding flow."
      (suggested by Roger Pau and Dario Faggioli)
    - remove 'get_cos_num' related codes.
      (suggested by Jan Beulich)
    - modify 'l2_cat_write_msr' according to previous patch change.
    - changes about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v8:
    - modify 'l2_cat_write_msr' to 'void'.
v5:
    - remove type check in callback function.
      (suggested by Jan Beulich)
    - modify return value of callback functions because we do not need them
      to return number of entries the feature uses. In caller, we call
      'get_cos_num' to get the number of entries the feature uses.
      (suggested by Jan Beulich)
    - remove 'l2_cat_get_cos_max_from_type'.
      (suggested by Jan Beulich)
    - rename 'l2_cat_exceeds_cos_max' to 'l2_cat_fits_cos_max'.
      (suggested by Jan Beulich)
v4:
    - create this patch because of codes architecture change.
      (suggested by Jan Beulich)
---
 xen/arch/x86/domctl.c       | 10 ++++++++++
 xen/arch/x86/psr.c          | 11 +++++++++++
 xen/include/public/domctl.h |  1 +
 3 files changed, 22 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 59d472c..7eb5983 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1466,6 +1466,16 @@ long arch_do_domctl(
                               PSR_CBM_TYPE_L3_DATA);
             break;
 
+        case XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM:
+            if ( domctl->u.psr_cat_op.data !=
+                 (uint32_t)domctl->u.psr_cat_op.data )
+                return -EINVAL;
+
+            ret = psr_set_val(d, domctl->u.psr_cat_op.target,
+                              domctl->u.psr_cat_op.data,
+                              PSR_CBM_TYPE_L2);
+            break;
+
         case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
         {
             uint32_t val;
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 426d725..a85ea99 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -467,10 +467,21 @@ static struct feat_props l3_cdp_props = {
 };
 
 /* L2 CAT ops */
+static void l2_cat_write_msr(unsigned int cos, uint32_t val,
+                             enum cbm_type type, struct feat_node *feat)
+{
+    if ( feat->cos_reg_val[cos] != val )
+    {
+        feat->cos_reg_val[cos] = val;
+        wrmsrl(MSR_IA32_PSR_L2_MASK(cos), val);
+    }
+}
+
 static struct feat_props l2_cat_props = {
     .cos_num = 1,
     .get_feat_info = cat_get_feat_info,
     .get_val = cat_get_val,
+    .write_msr = l2_cat_write_msr,
 };
 
 static void __init parse_psr_bool(char *s, char *value, char *feature,
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 8c183ba..523a2cd 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1138,6 +1138,7 @@ struct xen_domctl_psr_cat_op {
 #define XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA    3
 #define XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE    4
 #define XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA    5
+#define XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM     6
 #define XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM     7
     uint32_t cmd;       /* IN: XEN_DOMCTL_PSR_CAT_OP_* */
     uint32_t target;    /* IN */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 22/25] tools: L2 CAT: support get HW info for L2 CAT.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (20 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 21/25] x86: L2 CAT: implement set " Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-12 15:24   ` Jan Beulich
  2017-04-01 13:53 ` [PATCH v10 23/25] tools: L2 CAT: support show cbm " Yi Sun
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements xl/xc changes to support get HW info
for L2 CAT.

'xl psr-hwinfo' is updated to show both L3 CAT and L2 CAT
info.

Example(on machine which only supports L2 CAT):
Cache Monitoring Technology (CMT):
Enabled         : 0
Cache Allocation Technology (CAT): L2
Socket ID       : 0
Maximum COS     : 3
CBM length      : 8
Default CBM     : 0xff

Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - change macros names according to previous changes.
      (suggested by Jan Beulich)
v9:
    - add some cases to handle return error no.
    - move xl_cmdimpl.c codes into xl/xl_psr.c.
    - change 'l3_info' to 'cat_info' to cover both L3 and L2 CAT.
v6:
    - adjust '{' position for 'switch'.
      (suggested by Wei Liu)
    - modify commit message to remove error log.
      (suggested by Dario Faggioli)
v5:
    - modify commit message to remove error log.
      (suggested by Wei Liu and Jan Beulich)
    - replace unnecessary 'return' to 'break'.
      (suggested by Wei Liu)
    - restore 'libxl_psr_cat_get_l3_info' to keep interface backward compatible
      but change codes in it to call new function to get hw info.
      (suggested by Wei Liu)
    - add 'L2_CBM' into 'psr_cbm_type' because it is interface change which
      should be in same patch with new 'LIBXL_HAVE_' macro.
      (suggested by Wei Liu)
    - addjust logs sentence to make unnecessary error logs not show.
      (suggested by Wei Liu and Jan Beulich)
v4:
    - create this patch to help reviewers better understand the codes.
---
 tools/libxc/include/xenctrl.h |  6 ++---
 tools/libxc/xc_psr.c          | 39 +++++++++++++++++++++++---------
 tools/libxl/libxl.h           |  9 ++++++++
 tools/libxl/libxl_psr.c       | 28 ++++++++++++++++++-----
 tools/libxl/libxl_types.idl   |  1 +
 tools/xl/xl_psr.c             | 52 +++++++++++++++++++++++++++++++++----------
 xen/arch/x86/sysctl.c         | 12 +++++-----
 xen/include/public/sysctl.h   |  2 +-
 8 files changed, 111 insertions(+), 38 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index a48981a..99c6fa5 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2442,9 +2442,9 @@ int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
 int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
                                xc_psr_cat_type type, uint32_t target,
                                uint64_t *data);
-int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
-                           uint32_t *cos_max, uint32_t *cbm_len,
-                           bool *cdp_enabled);
+int xc_psr_cat_get_info(xc_interface *xch, uint32_t socket, unsigned int lvl,
+                        uint32_t *cos_max, uint32_t *cbm_len,
+                        bool *cdp_enabled);
 
 int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps);
 int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index 43b3286..84a08c4 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -317,24 +317,41 @@ int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
     return rc;
 }
 
-int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
-                           uint32_t *cos_max, uint32_t *cbm_len,
-                           bool *cdp_enabled)
+int xc_psr_cat_get_info(xc_interface *xch, uint32_t socket, unsigned int lvl,
+                        uint32_t *cos_max, uint32_t *cbm_len, bool *cdp_enabled)
 {
-    int rc;
+    int rc = -1;
     DECLARE_SYSCTL;
 
     sysctl.cmd = XEN_SYSCTL_psr_cat_op;
-    sysctl.u.psr_cat_op.cmd = XEN_SYSCTL_PSR_CAT_get_l3_info;
     sysctl.u.psr_cat_op.target = socket;
 
-    rc = xc_sysctl(xch, &sysctl);
-    if ( !rc )
+    switch ( lvl )
     {
-        *cos_max = sysctl.u.psr_cat_op.u.l3_info.cos_max;
-        *cbm_len = sysctl.u.psr_cat_op.u.l3_info.cbm_len;
-        *cdp_enabled = sysctl.u.psr_cat_op.u.l3_info.flags &
-                       XEN_SYSCTL_PSR_CAT_L3_CDP;
+    case 2:
+        sysctl.u.psr_cat_op.cmd = XEN_SYSCTL_PSR_CAT_get_l2_info;
+        rc = xc_sysctl(xch, &sysctl);
+        if ( !rc )
+        {
+            *cos_max = sysctl.u.psr_cat_op.u.cat_info.cos_max;
+            *cbm_len = sysctl.u.psr_cat_op.u.cat_info.cbm_len;
+            *cdp_enabled = false;
+        }
+        break;
+    case 3:
+        sysctl.u.psr_cat_op.cmd = XEN_SYSCTL_PSR_CAT_get_l3_info;
+        rc = xc_sysctl(xch, &sysctl);
+        if ( !rc )
+        {
+            *cos_max = sysctl.u.psr_cat_op.u.cat_info.cos_max;
+            *cbm_len = sysctl.u.psr_cat_op.u.cat_info.cbm_len;
+            *cdp_enabled = sysctl.u.psr_cat_op.u.cat_info.flags &
+                           XEN_SYSCTL_PSR_CAT_L3_CDP;
+        }
+        break;
+    default:
+        errno = EOPNOTSUPP;
+        break;
     }
 
     return rc;
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 92f1751..6c6fb01 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -904,6 +904,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, const libxl_mac *src);
  * If this is defined, the Code and Data Prioritization feature is supported.
  */
 #define LIBXL_HAVE_PSR_CDP 1
+
+/*
+ * LIBXL_HAVE_PSR_L2_CAT
+ *
+ * If this is defined, the L2 Cache Allocation Technology feature is supported.
+ */
+#define LIBXL_HAVE_PSR_L2_CAT 1
 #endif
 
 /*
@@ -2172,6 +2179,8 @@ int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
  * On success, the function returns an array of elements in 'info',
  * and the length in 'nr'.
  */
+int libxl_psr_cat_get_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
+                           int *nr, unsigned int lvl);
 int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
                               int *nr);
 void libxl_psr_cat_info_list_free(libxl_psr_cat_info *list, int nr);
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index ec5c79d..f55ba1e 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -91,6 +91,15 @@ static void libxl__psr_cat_log_err_msg(libxl__gc *gc, int err)
     case ENXIO:
         msg = "Unable to set code or data CBM when CDP is disabled";
         break;
+    case EINVAL:
+        msg = "Invalid input or some internal values are not expected";
+        break;
+    case ERANGE:
+        msg = "Socket number is wrong";
+        break;
+    case ENOSPC:
+        msg = "Value array exceeds the range";
+        break;
 
     default:
         libxl__psr_log_err_msg(gc, err);
@@ -352,8 +361,8 @@ int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
     return rc;
 }
 
-int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
-                              int *nr)
+int libxl_psr_cat_get_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
+                           int *nr, unsigned int lvl)
 {
     GC_INIT(ctx);
     int rc;
@@ -380,9 +389,8 @@ int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
 
     libxl_for_each_set_bit(socketid, socketmap) {
         ptr[i].id = socketid;
-        if (xc_psr_cat_get_l3_info(ctx->xch, socketid, &ptr[i].cos_max,
-                                   &ptr[i].cbm_len, &ptr[i].cdp_enabled)) {
-            libxl__psr_cat_log_err_msg(gc, errno);
+        if (xc_psr_cat_get_info(ctx->xch, socketid, lvl, &ptr[i].cos_max,
+                                &ptr[i].cbm_len, &ptr[i].cdp_enabled)) {
             rc = ERROR_FAIL;
             free(ptr);
             goto out;
@@ -398,6 +406,16 @@ out:
     return rc;
 }
 
+int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
+                              int *nr)
+{
+    int rc;
+
+    rc = libxl_psr_cat_get_info(ctx, info, nr, 3);
+
+    return rc;
+}
+
 void libxl_psr_cat_info_list_free(libxl_psr_cat_info *list, int nr)
 {
     int i;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index a612d1f..5a401b8 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -899,6 +899,7 @@ libxl_psr_cbm_type = Enumeration("psr_cbm_type", [
     (1, "L3_CBM"),
     (2, "L3_CBM_CODE"),
     (3, "L3_CBM_DATA"),
+    (4, "L2_CBM"),
     ])
 
 libxl_psr_cat_info = Struct("psr_cat_info", [
diff --git a/tools/xl/xl_psr.c b/tools/xl/xl_psr.c
index c061b29..271b88f 100644
--- a/tools/xl/xl_psr.c
+++ b/tools/xl/xl_psr.c
@@ -294,21 +294,19 @@ int main_psr_cmt_show(int argc, char **argv)
 }
 #endif
 
-#ifdef LIBXL_HAVE_PSR_CAT
-static int psr_cat_hwinfo(void)
+#if defined(LIBXL_HAVE_PSR_CAT) || defined(LIBXL_HAVE_PSR_L2_CAT)
+static int psr_l3_cat_hwinfo(void)
 {
-    int rc;
-    int i, nr;
+    int rc, nr;
+    unsigned int i;
     uint32_t l3_cache_size;
     libxl_psr_cat_info *info;
 
-    printf("Cache Allocation Technology (CAT):\n");
-
-    rc = libxl_psr_cat_get_l3_info(ctx, &info, &nr);
-    if (rc) {
-        fprintf(stderr, "Failed to get cat info\n");
+    rc = libxl_psr_cat_get_info(ctx, &info, &nr, 3);
+    if (rc)
         return rc;
-    }
+
+    printf("Cache Allocation Technology (CAT):\n");
 
     for (i = 0; i < nr; i++) {
         rc = libxl_psr_cmt_get_l3_cache_size(ctx, info[i].id, &l3_cache_size);
@@ -417,7 +415,7 @@ static int psr_cat_show(uint32_t domid)
     int rc;
     libxl_psr_cat_info *info;
 
-    rc = libxl_psr_cat_get_l3_info(ctx, &info, &nr);
+    rc = libxl_psr_cat_get_info(ctx, &info, &nr, 3);
     if (rc) {
         fprintf(stderr, "Failed to get cat info\n");
         return rc;
@@ -434,6 +432,32 @@ out:
     return rc;
 }
 
+static int psr_l2_cat_hwinfo(void)
+{
+    int rc;
+    unsigned int i;
+    int nr;
+    libxl_psr_cat_info *info;
+
+    rc = libxl_psr_cat_get_info(ctx, &info, &nr, 2);
+    if (rc)
+        return rc;
+
+    printf("Cache Allocation Technology (CAT): L2\n");
+
+    for (i = 0; i < nr; i++) {
+        /* There is no CMT on L2 cache so far. */
+        printf("%-16s: %u\n", "Socket ID", info[i].id);
+        printf("%-16s: %u\n", "Maximum COS", info[i].cos_max);
+        printf("%-16s: %u\n", "CBM length", info[i].cbm_len);
+        printf("%-16s: %#llx\n", "Default CBM",
+               (1ull << info[i].cbm_len) - 1);
+    }
+
+    libxl_psr_cat_info_list_free(info, nr);
+    return rc;
+}
+
 int main_psr_cat_cbm_set(int argc, char **argv)
 {
     uint32_t domid;
@@ -551,7 +575,11 @@ int main_psr_hwinfo(int argc, char **argv)
         ret = psr_cmt_hwinfo();
 
     if (!ret && (all || cat))
-        ret = psr_cat_hwinfo();
+        ret = psr_l3_cat_hwinfo();
+
+    /* L2 CAT is independent of CMT and L3 CAT */
+    if (all || cat)
+        ret = psr_l2_cat_hwinfo();
 
     return ret;
 }
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index ba6b6a6..130a5cb 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -183,11 +183,11 @@ long arch_do_sysctl(
             if ( ret )
                 break;
 
-            sysctl->u.psr_cat_op.u.l3_info.cos_max =
+            sysctl->u.psr_cat_op.u.cat_info.cos_max =
                                       data[PSR_INFO_IDX_COS_MAX];
-            sysctl->u.psr_cat_op.u.l3_info.cbm_len =
+            sysctl->u.psr_cat_op.u.cat_info.cbm_len =
                                       data[PSR_INFO_IDX_CAT_CBM_LEN];
-            sysctl->u.psr_cat_op.u.l3_info.flags =
+            sysctl->u.psr_cat_op.u.cat_info.flags =
                                       data[PSR_INFO_IDX_CAT_FLAG];
 
             if ( !ret && __copy_field_to_guest(u_sysctl, sysctl, u.psr_cat_op) )
@@ -204,11 +204,11 @@ long arch_do_sysctl(
             if ( ret )
                 break;
 
-            sysctl->u.psr_cat_op.u.l3_info.cos_max =
+            sysctl->u.psr_cat_op.u.cat_info.cos_max =
                                       data[PSR_INFO_IDX_COS_MAX];
-            sysctl->u.psr_cat_op.u.l3_info.cbm_len =
+            sysctl->u.psr_cat_op.u.cat_info.cbm_len =
                                       data[PSR_INFO_IDX_CAT_CBM_LEN];
-            sysctl->u.psr_cat_op.u.l3_info.flags =
+            sysctl->u.psr_cat_op.u.cat_info.flags =
                                       data[PSR_INFO_IDX_CAT_FLAG];
 
             if ( !ret && __copy_field_to_guest(u_sysctl, sysctl, u.psr_cat_op) )
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 1fe8fe4..a3998c6 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -754,7 +754,7 @@ struct xen_sysctl_psr_cat_op {
             uint32_t cos_max;   /* OUT: Maximum COS */
 #define XEN_SYSCTL_PSR_CAT_L3_CDP       (1u << 0)
             uint32_t flags;     /* OUT: CAT flags */
-        } l3_info;
+        } cat_info;
     } u;
 };
 typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 23/25] tools: L2 CAT: support show cbm for L2 CAT.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (21 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 22/25] tools: L2 CAT: support get HW info for L2 CAT Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 24/25] tools: L2 CAT: support set " Yi Sun
  2017-04-01 13:53 ` [PATCH v10 25/25] docs: add L2 CAT description in docs Yi Sun
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements changes in xl/xc changes to support
showing CBM of L2 CAT.

The new level option is introduced to original CAT showing
command in order to show CBM for specified level CAT.
- 'xl psr-cat-show' is updated to show CBM of a domain
  according to input cache level.

Examples:
root@:~$ xl psr-cat-show -l2 1
Socket ID       : 0
Default CBM     : 0xff
   ID                     NAME             CBM
    1                 ubuntu14            0x7f

Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v9:
    - move xl_cmdimpl.c changes into xl/xl_psr.c.
    - move xl_cmdtable.c changes into xl/xl_cmdtable.c.
v6:
    - check if input level is correct.
    - adjust '{' postion for 'if'.
      (suggested by Wei Liu)
v5:
    - remove 'L2_CBM' in idl because it has been moved to patch 21:
      "tools: L2 CAT: support get HW info for L2 CAT".
      (suggested by Wei Liu)
v4:
    - create this patch because of codes architecture change.
---
 tools/libxc/include/xenctrl.h |  1 +
 tools/libxc/xc_psr.c          |  3 ++
 tools/xl/xl_cmdtable.c        |  3 +-
 tools/xl/xl_psr.c             | 85 +++++++++++++++++++++++++++++--------------
 4 files changed, 63 insertions(+), 29 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 99c6fa5..0fd9326 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2418,6 +2418,7 @@ enum xc_psr_cat_type {
     XC_PSR_CAT_L3_CBM      = 1,
     XC_PSR_CAT_L3_CBM_CODE = 2,
     XC_PSR_CAT_L3_CBM_DATA = 3,
+    XC_PSR_CAT_L2_CBM      = 4,
 };
 typedef enum xc_psr_cat_type xc_psr_cat_type;
 
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index 84a08c4..04f5927 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -299,6 +299,9 @@ int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
     case XC_PSR_CAT_L3_CBM_DATA:
         cmd = XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA;
         break;
+    case XC_PSR_CAT_L2_CBM:
+        cmd = XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM;
+        break;
     default:
         errno = EINVAL;
         return -1;
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 1219b33..ab7ad60 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -556,7 +556,8 @@ struct cmd_spec cmd_table[] = {
     { "psr-cat-show",
       &main_psr_cat_show, 0, 1,
       "Show Cache Allocation Technology information",
-      "<Domain>",
+      "[options] <Domain>",
+      "-l <level>        Specify the cache level to process, otherwise L3 cache is processed\n"
     },
 
 #endif
diff --git a/tools/xl/xl_psr.c b/tools/xl/xl_psr.c
index 271b88f..575f4a0 100644
--- a/tools/xl/xl_psr.c
+++ b/tools/xl/xl_psr.c
@@ -342,7 +342,7 @@ static void psr_cat_print_one_domain_cbm_type(uint32_t domid, uint32_t socketid,
 }
 
 static void psr_cat_print_one_domain_cbm(uint32_t domid, uint32_t socketid,
-                                         bool cdp_enabled)
+                                         bool cdp_enabled, unsigned int lvl)
 {
     char *domain_name;
 
@@ -350,27 +350,38 @@ static void psr_cat_print_one_domain_cbm(uint32_t domid, uint32_t socketid,
     printf("%5d%25s", domid, domain_name);
     free(domain_name);
 
-    if (!cdp_enabled) {
-        psr_cat_print_one_domain_cbm_type(domid, socketid,
-                                          LIBXL_PSR_CBM_TYPE_L3_CBM);
-    } else {
-        psr_cat_print_one_domain_cbm_type(domid, socketid,
-                                          LIBXL_PSR_CBM_TYPE_L3_CBM_CODE);
+    switch (lvl) {
+    case 3:
+        if (!cdp_enabled) {
+            psr_cat_print_one_domain_cbm_type(domid, socketid,
+                                              LIBXL_PSR_CBM_TYPE_L3_CBM);
+        } else {
+            psr_cat_print_one_domain_cbm_type(domid, socketid,
+                                              LIBXL_PSR_CBM_TYPE_L3_CBM_CODE);
+            psr_cat_print_one_domain_cbm_type(domid, socketid,
+                                              LIBXL_PSR_CBM_TYPE_L3_CBM_DATA);
+        }
+        break;
+    case 2:
         psr_cat_print_one_domain_cbm_type(domid, socketid,
-                                          LIBXL_PSR_CBM_TYPE_L3_CBM_DATA);
+                                          LIBXL_PSR_CBM_TYPE_L2_CBM);
+        break;
+    default:
+        printf("Input lvl %d is wrong!", lvl);
+        break;
     }
 
     printf("\n");
 }
 
 static int psr_cat_print_domain_cbm(uint32_t domid, uint32_t socketid,
-                                    bool cdp_enabled)
+                                    bool cdp_enabled, unsigned int lvl)
 {
     int i, nr_domains;
     libxl_dominfo *list;
 
     if (domid != INVALID_DOMID) {
-        psr_cat_print_one_domain_cbm(domid, socketid, cdp_enabled);
+        psr_cat_print_one_domain_cbm(domid, socketid, cdp_enabled, lvl);
         return 0;
     }
 
@@ -380,49 +391,59 @@ static int psr_cat_print_domain_cbm(uint32_t domid, uint32_t socketid,
     }
 
     for (i = 0; i < nr_domains; i++)
-        psr_cat_print_one_domain_cbm(list[i].domid, socketid, cdp_enabled);
+        psr_cat_print_one_domain_cbm(list[i].domid, socketid, cdp_enabled, lvl);
     libxl_dominfo_list_free(list, nr_domains);
 
     return 0;
 }
 
-static int psr_cat_print_socket(uint32_t domid, libxl_psr_cat_info *info)
+static int psr_cat_print_socket(uint32_t domid, libxl_psr_cat_info *info,
+                                unsigned int lvl)
 {
     int rc;
     uint32_t l3_cache_size;
 
-    rc = libxl_psr_cmt_get_l3_cache_size(ctx, info->id, &l3_cache_size);
-    if (rc) {
-        fprintf(stderr, "Failed to get l3 cache size for socket:%d\n",
-                info->id);
-        return -1;
+    printf("%-16s: %u\n", "Socket ID", info->id);
+
+    /* So far, CMT only supports L3 cache. */
+    if (lvl == 3) {
+        rc = libxl_psr_cmt_get_l3_cache_size(ctx, info->id, &l3_cache_size);
+        if (rc) {
+            fprintf(stderr, "Failed to get l3 cache size for socket:%d\n",
+                    info->id);
+            return -1;
+        }
+        printf("%-16s: %uKB\n", "L3 Cache", l3_cache_size);
     }
 
-    printf("%-16s: %u\n", "Socket ID", info->id);
-    printf("%-16s: %uKB\n", "L3 Cache", l3_cache_size);
     printf("%-16s: %#llx\n", "Default CBM", (1ull << info->cbm_len) - 1);
     if (info->cdp_enabled)
         printf("%5s%25s%16s%16s\n", "ID", "NAME", "CBM (code)", "CBM (data)");
     else
         printf("%5s%25s%16s\n", "ID", "NAME", "CBM");
 
-    return psr_cat_print_domain_cbm(domid, info->id, info->cdp_enabled);
+    return psr_cat_print_domain_cbm(domid, info->id, info->cdp_enabled, lvl);
 }
 
-static int psr_cat_show(uint32_t domid)
+static int psr_cat_show(uint32_t domid, unsigned int lvl)
 {
     int i, nr;
     int rc;
     libxl_psr_cat_info *info;
 
-    rc = libxl_psr_cat_get_info(ctx, &info, &nr, 3);
+    if (lvl != 2 && lvl != 3) {
+        fprintf(stderr, "Input lvl %d is wrong\n", lvl);
+        return EXIT_FAILURE;
+    }
+
+    rc = libxl_psr_cat_get_info(ctx, &info, &nr, lvl);
     if (rc) {
-        fprintf(stderr, "Failed to get cat info\n");
+        fprintf(stderr, "Failed to get %s cat info\n", (lvl == 3)?"L3":"L2");
         return rc;
     }
 
     for (i = 0; i < nr; i++) {
-        rc = psr_cat_print_socket(domid, info + i);
+        rc = psr_cat_print_socket(domid, info + i, lvl);
         if (rc)
             goto out;
     }
@@ -533,11 +554,19 @@ int main_psr_cat_cbm_set(int argc, char **argv)
 
 int main_psr_cat_show(int argc, char **argv)
 {
-    int opt;
+    int opt = 0;
     uint32_t domid;
+    unsigned int lvl = 3;
 
-    SWITCH_FOREACH_OPT(opt, "", NULL, "psr-cat-show", 0) {
-        /* No options */
+    static struct option opts[] = {
+        {"level", 1, 0, 'l'},
+        COMMON_LONG_OPTS
+    };
+
+    SWITCH_FOREACH_OPT(opt, "l:", opts, "psr-cat-show", 0) {
+    case 'l':
+        lvl = atoi(optarg);
+        break;
     }
 
     if (optind >= argc)
@@ -549,7 +578,7 @@ int main_psr_cat_show(int argc, char **argv)
         return 2;
     }
 
-    return psr_cat_show(domid);
+    return psr_cat_show(domid, lvl);
 }
 
 int main_psr_hwinfo(int argc, char **argv)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 24/25] tools: L2 CAT: support set cbm for L2 CAT.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (22 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 23/25] tools: L2 CAT: support show cbm " Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  2017-04-01 13:53 ` [PATCH v10 25/25] docs: add L2 CAT description in docs Yi Sun
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch implements the xl/xc changes to support set CBM
for L2 CAT.

The new level option is introduced to original CAT setting
command in order to set CBM for specified level CAT.
- 'xl psr-cat-set' is updated to set cache capacity bitmasks(CBM)
  for a domain according to input cache level.

root@:~$ xl psr-cat-set -l2 1 0x7f

root@:~$ xl psr-cat-show -l2 1
Socket ID       : 0
Default CBM     : 0xff
   ID                     NAME             CBM
    1                 ubuntu14            0x7f

Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v10:
    - fix comments.
      (suggested by Wei Liu)
v9:
    - handle the case to set both CODE and DATA for CDP at same time.
      For such case, user does not input '-c' or '-d' to set CDP cbm.
    - move xl_cmdimpl.c changes into xl/xl_psr.c.
    - move xl_cmdtable.c changes into xl/xl_cmdtable.c.
v6:
    - rename 'psr-cat-cbm-set' to 'psr-cat-set'.
      (suggested by Kevin Tian)
    - return 'EXIT_FAILURE' for error case.
      (suggested by Dario Faggioli)
    - print error info when input level is wrong.
v4:
    - create this patch because of codes architecture change.
---
 tools/libxc/xc_psr.c    |  3 +++
 tools/libxl/libxl_psr.c | 40 ++++++++++++++++++++++++++++++++++++----
 tools/xl/xl_cmdtable.c  |  3 ++-
 tools/xl/xl_psr.c       | 33 +++++++++++++++++++++++----------
 4 files changed, 64 insertions(+), 15 deletions(-)

diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index 04f5927..039b920 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -266,6 +266,9 @@ int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
     case XC_PSR_CAT_L3_CBM_DATA:
         cmd = XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA;
         break;
+    case XC_PSR_CAT_L2_CBM:
+        cmd = XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM;
+        break;
     default:
         errno = EINVAL;
         return -1;
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index f55ba1e..3598d84 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -317,6 +317,7 @@ int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
     GC_INIT(ctx);
     int rc;
     int socketid, nr_sockets;
+    libxl_psr_cat_info cat_info;
 
     rc = libxl__count_physical_sockets(gc, &nr_sockets);
     if (rc) {
@@ -331,10 +332,41 @@ int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
             break;
 
         xc_type = libxl__psr_cbm_type_to_libxc_psr_cat_type(type);
-        if (xc_psr_cat_set_domain_data(ctx->xch, domid, xc_type,
-                                       socketid, cbm)) {
-            libxl__psr_cat_log_err_msg(gc, errno);
-            rc = ERROR_FAIL;
+
+        if (xc_type == XC_PSR_CAT_L3_CBM) {
+            if (xc_psr_cat_get_info(ctx->xch, socketid, 3, &cat_info.cos_max,
+                                    &cat_info.cbm_len, &cat_info.cdp_enabled)) {
+                libxl__psr_cat_log_err_msg(gc, errno);
+                rc = ERROR_FAIL;
+                goto out;
+            }
+        }
+
+        /*
+         * If cdp_enabled is true and type is XC_PSR_CAT_L3_CBM,  we need set
+         * both CODE and DATA.
+         */
+        if (xc_type == XC_PSR_CAT_L3_CBM && cat_info.cdp_enabled) {
+            xc_type = XC_PSR_CAT_L3_CBM_CODE;
+            if (xc_psr_cat_set_domain_data(ctx->xch, domid, xc_type,
+                                           socketid, cbm)) {
+                libxl__psr_cat_log_err_msg(gc, errno);
+                rc = ERROR_FAIL;
+            }
+
+            xc_type = XC_PSR_CAT_L3_CBM_DATA;
+            if (rc != ERROR_FAIL &&
+                xc_psr_cat_set_domain_data(ctx->xch, domid, xc_type,
+                                           socketid, cbm)) {
+                libxl__psr_cat_log_err_msg(gc, errno);
+                rc = ERROR_FAIL;
+            }
+        } else {
+            if (xc_psr_cat_set_domain_data(ctx->xch, domid, xc_type,
+                                           socketid, cbm)) {
+                libxl__psr_cat_log_err_msg(gc, errno);
+                rc = ERROR_FAIL;
+            }
         }
     }
 
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index ab7ad60..d332e1a 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -545,11 +545,12 @@ struct cmd_spec cmd_table[] = {
     },
 #endif
 #ifdef LIBXL_HAVE_PSR_CAT
-    { "psr-cat-cbm-set",
+    { "psr-cat-set",
       &main_psr_cat_cbm_set, 0, 1,
       "Set cache capacity bitmasks(CBM) for a domain",
       "[options] <Domain> <CBM>",
       "-s <socket>       Specify the socket to process, otherwise all sockets are processed\n"
+      "-l <level>        Specify the cache level to process, otherwise L3 cache is processed\n"
       "-c                Set code CBM if CDP is supported\n"
       "-d                Set data CBM if CDP is supported\n"
     },
diff --git a/tools/xl/xl_psr.c b/tools/xl/xl_psr.c
index 575f4a0..7309d4f 100644
--- a/tools/xl/xl_psr.c
+++ b/tools/xl/xl_psr.c
@@ -490,19 +490,21 @@ int main_psr_cat_cbm_set(int argc, char **argv)
     char *value;
     libxl_string_list socket_list;
     unsigned long start, end;
-    int i, j, len;
+    unsigned int i, j, len;
+    unsigned int lvl = 3;
 
     static struct option opts[] = {
         {"socket", 1, 0, 's'},
         {"data", 0, 0, 'd'},
         {"code", 0, 0, 'c'},
+        {"level", 1, 0, 'l'},
         COMMON_LONG_OPTS
     };
 
     libxl_socket_bitmap_alloc(ctx, &target_map, 0);
     libxl_bitmap_set_none(&target_map);
 
-    SWITCH_FOREACH_OPT(opt, "s:cd", opts, "psr-cat-cbm-set", 2) {
+    SWITCH_FOREACH_OPT(opt, "s:l:cd", opts, "psr-cat-set", 2) {
     case 's':
         trim(isspace, optarg, &value);
         split_string_into_string_list(value, ",", &socket_list);
@@ -522,24 +524,35 @@ int main_psr_cat_cbm_set(int argc, char **argv)
     case 'c':
         opt_code = 1;
         break;
+    case 'l':
+        lvl = atoi(optarg);
+        break;
     }
 
-    if (opt_data && opt_code) {
-        fprintf(stderr, "Cannot handle -c and -d at the same time\n");
-        return -1;
-    } else if (opt_data) {
-        type = LIBXL_PSR_CBM_TYPE_L3_CBM_DATA;
-    } else if (opt_code) {
-        type = LIBXL_PSR_CBM_TYPE_L3_CBM_CODE;
+    if (lvl == 2)
+        type = LIBXL_PSR_CBM_TYPE_L2_CBM;
+    else if (lvl == 3) {
+        if (opt_data && opt_code) {
+            fprintf(stderr, "Cannot handle -c and -d at the same time\n");
+            return EXIT_FAILURE;
+        } else if (opt_data) {
+            type = LIBXL_PSR_CBM_TYPE_L3_CBM_DATA;
+        } else if (opt_code) {
+            type = LIBXL_PSR_CBM_TYPE_L3_CBM_CODE;
+        } else {
+            type = LIBXL_PSR_CBM_TYPE_L3_CBM;
+        }
     } else {
         type = LIBXL_PSR_CBM_TYPE_L3_CBM;
+        fprintf(stderr, "Input lvl %d is wrong\n", lvl);
+        return EXIT_FAILURE;
     }
 
     if (libxl_bitmap_is_empty(&target_map))
         libxl_bitmap_set_any(&target_map);
 
     if (argc != optind + 2) {
-        help("psr-cat-cbm-set");
+        help("psr-cat-set");
         return 2;
     }
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v10 25/25] docs: add L2 CAT description in docs.
  2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
                   ` (23 preceding siblings ...)
  2017-04-01 13:53 ` [PATCH v10 24/25] tools: L2 CAT: support set " Yi Sun
@ 2017-04-01 13:53 ` Yi Sun
  24 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-01 13:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, andrew.cooper3, dario.faggioli, he.chen,
	ian.jackson, Yi Sun, mengxu, jbeulich, chao.p.peng, roger.pau

This patch adds L2 CAT description in related documents.

Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 docs/man/xl.pod.1.in      | 25 ++++++++++++++++++++++---
 docs/misc/xl-psr.markdown | 18 ++++++++++++------
 2 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in
index 7caed08..5e7676e 100644
--- a/docs/man/xl.pod.1.in
+++ b/docs/man/xl.pod.1.in
@@ -1711,6 +1711,9 @@ occupancy monitoring share the same set of underlying monitoring service. Once
 a domain is attached to the monitoring service, monitoring data can be shown
 for any of these monitoring types.
 
+There is no cache monitoring and memory bandwidth monitoring on L2 cache so
+far.
+
 =over 4
 
 =item B<psr-cmt-attach> [I<domain-id>]
@@ -1735,7 +1738,7 @@ monitor types are:
 
 Intel Broadwell and later server platforms offer capabilities to configure and
 make use of the Cache Allocation Technology (CAT) mechanisms, which enable more
-cache resources (i.e. L3 cache) to be made available for high priority
+cache resources (i.e. L3/L2 cache) to be made available for high priority
 applications. In the Xen implementation, CAT is used to control cache allocation
 on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
 (CBM) for the domain.
@@ -1745,7 +1748,7 @@ Intel Broadwell and later server platforms also offer Code/Data Prioritization
 applications. CDP is used on a per VM basis in the Xen implementation. To
 specify code or data CBM for the domain, CDP feature must be enabled and CBM
 type options need to be specified when setting CBM, and the type options (code
-and data) are mutually exclusive.
+and data) are mutually exclusive. There is no CDP support on L2 so far.
 
 =over 4
 
@@ -1762,6 +1765,11 @@ B<OPTIONS>
 
 Specify the socket to process, otherwise all sockets are processed.
 
+=item B<-l LEVEL>, B<--level=LEVEL>
+
+Specify the cache level to process, otherwise the last level cache (L3) is
+processed.
+
 =item B<-c>, B<--code>
 
 Set code CBM when CDP is enabled.
@@ -1772,10 +1780,21 @@ Set data CBM when CDP is enabled.
 
 =back
 
-=item B<psr-cat-show> [I<domain-id>]
+=item B<psr-cat-show> [I<OPTIONS>] [I<domain-id>]
 
 Show CAT settings for a certain domain or all domains.
 
+B<OPTIONS>
+
+=over 4
+
+=item B<-l LEVEL>, B<--level=LEVEL>
+
+Specify the cache level to process, otherwise the last level cache (L3) is
+processed.
+
+=back
+
 =back
 
 =head1 IGNORED FOR COMPATIBILITY WITH XM
diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
index c3c1e8e..04dd957 100644
--- a/docs/misc/xl-psr.markdown
+++ b/docs/misc/xl-psr.markdown
@@ -70,7 +70,7 @@ total-mem-bandwidth instead of cache-occupancy). E.g. after a `xl psr-cmt-attach
 
 Cache Allocation Technology (CAT) is a new feature available on Intel
 Broadwell and later server platforms that allows an OS or Hypervisor/VMM to
-partition cache allocation (i.e. L3 cache) based on application priority or
+partition cache allocation (i.e. L3/L2 cache) based on application priority or
 Class of Service (COS). Each COS is configured using capacity bitmasks (CBM)
 which represent cache capacity and indicate the degree of overlap and
 isolation between classes. System cache resource is divided into numbers of
@@ -107,7 +107,7 @@ System CAT information such as maximum COS and CBM length can be obtained by:
 
 The simplest way to change a domain's CBM from its default is running:
 
-`xl psr-cat-cbm-set  [OPTIONS] <domid> <cbm>`
+`xl psr-cat-set  [OPTIONS] <domid> <cbm>`
 
 where cbm is a number to represent the corresponding cache subset can be used.
 A cbm is valid only when:
@@ -119,13 +119,19 @@ A cbm is valid only when:
 In a multi-socket system, the same cbm will be set on each socket by default.
 Per socket cbm can be specified with the `--socket SOCKET` option.
 
+In different systems, the different cache level is supported, e.g. L3 cache or
+L2 cache. Per cache level cbm can be specified with the `--level LEVEL` option.
+
 Setting the CBM may not be successful if insufficient COS is available. In
 such case unused COS(es) may be freed by setting CBM of all related domains to
 its default value(all-ones).
 
 Per domain CBM settings can be shown by:
 
-`xl psr-cat-show`
+`xl psr-cat-show [OPTIONS] <domid>`
+
+In different systems, the different cache level is supported, e.g. L3 cache or
+L2 cache. Per cache level cbm can be specified with the `--level LEVEL` option.
 
 ## Code and Data Prioritization (CDP)
 
@@ -172,13 +178,13 @@ options is invalid.
 Example:
 
 Setting code CBM for a domain:
-`xl psr-cat-cbm-set -c <domid> <cbm>`
+`xl psr-cat-set -c <domid> <cbm>`
 
 Setting data CBM for a domain:
-`xl psr-cat-cbm-set -d <domid> <cbm>`
+`xl psr-cat-set -d <domid> <cbm>`
 
 Setting the same code and data CBM for a domain:
-`xl psr-cat-cbm-set <domid> <cbm>`
+`xl psr-cat-set <domid> <cbm>`
 
 ## Reference
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 03/25] x86: refactor psr: implement main data structures.
  2017-04-01 13:53 ` [PATCH v10 03/25] x86: refactor psr: implement main data structures Yi Sun
@ 2017-04-03 15:50   ` Jan Beulich
  2017-04-05  3:12     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-03 15:50 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:

I was about to ack this, but there are a few more things which bother
me.

> ---
> v10:
>     - remove initialization for 'PSR_SOCKET_L3_CAT'.
>       (suggested by Jan Beulich)
>     - rename 'feat_ops' to 'feat_props'.
>       (suggested by Jan Beulich)
>     - move 'cbm_len' to 'feat_props' because it is feature independent so far.
>       (suggested by Jan Beulich)
>     - move 'cos_max' to 'feat_props' because it is feature independent.
>       (suggested by Jan Beulich)
>     - move 'cos_num' to 'feat_props' because it is feature independent.
>       (suggested by Jan Beulich)
>     - remove union 'info' and struct 'psr_cat_hw_info'.
>     - remove 'get_cos_max' from 'feat_props'.
>       (suggested by Jan Beulich)
>     - remove 'feat_mask' from 'psr_socket_info' because we can use 'features[]'
>       to check if any feature is initialized.
>       (suggested by Jan Beulich)
>     - move 'ref_lock' above 'cos_ref'.
>       (suggested by Jan Beulich)

The movement done is fine for the moment, but it would have been
even better if you had moved it ahead of the other array too.

> +enum psr_feat_type {
> +    PSR_SOCKET_L3_CAT,
> +    PSR_SOCKET_L3_CDP,
> +    PSR_SOCKET_L2_CAT,
> +    PSR_SOCKET_MAX_FEAT,
> +};

It's not really logical to have the first three here - they should have
been added by their respective patches. Which gets me back to
the question of the usefulness of a patch like this one: Without the
following patches, everything being added here is unused. Iirc the
original version of this patch was quite a bit larger, better justifying
to break out all of this. The size it has shrunk to by now is a pretty
good indication that it should have been folded altogether.

Also I think we've had the discussion about the difference between
"max" and "num" already: The former normally indicates an inclusive
upper bound, while the latter would usually be an exclusive one.
Clearly the above naming doesn't match up with this.

> +/*
> + * This structure represents one feature.
> + * feat_props  - Feature properties, including operation callback functions
> +                 and feature common values.
> + * cos_reg_val - Array to store the values of COS registers. One entry stores
> + *               the value of one COS register.
> + *               For L3 CAT and L2 CAT, one entry corresponds to one COS_ID.
> + *               For CDP, two entries correspond to one COS_ID. E.g.
> + *               COS_ID=0 corresponds to cos_reg_val[0] (Data) and
> + *               cos_reg_val[1] (Code).
> + */
> +struct feat_node {
> +    /*
> +     * This structure defines feature operation callback functions. Every
> +     * feature enabled MUST implement such callback functions and register
> +     * them to props.
> +     *
> +     * Feature specific behaviors will be encapsulated into these callback
> +     * functions. Then, the main flows will not be changed when introducing
> +     * a new feature.
> +     *
> +     * Feature independent HW info and common values are also defined in it.
> +     */
> +    const struct feat_props {
> +        /*
> +         * cos_num, cos_max and cbm_len are common values for all features
> +         * so far.
> +         * cos_num - COS registers number that feature uses for one COS ID.
> +         *           It is defined in SDM.
> +         * cos_max - The max COS registers number got through CPUID.
> +         * cbm_len - The length of CBM got through CPUID.
> +         */
> +        unsigned int cos_num;
> +        unsigned int cos_max;
> +        unsigned int cbm_len;
> +    } *props;

Let's think the data arrangement changes done so far a little further:
Why do we need this pointer per-node (i.e. once per socket)? Now
that we have a well established order of features used to index
struct psr_socket_info's features[], why can't the same indexing be
used to obtain the props pointer from a static (single instance) array
of props pointers?

Otoh I'm not sure you really meant to move all three data members
into there, if you still have reason to believe that different sockets
may have different values for cos_max and/or cbm_len. I.e. these
might better be members of struct feat_node (just not placed in a
union, as you had them in v9).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 03/25] x86: refactor psr: implement main data structures.
  2017-04-03 15:50   ` Jan Beulich
@ 2017-04-05  3:12     ` Yi Sun
  2017-04-05  8:20       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-05  3:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-03 09:50:34, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> 
> I was about to ack this, but there are a few more things which bother
> me.
> 
Do you mean ack to this patch 3 or whole patch set? Thanks!

> > ---
> > v10:
> >     - remove initialization for 'PSR_SOCKET_L3_CAT'.
> >       (suggested by Jan Beulich)
> >     - rename 'feat_ops' to 'feat_props'.
> >       (suggested by Jan Beulich)
> >     - move 'cbm_len' to 'feat_props' because it is feature independent so far.
> >       (suggested by Jan Beulich)
> >     - move 'cos_max' to 'feat_props' because it is feature independent.
> >       (suggested by Jan Beulich)
> >     - move 'cos_num' to 'feat_props' because it is feature independent.
> >       (suggested by Jan Beulich)
> >     - remove union 'info' and struct 'psr_cat_hw_info'.
> >     - remove 'get_cos_max' from 'feat_props'.
> >       (suggested by Jan Beulich)
> >     - remove 'feat_mask' from 'psr_socket_info' because we can use 'features[]'
> >       to check if any feature is initialized.
> >       (suggested by Jan Beulich)
> >     - move 'ref_lock' above 'cos_ref'.
> >       (suggested by Jan Beulich)
> 
> The movement done is fine for the moment, but it would have been
> even better if you had moved it ahead of the other array too.
> 
Got it.

> > +enum psr_feat_type {
> > +    PSR_SOCKET_L3_CAT,
> > +    PSR_SOCKET_L3_CDP,
> > +    PSR_SOCKET_L2_CAT,
> > +    PSR_SOCKET_MAX_FEAT,
> > +};
> 
> It's not really logical to have the first three here - they should have
> been added by their respective patches. Which gets me back to
> the question of the usefulness of a patch like this one: Without the
> following patches, everything being added here is unused. Iirc the
> original version of this patch was quite a bit larger, better justifying
> to break out all of this. The size it has shrunk to by now is a pretty
> good indication that it should have been folded altogether.
> 
Ok, I will add item one by one in related feature's patch. But can I keep
this patch 3? This patch outlines the infrastructure and I demonstrated how
I analyze the data structures in commit message. If I split these data
structures into pieces and implement them into different patches, I am
afraid to lose the completeness.

> Also I think we've had the discussion about the difference between
> "max" and "num" already: The former normally indicates an inclusive
> upper bound, while the latter would usually be an exclusive one.
> Clearly the above naming doesn't match up with this.
> 
Ok, may change it to 'PSR_SOCKET_FEAT_NUM'.

> > +/*
> > + * This structure represents one feature.
> > + * feat_props  - Feature properties, including operation callback functions
> > +                 and feature common values.
> > + * cos_reg_val - Array to store the values of COS registers. One entry stores
> > + *               the value of one COS register.
> > + *               For L3 CAT and L2 CAT, one entry corresponds to one COS_ID.
> > + *               For CDP, two entries correspond to one COS_ID. E.g.
> > + *               COS_ID=0 corresponds to cos_reg_val[0] (Data) and
> > + *               cos_reg_val[1] (Code).
> > + */
> > +struct feat_node {
> > +    /*
> > +     * This structure defines feature operation callback functions. Every
> > +     * feature enabled MUST implement such callback functions and register
> > +     * them to props.
> > +     *
> > +     * Feature specific behaviors will be encapsulated into these callback
> > +     * functions. Then, the main flows will not be changed when introducing
> > +     * a new feature.
> > +     *
> > +     * Feature independent HW info and common values are also defined in it.
> > +     */
> > +    const struct feat_props {
> > +        /*
> > +         * cos_num, cos_max and cbm_len are common values for all features
> > +         * so far.
> > +         * cos_num - COS registers number that feature uses for one COS ID.
> > +         *           It is defined in SDM.
> > +         * cos_max - The max COS registers number got through CPUID.
> > +         * cbm_len - The length of CBM got through CPUID.
> > +         */
> > +        unsigned int cos_num;
> > +        unsigned int cos_max;
> > +        unsigned int cbm_len;
> > +    } *props;
> 
> Let's think the data arrangement changes done so far a little further:
> Why do we need this pointer per-node (i.e. once per socket)? Now
> that we have a well established order of features used to index
> struct psr_socket_info's features[], why can't the same indexing be
> used to obtain the props pointer from a static (single instance) array
> of props pointers?
> 
Hmm, yes, we can declare a static standalone array of props pointers for all
features and all sockets. It does not belong to 'feat_node' or 'socket_info'.

> Otoh I'm not sure you really meant to move all three data members
> into there, if you still have reason to believe that different sockets
> may have different values for cos_max and/or cbm_len. I.e. these
> might better be members of struct feat_node (just not placed in a
> union, as you had them in v9).
> 
We still believe there may be chances that different sockets may have different
configurations. So, I would prefer to keep cos_max/cbm_len in 'feat_node'.

Because this Friday is the code freeze date, can I quickly make the changes
according to above comments and submit a new version if you do not have
further oppinion? Thanks!

BRs,
Sun Yi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 03/25] x86: refactor psr: implement main data structures.
  2017-04-05  3:12     ` Yi Sun
@ 2017-04-05  8:20       ` Jan Beulich
  2017-04-05  8:45         ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-05  8:20 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 05.04.17 at 05:12, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-03 09:50:34, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> 
>> I was about to ack this, but there are a few more things which bother
>> me.
>> 
> Do you mean ack to this patch 3 or whole patch set? Thanks!

Well, this one patch only - I didn't even get to look at the others yet.

>> > +enum psr_feat_type {
>> > +    PSR_SOCKET_L3_CAT,
>> > +    PSR_SOCKET_L3_CDP,
>> > +    PSR_SOCKET_L2_CAT,
>> > +    PSR_SOCKET_MAX_FEAT,
>> > +};
>> 
>> It's not really logical to have the first three here - they should have
>> been added by their respective patches. Which gets me back to
>> the question of the usefulness of a patch like this one: Without the
>> following patches, everything being added here is unused. Iirc the
>> original version of this patch was quite a bit larger, better justifying
>> to break out all of this. The size it has shrunk to by now is a pretty
>> good indication that it should have been folded altogether.
>> 
> Ok, I will add item one by one in related feature's patch. But can I keep
> this patch 3? This patch outlines the infrastructure and I demonstrated how
> I analyze the data structures in commit message. If I split these data
> structures into pieces and implement them into different patches, I am
> afraid to lose the completeness.

Well, in the interest of not needlessly delaying forward progress
I'm fine with you keeping this patch for now. Should the series
miss 4.9, though, I'd prefer if you eliminated it.

>> > +/*
>> > + * This structure represents one feature.
>> > + * feat_props  - Feature properties, including operation callback functions
>> > +                 and feature common values.
>> > + * cos_reg_val - Array to store the values of COS registers. One entry stores
>> > + *               the value of one COS register.
>> > + *               For L3 CAT and L2 CAT, one entry corresponds to one COS_ID.
>> > + *               For CDP, two entries correspond to one COS_ID. E.g.
>> > + *               COS_ID=0 corresponds to cos_reg_val[0] (Data) and
>> > + *               cos_reg_val[1] (Code).
>> > + */
>> > +struct feat_node {
>> > +    /*
>> > +     * This structure defines feature operation callback functions. Every
>> > +     * feature enabled MUST implement such callback functions and register
>> > +     * them to props.
>> > +     *
>> > +     * Feature specific behaviors will be encapsulated into these callback
>> > +     * functions. Then, the main flows will not be changed when introducing
>> > +     * a new feature.
>> > +     *
>> > +     * Feature independent HW info and common values are also defined in it.
>> > +     */
>> > +    const struct feat_props {
>> > +        /*
>> > +         * cos_num, cos_max and cbm_len are common values for all features
>> > +         * so far.
>> > +         * cos_num - COS registers number that feature uses for one COS ID.
>> > +         *           It is defined in SDM.
>> > +         * cos_max - The max COS registers number got through CPUID.
>> > +         * cbm_len - The length of CBM got through CPUID.
>> > +         */
>> > +        unsigned int cos_num;
>> > +        unsigned int cos_max;
>> > +        unsigned int cbm_len;
>> > +    } *props;
>> 
>> Let's think the data arrangement changes done so far a little further:
>> Why do we need this pointer per-node (i.e. once per socket)? Now
>> that we have a well established order of features used to index
>> struct psr_socket_info's features[], why can't the same indexing be
>> used to obtain the props pointer from a static (single instance) array
>> of props pointers?
>> 
> Hmm, yes, we can declare a static standalone array of props pointers for all
> features and all sockets. It does not belong to 'feat_node' or 
> 'socket_info'.
> 
>> Otoh I'm not sure you really meant to move all three data members
>> into there, if you still have reason to believe that different sockets
>> may have different values for cos_max and/or cbm_len. I.e. these
>> might better be members of struct feat_node (just not placed in a
>> union, as you had them in v9).
>> 
> We still believe there may be chances that different sockets may have different
> configurations. So, I would prefer to keep cos_max/cbm_len in 'feat_node'.

This is contradictory: The two fields aren't in struct feat_node in
the current version of the patch, so do you mean "keep in struct
feat_props" or "move back to struct feat_node"?

> Because this Friday is the code freeze date, can I quickly make the changes
> according to above comments and submit a new version if you do not have
> further oppinion? Thanks!

As said above, I didn't get to look at the other patches yet. It's
up to you whether to resubmit with the adjustments done here
(and carried through the rest of the series), or whether you wait.
As it's Wednesday already, don't have too much hope for the
series to make 4.9 - I'm sorry for this, but I also can't do much
about it.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 03/25] x86: refactor psr: implement main data structures.
  2017-04-05  8:20       ` Jan Beulich
@ 2017-04-05  8:45         ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-05  8:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, chao.p.peng, xen-devel, roger.pau

On 17-04-05 02:20:53, Jan Beulich wrote:
> >>> On 05.04.17 at 05:12, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-03 09:50:34, Jan Beulich wrote:
> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> > +enum psr_feat_type {
> >> > +    PSR_SOCKET_L3_CAT,
> >> > +    PSR_SOCKET_L3_CDP,
> >> > +    PSR_SOCKET_L2_CAT,
> >> > +    PSR_SOCKET_MAX_FEAT,
> >> > +};
> >> 
> >> It's not really logical to have the first three here - they should have
> >> been added by their respective patches. Which gets me back to
> >> the question of the usefulness of a patch like this one: Without the
> >> following patches, everything being added here is unused. Iirc the
> >> original version of this patch was quite a bit larger, better justifying
> >> to break out all of this. The size it has shrunk to by now is a pretty
> >> good indication that it should have been folded altogether.
> >> 
> > Ok, I will add item one by one in related feature's patch. But can I keep
> > this patch 3? This patch outlines the infrastructure and I demonstrated how
> > I analyze the data structures in commit message. If I split these data
> > structures into pieces and implement them into different patches, I am
> > afraid to lose the completeness.
> 
> Well, in the interest of not needlessly delaying forward progress
> I'm fine with you keeping this patch for now. Should the series
> miss 4.9, though, I'd prefer if you eliminated it.
> 
As other patches are still not reviewed yet, I am afraid the 4.9 will be missed.
Then, I will consider to eliminate this patch 3.

> >> > +/*
> >> > + * This structure represents one feature.
> >> > + * feat_props  - Feature properties, including operation callback functions
> >> > +                 and feature common values.
> >> > + * cos_reg_val - Array to store the values of COS registers. One entry stores
> >> > + *               the value of one COS register.
> >> > + *               For L3 CAT and L2 CAT, one entry corresponds to one COS_ID.
> >> > + *               For CDP, two entries correspond to one COS_ID. E.g.
> >> > + *               COS_ID=0 corresponds to cos_reg_val[0] (Data) and
> >> > + *               cos_reg_val[1] (Code).
> >> > + */
> >> > +struct feat_node {
> >> > +    /*
> >> > +     * This structure defines feature operation callback functions. Every
> >> > +     * feature enabled MUST implement such callback functions and register
> >> > +     * them to props.
> >> > +     *
> >> > +     * Feature specific behaviors will be encapsulated into these callback
> >> > +     * functions. Then, the main flows will not be changed when introducing
> >> > +     * a new feature.
> >> > +     *
> >> > +     * Feature independent HW info and common values are also defined in it.
> >> > +     */
> >> > +    const struct feat_props {
> >> > +        /*
> >> > +         * cos_num, cos_max and cbm_len are common values for all features
> >> > +         * so far.
> >> > +         * cos_num - COS registers number that feature uses for one COS ID.
> >> > +         *           It is defined in SDM.
> >> > +         * cos_max - The max COS registers number got through CPUID.
> >> > +         * cbm_len - The length of CBM got through CPUID.
> >> > +         */
> >> > +        unsigned int cos_num;
> >> > +        unsigned int cos_max;
> >> > +        unsigned int cbm_len;
> >> > +    } *props;
> >> 
> >> Let's think the data arrangement changes done so far a little further:
> >> Why do we need this pointer per-node (i.e. once per socket)? Now
> >> that we have a well established order of features used to index
> >> struct psr_socket_info's features[], why can't the same indexing be
> >> used to obtain the props pointer from a static (single instance) array
> >> of props pointers?
> >> 
> > Hmm, yes, we can declare a static standalone array of props pointers for all
> > features and all sockets. It does not belong to 'feat_node' or 
> > 'socket_info'.
> > 
> >> Otoh I'm not sure you really meant to move all three data members
> >> into there, if you still have reason to believe that different sockets
> >> may have different values for cos_max and/or cbm_len. I.e. these
> >> might better be members of struct feat_node (just not placed in a
> >> union, as you had them in v9).
> >> 
> > We still believe there may be chances that different sockets may have different
> > configurations. So, I would prefer to keep cos_max/cbm_len in 'feat_node'.
> 
> This is contradictory: The two fields aren't in struct feat_node in
> the current version of the patch, so do you mean "keep in struct
> feat_props" or "move back to struct feat_node"?
> 
"move back to struct feat_node".

> > Because this Friday is the code freeze date, can I quickly make the changes
> > according to above comments and submit a new version if you do not have
> > further oppinion? Thanks!
> 
> As said above, I didn't get to look at the other patches yet. It's
> up to you whether to resubmit with the adjustments done here
> (and carried through the rest of the series), or whether you wait.
> As it's Wednesday already, don't have too much hope for the
> series to make 4.9 - I'm sorry for this, but I also can't do much
> about it.
> 
Per current status, I will wait for all your review comments. Thanks!

BRs,
Sun Yi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-01 13:53 ` [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow Yi Sun
@ 2017-04-05 15:10   ` Jan Beulich
  2017-04-06  5:49     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-05 15:10 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> @@ -76,7 +79,7 @@ struct feat_node {
>       *
>       * Feature independent HW info and common values are also defined in it.
>       */
> -    const struct feat_props {
> +    struct feat_props {

As said before, the const here should stay. The init-time writing
to the structure can be done without going through this pointer.

> +static void free_socket_resources(unsigned int socket)
> +{
> +    unsigned int i;
> +    struct psr_socket_info *info = socket_info + socket;
> +
> +    if ( !info )
> +        return;
> +
> +    /*
> +     * Free resources of features. The global feature object, e.g. feat_l3_cat,
> +     * may not be freed here if it is not added into array. It is simply being
> +     * kept until the next CPU online attempt.
> +     */
> +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +    {
> +        if ( !info->features[i] )
> +            continue;
> +
> +        xfree(info->features[i]);
> +        info->features[i] = NULL;

There's no need for the if() here. And I'm sure this was pointed out
already (perhaps in a different context).

> +static bool feat_init_done(const struct psr_socket_info *info)
> +{
> +    unsigned int i;
> +
> +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +    {
> +        if ( !info->features[i] )
> +            continue;
> +
> +        return true;
> +    }
> +
> +    return false;
> +}

At the first glance this is a very strange function.

> +/* CAT common functions implementation. */
> +static void cat_init_feature(const struct cpuid_leaf *regs,
> +                             struct feat_node *feat,
> +                             struct psr_socket_info *info,
> +                             enum psr_feat_type type)
> +{
> +    unsigned int socket, i;
> +
> +    /* No valid value so do not enable feature. */
> +    if ( !regs->a || !regs->d )
> +        return;
> +
> +    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
> +    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
> +
> +    switch ( type )
> +    {
> +    case PSR_SOCKET_L3_CAT:
> +        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> +        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
> +
> +        /*
> +         * To handle cpu offline and then online case, we need restore MSRs to
> +         * default values.
> +         */
> +        for ( i = 1; i <= feat->props->cos_max; i++ )
> +        {
> +            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> +            feat->cos_reg_val[i] = feat->cos_reg_val[0];
> +        }

I continue to have difficulty with this: Why is offline-then-online
any different from first-time-online? Why wouldn't setting the
registers to their intended values not be taken care of by
context switch code, once vCPU-s get scheduled onto the newly
onlined CPU?

> +        break;
> +
> +    default:
> +        return;
> +    }
> +
> +    /* Add this feature into array. */
> +    info->features[type] = feat;
> +
> +    socket = cpu_to_socket(smp_processor_id());

No need for this variable, and definitely no need to do the
assignment ahead of ...

> +    if ( !opt_cpu_info )
> +        return;

... this.

>  static void psr_cpu_init(void)
>  {
> +    struct psr_socket_info *info;
> +    unsigned int socket;
> +    unsigned int cpu = smp_processor_id();
> +    struct feat_node *feat;
> +    struct cpuid_leaf regs;
> +
> +    if ( !psr_alloc_feat_enabled() || !boot_cpu_has(X86_FEATURE_PQE) )
> +        goto assoc_init;
> +
> +    if ( boot_cpu_data.cpuid_level < PSR_CPUID_LEVEL_CAT )
> +    {
> +        setup_clear_cpu_cap(X86_FEATURE_PQE);
> +        goto assoc_init;
> +    }
> +
> +    socket = cpu_to_socket(cpu);
> +    info = socket_info + socket;
> +    if ( feat_init_done(info) )
> +        goto assoc_init;

Hmm, so you bail here if any of the features was already set up.
But you don't bail if none of the features were available as the
reason for the setup not having been done before. I think this
can be solved in a better way once we have the static props
array: You need to do anything here only if the props slot is not
NULL, but the feature slot is NULL.

In any event you intentions are likely easier to understand for
a reader if this single-use function was inlined here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows.
  2017-04-01 13:53 ` [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows Yi Sun
@ 2017-04-05 15:23   ` Jan Beulich
  2017-04-06  6:01     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-05 15:23 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> @@ -376,11 +378,39 @@ void psr_free_rmid(struct domain *d)
>      d->arch.psr_rmid = 0;
>  }
>  
> -static inline void psr_assoc_init(void)
> +static unsigned int get_max_cos_max(const struct psr_socket_info *info)
> +{
> +    const struct feat_node *feat;
> +    unsigned int cos_max = 0, i;
> +
> +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +    {
> +        feat = info->features[i];
> +        if ( !feat )
> +            continue;
> +
> +        cos_max = max(feat->props->cos_max, cos_max);
> +    }
> +
> +    return cos_max;
> +}
> +
> +static void psr_assoc_init(void)
>  {
>      struct psr_assoc *psra = &this_cpu(psr_assoc);
>  
> -    if ( psr_cmt_enabled() )
> +    if ( psr_alloc_feat_enabled() )
> +    {
> +        unsigned int socket = cpu_to_socket(smp_processor_id());
> +        const struct psr_socket_info *info = socket_info + socket;
> +        unsigned int cos_max = get_max_cos_max(info);
> +
> +        if ( feat_init_done(info) )

I think the use here is different from the one in the earlier patch:
While there looking at props[] appears to be desirable, I think here
you indeed only want to look at features[]. And btw, I wouldn't
mind a simple flag or counter in info telling whether any (or how
many) features have been enabled, to avoid such iterations. It's
just that the original feature mask was fully redundant with
features[].

> @@ -397,6 +434,11 @@ void psr_ctxt_switch_to(struct domain *d)
>      if ( psr_cmt_enabled() )
>          psr_assoc_rmid(&reg, d->arch.psr_rmid);
>  
> +    if ( psra->cos_mask )
> +        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
> +                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
> +                      0, psra->cos_mask);

I may have asked this question before, but if so you can see that
the code above continues puzzling me: Under what conditions
would psra->cos_mask be non-zero, but d->arch.psr_cos_ids be
NULL? And why is zero the right value in that case?

Also you need to deal with alignment issues here: Part of an
expression at equal rank should align with one another. This
implies that you want to move the 2nd argument on a new line
(and the 3rd one would then better be moved to its own line
too).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow.
  2017-04-01 13:53 ` [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow Yi Sun
@ 2017-04-05 15:37   ` Jan Beulich
  2017-04-06  6:05     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-05 15:37 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -93,6 +93,10 @@ struct feat_node {
>          unsigned int cos_num;
>          unsigned int cos_max;
>          unsigned int cbm_len;
> +
> +        /* get_feat_info is used to get feature HW info. */
> +        bool (*get_feat_info)(const struct feat_node *feat,
> +                              uint32_t data[], unsigned int array_len);

The comment isn't very helpful in its current shape. You really want
to make clear that this is being used to return HW info via sysctl.
Without this (and without seeing the rest of this patch), despite
having seen previous versions my first thought was that this
retrieves data from MSRs at initialization time.

> @@ -183,6 +187,22 @@ static bool feat_init_done(const struct psr_socket_info *info)
>      return false;
>  }
>  
> +static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
> +{
> +    enum psr_feat_type feat_type;
> +
> +    switch ( type )
> +    {
> +    case PSR_CBM_TYPE_L3:
> +        feat_type = PSR_SOCKET_L3_CAT;
> +        break;
> +    default:
> +        ASSERT_UNREACHABLE();
> +    }
> +
> +    return feat_type;

I'm pretty certain this will (validly) produce an uninitialized variable
warning at least in a non-debug build. Not how I did say "add
ASSERT_UNREACHABLE()" in the v9 review.

> +int psr_get_info(unsigned int socket, enum cbm_type type,
> +                 uint32_t data[], unsigned int array_len)
> +{
> +    const struct psr_socket_info *info = get_socket_info(socket);
> +    const struct feat_node *feat;
> +    enum psr_feat_type feat_type;
> +
> +    if ( IS_ERR(info) )
> +        return PTR_ERR(info);
> +
> +    if ( !data )
> +        return -EINVAL;

I think I've asked this before - what does this check guard against?
A bad caller? This could be an ASSERT() then. Returning an error to
the sysctl caller because of some implementation bug seems pretty
odd to me.

> +    feat_type = psr_cbm_type_to_feat_type(type);
> +    if ( feat_type > ARRAY_SIZE(info->features) )
> +        return -ENOENT;
> +
> +    feat = info->features[feat_type];

Please can you be more careful when adding checks like the above
one? You still allow overrunning the array, when feat_type
== ARRAY_SIZE(info->features).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow.
  2017-04-01 13:53 ` [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow Yi Sun
@ 2017-04-05 15:51   ` Jan Beulich
  2017-04-06  6:10     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-05 15:51 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1455,25 +1455,37 @@ long arch_do_domctl(
>              break;
>  
>          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
> -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
> -                                 &domctl->u.psr_cat_op.data,
> -                                 PSR_CBM_TYPE_L3);
> +        {
> +            uint32_t val;
> +
> +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
> +                              &val, PSR_CBM_TYPE_L3);
> +            domctl->u.psr_cat_op.data = val;
>              copyback = 1;
>              break;
> +        }
>  
>          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE:
> -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
> -                                 &domctl->u.psr_cat_op.data,
> -                                 PSR_CBM_TYPE_L3_CODE);
> +        {
> +            uint32_t val;
> +
> +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
> +                              &val, PSR_CBM_TYPE_L3_CODE);
> +            domctl->u.psr_cat_op.data = val;
>              copyback = 1;
>              break;
> +        }
>  
>          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA:
> -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
> -                                 &domctl->u.psr_cat_op.data,
> -                                 PSR_CBM_TYPE_L3_DATA);
> +        {
> +            uint32_t val;
> +
> +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
> +                              &val, PSR_CBM_TYPE_L3_DATA);
> +            domctl->u.psr_cat_op.data = val;
>              copyback = 1;
>              break;
> +        }

I think code would read better overall if you had a switch()-wide
variable (then probably encoding its width in its name, e.g. val32).

> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -97,6 +97,10 @@ struct feat_node {
>          /* get_feat_info is used to get feature HW info. */
>          bool (*get_feat_info)(const struct feat_node *feat,
>                                uint32_t data[], unsigned int array_len);
> +
> +        /* get_val is used to get feature COS register value. */
> +        void (*get_val)(const struct feat_node *feat, unsigned int cos,
> +                        uint32_t *val);
>      } *props;
>  
>      uint32_t cos_reg_val[MAX_COS_REG_CNT];
> @@ -265,10 +269,17 @@ static bool cat_get_feat_info(const struct feat_node *feat,
>      return true;
>  }
>  
> +static void cat_get_val(const struct feat_node *feat, unsigned int cos,
> +                        uint32_t *val)
> +{
> +    *val = feat->cos_reg_val[cos];
> +}

This can be done by the caller - there's nothing feature specific in
here, so there's no need for a hook.

> @@ -494,24 +505,34 @@ static struct psr_socket_info *get_socket_info(unsigned int socket)
>      return socket_info + socket;
>  }
>  
> -int psr_get_info(unsigned int socket, enum cbm_type type,
> -                 uint32_t data[], unsigned int array_len)
> +static struct feat_node * psr_get_feat(unsigned int socket,

Stray blank after *.

> +                                       enum cbm_type type)
>  {
>      const struct psr_socket_info *info = get_socket_info(socket);
> -    const struct feat_node *feat;
>      enum psr_feat_type feat_type;
>  
>      if ( IS_ERR(info) )
> -        return PTR_ERR(info);
> +        return ERR_PTR(PTR_ERR(info));

Urgh. But yes, a cast would seem to be the worse alternative.

> @@ -521,9 +542,35 @@ int psr_get_info(unsigned int socket, enum cbm_type type,
>      return -EINVAL;
>  }
>  
> -int psr_get_l3_cbm(struct domain *d, unsigned int socket,
> -                   uint64_t *cbm, enum cbm_type type)
> +int psr_get_val(struct domain *d, unsigned int socket,
> +                uint32_t *val, enum cbm_type type)
>  {
> +    const struct feat_node *feat;
> +    unsigned int cos;
> +
> +    ASSERT(d && val);

I don't think we ever ASSERT() domain pointers to be non-NULL.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-05 15:10   ` Jan Beulich
@ 2017-04-06  5:49     ` Yi Sun
  2017-04-06  8:32       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-06  5:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-05 09:10:58, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > @@ -76,7 +79,7 @@ struct feat_node {
> >       *
> >       * Feature independent HW info and common values are also defined in it.
> >       */
> > -    const struct feat_props {
> > +    struct feat_props {
> 
> As said before, the const here should stay. The init-time writing
> to the structure can be done without going through this pointer.
> 
'feat_props' contains 'cos_max' and 'cbm_len' in this version. I have to assign
values to them in 'cat_init_feature'. So, I removed the 'const'.

Anyway, this structure will be moved out as a standalone struct to define a
static pointer array of 'props'. The 'cos_max' and 'cbm_len' will be moved to
'struct feat_node'.

> > +static void free_socket_resources(unsigned int socket)
> > +{
> > +    unsigned int i;
> > +    struct psr_socket_info *info = socket_info + socket;
> > +
> > +    if ( !info )
> > +        return;
> > +
> > +    /*
> > +     * Free resources of features. The global feature object, e.g. feat_l3_cat,
> > +     * may not be freed here if it is not added into array. It is simply being
> > +     * kept until the next CPU online attempt.
> > +     */
> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> > +    {
> > +        if ( !info->features[i] )
> > +            continue;
> > +
> > +        xfree(info->features[i]);
> > +        info->features[i] = NULL;
> 
> There's no need for the if() here. And I'm sure this was pointed out
> already (perhaps in a different context).
> 
There may be NULL member in features array. Features array contains all
features, including L3 CAT, CDP and L2 CAT. But on some machines, they
may only support L3 CAT but do not support CDP and L2 CAT. So, the features
array only has L3 CAT member in it and all other members are all NULL. That
is the reason we must check if the member is NULL or not.

> > +static bool feat_init_done(const struct psr_socket_info *info)
> > +{
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> > +    {
> > +        if ( !info->features[i] )
> > +            continue;
> > +
> > +        return true;
> > +    }
> > +
> > +    return false;
> > +}
> 
> At the first glance this is a very strange function.
> 
I used 'feat_mask' before to check if any feature has been initialized.
Per your comment in later patch, I want to define a flag to represent it.
Is that acceptable to you?

> > +/* CAT common functions implementation. */
> > +static void cat_init_feature(const struct cpuid_leaf *regs,
> > +                             struct feat_node *feat,
> > +                             struct psr_socket_info *info,
> > +                             enum psr_feat_type type)
> > +{
> > +    unsigned int socket, i;
> > +
> > +    /* No valid value so do not enable feature. */
> > +    if ( !regs->a || !regs->d )
> > +        return;
> > +
> > +    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
> > +    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
> > +
> > +    switch ( type )
> > +    {
> > +    case PSR_SOCKET_L3_CAT:
> > +        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> > +        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
> > +
> > +        /*
> > +         * To handle cpu offline and then online case, we need restore MSRs to
> > +         * default values.
> > +         */
> > +        for ( i = 1; i <= feat->props->cos_max; i++ )
> > +        {
> > +            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> > +            feat->cos_reg_val[i] = feat->cos_reg_val[0];
> > +        }
> 
> I continue to have difficulty with this: Why is offline-then-online
> any different from first-time-online? Why wouldn't setting the

May remove this comment. Per current codes, the MSRs are written to default
values no matter first time or not.

> registers to their intended values not be taken care of by
> context switch code, once vCPU-s get scheduled onto the newly
> onlined CPU?
> 
cat_init_feature is only called when the first CPU on a socket is online.
The MSRs to set are per socket. So, we only need set it once when socket
is online.

> > +        break;
> > +
> > +    default:
> > +        return;
> > +    }
> > +
> > +    /* Add this feature into array. */
> > +    info->features[type] = feat;
> > +
> > +    socket = cpu_to_socket(smp_processor_id());
> 
> No need for this variable, and definitely no need to do the
> assignment ahead of ...
> > +    if ( !opt_cpu_info )
> > +        return;
> 
> ... this.

Ok, will remove socket to directly use 'cpu_to_socket(smp_processor_id())'.

> 
> >  static void psr_cpu_init(void)
> >  {
> > +    struct psr_socket_info *info;
> > +    unsigned int socket;
> > +    unsigned int cpu = smp_processor_id();
> > +    struct feat_node *feat;
> > +    struct cpuid_leaf regs;
> > +
> > +    if ( !psr_alloc_feat_enabled() || !boot_cpu_has(X86_FEATURE_PQE) )
> > +        goto assoc_init;
> > +
> > +    if ( boot_cpu_data.cpuid_level < PSR_CPUID_LEVEL_CAT )
> > +    {
> > +        setup_clear_cpu_cap(X86_FEATURE_PQE);
> > +        goto assoc_init;
> > +    }
> > +
> > +    socket = cpu_to_socket(cpu);
> > +    info = socket_info + socket;
> > +    if ( feat_init_done(info) )
> > +        goto assoc_init;
> 
> Hmm, so you bail here if any of the features was already set up.
> But you don't bail if none of the features were available as the
> reason for the setup not having been done before. I think this
> can be solved in a better way once we have the static props
> array: You need to do anything here only if the props slot is not
> NULL, but the feature slot is NULL.
> 
As above comment, this check will be changed to check a flag if you have
no opinion. But, what is your conern here? Do you mind the 'goto'?

> In any event you intentions are likely easier to understand for
> a reader if this single-use function was inlined here.
> 
As you have observed, this 'feat_init_done' is used many times in later
patches.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows.
  2017-04-05 15:23   ` Jan Beulich
@ 2017-04-06  6:01     ` Yi Sun
  2017-04-06  8:34       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-06  6:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-05 09:23:50, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > +static void psr_assoc_init(void)
> >  {
> >      struct psr_assoc *psra = &this_cpu(psr_assoc);
> >  
> > -    if ( psr_cmt_enabled() )
> > +    if ( psr_alloc_feat_enabled() )
> > +    {
> > +        unsigned int socket = cpu_to_socket(smp_processor_id());
> > +        const struct psr_socket_info *info = socket_info + socket;
> > +        unsigned int cos_max = get_max_cos_max(info);
> > +
> > +        if ( feat_init_done(info) )
> 
> I think the use here is different from the one in the earlier patch:
> While there looking at props[] appears to be desirable, I think here
> you indeed only want to look at features[]. And btw, I wouldn't
> mind a simple flag or counter in info telling whether any (or how
> many) features have been enabled, to avoid such iterations. It's
> just that the original feature mask was fully redundant with
> features[].
> 
Per comment in previous patch, will add a 'flag' to indicate if any
feature has been initialized.

> > @@ -397,6 +434,11 @@ void psr_ctxt_switch_to(struct domain *d)
> >      if ( psr_cmt_enabled() )
> >          psr_assoc_rmid(&reg, d->arch.psr_rmid);
> >  
> > +    if ( psra->cos_mask )
> > +        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
> > +                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
> > +                      0, psra->cos_mask);
> 
> I may have asked this question before, but if so you can see that
> the code above continues puzzling me: Under what conditions
> would psra->cos_mask be non-zero, but d->arch.psr_cos_ids be
> NULL? And why is zero the right value in that case?
> 
'cos_mask' is initialized in 'psr_assoc_init' during cpu starting. The
'psr_cos_ids' is allocated during domain init. Here is soft a protection
to handle abnormal case. Of course, we can use ASSERT to check it.

> Also you need to deal with alignment issues here: Part of an
> expression at equal rank should align with one another. This
> implies that you want to move the 2nd argument on a new line
> (and the 3rd one would then better be moved to its own line
> too).
> 
Ok, thanks!

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow.
  2017-04-05 15:37   ` Jan Beulich
@ 2017-04-06  6:05     ` Yi Sun
  2017-04-06  8:36       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-06  6:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-05 09:37:44, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -93,6 +93,10 @@ struct feat_node {
> >          unsigned int cos_num;
> >          unsigned int cos_max;
> >          unsigned int cbm_len;
> > +
> > +        /* get_feat_info is used to get feature HW info. */
> > +        bool (*get_feat_info)(const struct feat_node *feat,
> > +                              uint32_t data[], unsigned int array_len);
> 
> The comment isn't very helpful in its current shape. You really want
> to make clear that this is being used to return HW info via sysctl.
> Without this (and without seeing the rest of this patch), despite
> having seen previous versions my first thought was that this
> retrieves data from MSRs at initialization time.
> 
Will modify the comment to make it clearer.

> > @@ -183,6 +187,22 @@ static bool feat_init_done(const struct psr_socket_info *info)
> >      return false;
> >  }
> >  
> > +static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
> > +{
> > +    enum psr_feat_type feat_type;
> > +
> > +    switch ( type )
> > +    {
> > +    case PSR_CBM_TYPE_L3:
> > +        feat_type = PSR_SOCKET_L3_CAT;
> > +        break;
> > +    default:
> > +        ASSERT_UNREACHABLE();
> > +    }
> > +
> > +    return feat_type;
> 
> I'm pretty certain this will (validly) produce an uninitialized variable
> warning at least in a non-debug build. Not how I did say "add
> ASSERT_UNREACHABLE()" in the v9 review.
> 
Do you mean to init feat_type to 'PSR_SOCKET_MAX_FEAT' and then check it
at the end of function using ASSERT?

> > +int psr_get_info(unsigned int socket, enum cbm_type type,
> > +                 uint32_t data[], unsigned int array_len)
> > +{
> > +    const struct psr_socket_info *info = get_socket_info(socket);
> > +    const struct feat_node *feat;
> > +    enum psr_feat_type feat_type;
> > +
> > +    if ( IS_ERR(info) )
> > +        return PTR_ERR(info);
> > +
> > +    if ( !data )
> > +        return -EINVAL;
> 
> I think I've asked this before - what does this check guard against?
> A bad caller? This could be an ASSERT() then. Returning an error to
> the sysctl caller because of some implementation bug seems pretty
> odd to me.
> 
Sorry, will use ASSERT for such cases.

> > +    feat_type = psr_cbm_type_to_feat_type(type);
> > +    if ( feat_type > ARRAY_SIZE(info->features) )
> > +        return -ENOENT;
> > +
> > +    feat = info->features[feat_type];
> 
> Please can you be more careful when adding checks like the above
> one? You still allow overrunning the array, when feat_type
> == ARRAY_SIZE(info->features).
> 
Oh, sorry.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow.
  2017-04-05 15:51   ` Jan Beulich
@ 2017-04-06  6:10     ` Yi Sun
  2017-04-06  8:40       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-06  6:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, chao.p.peng, xen-devel, roger.pau

On 17-04-05 09:51:44, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > --- a/xen/arch/x86/domctl.c
> > +++ b/xen/arch/x86/domctl.c
> > @@ -1455,25 +1455,37 @@ long arch_do_domctl(
> >              break;
> >  
> >          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
> > -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
> > -                                 &domctl->u.psr_cat_op.data,
> > -                                 PSR_CBM_TYPE_L3);
> > +        {
> > +            uint32_t val;
> > +
> > +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
> > +                              &val, PSR_CBM_TYPE_L3);
> > +            domctl->u.psr_cat_op.data = val;
> >              copyback = 1;
> >              break;
> > +        }
> >  
> >          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE:
> > -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
> > -                                 &domctl->u.psr_cat_op.data,
> > -                                 PSR_CBM_TYPE_L3_CODE);
> > +        {
> > +            uint32_t val;
> > +
> > +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
> > +                              &val, PSR_CBM_TYPE_L3_CODE);
> > +            domctl->u.psr_cat_op.data = val;
> >              copyback = 1;
> >              break;
> > +        }
> >  
> >          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA:
> > -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
> > -                                 &domctl->u.psr_cat_op.data,
> > -                                 PSR_CBM_TYPE_L3_DATA);
> > +        {
> > +            uint32_t val;
> > +
> > +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
> > +                              &val, PSR_CBM_TYPE_L3_DATA);
> > +            domctl->u.psr_cat_op.data = val;
> >              copyback = 1;
> >              break;
> > +        }
> 
> I think code would read better overall if you had a switch()-wide
> variable (then probably encoding its width in its name, e.g. val32).
> 
I thought this but the switch() also covers 'set' cases. Is that appropriate
to define a wide range variable but some cases do not use it?

> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -97,6 +97,10 @@ struct feat_node {
> >          /* get_feat_info is used to get feature HW info. */
> >          bool (*get_feat_info)(const struct feat_node *feat,
> >                                uint32_t data[], unsigned int array_len);
> > +
> > +        /* get_val is used to get feature COS register value. */
> > +        void (*get_val)(const struct feat_node *feat, unsigned int cos,
> > +                        uint32_t *val);
> >      } *props;
> >  
> >      uint32_t cos_reg_val[MAX_COS_REG_CNT];
> > @@ -265,10 +269,17 @@ static bool cat_get_feat_info(const struct feat_node *feat,
> >      return true;
> >  }
> >  
> > +static void cat_get_val(const struct feat_node *feat, unsigned int cos,
> > +                        uint32_t *val)
> > +{
> > +    *val = feat->cos_reg_val[cos];
> > +}
> 
> This can be done by the caller - there's nothing feature specific in
> here, so there's no need for a hook.
> 
Hmm, CDP's 'get_val' is different so that we need this hook. Do you mean I
should create this CAT's 'get_val' hook when implementing CDP patch?

> > @@ -494,24 +505,34 @@ static struct psr_socket_info *get_socket_info(unsigned int socket)
> >      return socket_info + socket;
> >  }
> >  
> > -int psr_get_info(unsigned int socket, enum cbm_type type,
> > -                 uint32_t data[], unsigned int array_len)
> > +static struct feat_node * psr_get_feat(unsigned int socket,
> 
> Stray blank after *.
> 
> > +                                       enum cbm_type type)
> >  {
> >      const struct psr_socket_info *info = get_socket_info(socket);
> > -    const struct feat_node *feat;
> >      enum psr_feat_type feat_type;
> >  
> >      if ( IS_ERR(info) )
> > -        return PTR_ERR(info);
> > +        return ERR_PTR(PTR_ERR(info));
> 
> Urgh. But yes, a cast would seem to be the worse alternative.
> 
Then, any suggestion for this? Shall I add a parameter into the function to
get this error number back?

> > @@ -521,9 +542,35 @@ int psr_get_info(unsigned int socket, enum cbm_type type,
> >      return -EINVAL;
> >  }
> >  
> > -int psr_get_l3_cbm(struct domain *d, unsigned int socket,
> > -                   uint64_t *cbm, enum cbm_type type)
> > +int psr_get_val(struct domain *d, unsigned int socket,
> > +                uint32_t *val, enum cbm_type type)
> >  {
> > +    const struct feat_node *feat;
> > +    unsigned int cos;
> > +
> > +    ASSERT(d && val);
> 
> I don't think we ever ASSERT() domain pointers to be non-NULL.
> 
Ok, will remove check to domain.

> Jan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-06  5:49     ` Yi Sun
@ 2017-04-06  8:32       ` Jan Beulich
  2017-04-06  9:22         ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-06  8:32 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 06.04.17 at 07:49, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-05 09:10:58, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> > +static void free_socket_resources(unsigned int socket)
>> > +{
>> > +    unsigned int i;
>> > +    struct psr_socket_info *info = socket_info + socket;
>> > +
>> > +    if ( !info )
>> > +        return;
>> > +
>> > +    /*
>> > +     * Free resources of features. The global feature object, e.g. feat_l3_cat,
>> > +     * may not be freed here if it is not added into array. It is simply being
>> > +     * kept until the next CPU online attempt.
>> > +     */
>> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
>> > +    {
>> > +        if ( !info->features[i] )
>> > +            continue;
>> > +
>> > +        xfree(info->features[i]);
>> > +        info->features[i] = NULL;
>> 
>> There's no need for the if() here. And I'm sure this was pointed out
>> already (perhaps in a different context).
>> 
> There may be NULL member in features array. Features array contains all
> features, including L3 CAT, CDP and L2 CAT. But on some machines, they
> may only support L3 CAT but do not support CDP and L2 CAT. So, the features
> array only has L3 CAT member in it and all other members are all NULL. That
> is the reason we must check if the member is NULL or not.

No, and this has been explained before: xfree() happily accepts
NULL pointers.

>> > +static bool feat_init_done(const struct psr_socket_info *info)
>> > +{
>> > +    unsigned int i;
>> > +
>> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
>> > +    {
>> > +        if ( !info->features[i] )
>> > +            continue;
>> > +
>> > +        return true;
>> > +    }
>> > +
>> > +    return false;
>> > +}
>> 
>> At the first glance this is a very strange function.
>> 
> I used 'feat_mask' before to check if any feature has been initialized.
> Per your comment in later patch, I want to define a flag to represent it.
> Is that acceptable to you?

Excuse me, but if I suggested it there, how can it not be acceptable
to me?

>> > +/* CAT common functions implementation. */
>> > +static void cat_init_feature(const struct cpuid_leaf *regs,
>> > +                             struct feat_node *feat,
>> > +                             struct psr_socket_info *info,
>> > +                             enum psr_feat_type type)
>> > +{
>> > +    unsigned int socket, i;
>> > +
>> > +    /* No valid value so do not enable feature. */
>> > +    if ( !regs->a || !regs->d )
>> > +        return;
>> > +
>> > +    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
>> > +    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
>> > +
>> > +    switch ( type )
>> > +    {
>> > +    case PSR_SOCKET_L3_CAT:
>> > +        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
>> > +        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
>> > +
>> > +        /*
>> > +         * To handle cpu offline and then online case, we need restore MSRs to
>> > +         * default values.
>> > +         */
>> > +        for ( i = 1; i <= feat->props->cos_max; i++ )
>> > +        {
>> > +            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
>> > +            feat->cos_reg_val[i] = feat->cos_reg_val[0];
>> > +        }
>> 
>> I continue to have difficulty with this: Why is offline-then-online
>> any different from first-time-online? Why wouldn't setting the
> 
> May remove this comment. Per current codes, the MSRs are written to default
> values no matter first time or not.
> 
>> registers to their intended values not be taken care of by
>> context switch code, once vCPU-s get scheduled onto the newly
>> onlined CPU?
>> 
> cat_init_feature is only called when the first CPU on a socket is online.
> The MSRs to set are per socket. So, we only need set it once when socket
> is online.

This does not answer my question. Once again - why does this need
doing here explicitly, rather than relying on the needed values being
loaded when the first vCPU gets scheduled onto one of the pCPU-s
of this socket?

>> >  static void psr_cpu_init(void)
>> >  {
>> > +    struct psr_socket_info *info;
>> > +    unsigned int socket;
>> > +    unsigned int cpu = smp_processor_id();
>> > +    struct feat_node *feat;
>> > +    struct cpuid_leaf regs;
>> > +
>> > +    if ( !psr_alloc_feat_enabled() || !boot_cpu_has(X86_FEATURE_PQE) )
>> > +        goto assoc_init;
>> > +
>> > +    if ( boot_cpu_data.cpuid_level < PSR_CPUID_LEVEL_CAT )
>> > +    {
>> > +        setup_clear_cpu_cap(X86_FEATURE_PQE);
>> > +        goto assoc_init;
>> > +    }
>> > +
>> > +    socket = cpu_to_socket(cpu);
>> > +    info = socket_info + socket;
>> > +    if ( feat_init_done(info) )
>> > +        goto assoc_init;
>> 
>> Hmm, so you bail here if any of the features was already set up.
>> But you don't bail if none of the features were available as the
>> reason for the setup not having been done before. I think this
>> can be solved in a better way once we have the static props
>> array: You need to do anything here only if the props slot is not
>> NULL, but the feature slot is NULL.
>> 
> As above comment, this check will be changed to check a flag if you have
> no opinion. But, what is your conern here? Do you mind the 'goto'?

Well, while with the intended flag introduction this discussion is
mostly moot now - no, it's not the goto, it's the way of checking
done.

>> In any event you intentions are likely easier to understand for
>> a reader if this single-use function was inlined here.
>> 
> As you have observed, this 'feat_init_done' is used many times in later
> patches.

And (again mostly moot now) as expressed there, the further uses
appear to want checks different from the one here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows.
  2017-04-06  6:01     ` Yi Sun
@ 2017-04-06  8:34       ` Jan Beulich
  0 siblings, 0 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-06  8:34 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 06.04.17 at 08:01, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-05 09:23:50, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> > @@ -397,6 +434,11 @@ void psr_ctxt_switch_to(struct domain *d)
>> >      if ( psr_cmt_enabled() )
>> >          psr_assoc_rmid(&reg, d->arch.psr_rmid);
>> >  
>> > +    if ( psra->cos_mask )
>> > +        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
>> > +                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
>> > +                      0, psra->cos_mask);
>> 
>> I may have asked this question before, but if so you can see that
>> the code above continues puzzling me: Under what conditions
>> would psra->cos_mask be non-zero, but d->arch.psr_cos_ids be
>> NULL? And why is zero the right value in that case?
>> 
> 'cos_mask' is initialized in 'psr_assoc_init' during cpu starting. The
> 'psr_cos_ids' is allocated during domain init. Here is soft a protection
> to handle abnormal case. Of course, we can use ASSERT to check it.

Obviously a domain failing initialization won't ever make it here,
so an ASSERT() is the maximum I'd consider reasonable here.
We should try to not go overboard with assertions - I appreciate
useful ones, but there is a line beyond which they end up being
clutter.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow.
  2017-04-06  6:05     ` Yi Sun
@ 2017-04-06  8:36       ` Jan Beulich
  2017-04-06 11:16         ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-06  8:36 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 06.04.17 at 08:05, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-05 09:37:44, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> > @@ -183,6 +187,22 @@ static bool feat_init_done(const struct psr_socket_info *info)
>> >      return false;
>> >  }
>> >  
>> > +static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
>> > +{
>> > +    enum psr_feat_type feat_type;
>> > +
>> > +    switch ( type )
>> > +    {
>> > +    case PSR_CBM_TYPE_L3:
>> > +        feat_type = PSR_SOCKET_L3_CAT;
>> > +        break;
>> > +    default:
>> > +        ASSERT_UNREACHABLE();
>> > +    }
>> > +
>> > +    return feat_type;
>> 
>> I'm pretty certain this will (validly) produce an uninitialized variable
>> warning at least in a non-debug build. Not how I did say "add
>> ASSERT_UNREACHABLE()" in the v9 review.
>> 
> Do you mean to init feat_type to 'PSR_SOCKET_MAX_FEAT' and then check it
> at the end of function using ASSERT?

That's a (less desirable) option, but what I really mean is take v9
code and _add_ ASSERT_UNREACHABLE() first thing in the default
case.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow.
  2017-04-06  6:10     ` Yi Sun
@ 2017-04-06  8:40       ` Jan Beulich
  2017-04-06 11:13         ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-06  8:40 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 06.04.17 at 08:10, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-05 09:51:44, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> > --- a/xen/arch/x86/domctl.c
>> > +++ b/xen/arch/x86/domctl.c
>> > @@ -1455,25 +1455,37 @@ long arch_do_domctl(
>> >              break;
>> >  
>> >          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
>> > -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
>> > -                                 &domctl->u.psr_cat_op.data,
>> > -                                 PSR_CBM_TYPE_L3);
>> > +        {
>> > +            uint32_t val;
>> > +
>> > +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
>> > +                              &val, PSR_CBM_TYPE_L3);
>> > +            domctl->u.psr_cat_op.data = val;
>> >              copyback = 1;
>> >              break;
>> > +        }
>> >  
>> >          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE:
>> > -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
>> > -                                 &domctl->u.psr_cat_op.data,
>> > -                                 PSR_CBM_TYPE_L3_CODE);
>> > +        {
>> > +            uint32_t val;
>> > +
>> > +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
>> > +                              &val, PSR_CBM_TYPE_L3_CODE);
>> > +            domctl->u.psr_cat_op.data = val;
>> >              copyback = 1;
>> >              break;
>> > +        }
>> >  
>> >          case XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA:
>> > -            ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
>> > -                                 &domctl->u.psr_cat_op.data,
>> > -                                 PSR_CBM_TYPE_L3_DATA);
>> > +        {
>> > +            uint32_t val;
>> > +
>> > +            ret = psr_get_val(d, domctl->u.psr_cat_op.target,
>> > +                              &val, PSR_CBM_TYPE_L3_DATA);
>> > +            domctl->u.psr_cat_op.data = val;
>> >              copyback = 1;
>> >              break;
>> > +        }
>> 
>> I think code would read better overall if you had a switch()-wide
>> variable (then probably encoding its width in its name, e.g. val32).
>> 
> I thought this but the switch() also covers 'set' cases. Is that appropriate
> to define a wide range variable but some cases do not use it?

Yes of course - why would it not be? We also do so elsewhere.

>> > --- a/xen/arch/x86/psr.c
>> > +++ b/xen/arch/x86/psr.c
>> > @@ -97,6 +97,10 @@ struct feat_node {
>> >          /* get_feat_info is used to get feature HW info. */
>> >          bool (*get_feat_info)(const struct feat_node *feat,
>> >                                uint32_t data[], unsigned int array_len);
>> > +
>> > +        /* get_val is used to get feature COS register value. */
>> > +        void (*get_val)(const struct feat_node *feat, unsigned int cos,
>> > +                        uint32_t *val);
>> >      } *props;
>> >  
>> >      uint32_t cos_reg_val[MAX_COS_REG_CNT];
>> > @@ -265,10 +269,17 @@ static bool cat_get_feat_info(const struct feat_node *feat,
>> >      return true;
>> >  }
>> >  
>> > +static void cat_get_val(const struct feat_node *feat, unsigned int cos,
>> > +                        uint32_t *val)
>> > +{
>> > +    *val = feat->cos_reg_val[cos];
>> > +}
>> 
>> This can be done by the caller - there's nothing feature specific in
>> here, so there's no need for a hook.
>> 
> Hmm, CDP's 'get_val' is different so that we need this hook. Do you mean I
> should create this CAT's 'get_val' hook when implementing CDP patch?

No, not really - doesn't the type-to-index mapping array (or
whichever way it ends up being) all take care of the feature
specific aspects here?

>> > @@ -494,24 +505,34 @@ static struct psr_socket_info *get_socket_info(unsigned int socket)
>> >      return socket_info + socket;
>> >  }
>> >  
>> > -int psr_get_info(unsigned int socket, enum cbm_type type,
>> > -                 uint32_t data[], unsigned int array_len)
>> > +static struct feat_node * psr_get_feat(unsigned int socket,
>> 
>> Stray blank after *.
>> 
>> > +                                       enum cbm_type type)
>> >  {
>> >      const struct psr_socket_info *info = get_socket_info(socket);
>> > -    const struct feat_node *feat;
>> >      enum psr_feat_type feat_type;
>> >  
>> >      if ( IS_ERR(info) )
>> > -        return PTR_ERR(info);
>> > +        return ERR_PTR(PTR_ERR(info));
>> 
>> Urgh. But yes, a cast would seem to be the worse alternative.
>> 
> Then, any suggestion for this? Shall I add a parameter into the function to
> get this error number back?

Once again, excuse me: Didn't my previous reply make clear that
while this looks ugly, I can't see a better alternative, and hence
it can remain as is unless someone comes up with a better
suggestion?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-06  8:32       ` Jan Beulich
@ 2017-04-06  9:22         ` Yi Sun
  2017-04-06  9:34           ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-06  9:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-06 02:32:04, Jan Beulich wrote:
> >>> On 06.04.17 at 07:49, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-05 09:10:58, Jan Beulich wrote:
> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> > +static void free_socket_resources(unsigned int socket)
> >> > +{
> >> > +    unsigned int i;
> >> > +    struct psr_socket_info *info = socket_info + socket;
> >> > +
> >> > +    if ( !info )
> >> > +        return;
> >> > +
> >> > +    /*
> >> > +     * Free resources of features. The global feature object, e.g. feat_l3_cat,
> >> > +     * may not be freed here if it is not added into array. It is simply being
> >> > +     * kept until the next CPU online attempt.
> >> > +     */
> >> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> >> > +    {
> >> > +        if ( !info->features[i] )
> >> > +            continue;
> >> > +
> >> > +        xfree(info->features[i]);
> >> > +        info->features[i] = NULL;
> >> 
> >> There's no need for the if() here. And I'm sure this was pointed out
> >> already (perhaps in a different context).
> >> 
> > There may be NULL member in features array. Features array contains all
> > features, including L3 CAT, CDP and L2 CAT. But on some machines, they
> > may only support L3 CAT but do not support CDP and L2 CAT. So, the features
> > array only has L3 CAT member in it and all other members are all NULL. That
> > is the reason we must check if the member is NULL or not.
> 
> No, and this has been explained before: xfree() happily accepts
> NULL pointers.
> 
Ok, got it.

[...]
> >> > +/* CAT common functions implementation. */
> >> > +static void cat_init_feature(const struct cpuid_leaf *regs,
> >> > +                             struct feat_node *feat,
> >> > +                             struct psr_socket_info *info,
> >> > +                             enum psr_feat_type type)
> >> > +{
> >> > +    unsigned int socket, i;
> >> > +
> >> > +    /* No valid value so do not enable feature. */
> >> > +    if ( !regs->a || !regs->d )
> >> > +        return;
> >> > +
> >> > +    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
> >> > +    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
> >> > +
> >> > +    switch ( type )
> >> > +    {
> >> > +    case PSR_SOCKET_L3_CAT:
> >> > +        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> >> > +        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
> >> > +
> >> > +        /*
> >> > +         * To handle cpu offline and then online case, we need restore MSRs to
> >> > +         * default values.
> >> > +         */
> >> > +        for ( i = 1; i <= feat->props->cos_max; i++ )
> >> > +        {
> >> > +            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> >> > +            feat->cos_reg_val[i] = feat->cos_reg_val[0];
> >> > +        }
> >> 
> >> I continue to have difficulty with this: Why is offline-then-online
> >> any different from first-time-online? Why wouldn't setting the
> > 
> > May remove this comment. Per current codes, the MSRs are written to default
> > values no matter first time or not.
> > 
> >> registers to their intended values not be taken care of by
> >> context switch code, once vCPU-s get scheduled onto the newly
> >> onlined CPU?
> >> 
> > cat_init_feature is only called when the first CPU on a socket is online.
> > The MSRs to set are per socket. So, we only need set it once when socket
> > is online.
> 
> This does not answer my question. Once again - why does this need
> doing here explicitly, rather than relying on the needed values being
> loaded when the first vCPU gets scheduled onto one of the pCPU-s
> of this socket?
> 
I do not know if I understand your question correctly. Let me try to explain
again. As we discussed in v9, the MSRs values may be wrong values when socket
is online. That is the reason we have to restore them. 

The MSRs are per socket. That means only one group of MSRs on one socket. So
the setting on one CPU can make it valid on whole socket. The 'cat_init_feature'
is executed when the first CPU on a socket is online so we restore them here.

[...]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-06  9:22         ` Yi Sun
@ 2017-04-06  9:34           ` Jan Beulich
  2017-04-06 10:02             ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-06  9:34 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 06.04.17 at 11:22, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-06 02:32:04, Jan Beulich wrote:
>> >>> On 06.04.17 at 07:49, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-05 09:10:58, Jan Beulich wrote:
>> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> >> > +static void cat_init_feature(const struct cpuid_leaf *regs,
>> >> > +                             struct feat_node *feat,
>> >> > +                             struct psr_socket_info *info,
>> >> > +                             enum psr_feat_type type)
>> >> > +{
>> >> > +    unsigned int socket, i;
>> >> > +
>> >> > +    /* No valid value so do not enable feature. */
>> >> > +    if ( !regs->a || !regs->d )
>> >> > +        return;
>> >> > +
>> >> > +    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
>> >> > +    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
>> >> > +
>> >> > +    switch ( type )
>> >> > +    {
>> >> > +    case PSR_SOCKET_L3_CAT:
>> >> > +        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
>> >> > +        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
>> >> > +
>> >> > +        /*
>> >> > +         * To handle cpu offline and then online case, we need restore MSRs to
>> >> > +         * default values.
>> >> > +         */
>> >> > +        for ( i = 1; i <= feat->props->cos_max; i++ )
>> >> > +        {
>> >> > +            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
>> >> > +            feat->cos_reg_val[i] = feat->cos_reg_val[0];
>> >> > +        }
>> >> 
>> >> I continue to have difficulty with this: Why is offline-then-online
>> >> any different from first-time-online? Why wouldn't setting the
>> > 
>> > May remove this comment. Per current codes, the MSRs are written to default
>> > values no matter first time or not.
>> > 
>> >> registers to their intended values not be taken care of by
>> >> context switch code, once vCPU-s get scheduled onto the newly
>> >> onlined CPU?
>> >> 
>> > cat_init_feature is only called when the first CPU on a socket is online.
>> > The MSRs to set are per socket. So, we only need set it once when socket
>> > is online.
>> 
>> This does not answer my question. Once again - why does this need
>> doing here explicitly, rather than relying on the needed values being
>> loaded when the first vCPU gets scheduled onto one of the pCPU-s
>> of this socket?
>> 
> I do not know if I understand your question correctly. Let me try to explain
> again. As we discussed in v9, the MSRs values may be wrong values when socket
> is online. That is the reason we have to restore them. 
> 
> The MSRs are per socket. That means only one group of MSRs on one socket. So
> the setting on one CPU can make it valid on whole socket. The 'cat_init_feature'
> is executed when the first CPU on a socket is online so we restore them here.

All understood. But you write the MSRs with the needed values in
the context switch path, don't you? Why is that writing not
sufficient?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-06  9:34           ` Jan Beulich
@ 2017-04-06 10:02             ` Yi Sun
  2017-04-06 14:02               ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-06 10:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, chao.p.peng, xen-devel, roger.pau

On 17-04-06 03:34:27, Jan Beulich wrote:
> >>> On 06.04.17 at 11:22, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-06 02:32:04, Jan Beulich wrote:
> >> >>> On 06.04.17 at 07:49, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-05 09:10:58, Jan Beulich wrote:
> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> >> > +static void cat_init_feature(const struct cpuid_leaf *regs,
> >> >> > +                             struct feat_node *feat,
> >> >> > +                             struct psr_socket_info *info,
> >> >> > +                             enum psr_feat_type type)
> >> >> > +{
> >> >> > +    unsigned int socket, i;
> >> >> > +
> >> >> > +    /* No valid value so do not enable feature. */
> >> >> > +    if ( !regs->a || !regs->d )
> >> >> > +        return;
> >> >> > +
> >> >> > +    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
> >> >> > +    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
> >> >> > +
> >> >> > +    switch ( type )
> >> >> > +    {
> >> >> > +    case PSR_SOCKET_L3_CAT:
> >> >> > +        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> >> >> > +        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
> >> >> > +
> >> >> > +        /*
> >> >> > +         * To handle cpu offline and then online case, we need restore MSRs to
> >> >> > +         * default values.
> >> >> > +         */
> >> >> > +        for ( i = 1; i <= feat->props->cos_max; i++ )
> >> >> > +        {
> >> >> > +            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> >> >> > +            feat->cos_reg_val[i] = feat->cos_reg_val[0];
> >> >> > +        }
> >> >> 
> >> >> I continue to have difficulty with this: Why is offline-then-online
> >> >> any different from first-time-online? Why wouldn't setting the
> >> > 
> >> > May remove this comment. Per current codes, the MSRs are written to default
> >> > values no matter first time or not.
> >> > 
> >> >> registers to their intended values not be taken care of by
> >> >> context switch code, once vCPU-s get scheduled onto the newly
> >> >> onlined CPU?
> >> >> 
> >> > cat_init_feature is only called when the first CPU on a socket is online.
> >> > The MSRs to set are per socket. So, we only need set it once when socket
> >> > is online.
> >> 
> >> This does not answer my question. Once again - why does this need
> >> doing here explicitly, rather than relying on the needed values being
> >> loaded when the first vCPU gets scheduled onto one of the pCPU-s
> >> of this socket?
> >> 
> > I do not know if I understand your question correctly. Let me try to explain
> > again. As we discussed in v9, the MSRs values may be wrong values when socket
> > is online. That is the reason we have to restore them. 
> > 
> > The MSRs are per socket. That means only one group of MSRs on one socket. So
> > the setting on one CPU can make it valid on whole socket. The 'cat_init_feature'
> > is executed when the first CPU on a socket is online so we restore them here.
> 
> All understood. But you write the MSRs with the needed values in
> the context switch path, don't you? Why is that writing not
> sufficient?
> 
No, in context switch path, we only set ASSOC register to set COS ID into it so
that the corresponding COS MSR value (CBM) can work.

Here, we set default CBM value into COS MSRs.

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow.
  2017-04-06  8:40       ` Jan Beulich
@ 2017-04-06 11:13         ` Yi Sun
  2017-04-06 14:08           ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-06 11:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, chao.p.peng, xen-devel, roger.pau

On 17-04-06 02:40:01, Jan Beulich wrote:
> >>> On 06.04.17 at 08:10, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-05 09:51:44, Jan Beulich wrote:
> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:

[...]
> >> > --- a/xen/arch/x86/psr.c
> >> > +++ b/xen/arch/x86/psr.c
> >> > @@ -97,6 +97,10 @@ struct feat_node {
> >> >          /* get_feat_info is used to get feature HW info. */
> >> >          bool (*get_feat_info)(const struct feat_node *feat,
> >> >                                uint32_t data[], unsigned int array_len);
> >> > +
> >> > +        /* get_val is used to get feature COS register value. */
> >> > +        void (*get_val)(const struct feat_node *feat, unsigned int cos,
> >> > +                        uint32_t *val);
> >> >      } *props;
> >> >  
> >> >      uint32_t cos_reg_val[MAX_COS_REG_CNT];
> >> > @@ -265,10 +269,17 @@ static bool cat_get_feat_info(const struct feat_node *feat,
> >> >      return true;
> >> >  }
> >> >  
> >> > +static void cat_get_val(const struct feat_node *feat, unsigned int cos,
> >> > +                        uint32_t *val)
> >> > +{
> >> > +    *val = feat->cos_reg_val[cos];
> >> > +}
> >> 
> >> This can be done by the caller - there's nothing feature specific in
> >> here, so there's no need for a hook.
> >> 
> > Hmm, CDP's 'get_val' is different so that we need this hook. Do you mean I
> > should create this CAT's 'get_val' hook when implementing CDP patch?
> 
> No, not really - doesn't the type-to-index mapping array (or
> whichever way it ends up being) all take care of the feature
> specific aspects here?
>
For CDP case, the value getting depends on type. If we don't have this hook,
we have to do special handling in main flow.

Still use 'fits_cos_max' as example:
    /* For CDP, DATA is the first item in val[], CODE is the second. */
    for ( j = 0; j < feat->props->cos_num; j++ )                       
    {                                                                  
        feat->props->get_val(feat, 0, feat->props->type[j],            
                             &default_val);                          
        if ( val[j] != default_val )                              
            return false;                                       
    } 

We want to get CDP DATA and CODE one by one to compare with val[] as the order.
If we do not have 'get_val', how can we handle this case? Getting DATA is
different with getting CODE which is shown below. Even we have type-to-index,
we still need the hook to help I think. So far, I cannot figure out a generic
way.
Get data: (feat)->cos_reg_val[(cos) * 2]
Get code: (feat)->cos_reg_val[(cos) * 2 + 1]

Furthermore, 'get_val' is straightforward. I think it loses the pithiness if we
remove the function.

> >> > @@ -494,24 +505,34 @@ static struct psr_socket_info *get_socket_info(unsigned int socket)
> >> >      return socket_info + socket;
> >> >  }
> >> >  
> >> > -int psr_get_info(unsigned int socket, enum cbm_type type,
> >> > -                 uint32_t data[], unsigned int array_len)
> >> > +static struct feat_node * psr_get_feat(unsigned int socket,
> >> 
> >> Stray blank after *.
> >> 
> >> > +                                       enum cbm_type type)
> >> >  {
> >> >      const struct psr_socket_info *info = get_socket_info(socket);
> >> > -    const struct feat_node *feat;
> >> >      enum psr_feat_type feat_type;
> >> >  
> >> >      if ( IS_ERR(info) )
> >> > -        return PTR_ERR(info);
> >> > +        return ERR_PTR(PTR_ERR(info));
> >> 
> >> Urgh. But yes, a cast would seem to be the worse alternative.
> >> 
> > Then, any suggestion for this? Shall I add a parameter into the function to
> > get this error number back?
> 
> Once again, excuse me: Didn't my previous reply make clear that
> while this looks ugly, I can't see a better alternative, and hence
> it can remain as is unless someone comes up with a better
> suggestion?
> 
Oh, sorry, I mis-understood you. I thought you cannot accept current codes.
But I was wrong.

> Jan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow.
  2017-04-06  8:36       ` Jan Beulich
@ 2017-04-06 11:16         ` Yi Sun
  2017-04-06 14:04           ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-06 11:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-06 02:36:19, Jan Beulich wrote:
> >>> On 06.04.17 at 08:05, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-05 09:37:44, Jan Beulich wrote:
> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> > @@ -183,6 +187,22 @@ static bool feat_init_done(const struct psr_socket_info *info)
> >> >      return false;
> >> >  }
> >> >  
> >> > +static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
> >> > +{
> >> > +    enum psr_feat_type feat_type;
> >> > +
> >> > +    switch ( type )
> >> > +    {
> >> > +    case PSR_CBM_TYPE_L3:
> >> > +        feat_type = PSR_SOCKET_L3_CAT;
> >> > +        break;
> >> > +    default:
> >> > +        ASSERT_UNREACHABLE();
> >> > +    }
> >> > +
> >> > +    return feat_type;
> >> 
> >> I'm pretty certain this will (validly) produce an uninitialized variable
> >> warning at least in a non-debug build. Not how I did say "add
> >> ASSERT_UNREACHABLE()" in the v9 review.
> >> 
> > Do you mean to init feat_type to 'PSR_SOCKET_MAX_FEAT' and then check it
> > at the end of function using ASSERT?
> 
> That's a (less desirable) option, but what I really mean is take v9
> code and _add_ ASSERT_UNREACHABLE() first thing in the default
> case.
> 

DYM we should initialize 'feat_type' to a valid value, e.g. PSR_SOCKET_L3_CAT
and keep ASSERT_UNREACHABLE() in default case?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-06 10:02             ` Yi Sun
@ 2017-04-06 14:02               ` Jan Beulich
  2017-04-07  5:17                 ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-06 14:02 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 06.04.17 at 12:02, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-06 03:34:27, Jan Beulich wrote:
>> >>> On 06.04.17 at 11:22, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-06 02:32:04, Jan Beulich wrote:
>> >> >>> On 06.04.17 at 07:49, <yi.y.sun@linux.intel.com> wrote:
>> >> > On 17-04-05 09:10:58, Jan Beulich wrote:
>> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> >> >> > +static void cat_init_feature(const struct cpuid_leaf *regs,
>> >> >> > +                             struct feat_node *feat,
>> >> >> > +                             struct psr_socket_info *info,
>> >> >> > +                             enum psr_feat_type type)
>> >> >> > +{
>> >> >> > +    unsigned int socket, i;
>> >> >> > +
>> >> >> > +    /* No valid value so do not enable feature. */
>> >> >> > +    if ( !regs->a || !regs->d )
>> >> >> > +        return;
>> >> >> > +
>> >> >> > +    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
>> >> >> > +    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
>> >> >> > +
>> >> >> > +    switch ( type )
>> >> >> > +    {
>> >> >> > +    case PSR_SOCKET_L3_CAT:
>> >> >> > +        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
>> >> >> > +        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
>> >> >> > +
>> >> >> > +        /*
>> >> >> > +         * To handle cpu offline and then online case, we need restore MSRs to
>> >> >> > +         * default values.
>> >> >> > +         */
>> >> >> > +        for ( i = 1; i <= feat->props->cos_max; i++ )
>> >> >> > +        {
>> >> >> > +            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
>> >> >> > +            feat->cos_reg_val[i] = feat->cos_reg_val[0];
>> >> >> > +        }
>> >> >> 
>> >> >> I continue to have difficulty with this: Why is offline-then-online
>> >> >> any different from first-time-online? Why wouldn't setting the
>> >> > 
>> >> > May remove this comment. Per current codes, the MSRs are written to default
>> >> > values no matter first time or not.
>> >> > 
>> >> >> registers to their intended values not be taken care of by
>> >> >> context switch code, once vCPU-s get scheduled onto the newly
>> >> >> onlined CPU?
>> >> >> 
>> >> > cat_init_feature is only called when the first CPU on a socket is online.
>> >> > The MSRs to set are per socket. So, we only need set it once when socket
>> >> > is online.
>> >> 
>> >> This does not answer my question. Once again - why does this need
>> >> doing here explicitly, rather than relying on the needed values being
>> >> loaded when the first vCPU gets scheduled onto one of the pCPU-s
>> >> of this socket?
>> >> 
>> > I do not know if I understand your question correctly. Let me try to explain
>> > again. As we discussed in v9, the MSRs values may be wrong values when socket
>> > is online. That is the reason we have to restore them. 
>> > 
>> > The MSRs are per socket. That means only one group of MSRs on one socket. So
>> > the setting on one CPU can make it valid on whole socket. The 'cat_init_feature'
>> > is executed when the first CPU on a socket is online so we restore them here.
>> 
>> All understood. But you write the MSRs with the needed values in
>> the context switch path, don't you? Why is that writing not
>> sufficient?
>> 
> No, in context switch path, we only set ASSOC register to set COS ID into it so
> that the corresponding COS MSR value (CBM) can work.

Okay, so not the context switch path then, But you must be
changing the MSRs _somewhere_, and the question is why this
somewhere isn't sufficient.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow.
  2017-04-06 11:16         ` Yi Sun
@ 2017-04-06 14:04           ` Jan Beulich
  2017-04-07  5:39             ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-06 14:04 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 06.04.17 at 13:16, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-06 02:36:19, Jan Beulich wrote:
>> >>> On 06.04.17 at 08:05, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-05 09:37:44, Jan Beulich wrote:
>> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> >> > @@ -183,6 +187,22 @@ static bool feat_init_done(const struct 
> psr_socket_info *info)
>> >> >      return false;
>> >> >  }
>> >> >  
>> >> > +static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
>> >> > +{
>> >> > +    enum psr_feat_type feat_type;
>> >> > +
>> >> > +    switch ( type )
>> >> > +    {
>> >> > +    case PSR_CBM_TYPE_L3:
>> >> > +        feat_type = PSR_SOCKET_L3_CAT;
>> >> > +        break;
>> >> > +    default:
>> >> > +        ASSERT_UNREACHABLE();
>> >> > +    }
>> >> > +
>> >> > +    return feat_type;
>> >> 
>> >> I'm pretty certain this will (validly) produce an uninitialized variable
>> >> warning at least in a non-debug build. Not how I did say "add
>> >> ASSERT_UNREACHABLE()" in the v9 review.
>> >> 
>> > Do you mean to init feat_type to 'PSR_SOCKET_MAX_FEAT' and then check it
>> > at the end of function using ASSERT?
>> 
>> That's a (less desirable) option, but what I really mean is take v9
>> code and _add_ ASSERT_UNREACHABLE() first thing in the default
>> case.
> 
> DYM we should initialize 'feat_type' to a valid value, e.g. PSR_SOCKET_L3_CAT
> and keep ASSERT_UNREACHABLE() in default case?

Yi, please. Did you read my previous reply, where I think I did say
very precisely what I mean? Why would you pick some random
valid type here?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow.
  2017-04-06 11:13         ` Yi Sun
@ 2017-04-06 14:08           ` Jan Beulich
  2017-04-07  5:40             ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-06 14:08 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 06.04.17 at 13:13, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-06 02:40:01, Jan Beulich wrote:
>> >>> On 06.04.17 at 08:10, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-05 09:51:44, Jan Beulich wrote:
>> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> 
> [...]
>> >> > --- a/xen/arch/x86/psr.c
>> >> > +++ b/xen/arch/x86/psr.c
>> >> > @@ -97,6 +97,10 @@ struct feat_node {
>> >> >          /* get_feat_info is used to get feature HW info. */
>> >> >          bool (*get_feat_info)(const struct feat_node *feat,
>> >> >                                uint32_t data[], unsigned int array_len);
>> >> > +
>> >> > +        /* get_val is used to get feature COS register value. */
>> >> > +        void (*get_val)(const struct feat_node *feat, unsigned int cos,
>> >> > +                        uint32_t *val);
>> >> >      } *props;
>> >> >  
>> >> >      uint32_t cos_reg_val[MAX_COS_REG_CNT];
>> >> > @@ -265,10 +269,17 @@ static bool cat_get_feat_info(const struct feat_node 
> *feat,
>> >> >      return true;
>> >> >  }
>> >> >  
>> >> > +static void cat_get_val(const struct feat_node *feat, unsigned int cos,
>> >> > +                        uint32_t *val)
>> >> > +{
>> >> > +    *val = feat->cos_reg_val[cos];
>> >> > +}
>> >> 
>> >> This can be done by the caller - there's nothing feature specific in
>> >> here, so there's no need for a hook.
>> >> 
>> > Hmm, CDP's 'get_val' is different so that we need this hook. Do you mean I
>> > should create this CAT's 'get_val' hook when implementing CDP patch?
>> 
>> No, not really - doesn't the type-to-index mapping array (or
>> whichever way it ends up being) all take care of the feature
>> specific aspects here?
>>
> For CDP case, the value getting depends on type. If we don't have this hook,
> we have to do special handling in main flow.
> 
> Still use 'fits_cos_max' as example:
>     /* For CDP, DATA is the first item in val[], CODE is the second. */
>     for ( j = 0; j < feat->props->cos_num; j++ )                       
>     {                                                                  
>         feat->props->get_val(feat, 0, feat->props->type[j],            
>                              &default_val);                          
>         if ( val[j] != default_val )                              
>             return false;                                       
>     } 
> 
> We want to get CDP DATA and CODE one by one to compare with val[] as the 
> order.
> If we do not have 'get_val', how can we handle this case? Getting DATA is
> different with getting CODE which is shown below. Even we have 
> type-to-index,
> we still need the hook to help I think. So far, I cannot figure out a 
> generic
> way.
> Get data: (feat)->cos_reg_val[(cos) * 2]
> Get code: (feat)->cos_reg_val[(cos) * 2 + 1]

Which makes it

    for ( i = 0; i < props->cos_num; ++i )
        val[i] = feat->cos_reg_val[cos * props->cos_num + i];

?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-06 14:02               ` Jan Beulich
@ 2017-04-07  5:17                 ` Yi Sun
  2017-04-07  8:48                   ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-07  5:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-06 08:02:15, Jan Beulich wrote:
> >>> On 06.04.17 at 12:02, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-06 03:34:27, Jan Beulich wrote:
> >> >>> On 06.04.17 at 11:22, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-06 02:32:04, Jan Beulich wrote:
> >> >> >>> On 06.04.17 at 07:49, <yi.y.sun@linux.intel.com> wrote:
> >> >> > On 17-04-05 09:10:58, Jan Beulich wrote:
> >> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> >> >> > +static void cat_init_feature(const struct cpuid_leaf *regs,
> >> >> >> > +                             struct feat_node *feat,
> >> >> >> > +                             struct psr_socket_info *info,
> >> >> >> > +                             enum psr_feat_type type)
> >> >> >> > +{
> >> >> >> > +    unsigned int socket, i;
> >> >> >> > +
> >> >> >> > +    /* No valid value so do not enable feature. */
> >> >> >> > +    if ( !regs->a || !regs->d )
> >> >> >> > +        return;
> >> >> >> > +
> >> >> >> > +    feat->props->cbm_len = (regs->a & CAT_CBM_LEN_MASK) + 1;
> >> >> >> > +    feat->props->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK);
> >> >> >> > +
> >> >> >> > +    switch ( type )
> >> >> >> > +    {
> >> >> >> > +    case PSR_SOCKET_L3_CAT:
> >> >> >> > +        /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> >> >> >> > +        feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
> >> >> >> > +
> >> >> >> > +        /*
> >> >> >> > +         * To handle cpu offline and then online case, we need restore MSRs to
> >> >> >> > +         * default values.
> >> >> >> > +         */
> >> >> >> > +        for ( i = 1; i <= feat->props->cos_max; i++ )
> >> >> >> > +        {
> >> >> >> > +            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> >> >> >> > +            feat->cos_reg_val[i] = feat->cos_reg_val[0];
> >> >> >> > +        }
> >> >> >> 
> >> >> >> I continue to have difficulty with this: Why is offline-then-online
> >> >> >> any different from first-time-online? Why wouldn't setting the
> >> >> > 
> >> >> > May remove this comment. Per current codes, the MSRs are written to default
> >> >> > values no matter first time or not.
> >> >> > 
> >> >> >> registers to their intended values not be taken care of by
> >> >> >> context switch code, once vCPU-s get scheduled onto the newly
> >> >> >> onlined CPU?
> >> >> >> 
> >> >> > cat_init_feature is only called when the first CPU on a socket is online.
> >> >> > The MSRs to set are per socket. So, we only need set it once when socket
> >> >> > is online.
> >> >> 
> >> >> This does not answer my question. Once again - why does this need
> >> >> doing here explicitly, rather than relying on the needed values being
> >> >> loaded when the first vCPU gets scheduled onto one of the pCPU-s
> >> >> of this socket?
> >> >> 
> >> > I do not know if I understand your question correctly. Let me try to explain
> >> > again. As we discussed in v9, the MSRs values may be wrong values when socket
> >> > is online. That is the reason we have to restore them. 
> >> > 
> >> > The MSRs are per socket. That means only one group of MSRs on one socket. So
> >> > the setting on one CPU can make it valid on whole socket. The 'cat_init_feature'
> >> > is executed when the first CPU on a socket is online so we restore them here.
> >> 
> >> All understood. But you write the MSRs with the needed values in
> >> the context switch path, don't you? Why is that writing not
> >> sufficient?
> >> 
> > No, in context switch path, we only set ASSOC register to set COS ID into it so
> > that the corresponding COS MSR value (CBM) can work.
> 
> Okay, so not the context switch path then, But you must be
> changing the MSRs _somewhere_, and the question is why this
> somewhere isn't sufficient.
> 
Besides the restore behavior in init process, I restore the MSRs when ref[cos]
is reduced to 0. This behavior happens in two scenarios:
1. In a value setting process, restore MSR if the ref[old_cos] is reduced to 0.
2. When a domain is destroyed, its ref[cos] is reduced too.

Reason to restore is below:
  For features,  e.g. CDP, which cos_num is more than 1, we have to
  restore the old_cos value back to default when ref[old_cos] is 0.
  Otherwise, user will see wrong values when this COS ID is reused. E.g.
  user wants to set DATA to 0x3ff for a new domain. He hopes to see the
  DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
  if the COS ID picked for this action is the one that has been used by
  other domain and the CODE has been set to 0x1ff. Then, user will see
  DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
  using multiple COSs.

BRs,
Sun Yi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow.
  2017-04-06 14:04           ` Jan Beulich
@ 2017-04-07  5:39             ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-07  5:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-06 08:04:16, Jan Beulich wrote:
> >>> On 06.04.17 at 13:16, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-06 02:36:19, Jan Beulich wrote:
> >> >>> On 06.04.17 at 08:05, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-05 09:37:44, Jan Beulich wrote:
> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> >> > @@ -183,6 +187,22 @@ static bool feat_init_done(const struct 
> > psr_socket_info *info)
> >> >> >      return false;
> >> >> >  }
> >> >> >  
> >> >> > +static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
> >> >> > +{
> >> >> > +    enum psr_feat_type feat_type;
> >> >> > +
> >> >> > +    switch ( type )
> >> >> > +    {
> >> >> > +    case PSR_CBM_TYPE_L3:
> >> >> > +        feat_type = PSR_SOCKET_L3_CAT;
> >> >> > +        break;
> >> >> > +    default:
> >> >> > +        ASSERT_UNREACHABLE();
> >> >> > +    }
> >> >> > +
> >> >> > +    return feat_type;
> >> >> 
> >> >> I'm pretty certain this will (validly) produce an uninitialized variable
> >> >> warning at least in a non-debug build. Not how I did say "add
> >> >> ASSERT_UNREACHABLE()" in the v9 review.
> >> >> 
> >> > Do you mean to init feat_type to 'PSR_SOCKET_MAX_FEAT' and then check it
> >> > at the end of function using ASSERT?
> >> 
> >> That's a (less desirable) option, but what I really mean is take v9
> >> code and _add_ ASSERT_UNREACHABLE() first thing in the default
> >> case.
> > 
> > DYM we should initialize 'feat_type' to a valid value, e.g. PSR_SOCKET_L3_CAT
> > and keep ASSERT_UNREACHABLE() in default case?
> 
> Yi, please. Did you read my previous reply, where I think I did say
> very precisely what I mean? Why would you pick some random
> valid type here?
> 
Sorry, I mis-understood your intention before. In v9 codes, I defined
'PSR_SOCKET_UNKNOWN' which is returned in default case. I thought you
wanted me to remove it.

I will add ASSERT_UNREACHABLE() and keep 'feat_type = PSR_SOCKET_UNKNOWN;'
in default case.

BRs,
Sun Yi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow.
  2017-04-06 14:08           ` Jan Beulich
@ 2017-04-07  5:40             ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-07  5:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-06 08:08:11, Jan Beulich wrote:
> >>> On 06.04.17 at 13:13, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-06 02:40:01, Jan Beulich wrote:
> >> >>> On 06.04.17 at 08:10, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-05 09:51:44, Jan Beulich wrote:
> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > 
> > [...]
> >> >> > --- a/xen/arch/x86/psr.c
> >> >> > +++ b/xen/arch/x86/psr.c
> >> >> > @@ -97,6 +97,10 @@ struct feat_node {
> >> >> >          /* get_feat_info is used to get feature HW info. */
> >> >> >          bool (*get_feat_info)(const struct feat_node *feat,
> >> >> >                                uint32_t data[], unsigned int array_len);
> >> >> > +
> >> >> > +        /* get_val is used to get feature COS register value. */
> >> >> > +        void (*get_val)(const struct feat_node *feat, unsigned int cos,
> >> >> > +                        uint32_t *val);
> >> >> >      } *props;
> >> >> >  
> >> >> >      uint32_t cos_reg_val[MAX_COS_REG_CNT];
> >> >> > @@ -265,10 +269,17 @@ static bool cat_get_feat_info(const struct feat_node 
> > *feat,
> >> >> >      return true;
> >> >> >  }
> >> >> >  
> >> >> > +static void cat_get_val(const struct feat_node *feat, unsigned int cos,
> >> >> > +                        uint32_t *val)
> >> >> > +{
> >> >> > +    *val = feat->cos_reg_val[cos];
> >> >> > +}
> >> >> 
> >> >> This can be done by the caller - there's nothing feature specific in
> >> >> here, so there's no need for a hook.
> >> >> 
> >> > Hmm, CDP's 'get_val' is different so that we need this hook. Do you mean I
> >> > should create this CAT's 'get_val' hook when implementing CDP patch?
> >> 
> >> No, not really - doesn't the type-to-index mapping array (or
> >> whichever way it ends up being) all take care of the feature
> >> specific aspects here?
> >>
> > For CDP case, the value getting depends on type. If we don't have this hook,
> > we have to do special handling in main flow.
> > 
> > Still use 'fits_cos_max' as example:
> >     /* For CDP, DATA is the first item in val[], CODE is the second. */
> >     for ( j = 0; j < feat->props->cos_num; j++ )                       
> >     {                                                                  
> >         feat->props->get_val(feat, 0, feat->props->type[j],            
> >                              &default_val);                          
> >         if ( val[j] != default_val )                              
> >             return false;                                       
> >     } 
> > 
> > We want to get CDP DATA and CODE one by one to compare with val[] as the 
> > order.
> > If we do not have 'get_val', how can we handle this case? Getting DATA is
> > different with getting CODE which is shown below. Even we have 
> > type-to-index,
> > we still need the hook to help I think. So far, I cannot figure out a 
> > generic
> > way.
> > Get data: (feat)->cos_reg_val[(cos) * 2]
> > Get code: (feat)->cos_reg_val[(cos) * 2 + 1]
> 
> Which makes it
> 
>     for ( i = 0; i < props->cos_num; ++i )
>         val[i] = feat->cos_reg_val[cos * props->cos_num + i];
> 
> ?
> 
Great, thanks! Then, I will remove 'get_val' hook.

BRs,
Sun Yi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-07  5:17                 ` Yi Sun
@ 2017-04-07  8:48                   ` Jan Beulich
  2017-04-07  9:08                     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-07  8:48 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 07.04.17 at 07:17, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-06 08:02:15, Jan Beulich wrote:
>> Okay, so not the context switch path then, But you must be
>> changing the MSRs _somewhere_, and the question is why this
>> somewhere isn't sufficient.
>> 
> Besides the restore behavior in init process, I restore the MSRs when 
> ref[cos]
> is reduced to 0. This behavior happens in two scenarios:
> 1. In a value setting process, restore MSR if the ref[old_cos] is reduced to 0.
> 2. When a domain is destroyed, its ref[cos] is reduced too.
> 
> Reason to restore is below:
>   For features,  e.g. CDP, which cos_num is more than 1, we have to
>   restore the old_cos value back to default when ref[old_cos] is 0.
>   Otherwise, user will see wrong values when this COS ID is reused. E.g.
>   user wants to set DATA to 0x3ff for a new domain. He hopes to see the
>   DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
>   if the COS ID picked for this action is the one that has been used by
>   other domain and the CODE has been set to 0x1ff. Then, user will see
>   DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
>   using multiple COSs.

I still don't feel my question being answered. Without a COS
ever allocated on a socket, how can values in MSRs other than
the first (index 0) one be used by anyone? I ask because if they
can't be used, their values don't matter (all you need to make
sure is to write them regardless of their currently cached value).
If syncing MSR and cached values at init time is to make sure
those writes won#t be bypassed, that would be a legitimate
explanation (if the alternatives, like introducing a separate flag,
would be overall more expensive or uglier). But the reason(s)
need(s) to be properly explained (and by "properly" I don't
mean a _long_ code comment, but a _precise_ one).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-07  8:48                   ` Jan Beulich
@ 2017-04-07  9:08                     ` Yi Sun
  2017-04-07  9:46                       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-07  9:08 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, chao.p.peng, xen-devel, roger.pau

On 17-04-07 02:48:40, Jan Beulich wrote:
> >>> On 07.04.17 at 07:17, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-06 08:02:15, Jan Beulich wrote:
> >> Okay, so not the context switch path then, But you must be
> >> changing the MSRs _somewhere_, and the question is why this
> >> somewhere isn't sufficient.
> >> 
> > Besides the restore behavior in init process, I restore the MSRs when 
> > ref[cos]
> > is reduced to 0. This behavior happens in two scenarios:
> > 1. In a value setting process, restore MSR if the ref[old_cos] is reduced to 0.
> > 2. When a domain is destroyed, its ref[cos] is reduced too.
> > 
> > Reason to restore is below:
> >   For features,  e.g. CDP, which cos_num is more than 1, we have to
> >   restore the old_cos value back to default when ref[old_cos] is 0.
> >   Otherwise, user will see wrong values when this COS ID is reused. E.g.
> >   user wants to set DATA to 0x3ff for a new domain. He hopes to see the
> >   DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
> >   if the COS ID picked for this action is the one that has been used by
> >   other domain and the CODE has been set to 0x1ff. Then, user will see
> >   DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
> >   using multiple COSs.
> 
> I still don't feel my question being answered. Without a COS
> ever allocated on a socket, how can values in MSRs other than
> the first (index 0) one be used by anyone? I ask because if they
> can't be used, their values don't matter (all you need to make
> sure is to write them regardless of their currently cached value).

The COS ID using is managed by domain (d->arch.psr_cos_ids[socket]). Even if a
socket is offline, the COS ID saved in domain is still valid (domain is alive).
When this socket is online again, the domain may be scheduled onto it to run.
Then, the COS ID (e.g 2) will be used to get/set value for this domain. If we
don't restore MSRs on the socket, we may get an unknown value. This the reason
we have to restore MSRs in 'cat_init_feature' which is called when socket is
online.

Per explanation above (in previous mail), we have to restore MSRs when ref[cos]
is reduced to 0.

Those are all sencarios and reasons to restore MSRs. I don't know if the
explanations are precise enough. Any unclear, please let me know. Thanks!

> If syncing MSR and cached values at init time is to make sure
> those writes won#t be bypassed, that would be a legitimate
> explanation (if the alternatives, like introducing a separate flag,
> would be overall more expensive or uglier). But the reason(s)
> need(s) to be properly explained (and by "properly" I don't
> mean a _long_ code comment, but a _precise_ one).
> 
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-07  9:08                     ` Yi Sun
@ 2017-04-07  9:46                       ` Jan Beulich
  2017-04-10  3:27                         ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-07  9:46 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 07.04.17 at 11:08, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-07 02:48:40, Jan Beulich wrote:
>> >>> On 07.04.17 at 07:17, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-06 08:02:15, Jan Beulich wrote:
>> >> Okay, so not the context switch path then, But you must be
>> >> changing the MSRs _somewhere_, and the question is why this
>> >> somewhere isn't sufficient.
>> >> 
>> > Besides the restore behavior in init process, I restore the MSRs when 
>> > ref[cos]
>> > is reduced to 0. This behavior happens in two scenarios:
>> > 1. In a value setting process, restore MSR if the ref[old_cos] is reduced to 0.
>> > 2. When a domain is destroyed, its ref[cos] is reduced too.
>> > 
>> > Reason to restore is below:
>> >   For features,  e.g. CDP, which cos_num is more than 1, we have to
>> >   restore the old_cos value back to default when ref[old_cos] is 0.
>> >   Otherwise, user will see wrong values when this COS ID is reused. E.g.
>> >   user wants to set DATA to 0x3ff for a new domain. He hopes to see the
>> >   DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
>> >   if the COS ID picked for this action is the one that has been used by
>> >   other domain and the CODE has been set to 0x1ff. Then, user will see
>> >   DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
>> >   using multiple COSs.
>> 
>> I still don't feel my question being answered. Without a COS
>> ever allocated on a socket, how can values in MSRs other than
>> the first (index 0) one be used by anyone? I ask because if they
>> can't be used, their values don't matter (all you need to make
>> sure is to write them regardless of their currently cached value).
> 
> The COS ID using is managed by domain (d->arch.psr_cos_ids[socket]). Even if a
> socket is offline, the COS ID saved in domain is still valid (domain is alive).
> When this socket is online again, the domain may be scheduled onto it to run.
> Then, the COS ID (e.g 2) will be used to get/set value for this domain. If we
> don't restore MSRs on the socket, we may get an unknown value. This the reason
> we have to restore MSRs in 'cat_init_feature' which is called when socket is
> online.

No, it's not. At least that's not the only way to solve the problem:
As said, you could equally well make sure you always write the MSR
the first time a vCPU using a particular COS is being scheduled onto
the newly onlined pCPU. In fact most of the time it is better if
generic code can also deal with special cases than to introduce
special purpose code. Hence my insisting on you properly explaining
why generic code can't deal with the post-online situation here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-07  9:46                       ` Jan Beulich
@ 2017-04-10  3:27                         ` Yi Sun
  2017-04-10 12:43                           ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-10  3:27 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-07 03:46:42, Jan Beulich wrote:
> >>> On 07.04.17 at 11:08, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-07 02:48:40, Jan Beulich wrote:
> >> >>> On 07.04.17 at 07:17, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-06 08:02:15, Jan Beulich wrote:
> >> >> Okay, so not the context switch path then, But you must be
> >> >> changing the MSRs _somewhere_, and the question is why this
> >> >> somewhere isn't sufficient.
> >> >> 
> >> > Besides the restore behavior in init process, I restore the MSRs when 
> >> > ref[cos]
> >> > is reduced to 0. This behavior happens in two scenarios:
> >> > 1. In a value setting process, restore MSR if the ref[old_cos] is reduced to 0.
> >> > 2. When a domain is destroyed, its ref[cos] is reduced too.
> >> > 
> >> > Reason to restore is below:
> >> >   For features,  e.g. CDP, which cos_num is more than 1, we have to
> >> >   restore the old_cos value back to default when ref[old_cos] is 0.
> >> >   Otherwise, user will see wrong values when this COS ID is reused. E.g.
> >> >   user wants to set DATA to 0x3ff for a new domain. He hopes to see the
> >> >   DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
> >> >   if the COS ID picked for this action is the one that has been used by
> >> >   other domain and the CODE has been set to 0x1ff. Then, user will see
> >> >   DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
> >> >   using multiple COSs.
> >> 
> >> I still don't feel my question being answered. Without a COS
> >> ever allocated on a socket, how can values in MSRs other than
> >> the first (index 0) one be used by anyone? I ask because if they
> >> can't be used, their values don't matter (all you need to make
> >> sure is to write them regardless of their currently cached value).
> > 
> > The COS ID using is managed by domain (d->arch.psr_cos_ids[socket]). Even if a
> > socket is offline, the COS ID saved in domain is still valid (domain is alive).
> > When this socket is online again, the domain may be scheduled onto it to run.
> > Then, the COS ID (e.g 2) will be used to get/set value for this domain. If we
> > don't restore MSRs on the socket, we may get an unknown value. This the reason
> > we have to restore MSRs in 'cat_init_feature' which is called when socket is
> > online.
> 
> No, it's not. At least that's not the only way to solve the problem:
> As said, you could equally well make sure you always write the MSR
> the first time a vCPU using a particular COS is being scheduled onto
> the newly onlined pCPU. In fact most of the time it is better if
> generic code can also deal with special cases than to introduce
> special purpose code. Hence my insisting on you properly explaining
> why generic code can't deal with the post-online situation here.
> 
Then, I propose another simple solution to handle this case. When writing
MSRs, check if all values in val_array are same as old values (kept in
'cos_reg_val[]'). If a value is different, the MSR will be written and the
corresponding member in 'cos_reg_val[]' will be updated. Still use above
sample, user wants to set DATA to 0x3ff for a new domain. After value array
assembling, the val_array will be val[DATA]=0x3ff, val[CODE]=0x7ff. The
cos_reg_val[DATA]=0x7ff and cos_reg_val[DATA]=0x1ff. So, both of them will
be updated.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.
  2017-04-10  3:27                         ` Yi Sun
@ 2017-04-10 12:43                           ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-10 12:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, chao.p.peng, xen-devel, roger.pau

On 17-04-10 11:27:04, Yi Sun wrote:
> On 17-04-07 03:46:42, Jan Beulich wrote:
> > >>> On 07.04.17 at 11:08, <yi.y.sun@linux.intel.com> wrote:
> > > On 17-04-07 02:48:40, Jan Beulich wrote:
> > >> >>> On 07.04.17 at 07:17, <yi.y.sun@linux.intel.com> wrote:
> > >> > On 17-04-06 08:02:15, Jan Beulich wrote:
> > >> >> Okay, so not the context switch path then, But you must be
> > >> >> changing the MSRs _somewhere_, and the question is why this
> > >> >> somewhere isn't sufficient.
> > >> >> 
> > >> > Besides the restore behavior in init process, I restore the MSRs when 
> > >> > ref[cos]
> > >> > is reduced to 0. This behavior happens in two scenarios:
> > >> > 1. In a value setting process, restore MSR if the ref[old_cos] is reduced to 0.
> > >> > 2. When a domain is destroyed, its ref[cos] is reduced too.
> > >> > 
> > >> > Reason to restore is below:
> > >> >   For features,  e.g. CDP, which cos_num is more than 1, we have to
> > >> >   restore the old_cos value back to default when ref[old_cos] is 0.
> > >> >   Otherwise, user will see wrong values when this COS ID is reused. E.g.
> > >> >   user wants to set DATA to 0x3ff for a new domain. He hopes to see the
> > >> >   DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
> > >> >   if the COS ID picked for this action is the one that has been used by
> > >> >   other domain and the CODE has been set to 0x1ff. Then, user will see
> > >> >   DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
> > >> >   using multiple COSs.
> > >> 
> > >> I still don't feel my question being answered. Without a COS
> > >> ever allocated on a socket, how can values in MSRs other than
> > >> the first (index 0) one be used by anyone? I ask because if they
> > >> can't be used, their values don't matter (all you need to make
> > >> sure is to write them regardless of their currently cached value).
> > > 
> > > The COS ID using is managed by domain (d->arch.psr_cos_ids[socket]). Even if a
> > > socket is offline, the COS ID saved in domain is still valid (domain is alive).
> > > When this socket is online again, the domain may be scheduled onto it to run.
> > > Then, the COS ID (e.g 2) will be used to get/set value for this domain. If we
> > > don't restore MSRs on the socket, we may get an unknown value. This the reason
> > > we have to restore MSRs in 'cat_init_feature' which is called when socket is
> > > online.
> > 
> > No, it's not. At least that's not the only way to solve the problem:
> > As said, you could equally well make sure you always write the MSR
> > the first time a vCPU using a particular COS is being scheduled onto
> > the newly onlined pCPU. In fact most of the time it is better if
> > generic code can also deal with special cases than to introduce
> > special purpose code. Hence my insisting on you properly explaining
> > why generic code can't deal with the post-online situation here.
> > 
> Then, I propose another simple solution to handle this case. When writing
> MSRs, check if all values in val_array are same as old values (kept in
> 'cos_reg_val[]'). If a value is different, the MSR will be written and the
> corresponding member in 'cos_reg_val[]' will be updated. Still use above
> sample, user wants to set DATA to 0x3ff for a new domain. After value array
> assembling, the val_array will be val[DATA]=0x3ff, val[CODE]=0x7ff. The
> cos_reg_val[DATA]=0x7ff and cos_reg_val[DATA]=0x1ff. So, both of them will
> be updated.
> 
I want to explain more to make it clearer. The original codes are to solve the
issues when a COS ID is free and reused by other domain. With the newly
proposed method above, we can solve this issue too. E.g:
1. COS ID 1 is used by domain 1. Its CODE=0x1ff, DATA=0x7ff. They are kept in
   cos_reg_val[1 - CODE] and cos_reg_val[1 - DATA].
2. For some reason, COS ID 1 is free.
3. We want to set DATA of a new domain (8) to 0x3ff. Its old_cos is 0 so that
   val_array assembled is CODE=0x7ff, DATA=0x3ff. Then COS ID 1 is picked.
4. In write flow, iterate the feature's type array according to cos_num to
   set both CODE and DATA if val_array member is different with cos_reg_val
   member.

Below are codes to explain it clearly:

static void l3_cdp_write_msr(unsigned int cos, uint32_t val, ...)
{
    /* Data. For above case, get_cdp_data(..., cos) returns 0x7ff */
    if ( type == PSR_CBM_TYPE_L3_DATA && get_cdp_data(feat, cos) != val )
    {
        get_cdp_data(feat, cos) = val;
        wrmsrl(MSR_IA32_PSR_L3_MASK_DATA(cos), val);
    }
    /* Code. For above case, get_cdp_code(..., cos) returns 0x1ff */
    if ( type == PSR_CBM_TYPE_L3_CODE && get_cdp_code(feat, cos) != val )
    {
        get_cdp_code(feat, cos) = val;
        wrmsrl(MSR_IA32_PSR_L3_MASK_CODE(cos), val);
    }
}

struct cos_write_info
{
    unsigned int cos;
    struct feat_node *feature;
    /* All COS MSRs values of the feature. */
    uint32_t *val;
    unsigned int array_len;
    enum psr_feat_type feat_type;
};

static void do_write_psr_msr(void *data)
{
...
    /* Write all COS MSRs of the feature. */
    for ( i = 0; i < props[info->feat_type]->cos_num; i++ )
        props[info->feat_type]->write_msr(cos, info->val[i],
                                          props[info->feat_type]->type[i], feat);
}

static int write_psr_msr(unsigned int socket, unsigned int cos, ...)
{
...
    /* Skip to the feature's value head. */
    for ( i = 0; i < feat_type; i++ )
    {
        if ( !info->features[i] )
            continue;

        if ( array_len <= props[i].cos_num )
            return -ENOSPC;

        array_len -= props[i].cos_num;

        val += props[i].cos_num;
    }

    data.val = val;
    data.array_len = array_len;
...
}

int psr_set_val(struct domain *d, unsigned int socket, ...)
{
...
    /* Gather and insert value array. Array should be val[DATA]=0x3ff, val[CODE]=0x7ff. */
    gather_val_array(val_array,...);
    insert_val_to_array(val_array,...);
 
    /* Input val_array which has all COS MSRs values of all features. */
    ret = write_psr_msr(socket, cos, val_array, array_len, feat_type);
...
}

> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-01 13:53 ` [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework Yi Sun
@ 2017-04-11 15:01   ` Jan Beulich
  2017-04-12  5:53     ` Yi Sun
  2017-04-20  5:38   ` [PATCH] dom_ids array implementation Yi Sun
  1 sibling, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-11 15:01 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -157,10 +157,26 @@ static void free_socket_resources(unsigned int socket)
>  {
>      unsigned int i;
>      struct psr_socket_info *info = socket_info + socket;
> +    struct domain *d;
>  
>      if ( !info )
>          return;
>  
> +    /* Restore domain cos id to 0 when socket is offline. */
> +    for_each_domain ( d )
> +    {
> +        unsigned int cos = d->arch.psr_cos_ids[socket];
> +        if ( cos == 0 )

Blank line between declaration(s) and statement(s) please.

> +            continue;
> +
> +        spin_lock(&info->ref_lock);
> +        ASSERT(!cos || info->cos_ref[cos]);

You've checked cos to be non-zero already above.

> +        info->cos_ref[cos]--;
> +        spin_unlock(&info->ref_lock);
> +
> +        d->arch.psr_cos_ids[socket] = 0;
> +    }

Overall, while you say in the revision log that this was a suggestion of
mine, I don't recall any such (and I've just checked the v9 thread of
this patch without finding anything), and hence it's not really clear to
me why this is needed. After all you should be freeing info anyway
(albeit I can't see this happening, which would look to be a bug in
patch 5), so getting the refcounts adjusted seems pointless in any
event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not
certain - this may indeed by unavoidable, to match up with
psr_alloc_cos() using xzalloc.

Furthermore I'm not at all convinced this is appropriate to do in the
context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
have a few thousand VMs, the loop above may take a while.

> @@ -574,15 +590,210 @@ int psr_get_val(struct domain *d, unsigned int socket,
>      return 0;
>  }
>  
> -int psr_set_l3_cbm(struct domain *d, unsigned int socket,
> -                   uint64_t cbm, enum cbm_type type)
> +/* Set value functions */
> +static unsigned int get_cos_num(const struct psr_socket_info *info)
>  {
>      return 0;
>  }
>  
> +static int gather_val_array(uint32_t val[],
> +                            unsigned int array_len,
> +                            const struct psr_socket_info *info,
> +                            unsigned int old_cos)
> +{
> +    return -EINVAL;
> +}
> +
> +static int insert_val_to_array(uint32_t val[],

As indicated before, I'm pretty convinced "into" would be more
natural than "to".

> +                               unsigned int array_len,
> +                               const struct psr_socket_info *info,
> +                               enum psr_feat_type feat_type,
> +                               enum cbm_type type,
> +                               uint32_t new_val)
> +{
> +    return -EINVAL;
> +}
> +
> +static int find_cos(const uint32_t val[], unsigned int array_len,
> +                    enum psr_feat_type feat_type,
> +                    const struct psr_socket_info *info,
> +                    spinlock_t *ref_lock)

I don't think I did suggest adding another parameter here. The lock
is accessible from info, isn't it? In which case, as I _did_ suggest,
you should drop const from it instead. But ...

> +{
> +    ASSERT(spin_is_locked(ref_lock));

... the assertion seems rather pointless anyway with there just
being a single caller, which very visibly acquires the lock up front.

> +static int pick_avail_cos(const struct psr_socket_info *info,
> +                          spinlock_t *ref_lock,

Same here then.

> +int psr_set_val(struct domain *d, unsigned int socket,
> +                uint32_t val, enum cbm_type type)
> +{
> +    unsigned int old_cos;
> +    int cos, ret;
> +    unsigned int *ref;
> +    uint32_t *val_array;
> +    struct psr_socket_info *info = get_socket_info(socket);
> +    unsigned int array_len;
> +    enum psr_feat_type feat_type;
> +
> +    if ( IS_ERR(info) )
> +        return PTR_ERR(info);
> +
> +    feat_type = psr_cbm_type_to_feat_type(type);
> +    if ( feat_type > ARRAY_SIZE(info->features) ||

I think I did point out the off-by-one mistake here in an earlier patch
already.

> +         !info->features[feat_type] )
> +        return -ENOENT;
> +
> +    /*
> +     * Step 0:
> +     * old_cos means the COS ID current domain is using. By default, it is 0.
> +     *
> +     * For every COS ID, there is a reference count to record how many domains
> +     * are using the COS register corresponding to this COS ID.
> +     * - If ref[old_cos] is 0, that means this COS is not used by any domain.
> +     * - If ref[old_cos] is 1, that means this COS is only used by current
> +     *   domain.
> +     * - If ref[old_cos] is more than 1, that mean multiple domains are using
> +     *   this COS.
> +     */
> +    old_cos = d->arch.psr_cos_ids[socket];
> +    ASSERT(old_cos < MAX_COS_REG_CNT);
> +
> +    ref = info->cos_ref;
> +
> +    /*
> +     * Step 1:
> +     * Gather a value array to store all features cos_reg_val[old_cos].
> +     * And, set the input new val into array according to the feature's
> +     * position in array.
> +     */
> +    array_len = get_cos_num(info);
> +    val_array = xzalloc_array(uint32_t, array_len);
> +    if ( !val_array )
> +        return -ENOMEM;
> +
> +    if ( (ret = gather_val_array(val_array, array_len, info, old_cos)) != 0 )
> +        goto free_array;
> +
> +    if ( (ret = insert_val_to_array(val_array, array_len, info,
> +                                    feat_type, type, val)) != 0 )
> +        goto free_array;
> +
> +    spin_lock(&info->ref_lock);
> +
> +    /*
> +     * Step 2:
> +     * Try to find if there is already a COS ID on which all features' values
> +     * are same as the array. Then, we can reuse this COS ID.
> +     */
> +    cos = find_cos(val_array, array_len, feat_type, info, &info->ref_lock);
> +    if ( cos == old_cos )
> +    {
> +        ret = 0;
> +        goto unlock_free_array;
> +    }
> +
> +    /*
> +     * Step 3:
> +     * If fail to find, we need pick an available COS ID.
> +     * In fact, only COS ID which ref is 1 or 0 can be picked for current
> +     * domain. If old_cos is not 0 and its ref==1, that means only current
> +     * domain is using this old_cos ID. So, this old_cos ID certainly can
> +     * be reused by current domain. Ref==0 means there is no any domain
> +     * using this COS ID. So it can be used for current domain too.
> +     */
> +    if ( cos < 0 )
> +    {
> +        cos = pick_avail_cos(info, &info->ref_lock, val_array,
> +                             array_len, old_cos, feat_type);
> +        if ( cos < 0 )
> +        {
> +            ret = cos;
> +            goto unlock_free_array;
> +        }
> +
> +        /*
> +         * Step 4:
> +         * Write all features MSRs according to the COS ID.
> +         */
> +        ret = write_psr_msr(socket, cos, val, feat_type);
> +        if ( ret )
> +            goto unlock_free_array;
> +    }
> +
> +    /*
> +     * Step 5:
> +     * Find the COS ID (find_cos result is '>= 0' or an available COS ID is
> +     * picked, then update ref according to COS ID.
> +     */
> +    ref[cos]++;
> +    ASSERT(!cos || ref[cos]);
> +    ASSERT(!old_cos || ref[old_cos]);
> +    ref[old_cos]--;
> +    spin_unlock(&info->ref_lock);
> +
> +    /*
> +     * Step 6:
> +     * Save the COS ID into current domain's psr_cos_ids[] so that we can know
> +     * which COS the domain is using on the socket. One domain can only use
> +     * one COS ID at same time on each socket.
> +     */
> +    d->arch.psr_cos_ids[socket] = cos;
> +

Please put the "free_array" label here instead of duplicating the code
below.

> +    xfree(val_array);
> +    return ret;
> +
> + unlock_free_array:
> +    spin_unlock(&info->ref_lock);
> + free_array:
> +    xfree(val_array);
> +    return ret;
> +}
> +
>  /* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
>  static void psr_free_cos(struct domain *d)
>  {
> +    unsigned int socket, cos;
> +
> +    ASSERT(socket_info);
> +
> +    if ( !d->arch.psr_cos_ids )
> +        return;
> +
> +    /* Domain is destroied so its cos_ref should be decreased. */
> +    for ( socket = 0; socket < nr_sockets; socket++ )
> +    {
> +        struct psr_socket_info *info;
> +
> +        /* cos 0 is default one which does not need be handled. */
> +        cos = d->arch.psr_cos_ids[socket];
> +        if ( cos == 0 )
> +            continue;
> +
> +        info = socket_info + socket;
> +        spin_lock(&info->ref_lock);
> +        ASSERT(!cos || info->cos_ref[cos]);
> +        info->cos_ref[cos]--;

This recurring pattern of assertion and decrement could surely be
put in a helper function (and the for symmetry also for increment).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array.
  2017-04-01 13:53 ` [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array Yi Sun
@ 2017-04-11 15:11   ` Jan Beulich
  2017-04-12  5:55     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-11 15:11 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> @@ -593,7 +616,21 @@ int psr_get_val(struct domain *d, unsigned int socket,
>  /* Set value functions */
>  static unsigned int get_cos_num(const struct psr_socket_info *info)
>  {
> -    return 0;
> +    unsigned int num = 0, i;
> +
> +    /* Get all features total amount. */
> +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +    {
> +        const struct feat_node *feat = info->features[i];
> +        if ( !feat )

Blank line between ... (and I likely won't repeat this any further)

> +            continue;
> +
> +        feat = info->features[i];

???

> @@ -611,7 +679,40 @@ static int insert_val_to_array(uint32_t val[],
>                                 enum cbm_type type,
>                                 uint32_t new_val)
>  {
> -    return -EINVAL;
> +    const struct feat_node *feat;
> +    unsigned int i;
> +
> +    ASSERT(feat_type < PSR_SOCKET_MAX_FEAT);
> +
> +    /* Insert new value into array according to feature's position in array. */
> +    for ( i = 0; i < feat_type; i++ )
> +    {
> +        feat = info->features[i];
> +        if ( !feat )
> +            continue;
> +
> +        if ( array_len <= feat->props->cos_num )
> +            return -ENOSPC;
> +
> +        array_len -= feat->props->cos_num;
> +
> +        val += feat->props->cos_num;
> +    }
> +
> +    feat = info->features[feat_type];
> +    if ( !feat )
> +        return -ENOENT;
> +
> +    if ( array_len < feat->props->cos_num )
> +        return -ENOSPC;
> +
> +    if ( !psr_check_cbm(feat->props->cbm_len, new_val) )
> +        return -EINVAL;
> +
> +    /* Value setting position is same as feature array. */
> +    val[0] = new_val;

How come this is array index 0 unconditionally, when cos_num
may be greater than 1?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 11/25] x86: refactor psr: L3 CAT: set value: implement cos finding flow.
  2017-04-01 13:53 ` [PATCH v10 11/25] x86: refactor psr: L3 CAT: set value: implement cos finding flow Yi Sun
@ 2017-04-11 15:17   ` Jan Beulich
  0 siblings, 0 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-11 15:17 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -720,8 +720,83 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
>                      const struct psr_socket_info *info,
>                      spinlock_t *ref_lock)
>  {
> +    unsigned int cos, i;
> +    const unsigned int *ref = info->cos_ref;
> +    const struct feat_node *feat;
> +    unsigned int cos_max;
> +
>      ASSERT(spin_is_locked(ref_lock));
>  
> +    /* cos_max is the one of the feature which is being set. */
> +    feat = info->features[feat_type];
> +    if ( !feat )
> +        return -ENOENT;
> +
> +    cos_max = feat->props->cos_max;
> +
> +    for ( cos = 0; cos <= cos_max; cos++ )
> +    {
> +        const uint32_t *val_ptr = val;
> +        bool found = false;
> +
> +        if ( cos && !ref[cos] )
> +            continue;
> +
> +        /*
> +         * If fail to find cos in below loop, need find whole feature array
> +         * again from beginning.
> +         */

This comment is unrelated to any of the immediately surrounding
code. Either move it, or get rid of it altogether.

> +        for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +        {
> +            uint32_t default_val = 0;

Pointless initializer as it seems.

> +            feat = info->features[i];
> +            if ( !feat )
> +                continue;
> +
> +            /*
> +             * COS ID 0 always stores the default value so input 0 to get
> +             * default value.
> +             */
> +            feat->props->get_val(feat, 0, &default_val);
> +
> +            /*
> +             * Compare value according to feature array order.
> +             * We must follow this order because value array is assembled
> +             * as this order.
> +             */
> +            if ( cos > feat->props->cos_max )
> +            {
> +                /*
> +                 * If cos is bigger than feature's cos_max, the val should be
> +                 * default value. Otherwise, it fails to find a COS ID. So we
> +                 * have to exit find flow.
> +                 */
> +                if ( val[0] != default_val )
> +                    return -EINVAL;
> +
> +                found = true;
> +            }
> +            else
> +            {
> +                if ( val[0] == feat->cos_reg_val[cos] )
> +                    found = true;
> +            }

Same question as on previous patch- why val[0] twice above,
despite cos_num possibly being larger than 1? And wouldn't this
need to be val_ptr anyway?

> +            /* If fail to match, go to next cos to compare. */
> +            if ( !found )
> +                break;
> +
> +            val_ptr += feat->props->cos_num;
> +            if ( val_ptr - val > array_len )
> +                return -ENOSPC;

This again looks suspicious - the check should once again be done
before accessing the respective array element(s).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 12/25] x86: refactor psr: L3 CAT: set value: implement cos id picking flow.
  2017-04-01 13:53 ` [PATCH v10 12/25] x86: refactor psr: L3 CAT: set value: implement cos id picking flow Yi Sun
@ 2017-04-11 15:20   ` Jan Beulich
  0 siblings, 0 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-11 15:20 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -800,15 +800,82 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
>      return -ENOENT;
>  }
>  
> +static bool fits_cos_max(const uint32_t val[],
> +                         uint32_t array_len,
> +                         const struct psr_socket_info *info,
> +                         unsigned int cos)
> +{
> +    unsigned int i;
> +
> +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +    {
> +        uint32_t default_val = 0;

Move this into the most narrow scope it's needed in, which will make
pretty clear that the initializer again isn't needed.

> +        const struct feat_node *feat = info->features[i];
> +        if ( !feat )
> +            continue;
> +
> +        if ( array_len < feat->props->cos_num )
> +            return false;
> +
> +        if ( cos > feat->props->cos_max )
> +        {
> +            feat->props->get_val(feat, 0, &default_val);
> +            if ( val[0] != default_val )

Same question as before.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 13/25] x86: refactor psr: L3 CAT: set value: implement write msr flow.
  2017-04-01 13:53 ` [PATCH v10 13/25] x86: refactor psr: L3 CAT: set value: implement write msr flow Yi Sun
@ 2017-04-11 15:25   ` Jan Beulich
  2017-04-12  6:04     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-11 15:25 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> +static void do_write_psr_msr(void *data)
> +{
> +    struct cos_write_info *info = data;
> +    unsigned int cos            = info->cos;
> +    struct feat_node *feat      = info->feature;
> +
> +    if ( cos > feat->props->cos_max )
> +        return;

This check can as well be done in the caller, allowing to avoid the IPI
in case it's true.

> +    feat->props->write_msr(cos, info->val, feat);

Once the function consists of just this I wonder if it wasn't better
to invoke the hook directly from write_psr_msr().

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow.
  2017-04-01 13:53 ` [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow Yi Sun
@ 2017-04-11 15:39   ` Jan Beulich
  2017-04-12  6:05     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-11 15:39 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> @@ -755,7 +765,7 @@ static int gather_val_array(uint32_t val[],
>              cos = 0;
>  
>          /* Value getting order is same as feature array. */
> -        feat->props->get_val(feat, cos, &val[0]);
> +        feat->props->get_val(feat, cos, 0, &val[0]);

How can this be literal zero here (and in further cases below)?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 17/25] x86: refactor psr: CDP: implement set value callback functions.
  2017-04-01 13:53 ` [PATCH v10 17/25] x86: refactor psr: CDP: implement set value callback functions Yi Sun
@ 2017-04-11 16:03   ` Jan Beulich
  2017-04-12  6:14     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-11 16:03 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> @@ -94,6 +102,13 @@ struct feat_node {
>          unsigned int cos_max;
>          unsigned int cbm_len;
>  
> +        /*
> +         * An array to save all 'enum cbm_type' values of the feature. It is
> +         * used with cos_num together to get/write a feature's COS registers
> +         * values one by one.
> +         */
> +        enum cbm_type type[PSR_MAX_COS_NUM];

So here it finally comes. But it's needed the latest in the first CDP
patch, quite possibly even earlier.

> +static int compare_val(const uint32_t val[],
> +                       const struct feat_node *feat,
> +                       unsigned int cos)
> +{
> +    int rc = 0;

The variable deserves a better name and type bool, according to its
usage. But I'm unconvinced the variable is needed at all - I think
without it the intention of the function would be more clear. See
remarks further down.

> +    unsigned int i;
> +
> +    for ( i = 0; i < feat->props->cos_num; i++ )
> +    {
> +        uint32_t feat_val = 0;

Pointless initializer again.

> +        rc = 0;
> +
> +        /* If cos is bigger than cos_max, we need compare default value. */
> +        if ( cos > feat->props->cos_max )
> +        {
> +            /*
> +             * COS ID 0 always stores the default value so input 0 to get
> +             * default value.
> +             */
> +            feat->props->get_val(feat, 0, feat->props->type[i], &feat_val);
> +
> +            /*
> +             * If cos is bigger than feature's cos_max, the val should be
> +             * default value. Otherwise, it fails to find a COS ID. So we
> +             * have to exit find flow.
> +             */
> +            if ( val[i] != feat_val )
> +                return -EINVAL;
> +
> +            rc = 1;
> +            continue;

Drop these two.

> +        }

else
{

> +
> +        /* For CDP, DATA is the first item in val[], CODE is the second. */
> +        feat->props->get_val(feat, cos, feat->props->type[i], &feat_val);
> +        if ( val[i] != feat_val )
> +            break;

return 0;
}

> +
> +        rc = 1;

Drop.

> +    }
> +
> +    return rc;

return 1;

> @@ -851,43 +948,21 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
>           */
>          for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
>          {
> -            uint32_t default_val = 0;
> -
>              feat = info->features[i];
>              if ( !feat )
>                  continue;
>  
>              /*
> -             * COS ID 0 always stores the default value so input 0 to get
> -             * default value.
> -             */
> -            feat->props->get_val(feat, 0, 0, &default_val);
> -
> -            /*
>               * Compare value according to feature array order.
>               * We must follow this order because value array is assembled
>               * as this order.
>               */
> -            if ( cos > feat->props->cos_max )
> -            {
> -                /*
> -                 * If cos is bigger than feature's cos_max, the val should be
> -                 * default value. Otherwise, it fails to find a COS ID. So we
> -                 * have to exit find flow.
> -                 */
> -                if ( val[0] != default_val )
> -                    return -EINVAL;
> -
> -                found = true;
> -            }
> -            else
> -            {
> -                if ( val[0] == feat->cos_reg_val[cos] )
> -                    found = true;
> -            }
> +            rc = compare_val(val, feat, cos);

Why is this being moved into a function here rather than being
introduced as a function right away?

> @@ -922,9 +997,14 @@ static bool fits_cos_max(const uint32_t val[],
>  
>          if ( cos > feat->props->cos_max )
>          {
> -            feat->props->get_val(feat, 0, 0, &default_val);
> -            if ( val[0] != default_val )
> -                return false;
> +            /* For CDP, DATA is the first item in val[], CODE is the second. */

This CDP specific comment doesn't belong into a generic function.

> @@ -1033,6 +1116,40 @@ static int write_psr_msr(unsigned int socket, unsigned int cos,
>      return 0;
>  }
>  
> +static void restore_default_val(unsigned int socket, unsigned int cos,
> +                                enum psr_feat_type feat_type)
> +{
> +    unsigned int i, j;
> +    uint32_t default_val;
> +    const struct psr_socket_info *info = get_socket_info(socket);
> +
> +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +    {
> +        const struct feat_node *feat = info->features[i];

Blank line here.

> +        /*
> +         * There are four judgements:
> +         * 1. Input 'feat_type' is valid so we have to get feature according to
> +         *    it. If current feature type (i) does not match 'feat_type', we
> +         *    need skip it, so continue to check next feature.
> +         * 2. Input 'feat_type' is 'PSR_SOCKET_MAX_FEAT' which means we should
> +         *    handle all features in this case. So, go to next loop.
> +         * 3. Do not need restore the COS value back to default if cos_num is 1,
> +         *    e.g. L3 CAT. Because next value setting will overwrite it.
> +         * 4. 'feat' we got is NULL, continue.
> +         */
> +        if ( ( feat_type != PSR_SOCKET_MAX_FEAT && feat_type != i ) ||

Stray blanks inside inner parentheses.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-11 15:01   ` Jan Beulich
@ 2017-04-12  5:53     ` Yi Sun
  2017-04-12  9:09       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-12  5:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-11 09:01:53, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -157,10 +157,26 @@ static void free_socket_resources(unsigned int socket)
> >  {
> >      unsigned int i;
> >      struct psr_socket_info *info = socket_info + socket;
> > +    struct domain *d;
> >  
> >      if ( !info )
> >          return;
> >  
> > +    /* Restore domain cos id to 0 when socket is offline. */
> > +    for_each_domain ( d )
> > +    {
> > +        unsigned int cos = d->arch.psr_cos_ids[socket];
> > +        if ( cos == 0 )
> 
> Blank line between declaration(s) and statement(s) please.
> 
Ok, will modify other places where have same issue.

> > +            continue;
> > +
> > +        spin_lock(&info->ref_lock);
> > +        ASSERT(!cos || info->cos_ref[cos]);
> 
> You've checked cos to be non-zero already above.
> 
Yep. Thanks!

> > +        info->cos_ref[cos]--;
> > +        spin_unlock(&info->ref_lock);
> > +
> > +        d->arch.psr_cos_ids[socket] = 0;
> > +    }
> 
> Overall, while you say in the revision log that this was a suggestion of
> mine, I don't recall any such (and I've just checked the v9 thread of
> this patch without finding anything), and hence it's not really clear to
> me why this is needed. After all you should be freeing info anyway

We discussed this in v9 05 patch. I paste it below for your convenience to
check.
[Sun Yi]:
  > So, you think the MSRs values may not be valid after such process and 
  > reloading (write MSRs to default value) is needed. If so, I would like 
  > to do more operations in 'free_feature()':
  > 1. Iterate all domains working on the offline socket to change
  >    'd->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init
  >    status.
  > 2. Restore 'socket_info[socket].cos_ref[]' to all 0.
  > 
  > These can make the socket's info be totally restored back to init status.

[Jan]
  Yes, that's what I think is needed.

> (albeit I can't see this happening, which would look to be a bug in
> patch 5), so getting the refcounts adjusted seems pointless in any
> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not

We only free resources in 'socket_info[socekt]' but do not free itself.
Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]'
is not a pointer that can be directly freed.
  socket_info = xzalloc_array(struct psr_socket_info, nr_sockets);

That is the reason to reduce the 'info->cos_ref[cos]'.

> certain - this may indeed by unavoidable, to match up with
> psr_alloc_cos() using xzalloc.
> 
> Furthermore I'm not at all convinced this is appropriate to do in the
> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
> have a few thousand VMs, the loop above may take a while.
> 
Hmm, that may be a potential issue. I have two proposals below. Could you
please help to check which one you prefer? Or provide another solution?

1. Start a tasklet in free_socket_resources() to restore 'psr_cos_ids[socket]'
   of all domains. The action is protected by 'ref_lock' to avoid confliction
   in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset
   the array to 0 in free_socket_resources().

2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index
   from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket
   and can memset the array to 0 when socket is offline. But here is an issue
   that we do not know how many members this array should have. I cannot find
   a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use reallocation
   in 'psr_alloc_cos' if the newly created domain's id is bigger than current
   array number.

> > @@ -574,15 +590,210 @@ int psr_get_val(struct domain *d, unsigned int socket,
> >      return 0;
> >  }
> >  
> > -int psr_set_l3_cbm(struct domain *d, unsigned int socket,
> > -                   uint64_t cbm, enum cbm_type type)
> > +/* Set value functions */
> > +static unsigned int get_cos_num(const struct psr_socket_info *info)
> >  {
> >      return 0;
> >  }
> >  
> > +static int gather_val_array(uint32_t val[],
> > +                            unsigned int array_len,
> > +                            const struct psr_socket_info *info,
> > +                            unsigned int old_cos)
> > +{
> > +    return -EINVAL;
> > +}
> > +
> > +static int insert_val_to_array(uint32_t val[],
> 
> As indicated before, I'm pretty convinced "into" would be more
> natural than "to".
> 
Got it. Thanks!

> > +                               unsigned int array_len,
> > +                               const struct psr_socket_info *info,
> > +                               enum psr_feat_type feat_type,
> > +                               enum cbm_type type,
> > +                               uint32_t new_val)
> > +{
> > +    return -EINVAL;
> > +}
> > +
> > +static int find_cos(const uint32_t val[], unsigned int array_len,
> > +                    enum psr_feat_type feat_type,
> > +                    const struct psr_socket_info *info,
> > +                    spinlock_t *ref_lock)
> 
> I don't think I did suggest adding another parameter here. The lock
> is accessible from info, isn't it? In which case, as I _did_ suggest,
> you should drop const from it instead. But ...
> 
> > +{
> > +    ASSERT(spin_is_locked(ref_lock));
> 
> ... the assertion seems rather pointless anyway with there just
> being a single caller, which very visibly acquires the lock up front.
> 
> > +static int pick_avail_cos(const struct psr_socket_info *info,
> > +                          spinlock_t *ref_lock,
> 
> Same here then.
> 
Ok, I will drop this assertion and the parameter 'ref_lock'.

> > +int psr_set_val(struct domain *d, unsigned int socket,
> > +                uint32_t val, enum cbm_type type)
> > +{
> > +    unsigned int old_cos;
> > +    int cos, ret;
> > +    unsigned int *ref;
> > +    uint32_t *val_array;
> > +    struct psr_socket_info *info = get_socket_info(socket);
> > +    unsigned int array_len;
> > +    enum psr_feat_type feat_type;
> > +
> > +    if ( IS_ERR(info) )
> > +        return PTR_ERR(info);
> > +
> > +    feat_type = psr_cbm_type_to_feat_type(type);
> > +    if ( feat_type > ARRAY_SIZE(info->features) ||
> 
> I think I did point out the off-by-one mistake here in an earlier patch
> already.
> 
Sorry, I did not notice it.

[...]
> > +    /*
> > +     * Step 5:
> > +     * Find the COS ID (find_cos result is '>= 0' or an available COS ID is
> > +     * picked, then update ref according to COS ID.
> > +     */
> > +    ref[cos]++;
> > +    ASSERT(!cos || ref[cos]);
> > +    ASSERT(!old_cos || ref[old_cos]);
> > +    ref[old_cos]--;
> > +    spin_unlock(&info->ref_lock);
> > +
> > +    /*
> > +     * Step 6:
> > +     * Save the COS ID into current domain's psr_cos_ids[] so that we can know
> > +     * which COS the domain is using on the socket. One domain can only use
> > +     * one COS ID at same time on each socket.
> > +     */
> > +    d->arch.psr_cos_ids[socket] = cos;
> > +
> 
> Please put the "free_array" label here instead of duplicating the code
> below.
> 
Got it. Thx!

> > +    xfree(val_array);
> > +    return ret;
> > +
> > + unlock_free_array:
> > +    spin_unlock(&info->ref_lock);
> > + free_array:
> > +    xfree(val_array);
> > +    return ret;
> > +}
> > +
> >  /* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
> >  static void psr_free_cos(struct domain *d)
> >  {
> > +    unsigned int socket, cos;
> > +
> > +    ASSERT(socket_info);
> > +
> > +    if ( !d->arch.psr_cos_ids )
> > +        return;
> > +
> > +    /* Domain is destroied so its cos_ref should be decreased. */
> > +    for ( socket = 0; socket < nr_sockets; socket++ )
> > +    {
> > +        struct psr_socket_info *info;
> > +
> > +        /* cos 0 is default one which does not need be handled. */
> > +        cos = d->arch.psr_cos_ids[socket];
> > +        if ( cos == 0 )
> > +            continue;
> > +
> > +        info = socket_info + socket;
> > +        spin_lock(&info->ref_lock);
> > +        ASSERT(!cos || info->cos_ref[cos]);
> > +        info->cos_ref[cos]--;
> 
> This recurring pattern of assertion and decrement could surely be
> put in a helper function (and the for symmetry also for increment).
> 
Ok, but if we use memset to restore 'cos_ref[]' per above comment, I think it
is unnecessary.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array.
  2017-04-11 15:11   ` Jan Beulich
@ 2017-04-12  5:55     ` Yi Sun
  2017-04-12  9:13       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-12  5:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-11 09:11:20, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > @@ -593,7 +616,21 @@ int psr_get_val(struct domain *d, unsigned int socket,
> >  /* Set value functions */
> >  static unsigned int get_cos_num(const struct psr_socket_info *info)
> >  {
> > -    return 0;
> > +    unsigned int num = 0, i;
> > +
> > +    /* Get all features total amount. */
> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> > +    {
> > +        const struct feat_node *feat = info->features[i];
> > +        if ( !feat )
> 
> Blank line between ... (and I likely won't repeat this any further)
> 
Got it, will fix all of them.

> > +            continue;
> > +
> > +        feat = info->features[i];
> 
> ???
> 
Sorry, missed it.

> > @@ -611,7 +679,40 @@ static int insert_val_to_array(uint32_t val[],
> >                                 enum cbm_type type,
> >                                 uint32_t new_val)
> >  {
> > -    return -EINVAL;
> > +    const struct feat_node *feat;
> > +    unsigned int i;
> > +
> > +    ASSERT(feat_type < PSR_SOCKET_MAX_FEAT);
> > +
> > +    /* Insert new value into array according to feature's position in array. */
> > +    for ( i = 0; i < feat_type; i++ )
> > +    {
> > +        feat = info->features[i];
> > +        if ( !feat )
> > +            continue;
> > +
> > +        if ( array_len <= feat->props->cos_num )
> > +            return -ENOSPC;
> > +
> > +        array_len -= feat->props->cos_num;
> > +
> > +        val += feat->props->cos_num;
> > +    }
> > +
> > +    feat = info->features[feat_type];
> > +    if ( !feat )
> > +        return -ENOENT;
> > +
> > +    if ( array_len < feat->props->cos_num )
> > +        return -ENOSPC;
> > +
> > +    if ( !psr_check_cbm(feat->props->cbm_len, new_val) )
> > +        return -EINVAL;
> > +
> > +    /* Value setting position is same as feature array. */
> > +    val[0] = new_val;
> 
> How come this is array index 0 unconditionally, when cos_num
> may be greater than 1?
> 
This patch is to implement L3 CAT so I do not introduce 'type[]' yet. The
mechanism will be introduced in CDP patch.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 13/25] x86: refactor psr: L3 CAT: set value: implement write msr flow.
  2017-04-11 15:25   ` Jan Beulich
@ 2017-04-12  6:04     ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-12  6:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-11 09:25:28, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > +static void do_write_psr_msr(void *data)
> > +{
> > +    struct cos_write_info *info = data;
> > +    unsigned int cos            = info->cos;
> > +    struct feat_node *feat      = info->feature;
> > +
> > +    if ( cos > feat->props->cos_max )
> > +        return;
> 
> This check can as well be done in the caller, allowing to avoid the IPI
> in case it's true.
> 
Yes, thanks!

> > +    feat->props->write_msr(cos, info->val, feat);
> 
> Once the function consists of just this I wonder if it wasn't better
> to invoke the hook directly from write_psr_msr().
> 
To solve the issue we discussed in patch 5, I propsed a solution which may
change here. If you agree that solution, I think we still need this helper
function.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow.
  2017-04-11 15:39   ` Jan Beulich
@ 2017-04-12  6:05     ` Yi Sun
  2017-04-12  9:14       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-12  6:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-11 09:39:55, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > @@ -755,7 +765,7 @@ static int gather_val_array(uint32_t val[],
> >              cos = 0;
> >  
> >          /* Value getting order is same as feature array. */
> > -        feat->props->get_val(feat, cos, &val[0]);
> > +        feat->props->get_val(feat, cos, 0, &val[0]);
> 
> How can this be literal zero here (and in further cases below)?
> 
Because the 'type[]' is not introduced yet. Please refer CDP patch
which implements this.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 17/25] x86: refactor psr: CDP: implement set value callback functions.
  2017-04-11 16:03   ` Jan Beulich
@ 2017-04-12  6:14     ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-12  6:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-11 10:03:21, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > @@ -94,6 +102,13 @@ struct feat_node {
> >          unsigned int cos_max;
> >          unsigned int cbm_len;
> >  
> > +        /*
> > +         * An array to save all 'enum cbm_type' values of the feature. It is
> > +         * used with cos_num together to get/write a feature's COS registers
> > +         * values one by one.
> > +         */
> > +        enum cbm_type type[PSR_MAX_COS_NUM];
> 
> So here it finally comes. But it's needed the latest in the first CDP
> patch, quite possibly even earlier.
> 
Hmm, ok, as the previous patches make you have some confusions. I will move
this to the first CDP patch.

> > +static int compare_val(const uint32_t val[],
> > +                       const struct feat_node *feat,
> > +                       unsigned int cos)
> > +{
> > +    int rc = 0;
> 
> The variable deserves a better name and type bool, according to its
> usage. But I'm unconvinced the variable is needed at all - I think
> without it the intention of the function would be more clear. See
> remarks further down.
> 
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < feat->props->cos_num; i++ )
> > +    {
> > +        uint32_t feat_val = 0;
> 
> Pointless initializer again.
> 
> > +        rc = 0;
> > +
> > +        /* If cos is bigger than cos_max, we need compare default value. */
> > +        if ( cos > feat->props->cos_max )
> > +        {
> > +            /*
> > +             * COS ID 0 always stores the default value so input 0 to get
> > +             * default value.
> > +             */
> > +            feat->props->get_val(feat, 0, feat->props->type[i], &feat_val);
> > +
> > +            /*
> > +             * If cos is bigger than feature's cos_max, the val should be
> > +             * default value. Otherwise, it fails to find a COS ID. So we
> > +             * have to exit find flow.
> > +             */
> > +            if ( val[i] != feat_val )
> > +                return -EINVAL;
> > +
> > +            rc = 1;
> > +            continue;
> 
> Drop these two.
> 
> > +        }
> 
> else
> {
> 
> > +
> > +        /* For CDP, DATA is the first item in val[], CODE is the second. */
> > +        feat->props->get_val(feat, cos, feat->props->type[i], &feat_val);
> > +        if ( val[i] != feat_val )
> > +            break;
> 
> return 0;
> }
> 
> > +
> > +        rc = 1;
> 
> Drop.
> 
> > +    }
> > +
> > +    return rc;
> 
> return 1;
> 
Thanks! My original intention is to avoid the 'else' so that no indentations.
But it seems 'else' is good to you so I will change it to above codes.

> > @@ -851,43 +948,21 @@ static int find_cos(const uint32_t val[], unsigned int array_len,
> >           */
> >          for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> >          {
> > -            uint32_t default_val = 0;
> > -
> >              feat = info->features[i];
> >              if ( !feat )
> >                  continue;
> >  
> >              /*
> > -             * COS ID 0 always stores the default value so input 0 to get
> > -             * default value.
> > -             */
> > -            feat->props->get_val(feat, 0, 0, &default_val);
> > -
> > -            /*
> >               * Compare value according to feature array order.
> >               * We must follow this order because value array is assembled
> >               * as this order.
> >               */
> > -            if ( cos > feat->props->cos_max )
> > -            {
> > -                /*
> > -                 * If cos is bigger than feature's cos_max, the val should be
> > -                 * default value. Otherwise, it fails to find a COS ID. So we
> > -                 * have to exit find flow.
> > -                 */
> > -                if ( val[0] != default_val )
> > -                    return -EINVAL;
> > -
> > -                found = true;
> > -            }
> > -            else
> > -            {
> > -                if ( val[0] == feat->cos_reg_val[cos] )
> > -                    found = true;
> > -            }
> > +            rc = compare_val(val, feat, cos);
> 
> Why is this being moved into a function here rather than being
> introduced as a function right away?
> 
In L3 CAT patch, this part looks simple so I did not encapulate them into a
function. If you prefer a function here, I will do it at the beginning.

> > @@ -922,9 +997,14 @@ static bool fits_cos_max(const uint32_t val[],
> >  
> >          if ( cos > feat->props->cos_max )
> >          {
> > -            feat->props->get_val(feat, 0, 0, &default_val);
> > -            if ( val[0] != default_val )
> > -                return false;
> > +            /* For CDP, DATA is the first item in val[], CODE is the second. */
> 
> This CDP specific comment doesn't belong into a generic function.
> 
Ok, will remove it.

> > @@ -1033,6 +1116,40 @@ static int write_psr_msr(unsigned int socket, unsigned int cos,
> >      return 0;
> >  }
> >  
> > +static void restore_default_val(unsigned int socket, unsigned int cos,
> > +                                enum psr_feat_type feat_type)
> > +{
> > +    unsigned int i, j;
> > +    uint32_t default_val;
> > +    const struct psr_socket_info *info = get_socket_info(socket);
> > +
> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> > +    {
> > +        const struct feat_node *feat = info->features[i];
> 
> Blank line here.
> 
Got it.

> > +        /*
> > +         * There are four judgements:
> > +         * 1. Input 'feat_type' is valid so we have to get feature according to
> > +         *    it. If current feature type (i) does not match 'feat_type', we
> > +         *    need skip it, so continue to check next feature.
> > +         * 2. Input 'feat_type' is 'PSR_SOCKET_MAX_FEAT' which means we should
> > +         *    handle all features in this case. So, go to next loop.
> > +         * 3. Do not need restore the COS value back to default if cos_num is 1,
> > +         *    e.g. L3 CAT. Because next value setting will overwrite it.
> > +         * 4. 'feat' we got is NULL, continue.
> > +         */
> > +        if ( ( feat_type != PSR_SOCKET_MAX_FEAT && feat_type != i ) ||
> 
> Stray blanks inside inner parentheses.
> 
Ok, will remove them.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-12  5:53     ` Yi Sun
@ 2017-04-12  9:09       ` Jan Beulich
  2017-04-12 12:23         ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-12  9:09 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-11 09:01:53, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> > +        info->cos_ref[cos]--;
>> > +        spin_unlock(&info->ref_lock);
>> > +
>> > +        d->arch.psr_cos_ids[socket] = 0;
>> > +    }
>> 
>> Overall, while you say in the revision log that this was a suggestion of
>> mine, I don't recall any such (and I've just checked the v9 thread of
>> this patch without finding anything), and hence it's not really clear to
>> me why this is needed. After all you should be freeing info anyway
> 
> We discussed this in v9 05 patch.

Ah, that's why I didn't find it.

> I paste it below for your convenience to
> check.
> [Sun Yi]:
>   > So, you think the MSRs values may not be valid after such process and 
>   > reloading (write MSRs to default value) is needed. If so, I would like 
>   > to do more operations in 'free_feature()':
>   > 1. Iterate all domains working on the offline socket to change
>   >    'd->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init
>   >    status.
>   > 2. Restore 'socket_info[socket].cos_ref[]' to all 0.
>   > 
>   > These can make the socket's info be totally restored back to init status.
> 
> [Jan]
>   Yes, that's what I think is needed.
> 
>> (albeit I can't see this happening, which would look to be a bug in
>> patch 5), so getting the refcounts adjusted seems pointless in any
>> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not
> 
> We only free resources in 'socket_info[socekt]' but do not free itself.
> Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]'
> is not a pointer that can be directly freed.
>   socket_info = xzalloc_array(struct psr_socket_info, nr_sockets);
> 
> That is the reason to reduce the 'info->cos_ref[cos]'.

I see. But then there's no need to decrement it for each domain
using it, you could simply flush it to zero.

>> certain - this may indeed by unavoidable, to match up with
>> psr_alloc_cos() using xzalloc.
>> 
>> Furthermore I'm not at all convinced this is appropriate to do in the
>> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
>> have a few thousand VMs, the loop above may take a while.
>> 
> Hmm, that may be a potential issue. I have two proposals below. Could you
> please help to check which one you prefer? Or provide another solution?
> 
> 1. Start a tasklet in free_socket_resources() to restore 'psr_cos_ids[socket]'
>    of all domains. The action is protected by 'ref_lock' to avoid confliction
>    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset
>    the array to 0 in free_socket_resources().
> 
> 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index
>    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket
>    and can memset the array to 0 when socket is offline. But here is an issue
>    that we do not know how many members this array should have. I cannot find
>    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use reallocation
>    in 'psr_alloc_cos' if the newly created domain's id is bigger than current
>    array number.

The number of domains is limited by the special DOMID_* values.
However, allocating an array with 32k entries doesn't sound very
reasonable. Sadly the other solution doesn't look very attractive
either, as there'll be quite a bit of synchronization needed (you'd
have to defer the same socket being re-used by a CPU being
onlined until you've done the cleanup). This may need some more
thought, but I can't see myself finding time for this any time soon,
so I'm afraid I'll have to leave it to you for now.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array.
  2017-04-12  5:55     ` Yi Sun
@ 2017-04-12  9:13       ` Jan Beulich
  2017-04-12 12:26         ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-12  9:13 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 12.04.17 at 07:55, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-11 09:11:20, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> > @@ -611,7 +679,40 @@ static int insert_val_to_array(uint32_t val[],
>> >                                 enum cbm_type type,
>> >                                 uint32_t new_val)
>> >  {
>> > -    return -EINVAL;
>> > +    const struct feat_node *feat;
>> > +    unsigned int i;
>> > +
>> > +    ASSERT(feat_type < PSR_SOCKET_MAX_FEAT);
>> > +
>> > +    /* Insert new value into array according to feature's position in array. */
>> > +    for ( i = 0; i < feat_type; i++ )
>> > +    {
>> > +        feat = info->features[i];
>> > +        if ( !feat )
>> > +            continue;
>> > +
>> > +        if ( array_len <= feat->props->cos_num )
>> > +            return -ENOSPC;
>> > +
>> > +        array_len -= feat->props->cos_num;
>> > +
>> > +        val += feat->props->cos_num;
>> > +    }
>> > +
>> > +    feat = info->features[feat_type];
>> > +    if ( !feat )
>> > +        return -ENOENT;
>> > +
>> > +    if ( array_len < feat->props->cos_num )
>> > +        return -ENOSPC;
>> > +
>> > +    if ( !psr_check_cbm(feat->props->cbm_len, new_val) )
>> > +        return -EINVAL;
>> > +
>> > +    /* Value setting position is same as feature array. */
>> > +    val[0] = new_val;
>> 
>> How come this is array index 0 unconditionally, when cos_num
>> may be greater than 1?
>> 
> This patch is to implement L3 CAT so I do not introduce 'type[]' yet. The
> mechanism will be introduced in CDP patch.

Which imo is wrong, as I've pointed out in a later patch. The
moment you have the "cos_num" field (instead of implying 1
everywhere) you ought to be dealing with the field having a
value other than 1. Otherwise, as that later patch shows, you'd
then re-write _part_ of the logic without in fact _newly_
introducing "cos_num" being a variable.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow.
  2017-04-12  6:05     ` Yi Sun
@ 2017-04-12  9:14       ` Jan Beulich
  0 siblings, 0 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-12  9:14 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 12.04.17 at 08:05, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-11 09:39:55, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> > @@ -755,7 +765,7 @@ static int gather_val_array(uint32_t val[],
>> >              cos = 0;
>> >  
>> >          /* Value getting order is same as feature array. */
>> > -        feat->props->get_val(feat, cos, &val[0]);
>> > +        feat->props->get_val(feat, cos, 0, &val[0]);
>> 
>> How can this be literal zero here (and in further cases below)?
>> 
> Because the 'type[]' is not introduced yet. Please refer CDP patch
> which implements this.

Again, type[] should be introduced together with the cos_num
field.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-12  9:09       ` Jan Beulich
@ 2017-04-12 12:23         ` Yi Sun
  2017-04-12 12:42           ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-12 12:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-12 03:09:56, Jan Beulich wrote:
> >>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-11 09:01:53, Jan Beulich wrote:
> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> > +        info->cos_ref[cos]--;
> >> > +        spin_unlock(&info->ref_lock);
> >> > +
> >> > +        d->arch.psr_cos_ids[socket] = 0;
> >> > +    }
> >> 
> >> Overall, while you say in the revision log that this was a suggestion of
> >> mine, I don't recall any such (and I've just checked the v9 thread of
> >> this patch without finding anything), and hence it's not really clear to
> >> me why this is needed. After all you should be freeing info anyway
> > 
> > We discussed this in v9 05 patch.
> 
> Ah, that's why I didn't find it.
> 
> > I paste it below for your convenience to
> > check.
> > [Sun Yi]:
> >   > So, you think the MSRs values may not be valid after such process and 
> >   > reloading (write MSRs to default value) is needed. If so, I would like 
> >   > to do more operations in 'free_feature()':
> >   > 1. Iterate all domains working on the offline socket to change
> >   >    'd->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init
> >   >    status.
> >   > 2. Restore 'socket_info[socket].cos_ref[]' to all 0.
> >   > 
> >   > These can make the socket's info be totally restored back to init status.
> > 
> > [Jan]
> >   Yes, that's what I think is needed.
> > 
> >> (albeit I can't see this happening, which would look to be a bug in
> >> patch 5), so getting the refcounts adjusted seems pointless in any
> >> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not
> > 
> > We only free resources in 'socket_info[socekt]' but do not free itself.
> > Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]'
> > is not a pointer that can be directly freed.
> >   socket_info = xzalloc_array(struct psr_socket_info, nr_sockets);
> > 
> > That is the reason to reduce the 'info->cos_ref[cos]'.
> 
> I see. But then there's no need to decrement it for each domain
> using it, you could simply flush it to zero.
> 
> >> certain - this may indeed by unavoidable, to match up with
> >> psr_alloc_cos() using xzalloc.
> >> 
> >> Furthermore I'm not at all convinced this is appropriate to do in the
> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
> >> have a few thousand VMs, the loop above may take a while.
> >> 
> > Hmm, that may be a potential issue. I have two proposals below. Could you
> > please help to check which one you prefer? Or provide another solution?
> > 
> > 1. Start a tasklet in free_socket_resources() to restore 'psr_cos_ids[socket]'
> >    of all domains. The action is protected by 'ref_lock' to avoid confliction
> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset
> >    the array to 0 in free_socket_resources().
> > 
> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index
> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket
> >    and can memset the array to 0 when socket is offline. But here is an issue
> >    that we do not know how many members this array should have. I cannot find
> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use reallocation
> >    in 'psr_alloc_cos' if the newly created domain's id is bigger than current
> >    array number.
> 
> The number of domains is limited by the special DOMID_* values.
> However, allocating an array with 32k entries doesn't sound very
> reasonable.

I think 32K entries should be the extreme case. I can allocate e.g. 100 entries
when the first domain is created. If a new domain's id exceeds 100, reallocate
another 100 entries. The total number of entries allocated should be less than
32K. This is a functional requirement which cannot be avoided. How do you think?

> Sadly the other solution doesn't look very attractive
> either, as there'll be quite a bit of synchronization needed (you'd
> have to defer the same socket being re-used by a CPU being
> onlined until you've done the cleanup). This may need some more
> thought, but I can't see myself finding time for this any time soon,
> so I'm afraid I'll have to leave it to you for now.
> 
Yes, the first option need carefully consider the synchronization which is more
complex than second option.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array.
  2017-04-12  9:13       ` Jan Beulich
@ 2017-04-12 12:26         ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-12 12:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-12 03:13:12, Jan Beulich wrote:
> >>> On 12.04.17 at 07:55, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-11 09:11:20, Jan Beulich wrote:
> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> > @@ -611,7 +679,40 @@ static int insert_val_to_array(uint32_t val[],
> >> >                                 enum cbm_type type,
> >> >                                 uint32_t new_val)
> >> >  {
> >> > -    return -EINVAL;
> >> > +    const struct feat_node *feat;
> >> > +    unsigned int i;
> >> > +
> >> > +    ASSERT(feat_type < PSR_SOCKET_MAX_FEAT);
> >> > +
> >> > +    /* Insert new value into array according to feature's position in array. */
> >> > +    for ( i = 0; i < feat_type; i++ )
> >> > +    {
> >> > +        feat = info->features[i];
> >> > +        if ( !feat )
> >> > +            continue;
> >> > +
> >> > +        if ( array_len <= feat->props->cos_num )
> >> > +            return -ENOSPC;
> >> > +
> >> > +        array_len -= feat->props->cos_num;
> >> > +
> >> > +        val += feat->props->cos_num;
> >> > +    }
> >> > +
> >> > +    feat = info->features[feat_type];
> >> > +    if ( !feat )
> >> > +        return -ENOENT;
> >> > +
> >> > +    if ( array_len < feat->props->cos_num )
> >> > +        return -ENOSPC;
> >> > +
> >> > +    if ( !psr_check_cbm(feat->props->cbm_len, new_val) )
> >> > +        return -EINVAL;
> >> > +
> >> > +    /* Value setting position is same as feature array. */
> >> > +    val[0] = new_val;
> >> 
> >> How come this is array index 0 unconditionally, when cos_num
> >> may be greater than 1?
> >> 
> > This patch is to implement L3 CAT so I do not introduce 'type[]' yet. The
> > mechanism will be introduced in CDP patch.
> 
> Which imo is wrong, as I've pointed out in a later patch. The
> moment you have the "cos_num" field (instead of implying 1
> everywhere) you ought to be dealing with the field having a
> value other than 1. Otherwise, as that later patch shows, you'd
> then re-write _part_ of the logic without in fact _newly_
> introducing "cos_num" being a variable.
> 
Ok, I will introduce the 'type[]' with 'cos_num' together from the beginning.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-12 12:23         ` Yi Sun
@ 2017-04-12 12:42           ` Jan Beulich
  2017-04-13  8:11             ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-12 12:42 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 12.04.17 at 14:23, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-12 03:09:56, Jan Beulich wrote:
>> >>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-11 09:01:53, Jan Beulich wrote:
>> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> >> > +        info->cos_ref[cos]--;
>> >> > +        spin_unlock(&info->ref_lock);
>> >> > +
>> >> > +        d->arch.psr_cos_ids[socket] = 0;
>> >> > +    }
>> >> 
>> >> Overall, while you say in the revision log that this was a suggestion of
>> >> mine, I don't recall any such (and I've just checked the v9 thread of
>> >> this patch without finding anything), and hence it's not really clear to
>> >> me why this is needed. After all you should be freeing info anyway
>> > 
>> > We discussed this in v9 05 patch.
>> 
>> Ah, that's why I didn't find it.
>> 
>> > I paste it below for your convenience to
>> > check.
>> > [Sun Yi]:
>> >   > So, you think the MSRs values may not be valid after such process and 
>> >   > reloading (write MSRs to default value) is needed. If so, I would like 
>> >   > to do more operations in 'free_feature()':
>> >   > 1. Iterate all domains working on the offline socket to change
>> >   >    'd->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init
>> >   >    status.
>> >   > 2. Restore 'socket_info[socket].cos_ref[]' to all 0.
>> >   > 
>> >   > These can make the socket's info be totally restored back to init 
> status.
>> > 
>> > [Jan]
>> >   Yes, that's what I think is needed.
>> > 
>> >> (albeit I can't see this happening, which would look to be a bug in
>> >> patch 5), so getting the refcounts adjusted seems pointless in any
>> >> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not
>> > 
>> > We only free resources in 'socket_info[socekt]' but do not free itself.
>> > Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]'
>> > is not a pointer that can be directly freed.
>> >   socket_info = xzalloc_array(struct psr_socket_info, nr_sockets);
>> > 
>> > That is the reason to reduce the 'info->cos_ref[cos]'.
>> 
>> I see. But then there's no need to decrement it for each domain
>> using it, you could simply flush it to zero.
>> 
>> >> certain - this may indeed by unavoidable, to match up with
>> >> psr_alloc_cos() using xzalloc.
>> >> 
>> >> Furthermore I'm not at all convinced this is appropriate to do in the
>> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
>> >> have a few thousand VMs, the loop above may take a while.
>> >> 
>> > Hmm, that may be a potential issue. I have two proposals below. Could you
>> > please help to check which one you prefer? Or provide another solution?
>> > 
>> > 1. Start a tasklet in free_socket_resources() to restore 
> 'psr_cos_ids[socket]'
>> >    of all domains. The action is protected by 'ref_lock' to avoid 
> confliction
>> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset
>> >    the array to 0 in free_socket_resources().
>> > 
>> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index
>> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket
>> >    and can memset the array to 0 when socket is offline. But here is an 
> issue
>> >    that we do not know how many members this array should have. I cannot 
> find
>> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use 
> reallocation
>> >    in 'psr_alloc_cos' if the newly created domain's id is bigger than 
> current
>> >    array number.
>> 
>> The number of domains is limited by the special DOMID_* values.
>> However, allocating an array with 32k entries doesn't sound very
>> reasonable.
> 
> I think 32K entries should be the extreme case. I can allocate e.g. 100 entries
> when the first domain is created. If a new domain's id exceeds 100, reallocate
> another 100 entries. The total number of entries allocated should be less than
> 32K. This is a functional requirement which cannot be avoided. How do you 
> think?

So how many entries would your array have once I start the 32,000th
domain (having at any one time at most a single one running, besides
Dom0)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow.
  2017-04-01 13:53 ` [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow Yi Sun
@ 2017-04-12 15:18   ` Jan Beulich
  2017-04-13  8:12     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-12 15:18 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> @@ -304,10 +305,14 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
>      switch ( type )
>      {
>      case PSR_SOCKET_L3_CAT:
> +    case PSR_SOCKET_L2_CAT:
>          /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
>          feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
>  
> -        feat->props->type[0] = PSR_CBM_TYPE_L3;
> +        if ( type == PSR_SOCKET_L3_CAT )
> +            feat->props->type[0] = PSR_CBM_TYPE_L3;
> +        else
> +            feat->props->type[0] = PSR_CBM_TYPE_L2;

Can I talk you into preferring conditional expressions in case like
this or ...

> @@ -315,7 +320,11 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
>           */
>          for ( i = 1; i <= feat->props->cos_max; i++ )
>          {
> -            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> +            if ( type == PSR_SOCKET_L3_CAT )
> +                wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> +            else
> +                wrmsrl(MSR_IA32_PSR_L2_MASK(i), feat->cos_reg_val[0]);

... this?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 21/25] x86: L2 CAT: implement set value flow.
  2017-04-01 13:53 ` [PATCH v10 21/25] x86: L2 CAT: implement set " Yi Sun
@ 2017-04-12 15:23   ` Jan Beulich
  0 siblings, 0 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-12 15:23 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1466,6 +1466,16 @@ long arch_do_domctl(
>                                PSR_CBM_TYPE_L3_DATA);
>              break;
>  
> +        case XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM:
> +            if ( domctl->u.psr_cat_op.data !=
> +                 (uint32_t)domctl->u.psr_cat_op.data )
> +                return -EINVAL;

Considering this recurring pattern I'd like to suggest to do the
check in a single place early in ...

> +            ret = psr_set_val(d, domctl->u.psr_cat_op.target,
> +                              domctl->u.psr_cat_op.data,
> +                              PSR_CBM_TYPE_L2);

... the function being called here.

> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -467,10 +467,21 @@ static struct feat_props l3_cdp_props = {
>  };
>  
>  /* L2 CAT ops */
> +static void l2_cat_write_msr(unsigned int cos, uint32_t val,
> +                             enum cbm_type type, struct feat_node *feat)
> +{
> +    if ( feat->cos_reg_val[cos] != val )
> +    {
> +        feat->cos_reg_val[cos] = val;

It's not the first time I see this pattern, so it looks like this again
would be a candidate for further code movement into generic
logic.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 22/25] tools: L2 CAT: support get HW info for L2 CAT.
  2017-04-01 13:53 ` [PATCH v10 22/25] tools: L2 CAT: support get HW info for L2 CAT Yi Sun
@ 2017-04-12 15:24   ` Jan Beulich
  0 siblings, 0 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-12 15:24 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> This patch implements xl/xc changes to support get HW info
> for L2 CAT.
> 
> 'xl psr-hwinfo' is updated to show both L3 CAT and L2 CAT
> info.
> 
> Example(on machine which only supports L2 CAT):
> Cache Monitoring Technology (CMT):
> Enabled         : 0
> Cache Allocation Technology (CAT): L2
> Socket ID       : 0
> Maximum COS     : 3
> CBM length      : 8
> Default CBM     : 0xff
> 
> Signed-off-by: He Chen <he.chen@linux.intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>

Hypervisor side
Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-12 12:42           ` Jan Beulich
@ 2017-04-13  8:11             ` Yi Sun
  2017-04-13  9:41               ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-13  8:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, chao.p.peng, xen-devel, roger.pau

On 17-04-12 06:42:01, Jan Beulich wrote:
> >>> On 12.04.17 at 14:23, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-12 03:09:56, Jan Beulich wrote:
> >> >>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-11 09:01:53, Jan Beulich wrote:
> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> >> > +        info->cos_ref[cos]--;
> >> >> > +        spin_unlock(&info->ref_lock);
> >> >> > +
> >> >> > +        d->arch.psr_cos_ids[socket] = 0;
> >> >> > +    }
> >> >> 
> >> >> Overall, while you say in the revision log that this was a suggestion of
> >> >> mine, I don't recall any such (and I've just checked the v9 thread of
> >> >> this patch without finding anything), and hence it's not really clear to
> >> >> me why this is needed. After all you should be freeing info anyway
> >> > 
> >> > We discussed this in v9 05 patch.
> >> 
> >> Ah, that's why I didn't find it.
> >> 
> >> > I paste it below for your convenience to
> >> > check.
> >> > [Sun Yi]:
> >> >   > So, you think the MSRs values may not be valid after such process and 
> >> >   > reloading (write MSRs to default value) is needed. If so, I would like 
> >> >   > to do more operations in 'free_feature()':
> >> >   > 1. Iterate all domains working on the offline socket to change
> >> >   >    'd->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init
> >> >   >    status.
> >> >   > 2. Restore 'socket_info[socket].cos_ref[]' to all 0.
> >> >   > 
> >> >   > These can make the socket's info be totally restored back to init 
> > status.
> >> > 
> >> > [Jan]
> >> >   Yes, that's what I think is needed.
> >> > 
> >> >> (albeit I can't see this happening, which would look to be a bug in
> >> >> patch 5), so getting the refcounts adjusted seems pointless in any
> >> >> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not
> >> > 
> >> > We only free resources in 'socket_info[socekt]' but do not free itself.
> >> > Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]'
> >> > is not a pointer that can be directly freed.
> >> >   socket_info = xzalloc_array(struct psr_socket_info, nr_sockets);
> >> > 
> >> > That is the reason to reduce the 'info->cos_ref[cos]'.
> >> 
> >> I see. But then there's no need to decrement it for each domain
> >> using it, you could simply flush it to zero.
> >> 
> >> >> certain - this may indeed by unavoidable, to match up with
> >> >> psr_alloc_cos() using xzalloc.
> >> >> 
> >> >> Furthermore I'm not at all convinced this is appropriate to do in the
> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
> >> >> have a few thousand VMs, the loop above may take a while.
> >> >> 
> >> > Hmm, that may be a potential issue. I have two proposals below. Could you
> >> > please help to check which one you prefer? Or provide another solution?
> >> > 
> >> > 1. Start a tasklet in free_socket_resources() to restore 
> > 'psr_cos_ids[socket]'
> >> >    of all domains. The action is protected by 'ref_lock' to avoid 
> > confliction
> >> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset
> >> >    the array to 0 in free_socket_resources().
> >> > 
> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index
> >> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket
> >> >    and can memset the array to 0 when socket is offline. But here is an 
> > issue
> >> >    that we do not know how many members this array should have. I cannot 
> > find
> >> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use 
> > reallocation
> >> >    in 'psr_alloc_cos' if the newly created domain's id is bigger than 
> > current
> >> >    array number.
> >> 
> >> The number of domains is limited by the special DOMID_* values.
> >> However, allocating an array with 32k entries doesn't sound very
> >> reasonable.
> > 
> > I think 32K entries should be the extreme case. I can allocate e.g. 100 entries
> > when the first domain is created. If a new domain's id exceeds 100, reallocate
> > another 100 entries. The total number of entries allocated should be less than
> > 32K. This is a functional requirement which cannot be avoided. How do you 
> > think?
> 
> So how many entries would your array have once I start the 32,000th
> domain (having at any one time at most a single one running, besides
> Dom0)?
> 
In such case, we have to keep a 32K array because the domain_id is the index to
access the array. But this array is per socket so the whole memory used should
not be too much.

After considering this issue more, I think the original codes might not be
so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most
288 CPUs. So, I think the domains running at same time in reality may not be
so many (no efficient resources). If this hypothesis is right, a loop to write
'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am
wrong, please correct me. Thanks!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow.
  2017-04-12 15:18   ` Jan Beulich
@ 2017-04-13  8:12     ` Yi Sun
  2017-04-13  8:16       ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-13  8:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-12 09:18:51, Jan Beulich wrote:
> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > @@ -304,10 +305,14 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
> >      switch ( type )
> >      {
> >      case PSR_SOCKET_L3_CAT:
> > +    case PSR_SOCKET_L2_CAT:
> >          /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> >          feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
> >  
> > -        feat->props->type[0] = PSR_CBM_TYPE_L3;
> > +        if ( type == PSR_SOCKET_L3_CAT )
> > +            feat->props->type[0] = PSR_CBM_TYPE_L3;
> > +        else
> > +            feat->props->type[0] = PSR_CBM_TYPE_L2;
> 
> Can I talk you into preferring conditional expressions in case like
> this or ...
> 
> > @@ -315,7 +320,11 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
> >           */
> >          for ( i = 1; i <= feat->props->cos_max; i++ )
> >          {
> > -            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> > +            if ( type == PSR_SOCKET_L3_CAT )
> > +                wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
> > +            else
> > +                wrmsrl(MSR_IA32_PSR_L2_MASK(i), feat->cos_reg_val[0]);
> 
> ... this?
> 
I think you mean '?:', right?

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow.
  2017-04-13  8:12     ` Yi Sun
@ 2017-04-13  8:16       ` Jan Beulich
  0 siblings, 0 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-13  8:16 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 13.04.17 at 10:12, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-12 09:18:51, Jan Beulich wrote:
>> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> > @@ -304,10 +305,14 @@ static void cat_init_feature(const struct cpuid_leaf 
> *regs,
>> >      switch ( type )
>> >      {
>> >      case PSR_SOCKET_L3_CAT:
>> > +    case PSR_SOCKET_L2_CAT:
>> >          /* cos=0 is reserved as default cbm(all bits within cbm_len are 
> 1). */
>> >          feat->cos_reg_val[0] = cat_default_val(feat->props->cbm_len);
>> >  
>> > -        feat->props->type[0] = PSR_CBM_TYPE_L3;
>> > +        if ( type == PSR_SOCKET_L3_CAT )
>> > +            feat->props->type[0] = PSR_CBM_TYPE_L3;
>> > +        else
>> > +            feat->props->type[0] = PSR_CBM_TYPE_L2;
>> 
>> Can I talk you into preferring conditional expressions in case like
>> this or ...
>> 
>> > @@ -315,7 +320,11 @@ static void cat_init_feature(const struct cpuid_leaf *regs,
>> >           */
>> >          for ( i = 1; i <= feat->props->cos_max; i++ )
>> >          {
>> > -            wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
>> > +            if ( type == PSR_SOCKET_L3_CAT )
>> > +                wrmsrl(MSR_IA32_PSR_L3_MASK(i), feat->cos_reg_val[0]);
>> > +            else
>> > +                wrmsrl(MSR_IA32_PSR_L2_MASK(i), feat->cos_reg_val[0]);
>> 
>> ... this?
>> 
> I think you mean '?:', right?

Yes, that's what "conditional expression" is according to the C
standard.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13  8:11             ` Yi Sun
@ 2017-04-13  9:41               ` Jan Beulich
  2017-04-13 10:49                 ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-13  9:41 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 13.04.17 at 10:11, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-12 06:42:01, Jan Beulich wrote:
>> >>> On 12.04.17 at 14:23, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-12 03:09:56, Jan Beulich wrote:
>> >> >>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
>> >> > On 17-04-11 09:01:53, Jan Beulich wrote:
>> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> >> >> Furthermore I'm not at all convinced this is appropriate to do in the
>> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
>> >> >> have a few thousand VMs, the loop above may take a while.
>> >> >> 
>> >> > Hmm, that may be a potential issue. I have two proposals below. Could you
>> >> > please help to check which one you prefer? Or provide another solution?
>> >> > 
>> >> > 1. Start a tasklet in free_socket_resources() to restore 
>> > 'psr_cos_ids[socket]'
>> >> >    of all domains. The action is protected by 'ref_lock' to avoid 
>> > confliction
>> >> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset
>> >> >    the array to 0 in free_socket_resources().
>> >> > 
>> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index
>> >> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket
>> >> >    and can memset the array to 0 when socket is offline. But here is an 
>> > issue
>> >> >    that we do not know how many members this array should have. I cannot 
>> > find
>> >> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use 
>> > reallocation
>> >> >    in 'psr_alloc_cos' if the newly created domain's id is bigger than 
>> > current
>> >> >    array number.
>> >> 
>> >> The number of domains is limited by the special DOMID_* values.
>> >> However, allocating an array with 32k entries doesn't sound very
>> >> reasonable.
>> > 
>> > I think 32K entries should be the extreme case. I can allocate e.g. 100 entries
>> > when the first domain is created. If a new domain's id exceeds 100, reallocate
>> > another 100 entries. The total number of entries allocated should be less than
>> > 32K. This is a functional requirement which cannot be avoided. How do you 
>> > think?
>> 
>> So how many entries would your array have once I start the 32,000th
>> domain (having at any one time at most a single one running, besides
>> Dom0)?
>> 
> In such case, we have to keep a 32K array because the domain_id is the index to
> access the array. But this array is per socket so the whole memory used should
> not be too much.

We carefully avoid any runtime allocations of order > 0, so if you
were to set up such an array, you'd need to use vmalloc()/vzalloc().
But I continue to be unconvinced that we want such a large array
in the first place.

> After considering this issue more, I think the original codes might not be
> so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most
> 288 CPUs. So, I think the domains running at same time in reality may not be
> so many (no efficient resources). If this hypothesis is right, a loop to write
> 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am
> wrong, please correct me. Thanks!

What relationship does the number of CPUs have to the number of
domains on a host? There could be thousands with just a few dozen
CPUs, provided none or very few of them have high demands on
CPU resources. Additionally please never forget that system sizes
basically only ever grow. Plus we wouldn't want a latent issue here
in case we ever end up needing to widen domain IDs beyond 16 bits.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13  9:41               ` Jan Beulich
@ 2017-04-13 10:49                 ` Yi Sun
  2017-04-13 10:58                   ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-13 10:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-13 03:41:44, Jan Beulich wrote:
> >>> On 13.04.17 at 10:11, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-12 06:42:01, Jan Beulich wrote:
> >> >>> On 12.04.17 at 14:23, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-12 03:09:56, Jan Beulich wrote:
> >> >> >>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
> >> >> > On 17-04-11 09:01:53, Jan Beulich wrote:
> >> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> >> >> Furthermore I'm not at all convinced this is appropriate to do in the
> >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
> >> >> >> have a few thousand VMs, the loop above may take a while.
> >> >> >> 
> >> >> > Hmm, that may be a potential issue. I have two proposals below. Could you
> >> >> > please help to check which one you prefer? Or provide another solution?
> >> >> > 
> >> >> > 1. Start a tasklet in free_socket_resources() to restore 
> >> > 'psr_cos_ids[socket]'
> >> >> >    of all domains. The action is protected by 'ref_lock' to avoid 
> >> > confliction
> >> >> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset
> >> >> >    the array to 0 in free_socket_resources().
> >> >> > 
> >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index
> >> >> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket
> >> >> >    and can memset the array to 0 when socket is offline. But here is an 
> >> > issue
> >> >> >    that we do not know how many members this array should have. I cannot 
> >> > find
> >> >> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use 
> >> > reallocation
> >> >> >    in 'psr_alloc_cos' if the newly created domain's id is bigger than 
> >> > current
> >> >> >    array number.
> >> >> 
> >> >> The number of domains is limited by the special DOMID_* values.
> >> >> However, allocating an array with 32k entries doesn't sound very
> >> >> reasonable.
> >> > 
> >> > I think 32K entries should be the extreme case. I can allocate e.g. 100 entries
> >> > when the first domain is created. If a new domain's id exceeds 100, reallocate
> >> > another 100 entries. The total number of entries allocated should be less than
> >> > 32K. This is a functional requirement which cannot be avoided. How do you 
> >> > think?
> >> 
> >> So how many entries would your array have once I start the 32,000th
> >> domain (having at any one time at most a single one running, besides
> >> Dom0)?
> >> 
> > In such case, we have to keep a 32K array because the domain_id is the index to
> > access the array. But this array is per socket so the whole memory used should
> > not be too much.
> 
> We carefully avoid any runtime allocations of order > 0, so if you
> were to set up such an array, you'd need to use vmalloc()/vzalloc().
> But I continue to be unconvinced that we want such a large array
> in the first place.
> 
> > After considering this issue more, I think the original codes might not be
> > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most
> > 288 CPUs. So, I think the domains running at same time in reality may not be
> > so many (no efficient resources). If this hypothesis is right, a loop to write
> > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am
> > wrong, please correct me. Thanks!
> 
> What relationship does the number of CPUs have to the number of
> domains on a host? There could be thousands with just a few dozen
> CPUs, provided none or very few of them have high demands on
> CPU resources. Additionally please never forget that system sizes
> basically only ever grow. Plus we wouldn't want a latent issue here
> in case we ever end up needing to widen domain IDs beyond 16 bits.
> 
How about a per socket array like this:
uint32_t domain_switch[1024];

Every bit represents a domain id. Then, we can handle this case as below:
1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to
   cover socket offline case. We do not need to clear it in 'free_socket_resources'.

2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set
   the bit to 1 according to domain_id. If the old value is 0 and the 
   'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.

3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit
   to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow.

Then, we only use 4KB for one socket.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13 10:49                 ` Yi Sun
@ 2017-04-13 10:58                   ` Jan Beulich
  2017-04-13 11:11                     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-13 10:58 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 13.04.17 at 12:49, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-13 03:41:44, Jan Beulich wrote:
>> >>> On 13.04.17 at 10:11, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-12 06:42:01, Jan Beulich wrote:
>> >> >>> On 12.04.17 at 14:23, <yi.y.sun@linux.intel.com> wrote:
>> >> > On 17-04-12 03:09:56, Jan Beulich wrote:
>> >> >> >>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
>> >> >> > On 17-04-11 09:01:53, Jan Beulich wrote:
>> >> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
>> >> >> >> Furthermore I'm not at all convinced this is appropriate to do in the
>> >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
>> >> >> >> have a few thousand VMs, the loop above may take a while.
>> >> >> >> 
>> >> >> > Hmm, that may be a potential issue. I have two proposals below. Could you
>> >> >> > please help to check which one you prefer? Or provide another solution?
>> >> >> > 
>> >> >> > 1. Start a tasklet in free_socket_resources() to restore 
>> >> > 'psr_cos_ids[socket]'
>> >> >> >    of all domains. The action is protected by 'ref_lock' to avoid 
>> >> > confliction
>> >> >> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or 
> memset
>> >> >> >    the array to 0 in free_socket_resources().
>> >> >> > 
>> >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change 
> index
>> >> >> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per 
> socket
>> >> >> >    and can memset the array to 0 when socket is offline. But here is an 
>> >> > issue
>> >> >> >    that we do not know how many members this array should have. I cannot 
>> >> > find
>> >> >> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use 
>> >> > reallocation
>> >> >> >    in 'psr_alloc_cos' if the newly created domain's id is bigger than 
>> >> > current
>> >> >> >    array number.
>> >> >> 
>> >> >> The number of domains is limited by the special DOMID_* values.
>> >> >> However, allocating an array with 32k entries doesn't sound very
>> >> >> reasonable.
>> >> > 
>> >> > I think 32K entries should be the extreme case. I can allocate e.g. 100 
> entries
>> >> > when the first domain is created. If a new domain's id exceeds 100, 
> reallocate
>> >> > another 100 entries. The total number of entries allocated should be less 
> than
>> >> > 32K. This is a functional requirement which cannot be avoided. How do you 
>> >> > think?
>> >> 
>> >> So how many entries would your array have once I start the 32,000th
>> >> domain (having at any one time at most a single one running, besides
>> >> Dom0)?
>> >> 
>> > In such case, we have to keep a 32K array because the domain_id is the 
> index to
>> > access the array. But this array is per socket so the whole memory used 
> should
>> > not be too much.
>> 
>> We carefully avoid any runtime allocations of order > 0, so if you
>> were to set up such an array, you'd need to use vmalloc()/vzalloc().
>> But I continue to be unconvinced that we want such a large array
>> in the first place.
>> 
>> > After considering this issue more, I think the original codes might not be
>> > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most
>> > 288 CPUs. So, I think the domains running at same time in reality may not 
> be
>> > so many (no efficient resources). If this hypothesis is right, a loop to 
> write
>> > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am
>> > wrong, please correct me. Thanks!
>> 
>> What relationship does the number of CPUs have to the number of
>> domains on a host? There could be thousands with just a few dozen
>> CPUs, provided none or very few of them have high demands on
>> CPU resources. Additionally please never forget that system sizes
>> basically only ever grow. Plus we wouldn't want a latent issue here
>> in case we ever end up needing to widen domain IDs beyond 16 bits.
>> 
> How about a per socket array like this:
> uint32_t domain_switch[1024];
> 
> Every bit represents a domain id. Then, we can handle this case as below:
> 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to
>    cover socket offline case. We do not need to clear it in 
> 'free_socket_resources'.
> 
> 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set
>    the bit to 1 according to domain_id. If the old value is 0 and the 
>    'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.
> 
> 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit
>    to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow.
> 
> Then, we only use 4KB for one socket.

This looks to come closer to something I'd consider acceptable, but
I may not understand your intentions in full yet: For one, there's
nowhere you clear the bit (other than presumably during socket
cleanup). And then I don't understand the test_and_ parts of the
constructs above, i.e. you don't clarify what the return values
would be used/needed for.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13 10:58                   ` Jan Beulich
@ 2017-04-13 11:11                     ` Yi Sun
  2017-04-13 11:26                       ` Yi Sun
  2017-04-13 11:31                       ` Jan Beulich
  0 siblings, 2 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-13 11:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-13 04:58:06, Jan Beulich wrote:
> >>> On 13.04.17 at 12:49, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-13 03:41:44, Jan Beulich wrote:
> >> >>> On 13.04.17 at 10:11, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-12 06:42:01, Jan Beulich wrote:
> >> >> >>> On 12.04.17 at 14:23, <yi.y.sun@linux.intel.com> wrote:
> >> >> > On 17-04-12 03:09:56, Jan Beulich wrote:
> >> >> >> >>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
> >> >> >> > On 17-04-11 09:01:53, Jan Beulich wrote:
> >> >> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> >> >> >> >> Furthermore I'm not at all convinced this is appropriate to do in the
> >> >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
> >> >> >> >> have a few thousand VMs, the loop above may take a while.
> >> >> >> >> 
> >> >> >> > Hmm, that may be a potential issue. I have two proposals below. Could you
> >> >> >> > please help to check which one you prefer? Or provide another solution?
> >> >> >> > 
> >> >> >> > 1. Start a tasklet in free_socket_resources() to restore 
> >> >> > 'psr_cos_ids[socket]'
> >> >> >> >    of all domains. The action is protected by 'ref_lock' to avoid 
> >> >> > confliction
> >> >> >> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or 
> > memset
> >> >> >> >    the array to 0 in free_socket_resources().
> >> >> >> > 
> >> >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change 
> > index
> >> >> >> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per 
> > socket
> >> >> >> >    and can memset the array to 0 when socket is offline. But here is an 
> >> >> > issue
> >> >> >> >    that we do not know how many members this array should have. I cannot 
> >> >> > find
> >> >> >> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use 
> >> >> > reallocation
> >> >> >> >    in 'psr_alloc_cos' if the newly created domain's id is bigger than 
> >> >> > current
> >> >> >> >    array number.
> >> >> >> 
> >> >> >> The number of domains is limited by the special DOMID_* values.
> >> >> >> However, allocating an array with 32k entries doesn't sound very
> >> >> >> reasonable.
> >> >> > 
> >> >> > I think 32K entries should be the extreme case. I can allocate e.g. 100 
> > entries
> >> >> > when the first domain is created. If a new domain's id exceeds 100, 
> > reallocate
> >> >> > another 100 entries. The total number of entries allocated should be less 
> > than
> >> >> > 32K. This is a functional requirement which cannot be avoided. How do you 
> >> >> > think?
> >> >> 
> >> >> So how many entries would your array have once I start the 32,000th
> >> >> domain (having at any one time at most a single one running, besides
> >> >> Dom0)?
> >> >> 
> >> > In such case, we have to keep a 32K array because the domain_id is the 
> > index to
> >> > access the array. But this array is per socket so the whole memory used 
> > should
> >> > not be too much.
> >> 
> >> We carefully avoid any runtime allocations of order > 0, so if you
> >> were to set up such an array, you'd need to use vmalloc()/vzalloc().
> >> But I continue to be unconvinced that we want such a large array
> >> in the first place.
> >> 
> >> > After considering this issue more, I think the original codes might not be
> >> > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most
> >> > 288 CPUs. So, I think the domains running at same time in reality may not 
> > be
> >> > so many (no efficient resources). If this hypothesis is right, a loop to 
> > write
> >> > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am
> >> > wrong, please correct me. Thanks!
> >> 
> >> What relationship does the number of CPUs have to the number of
> >> domains on a host? There could be thousands with just a few dozen
> >> CPUs, provided none or very few of them have high demands on
> >> CPU resources. Additionally please never forget that system sizes
> >> basically only ever grow. Plus we wouldn't want a latent issue here
> >> in case we ever end up needing to widen domain IDs beyond 16 bits.
> >> 
> > How about a per socket array like this:
> > uint32_t domain_switch[1024];
> > 
> > Every bit represents a domain id. Then, we can handle this case as below:
> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to
> >    cover socket offline case. We do not need to clear it in 
> > 'free_socket_resources'.
> > 
> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set
> >    the bit to 1 according to domain_id. If the old value is 0 and the 
> >    'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.
> > 
> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit
> >    to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow.
> > 
> > Then, we only use 4KB for one socket.
> 
> This looks to come closer to something I'd consider acceptable, but
> I may not understand your intentions in full yet: For one, there's
> nowhere you clear the bit (other than presumably during socket
> cleanup). 

Actually, clear the array in 'free_socket_resources' has same effect. I can
move clear action into it.

> And then I don't understand the test_and_ parts of the
> constructs above, i.e. you don't clarify what the return values
> would be used/needed for.
> 
Sorry, 0 means this domain has not been scheduled to the socket yet. If
test_and_ returns 0, that is the first time the domain runs on the socket
(the first time the socket is online). So, we need restore 'psr_cos_ids[socket]'
to 0 in 'psr_ctxt_switch_to()'.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13 11:11                     ` Yi Sun
@ 2017-04-13 11:26                       ` Yi Sun
  2017-04-13 11:31                       ` Jan Beulich
  1 sibling, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-13 11:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, chao.p.peng, xen-devel, roger.pau

On 17-04-13 19:11:54, Yi Sun wrote:
> On 17-04-13 04:58:06, Jan Beulich wrote:
> > >>> On 13.04.17 at 12:49, <yi.y.sun@linux.intel.com> wrote:
> > > On 17-04-13 03:41:44, Jan Beulich wrote:
> > >> >>> On 13.04.17 at 10:11, <yi.y.sun@linux.intel.com> wrote:
> > >> > On 17-04-12 06:42:01, Jan Beulich wrote:
> > >> >> >>> On 12.04.17 at 14:23, <yi.y.sun@linux.intel.com> wrote:
> > >> >> > On 17-04-12 03:09:56, Jan Beulich wrote:
> > >> >> >> >>> On 12.04.17 at 07:53, <yi.y.sun@linux.intel.com> wrote:
> > >> >> >> > On 17-04-11 09:01:53, Jan Beulich wrote:
> > >> >> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@linux.intel.com> wrote:
> > >> >> >> >> Furthermore I'm not at all convinced this is appropriate to do in the
> > >> >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
> > >> >> >> >> have a few thousand VMs, the loop above may take a while.
> > >> >> >> >> 
> > >> >> >> > Hmm, that may be a potential issue. I have two proposals below. Could you
> > >> >> >> > please help to check which one you prefer? Or provide another solution?
> > >> >> >> > 
> > >> >> >> > 1. Start a tasklet in free_socket_resources() to restore 
> > >> >> > 'psr_cos_ids[socket]'
> > >> >> >> >    of all domains. The action is protected by 'ref_lock' to avoid 
> > >> >> > confliction
> > >> >> >> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or 
> > > memset
> > >> >> >> >    the array to 0 in free_socket_resources().
> > >> >> >> > 
> > >> >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change 
> > > index
> > >> >> >> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs per 
> > > socket
> > >> >> >> >    and can memset the array to 0 when socket is offline. But here is an 
> > >> >> > issue
> > >> >> >> >    that we do not know how many members this array should have. I cannot 
> > >> >> > find
> > >> >> >> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use 
> > >> >> > reallocation
> > >> >> >> >    in 'psr_alloc_cos' if the newly created domain's id is bigger than 
> > >> >> > current
> > >> >> >> >    array number.
> > >> >> >> 
> > >> >> >> The number of domains is limited by the special DOMID_* values.
> > >> >> >> However, allocating an array with 32k entries doesn't sound very
> > >> >> >> reasonable.
> > >> >> > 
> > >> >> > I think 32K entries should be the extreme case. I can allocate e.g. 100 
> > > entries
> > >> >> > when the first domain is created. If a new domain's id exceeds 100, 
> > > reallocate
> > >> >> > another 100 entries. The total number of entries allocated should be less 
> > > than
> > >> >> > 32K. This is a functional requirement which cannot be avoided. How do you 
> > >> >> > think?
> > >> >> 
> > >> >> So how many entries would your array have once I start the 32,000th
> > >> >> domain (having at any one time at most a single one running, besides
> > >> >> Dom0)?
> > >> >> 
> > >> > In such case, we have to keep a 32K array because the domain_id is the 
> > > index to
> > >> > access the array. But this array is per socket so the whole memory used 
> > > should
> > >> > not be too much.
> > >> 
> > >> We carefully avoid any runtime allocations of order > 0, so if you
> > >> were to set up such an array, you'd need to use vmalloc()/vzalloc().
> > >> But I continue to be unconvinced that we want such a large array
> > >> in the first place.
> > >> 
> > >> > After considering this issue more, I think the original codes might not be
> > >> > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most
> > >> > 288 CPUs. So, I think the domains running at same time in reality may not 
> > > be
> > >> > so many (no efficient resources). If this hypothesis is right, a loop to 
> > > write
> > >> > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am
> > >> > wrong, please correct me. Thanks!
> > >> 
> > >> What relationship does the number of CPUs have to the number of
> > >> domains on a host? There could be thousands with just a few dozen
> > >> CPUs, provided none or very few of them have high demands on
> > >> CPU resources. Additionally please never forget that system sizes
> > >> basically only ever grow. Plus we wouldn't want a latent issue here
> > >> in case we ever end up needing to widen domain IDs beyond 16 bits.
> > >> 
> > > How about a per socket array like this:
> > > uint32_t domain_switch[1024];
> > > 
> > > Every bit represents a domain id. Then, we can handle this case as below:
> > > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to
> > >    cover socket offline case. We do not need to clear it in 
> > > 'free_socket_resources'.
> > > 
> > > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set
> > >    the bit to 1 according to domain_id. If the old value is 0 and the 
> > >    'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.
> > > 
> > > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit
> > >    to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow.
> > > 
> > > Then, we only use 4KB for one socket.
> > 
> > This looks to come closer to something I'd consider acceptable, but
> > I may not understand your intentions in full yet: For one, there's
> > nowhere you clear the bit (other than presumably during socket
> > cleanup). 
> 
> Actually, clear the array in 'free_socket_resources' has same effect. I can
> move clear action into it.
> 
> > And then I don't understand the test_and_ parts of the
> > constructs above, i.e. you don't clarify what the return values
> > would be used/needed for.
> > 
> Sorry, 0 means this domain has not been scheduled to the socket yet. If
> test_and_ returns 0, that is the first time the domain runs on the socket
> (the first time the socket is online). So, we need restore 'psr_cos_ids[socket]'
                 ^ missed 'after'. I mean the first time the domain is scheduled
                                   to the socket after the socket is online.
> to 0 in 'psr_ctxt_switch_to()'.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13 11:11                     ` Yi Sun
  2017-04-13 11:26                       ` Yi Sun
@ 2017-04-13 11:31                       ` Jan Beulich
  2017-04-13 11:44                         ` Yi Sun
  1 sibling, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-13 11:31 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 13.04.17 at 13:11, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-13 04:58:06, Jan Beulich wrote:
>> >>> On 13.04.17 at 12:49, <yi.y.sun@linux.intel.com> wrote:
>> > How about a per socket array like this:
>> > uint32_t domain_switch[1024];
>> > 
>> > Every bit represents a domain id. Then, we can handle this case as below:
>> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to
>> >    cover socket offline case. We do not need to clear it in 
>> > 'free_socket_resources'.
>> > 
>> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set
>> >    the bit to 1 according to domain_id. If the old value is 0 and the 
>> >    'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.
>> > 
>> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit
>> >    to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow.
>> > 
>> > Then, we only use 4KB for one socket.
>> 
>> This looks to come closer to something I'd consider acceptable, but
>> I may not understand your intentions in full yet: For one, there's
>> nowhere you clear the bit (other than presumably during socket
>> cleanup). 
> 
> Actually, clear the array in 'free_socket_resources' has same effect. I can
> move clear action into it.

That wasn't my point - I was asking about clearing individual bits.
Point being that if you only ever set bits in the map, you'll likely
end up iterating through all active domains anyway.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13 11:31                       ` Jan Beulich
@ 2017-04-13 11:44                         ` Yi Sun
  2017-04-13 11:50                           ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-13 11:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-13 05:31:41, Jan Beulich wrote:
> >>> On 13.04.17 at 13:11, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-13 04:58:06, Jan Beulich wrote:
> >> >>> On 13.04.17 at 12:49, <yi.y.sun@linux.intel.com> wrote:
> >> > How about a per socket array like this:
> >> > uint32_t domain_switch[1024];
> >> > 
> >> > Every bit represents a domain id. Then, we can handle this case as below:
> >> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to
> >> >    cover socket offline case. We do not need to clear it in 
> >> > 'free_socket_resources'.
> >> > 
> >> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set
> >> >    the bit to 1 according to domain_id. If the old value is 0 and the 
> >> >    'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.
> >> > 
> >> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit
> >> >    to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow.
> >> > 
> >> > Then, we only use 4KB for one socket.
> >> 
> >> This looks to come closer to something I'd consider acceptable, but
> >> I may not understand your intentions in full yet: For one, there's
> >> nowhere you clear the bit (other than presumably during socket
> >> cleanup). 
> > 
> > Actually, clear the array in 'free_socket_resources' has same effect. I can
> > move clear action into it.
> 
> That wasn't my point - I was asking about clearing individual bits.
> Point being that if you only ever set bits in the map, you'll likely
> end up iterating through all active domains anyway.
> 
If entering 'free_socket_resources', that means no more actions to
the array on this socket except clearing it. Can I just memset this array
of the socekt to 0?

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13 11:44                         ` Yi Sun
@ 2017-04-13 11:50                           ` Jan Beulich
  2017-04-18 10:55                             ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-13 11:50 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 13.04.17 at 13:44, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-13 05:31:41, Jan Beulich wrote:
>> >>> On 13.04.17 at 13:11, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-13 04:58:06, Jan Beulich wrote:
>> >> >>> On 13.04.17 at 12:49, <yi.y.sun@linux.intel.com> wrote:
>> >> > How about a per socket array like this:
>> >> > uint32_t domain_switch[1024];
>> >> > 
>> >> > Every bit represents a domain id. Then, we can handle this case as below:
>> >> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to
>> >> >    cover socket offline case. We do not need to clear it in 
>> >> > 'free_socket_resources'.
>> >> > 
>> >> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set
>> >> >    the bit to 1 according to domain_id. If the old value is 0 and the 
>> >> >    'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.
>> >> > 
>> >> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit
>> >> >    to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow.
>> >> > 
>> >> > Then, we only use 4KB for one socket.
>> >> 
>> >> This looks to come closer to something I'd consider acceptable, but
>> >> I may not understand your intentions in full yet: For one, there's
>> >> nowhere you clear the bit (other than presumably during socket
>> >> cleanup). 
>> > 
>> > Actually, clear the array in 'free_socket_resources' has same effect. I can
>> > move clear action into it.
>> 
>> That wasn't my point - I was asking about clearing individual bits.
>> Point being that if you only ever set bits in the map, you'll likely
>> end up iterating through all active domains anyway.
>> 
> If entering 'free_socket_resources', that means no more actions to
> the array on this socket except clearing it. Can I just memset this array
> of the socekt to 0?

You can, afaict, but unless first you act on the set bits I can't see why
you would want to track the bits in the first place. Or maybe I'm still
not understanding your intention, in which case I guess the best you
can do is simply implement your plan, and we'll discuss it in v11 review.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-13 11:50                           ` Jan Beulich
@ 2017-04-18 10:55                             ` Yi Sun
  2017-04-18 11:46                               ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-18 10:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

[-- Attachment #1: Type: text/plain, Size: 2400 bytes --]

On 17-04-13 05:50:01, Jan Beulich wrote:
> >>> On 13.04.17 at 13:44, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-13 05:31:41, Jan Beulich wrote:
> >> >>> On 13.04.17 at 13:11, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-13 04:58:06, Jan Beulich wrote:
> >> >> >>> On 13.04.17 at 12:49, <yi.y.sun@linux.intel.com> wrote:
> >> >> > How about a per socket array like this:
> >> >> > uint32_t domain_switch[1024];
> >> >> > 
> >> >> > Every bit represents a domain id. Then, we can handle this case as below:
> >> >> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to
> >> >> >    cover socket offline case. We do not need to clear it in 
> >> >> > 'free_socket_resources'.
> >> >> > 
> >> >> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set
> >> >> >    the bit to 1 according to domain_id. If the old value is 0 and the 
> >> >> >    'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.
> >> >> > 
> >> >> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit
> >> >> >    to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow.
> >> >> > 
> >> >> > Then, we only use 4KB for one socket.
> >> >> 
> >> >> This looks to come closer to something I'd consider acceptable, but
> >> >> I may not understand your intentions in full yet: For one, there's
> >> >> nowhere you clear the bit (other than presumably during socket
> >> >> cleanup). 
> >> > 
> >> > Actually, clear the array in 'free_socket_resources' has same effect. I can
> >> > move clear action into it.
> >> 
> >> That wasn't my point - I was asking about clearing individual bits.
> >> Point being that if you only ever set bits in the map, you'll likely
> >> end up iterating through all active domains anyway.
> >> 
> > If entering 'free_socket_resources', that means no more actions to
> > the array on this socket except clearing it. Can I just memset this array
> > of the socekt to 0?
> 
> You can, afaict, but unless first you act on the set bits I can't see why
> you would want to track the bits in the first place. Or maybe I'm still
> not understanding your intention, in which case I guess the best you
> can do is simply implement your plan, and we'll discuss it in v11 review.
> 
I made a test patch based on v10 and attached it in mail. Could you please
help to check it? Thanks!

> Jan

[-- Attachment #2: domids_final.patch --]
[-- Type: text/x-diff, Size: 8569 bytes --]

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index a85ea99..ef8d3e9 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -125,6 +125,8 @@ struct feat_node {
     uint32_t cos_reg_val[MAX_COS_REG_CNT];
 };
 
+#define PSR_DOM_IDS_NUM ((DOMID_IDLE + 1) / sizeof(uint32_t))
+
 /*
  * PSR features are managed per socket. Below structure defines the members
  * used to manage these features.
@@ -134,9 +136,11 @@ struct feat_node {
  *             COS ID. Every entry of cos_ref corresponds to one COS ID.
  */
 struct psr_socket_info {
-    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
     spinlock_t ref_lock;
+    spinlock_t dom_ids_lock;
+    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
     unsigned int cos_ref[MAX_COS_REG_CNT];
+    uint32_t dom_ids[PSR_DOM_IDS_NUM];
 };
 
 struct psr_assoc {
@@ -194,26 +198,11 @@ static void free_socket_resources(unsigned int socket)
 {
     unsigned int i;
     struct psr_socket_info *info = socket_info + socket;
-    struct domain *d;
+    unsigned long flag;
 
     if ( !info )
         return;
 
-    /* Restore domain cos id to 0 when socket is offline. */
-    for_each_domain ( d )
-    {
-        unsigned int cos = d->arch.psr_cos_ids[socket];
-        if ( cos == 0 )
-            continue;
-
-        spin_lock(&info->ref_lock);
-        ASSERT(!cos || info->cos_ref[cos]);
-        info->cos_ref[cos]--;
-        spin_unlock(&info->ref_lock);
-
-        d->arch.psr_cos_ids[socket] = 0;
-    }
-
     /*
      * Free resources of features. The global feature object, e.g. feat_l3_cat,
      * may not be freed here if it is not added into array. It is simply being
@@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket)
      */
     for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
     {
-        if ( !info->features[i] )
-            continue;
-
         xfree(info->features[i]);
         info->features[i] = NULL;
     }
+
+    spin_lock(&info->ref_lock);
+    memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int));
+    spin_unlock(&info->ref_lock);
+
+    spin_lock_irqsave(&info->dom_ids_lock, flag);
+    memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t));
+    spin_unlock_irqrestore(&info->dom_ids_lock, flag);
 }
 
 static bool feat_init_done(const struct psr_socket_info *info)
@@ -682,9 +676,34 @@ void psr_ctxt_switch_to(struct domain *d)
         psr_assoc_rmid(&reg, d->arch.psr_rmid);
 
     if ( psra->cos_mask )
-        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
-                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
-                      0, psra->cos_mask);
+    {
+        unsigned int socket = cpu_to_socket(smp_processor_id());
+        struct psr_socket_info *info = socket_info + socket;
+        int old_bit;
+
+        spin_lock(&info->dom_ids_lock);
+
+        old_bit = test_and_set_bit(d->domain_id, info->dom_ids);
+
+        /*
+         * If old_bit is 0, that means this is the first time the domain is
+         * switched to this socket or domain's COS ID has not been set since
+         * the socket is online. So, the domain's COS ID on this socket should
+         * be default value, 0. If not, that means this socket has been offline
+         * and the domain's COS ID has been set when the socket was online. So,
+         * this COS ID is invalid and we have to restore it to 0.
+         */
+        if ( d->arch.psr_cos_ids &&
+             old_bit == 0 &&
+             d->arch.psr_cos_ids[socket] != 0 )
+            d->arch.psr_cos_ids[socket] = 0;
+
+        spin_unlock(&info->dom_ids_lock);
+
+        psr_assoc_cos(&reg,
+                      d->arch.psr_cos_ids ? d->arch.psr_cos_ids[socket] : 0,
+                      psra->cos_mask);
+    }
 
     if ( reg != psra->val )
     {
@@ -1146,40 +1165,6 @@ static int write_psr_msr(unsigned int socket, unsigned int cos,
     return 0;
 }
 
-static void restore_default_val(unsigned int socket, unsigned int cos,
-                                enum psr_feat_type feat_type)
-{
-    unsigned int i, j;
-    uint32_t default_val;
-    const struct psr_socket_info *info = get_socket_info(socket);
-
-    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
-    {
-        const struct feat_node *feat = info->features[i];
-        /*
-         * There are four judgements:
-         * 1. Input 'feat_type' is valid so we have to get feature according to
-         *    it. If current feature type (i) does not match 'feat_type', we
-         *    need skip it, so continue to check next feature.
-         * 2. Input 'feat_type' is 'PSR_SOCKET_MAX_FEAT' which means we should
-         *    handle all features in this case. So, go to next loop.
-         * 3. Do not need restore the COS value back to default if cos_num is 1,
-         *    e.g. L3 CAT. Because next value setting will overwrite it.
-         * 4. 'feat' we got is NULL, continue.
-         */
-        if ( ( feat_type != PSR_SOCKET_MAX_FEAT && feat_type != i ) ||
-             !feat || feat->props->cos_num == 1 )
-            continue;
-
-        for ( j = 0; j < feat->props->cos_num; j++ )
-        {
-            feat->props->get_val(feat, 0, feat->props->type[j], &default_val);
-            write_psr_msr(socket, cos, default_val,
-                          feat->props->type[j], i);
-        }
-    }
-}
-
 /* The whole set process is protected by domctl_lock. */
 int psr_set_val(struct domain *d, unsigned int socket,
                 uint32_t val, enum cbm_type type)
@@ -1191,6 +1176,7 @@ int psr_set_val(struct domain *d, unsigned int socket,
     struct psr_socket_info *info = get_socket_info(socket);
     unsigned int array_len;
     enum psr_feat_type feat_type;
+    unsigned long flag;
 
     if ( IS_ERR(info) )
         return PTR_ERR(info);
@@ -1286,22 +1272,6 @@ int psr_set_val(struct domain *d, unsigned int socket,
     ASSERT(!cos || ref[cos]);
     ASSERT(!old_cos || ref[old_cos]);
     ref[old_cos]--;
-
-    /*
-     * Step 6:
-     * For features,  e.g. CDP, which cos_num is more than 1, we have to
-     * restore the old_cos value back to default when ref[old_cos] is 0.
-     * Otherwise, user will see wrong values when this COS ID is reused. E.g.
-     * user wants to set DATA to 0x3ff for a new domain. He hopes to see the
-     * DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
-     * if the COS ID picked for this action is the one that has been used by
-     * other domain and the CODE has been set to 0x1ff. Then, user will see
-     * DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
-     * using multiple COSs.
-     */
-    if ( old_cos && !ref[old_cos] )
-        restore_default_val(socket, old_cos, feat_type);
-
     spin_unlock(&info->ref_lock);
 
     /*
@@ -1310,7 +1280,10 @@ int psr_set_val(struct domain *d, unsigned int socket,
      * which COS the domain is using on the socket. One domain can only use
      * one COS ID at same time on each socket.
      */
+    spin_lock_irqsave(&info->dom_ids_lock, flag);
     d->arch.psr_cos_ids[socket] = cos;
+    test_and_set_bit(d->domain_id, info->dom_ids);
+    spin_unlock_irqrestore(&info->dom_ids_lock, flag);
 
     xfree(val_array);
     return ret;
@@ -1336,6 +1309,7 @@ static void psr_free_cos(struct domain *d)
     for ( socket = 0; socket < nr_sockets; socket++ )
     {
         struct psr_socket_info *info;
+        unsigned long flag;
 
         /* cos 0 is default one which does not need be handled. */
         cos = d->arch.psr_cos_ids[socket];
@@ -1346,14 +1320,11 @@ static void psr_free_cos(struct domain *d)
         spin_lock(&info->ref_lock);
         ASSERT(!cos || info->cos_ref[cos]);
         info->cos_ref[cos]--;
-        /*
-         * The 'cos_ref[cos]' of 'd' is 0 now so we need restore corresponding
-         * COS registers to default value. Because this case happens when a
-         * domain is destroied, we need restore all features.
-         */
-        if ( !info->cos_ref[cos] )
-            restore_default_val(socket, cos, PSR_SOCKET_MAX_FEAT);
         spin_unlock(&info->ref_lock);
+
+        spin_lock_irqsave(&info->dom_ids_lock, flag);
+        clear_bit(d->domain_id, info->dom_ids);
+        spin_unlock_irqrestore(&info->dom_ids_lock, flag);
     }
 
     xfree(d->arch.psr_cos_ids);
@@ -1453,6 +1424,7 @@ static void psr_cpu_init(void)
         goto assoc_init;
 
     spin_lock_init(&info->ref_lock);
+    spin_lock_init(&info->dom_ids_lock);
 
     cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, &regs);
     if ( regs.b & PSR_RESOURCE_TYPE_L3 )

[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-18 10:55                             ` Yi Sun
@ 2017-04-18 11:46                               ` Jan Beulich
  2017-04-19  8:22                                 ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-18 11:46 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
> I made a test patch based on v10 and attached it in mail. Could you please
> help to check it? Thanks!

This looks reasonable at the first glance, albeit I continue to be
unconvinced that this is the only (reasonable) way of solving the
problem. After all we don't have to go through similar hoops for
any other of the register state associated with a vCPU. There
are a number of cosmetic issues, but commenting on an attached
(rather than inlined) patch is a little difficult.

One remark regarding the locking though: Acquiring a lock in the
context switch path should be made as low risk of long stalls as
possible. Therefore you will want to consider using r/w locks
instead of spin locks here, which would allow parallelism on all
cores of a socket as long as COS IDs aren't being updated.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-18 11:46                               ` Jan Beulich
@ 2017-04-19  8:22                                 ` Yi Sun
  2017-04-19  9:00                                   ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-19  8:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-18 05:46:43, Jan Beulich wrote:
> >>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
> > I made a test patch based on v10 and attached it in mail. Could you please
> > help to check it? Thanks!
> 
> This looks reasonable at the first glance, albeit I continue to be
> unconvinced that this is the only (reasonable) way of solving the
> problem. After all we don't have to go through similar hoops for
> any other of the register state associated with a vCPU. There

Sorry, I do not understand your meaning clearly.
1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this patch,
   we check 'dom_ids' array to know if the domain's cos id has not been set but
   its 'psr_cos_ids[socket]' already has a non zero value. This case means the
   socket offline has happened so that we have to restore the domain's
   'psr_cos_ids[socket]' to default value 0 which points to default COS MSR.
   I think we have discussed this in previous mails and achieved agreement on
   such logic. 
2. DYM the COS MSRs? We have discussed this before. Per your comments, when
   socket is online, the registers values may be modified by FW and others.
   So, we have to restore them to default value which is being done in
   'cat_init_feature'.

So, what is your exactly meaning here? Thanks!

> are a number of cosmetic issues, but commenting on an attached
> (rather than inlined) patch is a little difficult.
> 
Sorry for that, I will directly send patch out next time.

> One remark regarding the locking though: Acquiring a lock in the
> context switch path should be made as low risk of long stalls as
> possible. Therefore you will want to consider using r/w locks
> instead of spin locks here, which would allow parallelism on all
> cores of a socket as long as COS IDs aren't being updated.
> 
In 'psr_ctxt_switch_to', I use the lock only to protect 'write' actions.
So, I do not understand why read-write lock is better? Anything I don't
consider? Please help to point out. Thanks!

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-19  8:22                                 ` Yi Sun
@ 2017-04-19  9:00                                   ` Jan Beulich
  2017-04-20  2:14                                     ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-19  9:00 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 19.04.17 at 10:22, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-18 05:46:43, Jan Beulich wrote:
>> >>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
>> > I made a test patch based on v10 and attached it in mail. Could you please
>> > help to check it? Thanks!
>> 
>> This looks reasonable at the first glance, albeit I continue to be
>> unconvinced that this is the only (reasonable) way of solving the
>> problem. After all we don't have to go through similar hoops for
>> any other of the register state associated with a vCPU. There
> 
> Sorry, I do not understand your meaning clearly.
> 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this patch,
>    we check 'dom_ids' array to know if the domain's cos id has not been set but
>    its 'psr_cos_ids[socket]' already has a non zero value. This case means the
>    socket offline has happened so that we have to restore the domain's
>    'psr_cos_ids[socket]' to default value 0 which points to default COS MSR.
>    I think we have discussed this in previous mails and achieved agreement on
>    such logic. 
> 2. DYM the COS MSRs? We have discussed this before. Per your comments, when
>    socket is online, the registers values may be modified by FW and others.
>    So, we have to restore them to default value which is being done in
>    'cat_init_feature'.
> 
> So, what is your exactly meaning here? Thanks!

Neither of the two. I'm comparing with COS/PSR-_unrelated_
handling of register state. After all there are other MSRs which
we need to put into the right state after a core comes online.

>> are a number of cosmetic issues, but commenting on an attached
>> (rather than inlined) patch is a little difficult.
>> 
> Sorry for that, I will directly send patch out next time.
> 
>> One remark regarding the locking though: Acquiring a lock in the
>> context switch path should be made as low risk of long stalls as
>> possible. Therefore you will want to consider using r/w locks
>> instead of spin locks here, which would allow parallelism on all
>> cores of a socket as long as COS IDs aren't being updated.
>> 
> In 'psr_ctxt_switch_to', I use the lock only to protect 'write' actions.
> So, I do not understand why read-write lock is better? Anything I don't
> consider? Please help to point out. Thanks!

Hmm, looking again I can see that r/w locks indeed may not help
here. However, what you say still doesn't look correct to me: You
acquire the lock depending on _only_ psra->cos_mask being non-
zero. This means that all cores on one socket are being serialized
here. Quite possibly all you need is for some of the checks done
inside the locked region to be replicated (but _not_ moved) to the
outer if(), to limit the number of times where the lock is to be
acquired.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-19  9:00                                   ` Jan Beulich
@ 2017-04-20  2:14                                     ` Yi Sun
  2017-04-20  9:43                                       ` Jan Beulich
  2017-04-21  6:18                                       ` Jan Beulich
  0 siblings, 2 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-20  2:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-19 03:00:29, Jan Beulich wrote:
> >>> On 19.04.17 at 10:22, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-18 05:46:43, Jan Beulich wrote:
> >> >>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
> >> > I made a test patch based on v10 and attached it in mail. Could you please
> >> > help to check it? Thanks!
> >> 
> >> This looks reasonable at the first glance, albeit I continue to be
> >> unconvinced that this is the only (reasonable) way of solving the

Do you have any other suggestion on this? Thanks!

> >> problem. After all we don't have to go through similar hoops for
> >> any other of the register state associated with a vCPU. There
> > 
> > Sorry, I do not understand your meaning clearly.
> > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this patch,
> >    we check 'dom_ids' array to know if the domain's cos id has not been set but
> >    its 'psr_cos_ids[socket]' already has a non zero value. This case means the
> >    socket offline has happened so that we have to restore the domain's
> >    'psr_cos_ids[socket]' to default value 0 which points to default COS MSR.
> >    I think we have discussed this in previous mails and achieved agreement on
> >    such logic. 
> > 2. DYM the COS MSRs? We have discussed this before. Per your comments, when
> >    socket is online, the registers values may be modified by FW and others.
> >    So, we have to restore them to default value which is being done in
> >    'cat_init_feature'.
> > 
> > So, what is your exactly meaning here? Thanks!
> 
> Neither of the two. I'm comparing with COS/PSR-_unrelated_
> handling of register state. After all there are other MSRs which
> we need to put into the right state after a core comes online.
> 
For PSR feature, the 'cos_reg_val[]' of the feature is destroied when socket
is offline. The values in it are all 0 when socket is online again. This causes
error when user shows the CBMs. So, we have two options when socket is online:
1. Read COS MSRs values and save them into 'cos_reg_val[]'.
2. Restore COS MSRs values and 'cos_reg_val[]' values to default CBM.

Per our discussion on v9 patch 05, we decided to adopt option 2. Below are
what we discussed. FYR.

[v9 patch 05]
>> >> > +    /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
>> >> > +    feat->cos_reg_val[0] = cat_default_val(cat.cbm_len);
>> >> > +    /*
>> >> > +     * To handle cpu offline and then online case, we need read MSRs back to
>> >> > +     * save values into cos_reg_val array.
>> >> > +     */
>> >> > +    for ( i = 1; i <= cat.cos_max; i++ )
>> >> > +    {
>> >> > +        rdmsrl(MSR_IA32_PSR_L3_MASK(i), val);
>> >> > +        feat->cos_reg_val[i] = (uint32_t)val;
>> >> > +    }
>> >> 
[Jan]
>> >> You mention this in the changes done, but I don't understand why 
>> >> you do this. What meaning to these values have to you? If you want 
>> >> hardware and cached values to match up, the much more conventional 
>> >> way of enforcing this would be to write the values you actually 
>> >> want (normally all zero).
>> >>
[Sun Yi] 
>> > When all cpus on a socket are offline, the free_feature() is called 
>> > to free features resources so that the values saved in 
>> > cos_reg_val[] are lost. When the socket is online again, features 
>> > are allocated again so that cos_reg_val[] members are all 
>> > initialized to 0. Only is cos_reg_val[0] initialized to default value in this function in old codes.
>> > 
>> > But domain is still alive so that its cos id on the socket is kept. 
>> > The corresponding MSR value is kept too per test. To make 
>> > cos_reg_val[] values be same as HW to not to mislead user, we 
>> > should read back the valid values on HW into cos_reg_val[].
>> 
[Jan]
>> Okay, I understand the background, but I don't view this solution as 
>> viable: Once the last core on a socket goes offline, all references 
>> to it should be cleaned up. After all what will be brought back 
>> online may be a different physical CPU altogether; you can't assume 
>> MSR values to have survived even if it is the same CPU which comes 
>> back online, as it may have undergone a reset cycle, or BIOS/SMM may 
>> have played with the MSRs.
>> That's even a possibility for a single core coming back online, so 
>> you have to reload MSRs explicitly anyway if implicit reloading (i.e. 
>> once vCPU-s get scheduled onto it) doesn't suffice.
>> 
[Sun Yi]
> So, you think the MSRs values may not be valid after such process and 
> reloading (write MSRs to default value) is needed. If so, I would like 
> to do more operations in 'free_feature()':
> 1. Iterate all domains working on the offline socket to change
>    'd->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init
>    status.
> 2. Restore 'socket_info[socket].cos_ref[]' to all 0.
> 
> These can make the socket's info be totally restored back to init status.
[Jan]
Yes, that's what I think is needed.

> >> are a number of cosmetic issues, but commenting on an attached
> >> (rather than inlined) patch is a little difficult.
> >> 
> > Sorry for that, I will directly send patch out next time.
> > 
> >> One remark regarding the locking though: Acquiring a lock in the
> >> context switch path should be made as low risk of long stalls as
> >> possible. Therefore you will want to consider using r/w locks
> >> instead of spin locks here, which would allow parallelism on all
> >> cores of a socket as long as COS IDs aren't being updated.
> >> 
> > In 'psr_ctxt_switch_to', I use the lock only to protect 'write' actions.
> > So, I do not understand why read-write lock is better? Anything I don't
> > consider? Please help to point out. Thanks!
> 
> Hmm, looking again I can see that r/w locks indeed may not help
> here. However, what you say still doesn't look correct to me: You
> acquire the lock depending on _only_ psra->cos_mask being non-
> zero. This means that all cores on one socket are being serialized
> here. Quite possibly all you need is for some of the checks done
> inside the locked region to be replicated (but _not_ moved) to the
> outer if(), to limit the number of times where the lock is to be
> acquired.
> 
I think your suggestions is to check old_cos outer the lock region.
If it is not 0 which means the domain's cos id does not need restore
to 0, we can directly set it into ASSOC register. That can avoid
unnecessary lock. I will send out the test patch again to ask your
help to provide review comments (you said there are also 'a number
of cosmetics issues'). Thanks!

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH] dom_ids array implementation.
  2017-04-01 13:53 ` [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework Yi Sun
  2017-04-11 15:01   ` Jan Beulich
@ 2017-04-20  5:38   ` Yi Sun
  2017-04-26 10:04     ` Jan Beulich
  1 sibling, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-20  5:38 UTC (permalink / raw)
  To: xen-devel; +Cc: Yi Sun, jbeulich

Hi, Jan,

Please help to review this patch. Thank you!

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
 xen/arch/x86/psr.c | 135 ++++++++++++++++++++++-------------------------------
 1 file changed, 55 insertions(+), 80 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index a85ea99..7bc212f 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -125,6 +125,8 @@ struct feat_node {
     uint32_t cos_reg_val[MAX_COS_REG_CNT];
 };
 
+#define PSR_DOM_IDS_NUM ((DOMID_IDLE + 1) / sizeof(uint32_t))
+
 /*
  * PSR features are managed per socket. Below structure defines the members
  * used to manage these features.
@@ -134,9 +136,11 @@ struct feat_node {
  *             COS ID. Every entry of cos_ref corresponds to one COS ID.
  */
 struct psr_socket_info {
-    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
     spinlock_t ref_lock;
+    spinlock_t dom_ids_lock;
+    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
     unsigned int cos_ref[MAX_COS_REG_CNT];
+    uint32_t dom_ids[PSR_DOM_IDS_NUM];
 };
 
 struct psr_assoc {
@@ -194,26 +198,11 @@ static void free_socket_resources(unsigned int socket)
 {
     unsigned int i;
     struct psr_socket_info *info = socket_info + socket;
-    struct domain *d;
+    unsigned long flag;
 
     if ( !info )
         return;
 
-    /* Restore domain cos id to 0 when socket is offline. */
-    for_each_domain ( d )
-    {
-        unsigned int cos = d->arch.psr_cos_ids[socket];
-        if ( cos == 0 )
-            continue;
-
-        spin_lock(&info->ref_lock);
-        ASSERT(!cos || info->cos_ref[cos]);
-        info->cos_ref[cos]--;
-        spin_unlock(&info->ref_lock);
-
-        d->arch.psr_cos_ids[socket] = 0;
-    }
-
     /*
      * Free resources of features. The global feature object, e.g. feat_l3_cat,
      * may not be freed here if it is not added into array. It is simply being
@@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket)
      */
     for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
     {
-        if ( !info->features[i] )
-            continue;
-
         xfree(info->features[i]);
         info->features[i] = NULL;
     }
+
+    spin_lock(&info->ref_lock);
+    memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int));
+    spin_unlock(&info->ref_lock);
+
+    spin_lock_irqsave(&info->dom_ids_lock, flag);
+    memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t));
+    spin_unlock_irqrestore(&info->dom_ids_lock, flag);
 }
 
 static bool feat_init_done(const struct psr_socket_info *info)
@@ -682,9 +676,37 @@ void psr_ctxt_switch_to(struct domain *d)
         psr_assoc_rmid(&reg, d->arch.psr_rmid);
 
     if ( psra->cos_mask )
-        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
-                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
-                      0, psra->cos_mask);
+    {
+        unsigned int socket = cpu_to_socket(smp_processor_id());
+        struct psr_socket_info *info = socket_info + socket;
+
+        if ( test_bit(d->domain_id, info->dom_ids) )
+            goto set_assoc;
+
+        spin_lock(&info->dom_ids_lock);
+
+        int old_bit = test_and_set_bit(d->domain_id, info->dom_ids);
+
+        /*
+         * If old_bit is 0, that means this is the first time the domain is
+         * switched to this socket or domain's COS ID has not been set since
+         * the socket is online. So, the domain's COS ID on this socket should
+         * be default value, 0. If not, that means this socket has been offline
+         * and the domain's COS ID has been set when the socket was online. So,
+         * this COS ID is invalid and we have to restore it to 0.
+         */
+        if ( d->arch.psr_cos_ids &&
+             old_bit == 0 &&
+             d->arch.psr_cos_ids[socket] != 0 )
+            d->arch.psr_cos_ids[socket] = 0;
+
+        spin_unlock(&info->dom_ids_lock);
+
+ set_assoc:
+        psr_assoc_cos(&reg,
+                      d->arch.psr_cos_ids ? d->arch.psr_cos_ids[socket] : 0,
+                      psra->cos_mask);
+    }
 
     if ( reg != psra->val )
     {
@@ -1146,40 +1168,6 @@ static int write_psr_msr(unsigned int socket, unsigned int cos,
     return 0;
 }
 
-static void restore_default_val(unsigned int socket, unsigned int cos,
-                                enum psr_feat_type feat_type)
-{
-    unsigned int i, j;
-    uint32_t default_val;
-    const struct psr_socket_info *info = get_socket_info(socket);
-
-    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
-    {
-        const struct feat_node *feat = info->features[i];
-        /*
-         * There are four judgements:
-         * 1. Input 'feat_type' is valid so we have to get feature according to
-         *    it. If current feature type (i) does not match 'feat_type', we
-         *    need skip it, so continue to check next feature.
-         * 2. Input 'feat_type' is 'PSR_SOCKET_MAX_FEAT' which means we should
-         *    handle all features in this case. So, go to next loop.
-         * 3. Do not need restore the COS value back to default if cos_num is 1,
-         *    e.g. L3 CAT. Because next value setting will overwrite it.
-         * 4. 'feat' we got is NULL, continue.
-         */
-        if ( ( feat_type != PSR_SOCKET_MAX_FEAT && feat_type != i ) ||
-             !feat || feat->props->cos_num == 1 )
-            continue;
-
-        for ( j = 0; j < feat->props->cos_num; j++ )
-        {
-            feat->props->get_val(feat, 0, feat->props->type[j], &default_val);
-            write_psr_msr(socket, cos, default_val,
-                          feat->props->type[j], i);
-        }
-    }
-}
-
 /* The whole set process is protected by domctl_lock. */
 int psr_set_val(struct domain *d, unsigned int socket,
                 uint32_t val, enum cbm_type type)
@@ -1191,6 +1179,7 @@ int psr_set_val(struct domain *d, unsigned int socket,
     struct psr_socket_info *info = get_socket_info(socket);
     unsigned int array_len;
     enum psr_feat_type feat_type;
+    unsigned long flag;
 
     if ( IS_ERR(info) )
         return PTR_ERR(info);
@@ -1286,22 +1275,6 @@ int psr_set_val(struct domain *d, unsigned int socket,
     ASSERT(!cos || ref[cos]);
     ASSERT(!old_cos || ref[old_cos]);
     ref[old_cos]--;
-
-    /*
-     * Step 6:
-     * For features,  e.g. CDP, which cos_num is more than 1, we have to
-     * restore the old_cos value back to default when ref[old_cos] is 0.
-     * Otherwise, user will see wrong values when this COS ID is reused. E.g.
-     * user wants to set DATA to 0x3ff for a new domain. He hopes to see the
-     * DATA is set to 0x3ff and CODE should be the default value, 0x7ff. But
-     * if the COS ID picked for this action is the one that has been used by
-     * other domain and the CODE has been set to 0x1ff. Then, user will see
-     * DATA: 0x3ff, CODE: 0x1ff. So, we have to restore COS values for features
-     * using multiple COSs.
-     */
-    if ( old_cos && !ref[old_cos] )
-        restore_default_val(socket, old_cos, feat_type);
-
     spin_unlock(&info->ref_lock);
 
     /*
@@ -1310,7 +1283,10 @@ int psr_set_val(struct domain *d, unsigned int socket,
      * which COS the domain is using on the socket. One domain can only use
      * one COS ID at same time on each socket.
      */
+    spin_lock_irqsave(&info->dom_ids_lock, flag);
     d->arch.psr_cos_ids[socket] = cos;
+    test_and_set_bit(d->domain_id, info->dom_ids);
+    spin_unlock_irqrestore(&info->dom_ids_lock, flag);
 
     xfree(val_array);
     return ret;
@@ -1336,6 +1312,7 @@ static void psr_free_cos(struct domain *d)
     for ( socket = 0; socket < nr_sockets; socket++ )
     {
         struct psr_socket_info *info;
+        unsigned long flag;
 
         /* cos 0 is default one which does not need be handled. */
         cos = d->arch.psr_cos_ids[socket];
@@ -1346,14 +1323,11 @@ static void psr_free_cos(struct domain *d)
         spin_lock(&info->ref_lock);
         ASSERT(!cos || info->cos_ref[cos]);
         info->cos_ref[cos]--;
-        /*
-         * The 'cos_ref[cos]' of 'd' is 0 now so we need restore corresponding
-         * COS registers to default value. Because this case happens when a
-         * domain is destroied, we need restore all features.
-         */
-        if ( !info->cos_ref[cos] )
-            restore_default_val(socket, cos, PSR_SOCKET_MAX_FEAT);
         spin_unlock(&info->ref_lock);
+
+        spin_lock_irqsave(&info->dom_ids_lock, flag);
+        clear_bit(d->domain_id, info->dom_ids);
+        spin_unlock_irqrestore(&info->dom_ids_lock, flag);
     }
 
     xfree(d->arch.psr_cos_ids);
@@ -1453,6 +1427,7 @@ static void psr_cpu_init(void)
         goto assoc_init;
 
     spin_lock_init(&info->ref_lock);
+    spin_lock_init(&info->dom_ids_lock);
 
     cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, &regs);
     if ( regs.b & PSR_RESOURCE_TYPE_L3 )
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-20  2:14                                     ` Yi Sun
@ 2017-04-20  9:43                                       ` Jan Beulich
  2017-04-20 13:02                                         ` Lars Kurth
  2017-04-21  6:18                                       ` Jan Beulich
  1 sibling, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-20  9:43 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 20.04.17 at 04:14, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-19 03:00:29, Jan Beulich wrote:
>> >>> On 19.04.17 at 10:22, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-18 05:46:43, Jan Beulich wrote:
>> >> >>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
>> >> > I made a test patch based on v10 and attached it in mail. Could you please
>> >> > help to check it? Thanks!
>> >> 
>> >> This looks reasonable at the first glance, albeit I continue to be
>> >> unconvinced that this is the only (reasonable) way of solving the
> 
> Do you have any other suggestion on this? Thanks!

I'm sorry, but no. I'm already spending _much_ more time on this
series than I should be, considering its (afaic) relatively low
priority. I really have to ask you to properly think through both
the data layout and code structure, including considering
alternatives especially in places where you can _anticipate_
controversy.

>> >> problem. After all we don't have to go through similar hoops for
>> >> any other of the register state associated with a vCPU. There
>> > 
>> > Sorry, I do not understand your meaning clearly.
>> > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this patch,
>> >    we check 'dom_ids' array to know if the domain's cos id has not been set but
>> >    its 'psr_cos_ids[socket]' already has a non zero value. This case means the
>> >    socket offline has happened so that we have to restore the domain's
>> >    'psr_cos_ids[socket]' to default value 0 which points to default COS MSR.
>> >    I think we have discussed this in previous mails and achieved agreement on
>> >    such logic. 
>> > 2. DYM the COS MSRs? We have discussed this before. Per your comments, when
>> >    socket is online, the registers values may be modified by FW and others.
>> >    So, we have to restore them to default value which is being done in
>> >    'cat_init_feature'.
>> > 
>> > So, what is your exactly meaning here? Thanks!
>> 
>> Neither of the two. I'm comparing with COS/PSR-_unrelated_
>> handling of register state. After all there are other MSRs which
>> we need to put into the right state after a core comes online.
>> 
> For PSR feature, the 'cos_reg_val[]' of the feature is destroied when socket
> is offline. The values in it are all 0 when socket is online again. This causes
> error when user shows the CBMs. So, we have two options when socket is 
> online:
> 1. Read COS MSRs values and save them into 'cos_reg_val[]'.
> 2. Restore COS MSRs values and 'cos_reg_val[]' values to default CBM.

This re-states what you want to do; it does not answer my question.
Along the lines of what you say, for example FS and GS base MSRs
come back as zero too after a socket has been (re-)onlined. We
don't need to go through any hoops there, nevertheless.

> Per our discussion on v9 patch 05, we decided to adopt option 2. Below are
> what we discussed. FYR.

Well, we decided option 2 is better than option 1. I'm still
unconvinced there's not a yet better alternative.

>> >> are a number of cosmetic issues, but commenting on an attached
>> >> (rather than inlined) patch is a little difficult.
>> >> 
>> > Sorry for that, I will directly send patch out next time.
>> > 
>> >> One remark regarding the locking though: Acquiring a lock in the
>> >> context switch path should be made as low risk of long stalls as
>> >> possible. Therefore you will want to consider using r/w locks
>> >> instead of spin locks here, which would allow parallelism on all
>> >> cores of a socket as long as COS IDs aren't being updated.
>> >> 
>> > In 'psr_ctxt_switch_to', I use the lock only to protect 'write' actions.
>> > So, I do not understand why read-write lock is better? Anything I don't
>> > consider? Please help to point out. Thanks!
>> 
>> Hmm, looking again I can see that r/w locks indeed may not help
>> here. However, what you say still doesn't look correct to me: You
>> acquire the lock depending on _only_ psra->cos_mask being non-
>> zero. This means that all cores on one socket are being serialized
>> here. Quite possibly all you need is for some of the checks done
>> inside the locked region to be replicated (but _not_ moved) to the
>> outer if(), to limit the number of times where the lock is to be
>> acquired.
>> 
> I think your suggestions is to check old_cos outer the lock region.

My suggestion is to check as much state as possible, to prevent
having to acquire the lock whenever possible.

> If it is not 0 which means the domain's cos id does not need restore
> to 0, we can directly set it into ASSOC register. That can avoid
> unnecessary lock. I will send out the test patch again to ask your
> help to provide review comments (you said there are also 'a number
> of cosmetics issues').

And I would hope you would try to eliminate some (if not all) yourself
first. After all you can easily go over your patch yourself, checking
for e.g. style violations.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-20  9:43                                       ` Jan Beulich
@ 2017-04-20 13:02                                         ` Lars Kurth
  2017-04-20 13:21                                           ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Lars Kurth @ 2017-04-20 13:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, chao.p.peng, mengxu, xen-devel, Yi Sun, roger.pau

Apologies for stepping in here. But it feels to me that this thread is at risk of becoming unproductive.

> On 20 Apr 2017, at 10:43, Jan Beulich <jbeulich@suse.com> wrote:
> 
>>>> On 20.04.17 at 04:14, <yi.y.sun@linux.intel.com> wrote:
>> On 17-04-19 03:00:29, Jan Beulich wrote:
>>>>>> On 19.04.17 at 10:22, <yi.y.sun@linux.intel.com> wrote:
>>>> On 17-04-18 05:46:43, Jan Beulich wrote:
>>>>>>>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
>>>>>> I made a test patch based on v10 and attached it in mail. Could you please
>>>>>> help to check it? Thanks!
>>>>> 
>>>>> This looks reasonable at the first glance, albeit I continue to be
>>>>> unconvinced that this is the only (reasonable) way of solving the
>> 
>> Do you have any other suggestion on this? Thanks!
> 
> I'm sorry, but no. I'm already spending _much_ more time on this
> series than I should be, considering its (afaic) relatively low
> priority. I really have to ask you to properly think through both
> the data layout and code structure, including considering
> alternatives especially in places where you can _anticipate_
> controversy.

Jan, can you confirm whether you are happy with the current proposal. You say "this looks reasonable at the first glance", but then go on to say that there may be other "reasonable ways" of solving the problem at hand.

This is not very concrete and hard to respond to from a contributors point of view: it would be good to establish whether a) the v10 proposal in this area is good enough, b) whether there are any concrete improvements to be made.

You say please think through whether "you can _anticipate_ controversy", but at the same time you can't currently identify/think of any issues. That seems to suggest to me that maybe the proposal is good enough. Or that something is missing, which has not been articulated. Taking into account language barriers, more clarity would probably be helpful.

>> Per our discussion on v9 patch 05, we decided to adopt option 2. Below are
>> what we discussed. FYR.
> 
> Well, we decided option 2 is better than option 1. I'm still
> unconvinced there's not a yet better alternative.

I suppose that is the same type of argument. Aka we looked at a number of options, seem to have agreed one is better than the other. But there is no clarity as to whether in this case option 2 is good enough and what concrete steps would be needed to get to a better alternative.

Of course I may have missed some of the context here in some of the older threads and thus I may have missed some of the context. 

>> If it is not 0 which means the domain's cos id does not need restore
>> to 0, we can directly set it into ASSOC register. That can avoid
>> unnecessary lock. I will send out the test patch again to ask your
>> help to provide review comments (you said there are also 'a number
>> of cosmetics issues').
> 
> And I would hope you would try to eliminate some (if not all) yourself
> first. After all you can easily go over your patch yourself, checking
> for e.g. style violations.

I think this is fair enough.

Best Regards
Lars
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-20 13:02                                         ` Lars Kurth
@ 2017-04-20 13:21                                           ` Jan Beulich
  2017-04-20 16:52                                             ` Lars Kurth
  2017-04-21  1:13                                             ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-20 13:21 UTC (permalink / raw)
  To: Lars Kurth
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, Yi Sun, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 20.04.17 at 15:02, <lars.kurth.xen@gmail.com> wrote:
>> On 20 Apr 2017, at 10:43, Jan Beulich <jbeulich@suse.com> wrote:
>> 
>>>>> On 20.04.17 at 04:14, <yi.y.sun@linux.intel.com> wrote:
>>> On 17-04-19 03:00:29, Jan Beulich wrote:
>>>>>>> On 19.04.17 at 10:22, <yi.y.sun@linux.intel.com> wrote:
>>>>> On 17-04-18 05:46:43, Jan Beulich wrote:
>>>>>>>>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
>>>>>>> I made a test patch based on v10 and attached it in mail. Could you please
>>>>>>> help to check it? Thanks!
>>>>>> 
>>>>>> This looks reasonable at the first glance, albeit I continue to be
>>>>>> unconvinced that this is the only (reasonable) way of solving the
>>> 
>>> Do you have any other suggestion on this? Thanks!
>> 
>> I'm sorry, but no. I'm already spending _much_ more time on this
>> series than I should be, considering its (afaic) relatively low
>> priority. I really have to ask you to properly think through both
>> the data layout and code structure, including considering
>> alternatives especially in places where you can _anticipate_
>> controversy.
> 
> Jan, can you confirm whether you are happy with the current proposal. You 
> say "this looks reasonable at the first glance", but then go on to say that 
> there may be other "reasonable ways" of solving the problem at hand.
> 
> This is not very concrete and hard to respond to from a contributors point 
> of view: it would be good to establish whether a) the v10 proposal in this 
> area is good enough, b) whether there are any concrete improvements to be 
> made.

I understand it's not very concrete, but please understand that with
the over 100 patches wanting looking at right now it is simply
impossible for me to give precise suggestions everywhere. I really
have to be allowed to defer to the originator to come up with
possible better mechanisms (or defend what there is by addressing
questions raised), especially with - as said - the amount of time spent
here already having been way higher than is justifiable. Just go and
compare v10 with one of the initial versions: Almost all of the data
layout and code flow have fundamentally changed, mostly based on
feedback I gave. I'm sorry for saying that, but to me this is an
indication that things hadn't been thought through well in the design
phase, i.e. before even submitting a first RFC.

> You say please think through whether "you can _anticipate_ controversy", but 
> at the same time you can't currently identify/think of any issues. That seems 
> to suggest to me that maybe the proposal is good enough. Or that something is 
> missing, which has not been articulated. Taking into account language 
> barriers, more clarity would probably be helpful.

I've given criteria by which I have the gut feeling (but no more)
that this isn't the right approach. I'm absolutely fine to be
convinced that my gut feeling is wrong. That would require to
simply answer the question I raised multiple times, and which was
repeatedly "answered" by simply re-stating previously expressed
facts.

If this reaction of mine is not acceptable, all I can do is refrain
from further looking at this series. And Yi, I certainly apologize
for perhaps not doing these reviews wholeheartedly, since -
as also expressed before - I continue to not really view this as
very important functionality. Yet considering for how long some
of the versions hadn't been looked at by anyone at all, the
alternative would have been to simply let it sit further without
looking at it. I actually take this lack of interest by others as an
indication that I'm not the only one considering this nice to have,
but no more.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-20 13:21                                           ` Jan Beulich
@ 2017-04-20 16:52                                             ` Lars Kurth
  2017-04-21  6:11                                               ` Jan Beulich
  2017-04-21  1:13                                             ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 114+ messages in thread
From: Lars Kurth @ 2017-04-20 16:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wei Liu, He Chen, Andrew Cooper, Dario Faggioli,
	ian.jackson, Yi Sun, mengxu, xen-devel, chao.p.peng,
	Roger Pau Monne


> On 20 Apr 2017, at 14:21, Jan Beulich <jbeulich@suse.com> wrote:
> 
>>>> On 20.04.17 at 15:02, <lars.kurth.xen@gmail.com> wrote:
>>> On 20 Apr 2017, at 10:43, Jan Beulich <jbeulich@suse.com> wrote:
>>> 
>>>>>> On 20.04.17 at 04:14, <yi.y.sun@linux.intel.com> wrote:
>>>> On 17-04-19 03:00:29, Jan Beulich wrote:
>>>>>>>> On 19.04.17 at 10:22, <yi.y.sun@linux.intel.com> wrote:
>>>>>> On 17-04-18 05:46:43, Jan Beulich wrote:
>>>>>>>>>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
>>>>>>>> I made a test patch based on v10 and attached it in mail. Could you please
>>>>>>>> help to check it? Thanks!
>>>>>>> 

[Item 1]

>>>>>>> This looks reasonable at the first glance, albeit I continue to be
>>>>>>> unconvinced that this is the only (reasonable) way of solving the
>>>> 
>>>> Do you have any other suggestion on this? Thanks!
>>> 
>>> I'm sorry, but no. I'm already spending _much_ more time on this
>>> series than I should be, considering its (afaic) relatively low
>>> priority. I really have to ask you to properly think through both
>>> the data layout and code structure, including considering
>>> alternatives especially in places where you can _anticipate_
>>> controversy.
>> 
>> Jan, can you confirm whether you are happy with the current proposal. You 
>> say "this looks reasonable at the first glance", but then go on to say that 
>> there may be other "reasonable ways" of solving the problem at hand.
>> 
>> This is not very concrete and hard to respond to from a contributors point 
>> of view: it would be good to establish whether a) the v10 proposal in this 
>> area is good enough, b) whether there are any concrete improvements to be 
>> made.

> I understand it's not very concrete, but please understand that with
> the over 100 patches wanting looking at right now it is simply
> impossible for me to give precise suggestions everywhere. I really
> have to be allowed to defer to the originator to come up with
> possible better mechanisms (or defend what there is by addressing
> questions raised),

Jan, I don't object to the principle of deferring issues to a contributor, for contributor to defend their viewpoint or to come up with a better mechanism. I just observed, that I could not make a lot of sense what you were looking for in this particular review. I am assuming that it would be even harder for Yi. 

> especially with - as said - the amount of time spent
> here already having been way higher than is justifiable.

And of course we have passed the 4.9 code freeze, so some of the pressure is off. At the same time I understand that because of the upcoming releases you need to focus on bug fixes, etc.

> Just go and
> compare v10 with one of the initial versions: Almost all of the data
> layout and code flow have fundamentally changed, mostly based on
> feedback I gave. I'm sorry for saying that, but to me this is an
> indication that things hadn't been thought through well in the design
> phase, i.e. before even submitting a first RFC.

That is good feedback which may contain some valuable lessons. Once we are through this (or maybe at the summit) it may be worthwhile to look at what has gone wrong and see how we can do better in future.

[Item 2]

>> You say please think through whether "you can _anticipate_ controversy", but 
>> at the same time you can't currently identify/think of any issues. That seems 
>> to suggest to me that maybe the proposal is good enough. Or that something is 
>> missing, which has not been articulated. Taking into account language 
>> barriers, more clarity would probably be helpful.
> 
> I've given criteria by which I have the gut feeling (but no more)
> that this isn't the right approach. I'm absolutely fine to be
> convinced that my gut feeling is wrong. That would require to
> simply answer the question I raised multiple times, and which was
> repeatedly "answered" by simply re-stating previously expressed
> facts.

I have not followed the full thread. But it seems that we have communications issue there. Normally this happens when expectations don't quite match and one party (in this case Yi) does not quite get what the other one is looking for. Maybe the best approach would be for Yi to get some of these things resolved during a short IRC conversation with you. I did see him and others resolve some previous issues more effectively on IRC. 

> If this reaction of mine is not acceptable, all I can do is refrain
> from further looking at this series. And Yi, I certainly apologize
> for perhaps not doing these reviews wholeheartedly, since -
> as also expressed before - I continue to not really view this as
> very important functionality.

As I said earlier, I stepped in, as I didn't really understand what was going on. I think this is a little clearer now. 

So to summarise: 

On item 1: it appears that you are looking for a some more justification why some of the changes were made, maybe with a rationale for some of the choices that were made. Given that this is quite a complex series which has diverged quite a lot from the design, the goal is to make it easier for either you (or someone else) to sanity check the proposal which on the face of things look OK. But you have some doubts and you can't easily check against the design as it is out-of-date.

On item 2: you think something may not be quite right, but you can't really decide until a couple of questions (not quite sure which, but I am sure Yi can locate them) are answered.

Let me know whether this is actually true. 

> Yet considering for how long some
> of the versions hadn't been looked at by anyone at all, the
> alternative would have been to simply let it sit further without
> looking at it. I actually take this lack of interest by others as an
> indication that I'm not the only one considering this nice to have,
> but no more.

That's a good point. There have been some comments from others, but it is also true that 66% of comments on this series did come from you (see https://xen.biterg.io:443/goto/72a753919aa8e3205d03da7b430d2696). On the other hand, this is also a series which is hard to hand over to someone else.

Regards
Lars



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-20 13:21                                           ` Jan Beulich
  2017-04-20 16:52                                             ` Lars Kurth
@ 2017-04-21  1:13                                             ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 114+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-04-21  1:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, Lars Kurth, andrew.cooper3,
	dario.faggioli, ian.jackson, chao.p.peng, mengxu, xen-devel,
	Yi Sun, roger.pau

> If this reaction of mine is not acceptable, all I can do is refrain
> from further looking at this series. And Yi, I certainly apologize
> for perhaps not doing these reviews wholeheartedly, since -
> as also expressed before - I continue to not really view this as
> very important functionality. Yet considering for how long some
> of the versions hadn't been looked at by anyone at all, the
> alternative would have been to simply let it sit further without
> looking at it. I actually take this lack of interest by others as an
> indication that I'm not the only one considering this nice to have,
> but no more.

I do have an interest in this series. And I can certainly give it
a review - but once you get your teeth in a patchset I feel it is
bit counterintuitive to review it - as you do a much much better
job that I could have.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-20 16:52                                             ` Lars Kurth
@ 2017-04-21  6:11                                               ` Jan Beulich
  0 siblings, 0 replies; 114+ messages in thread
From: Jan Beulich @ 2017-04-21  6:11 UTC (permalink / raw)
  To: Lars Kurth
  Cc: Kevin Tian, Wei Liu, He Chen, Andrew Cooper, Dario Faggioli,
	ian.jackson, Yi Sun, mengxu, xen-devel, chao.p.peng,
	Roger Pau Monne

>>> On 20.04.17 at 18:52, <lars.kurth.xen@gmail.com> wrote:
> So to summarise: 
> 
> On item 1: it appears that you are looking for a some more justification why 
> some of the changes were made, maybe with a rationale for some of the choices 
> that were made. Given that this is quite a complex series which has diverged 
> quite a lot from the design, the goal is to make it easier for either you (or 
> someone else) to sanity check the proposal which on the face of things look 
> OK. But you have some doubts and you can't easily check against the design as 
> it is out-of-date.
> 
> On item 2: you think something may not be quite right, but you can't really 
> decide until a couple of questions (not quite sure which, but I am sure Yi 
> can locate them) are answered.
> 
> Let me know whether this is actually true. 

Well, afaict both actually boil down to the same single question
regarding the special handling of CAT MSRs after onlining (at
runtime) a core on a socket all of whose cores had been offline,
namely considering that other CPU registers don't require any
such special treatment (in context switch code or elsewhere).

As to the design possibly being out of date - I have to admit I
didn't even check whether the accompanying documentation
has been kept up to date with the actual code changes. The
matter here really isn't with comparing with the design, but
rather whether the design choice (written down or not) was
an appropriate one.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-20  2:14                                     ` Yi Sun
  2017-04-20  9:43                                       ` Jan Beulich
@ 2017-04-21  6:18                                       ` Jan Beulich
  2017-04-24  6:40                                         ` Yi Sun
  1 sibling, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-21  6:18 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 20.04.17 at 04:14, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-19 03:00:29, Jan Beulich wrote:
>> >>> On 19.04.17 at 10:22, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-04-18 05:46:43, Jan Beulich wrote:
>> >> >>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
>> >> > I made a test patch based on v10 and attached it in mail. Could you please
>> >> > help to check it? Thanks!
>> >> 
>> >> This looks reasonable at the first glance, albeit I continue to be
>> >> unconvinced that this is the only (reasonable) way of solving the
> 
> Do you have any other suggestion on this? Thanks!
> 
>> >> problem. After all we don't have to go through similar hoops for
>> >> any other of the register state associated with a vCPU. There
>> > 
>> > Sorry, I do not understand your meaning clearly.
>> > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this patch,
>> >    we check 'dom_ids' array to know if the domain's cos id has not been set but
>> >    its 'psr_cos_ids[socket]' already has a non zero value. This case means the
>> >    socket offline has happened so that we have to restore the domain's
>> >    'psr_cos_ids[socket]' to default value 0 which points to default COS MSR.
>> >    I think we have discussed this in previous mails and achieved agreement on
>> >    such logic. 
>> > 2. DYM the COS MSRs? We have discussed this before. Per your comments, when
>> >    socket is online, the registers values may be modified by FW and others.
>> >    So, we have to restore them to default value which is being done in
>> >    'cat_init_feature'.
>> > 
>> > So, what is your exactly meaning here? Thanks!
>> 
>> Neither of the two. I'm comparing with COS/PSR-_unrelated_
>> handling of register state. After all there are other MSRs which
>> we need to put into the right state after a core comes online.
>> 
> For PSR feature, the 'cos_reg_val[]' of the feature is destroied when socket
> is offline. The values in it are all 0 when socket is online again. This causes
> error when user shows the CBMs. So, we have two options when socket is online:
> 1. Read COS MSRs values and save them into 'cos_reg_val[]'.
> 2. Restore COS MSRs values and 'cos_reg_val[]' values to default CBM.
> 
> Per our discussion on v9 patch 05, we decided to adopt option 2.

Btw., having thought some more about this, putting code into the
context switch path when there is an alternative is probably the
wrong thing after all, i.e. if special treatment _is_ really needed,
doing it in less frequently executed code would likely be better.
But as before - much depends on clarifying why special treatment
is needed here, but not elsewhere (and to avoid questions, with
"elsewhere" I mean outside of PSR/CAT code - there's plenty of
other CPU register state to take as reference).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-21  6:18                                       ` Jan Beulich
@ 2017-04-24  6:40                                         ` Yi Sun
  2017-04-24  6:55                                           ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-24  6:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-21 00:18:27, Jan Beulich wrote:
> >>> On 20.04.17 at 04:14, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-19 03:00:29, Jan Beulich wrote:
> >> >>> On 19.04.17 at 10:22, <yi.y.sun@linux.intel.com> wrote:
> >> > On 17-04-18 05:46:43, Jan Beulich wrote:
> >> >> >>> On 18.04.17 at 12:55, <yi.y.sun@linux.intel.com> wrote:
> >> >> > I made a test patch based on v10 and attached it in mail. Could you please
> >> >> > help to check it? Thanks!
> >> >> 
> >> >> This looks reasonable at the first glance, albeit I continue to be
> >> >> unconvinced that this is the only (reasonable) way of solving the
> > 
> > Do you have any other suggestion on this? Thanks!
> > 
> >> >> problem. After all we don't have to go through similar hoops for
> >> >> any other of the register state associated with a vCPU. There
> >> > 
> >> > Sorry, I do not understand your meaning clearly.
> >> > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this patch,
> >> >    we check 'dom_ids' array to know if the domain's cos id has not been set but
> >> >    its 'psr_cos_ids[socket]' already has a non zero value. This case means the
> >> >    socket offline has happened so that we have to restore the domain's
> >> >    'psr_cos_ids[socket]' to default value 0 which points to default COS MSR.
> >> >    I think we have discussed this in previous mails and achieved agreement on
> >> >    such logic. 
> >> > 2. DYM the COS MSRs? We have discussed this before. Per your comments, when
> >> >    socket is online, the registers values may be modified by FW and others.
> >> >    So, we have to restore them to default value which is being done in
> >> >    'cat_init_feature'.
> >> > 
> >> > So, what is your exactly meaning here? Thanks!
> >> 
> >> Neither of the two. I'm comparing with COS/PSR-_unrelated_
> >> handling of register state. After all there are other MSRs which
> >> we need to put into the right state after a core comes online.
> >> 
> > For PSR feature, the 'cos_reg_val[]' of the feature is destroied when socket
> > is offline. The values in it are all 0 when socket is online again. This causes
> > error when user shows the CBMs. So, we have two options when socket is online:
> > 1. Read COS MSRs values and save them into 'cos_reg_val[]'.
> > 2. Restore COS MSRs values and 'cos_reg_val[]' values to default CBM.
> > 
> > Per our discussion on v9 patch 05, we decided to adopt option 2.
> 
> Btw., having thought some more about this, putting code into the
> context switch path when there is an alternative is probably the
> wrong thing after all, i.e. if special treatment _is_ really needed,
> doing it in less frequently executed code would likely be better.
> But as before - much depends on clarifying why special treatment
> is needed here, but not elsewhere (and to avoid questions, with
> "elsewhere" I mean outside of PSR/CAT code - there's plenty of
> other CPU register state to take as reference).
> 
Hi, Jan,

As what we talked on IRC last Friday, I have got answers for your
two final questions below:
1. Why domain setting is designed to per-socket, any reason? 
Answer: There is a real case from Intel's customer. HSX (Haswell server)
and BDX (Broadwell server) processors are plugged into each socket of a
Grantley platform. HSX does not support CAT but BDX does.

You and Chao Peng discussed this before for CAT feature enabling patches.
The asymmetry supporting is agreed.
http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:smfz7fbatbnxs3ti+state:results
http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:wlovqpg7oj63ejte+state:results

2. Why cannot the previous setting to the domains be kept when socket is online?
Answer: If the asymmetry system is supported, we cannot assume the configuration
can be applied to new socket.

BRs,
Sun Yi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-24  6:40                                         ` Yi Sun
@ 2017-04-24  6:55                                           ` Jan Beulich
  2017-04-25  7:15                                             ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-24  6:55 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 24.04.17 at 08:40, <yi.y.sun@linux.intel.com> wrote:
> As what we talked on IRC last Friday, I have got answers for your
> two final questions below:
> 1. Why domain setting is designed to per-socket, any reason? 
> Answer: There is a real case from Intel's customer. HSX (Haswell server)
> and BDX (Broadwell server) processors are plugged into each socket of a
> Grantley platform. HSX does not support CAT but BDX does.
> 
> You and Chao Peng discussed this before for CAT feature enabling patches.
> The asymmetry supporting is agreed.

I don't see any agreement in those threads. The first sub-thread is
merely mentioning this configuration, while the second is only about
nr_sockets calculation.

> http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:smfz7fbatbnxs 
> 3ti+state:results
> http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:wlovqpg7oj63e 
> jte+state:results
> 
> 2. Why cannot the previous setting to the domains be kept when socket is online?
> Answer: If the asymmetry system is supported, we cannot assume the configuration
> can be applied to new socket.

I'm afraid we'd have problems elsewhere if we tried to run Xen on
a mixed-model system. Unless you can prove PSR/CAT is the only
relevant (read: affecting Xen's behavior) hardware difference
between Haswell and Broadwell (for example, isn't the former
SMEP only, but the latter SMEP+SMAP?), I don't buy this as a
reason to have more complicated than necessary code.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-24  6:55                                           ` Jan Beulich
@ 2017-04-25  7:15                                             ` Yi Sun
  2017-04-25  8:24                                               ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-25  7:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-24 00:55:38, Jan Beulich wrote:
> >>> On 24.04.17 at 08:40, <yi.y.sun@linux.intel.com> wrote:
> > As what we talked on IRC last Friday, I have got answers for your
> > two final questions below:
> > 1. Why domain setting is designed to per-socket, any reason? 
> > Answer: There is a real case from Intel's customer. HSX (Haswell server)
> > and BDX (Broadwell server) processors are plugged into each socket of a
> > Grantley platform. HSX does not support CAT but BDX does.
> > 
> > You and Chao Peng discussed this before for CAT feature enabling patches.
> > The asymmetry supporting is agreed.
> 
> I don't see any agreement in those threads. The first sub-thread is
> merely mentioning this configuration, while the second is only about
> nr_sockets calculation.
> 
> > http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:smfz7fbatbnxs 
> > 3ti+state:results
> > http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:wlovqpg7oj63e 
> > jte+state:results
> > 
> > 2. Why cannot the previous setting to the domains be kept when socket is online?
> > Answer: If the asymmetry system is supported, we cannot assume the configuration
> > can be applied to new socket.
> 
> I'm afraid we'd have problems elsewhere if we tried to run Xen on
> a mixed-model system. Unless you can prove PSR/CAT is the only
> relevant (read: affecting Xen's behavior) hardware difference
> between Haswell and Broadwell (for example, isn't the former
> SMEP only, but the latter SMEP+SMAP?), I don't buy this as a
> reason to have more complicated than necessary code.
> 
Sorry, this may cause potential issue and is not a good example. But from SW
view, there is another case where the per-socket supporting is important in
real-time scenarios. You may have a real-time domain on one socket that requires
different masks (especially for code/data) to guarantee its performance vs.
other general-purpose domains run on a different socket. In that case it’s a
heterogeneous software usage model (rather than heterogeneous hardware). And,
we should not force same masks setting on different sockets in such case.
Because CLOS are a scarce software resource which should be wasted. One of
the most important reasons for managing CLOS independently across sockets is
to preserve the flexibility in using CLOS which is key as they are a scarce
resource. 

BRs,
Sun Yi

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-25  7:15                                             ` Yi Sun
@ 2017-04-25  8:24                                               ` Jan Beulich
  2017-04-25  8:40                                                 ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-25  8:24 UTC (permalink / raw)
  To: Yi Sun
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

>>> On 25.04.17 at 09:15, <yi.y.sun@linux.intel.com> wrote:
> Sorry, this may cause potential issue and is not a good example. But from SW
> view, there is another case where the per-socket supporting is important in
> real-time scenarios. You may have a real-time domain on one socket that requires
> different masks (especially for code/data) to guarantee its performance vs.
> other general-purpose domains run on a different socket. In that case it’s a
> heterogeneous software usage model (rather than heterogeneous hardware). And,
> we should not force same masks setting on different sockets in such case.

I don't follow: The COS IDs for the real-time and general purpose
domains would be different, wouldn't they? Thus there would be
different masks in use, as intended.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
  2017-04-25  8:24                                               ` Jan Beulich
@ 2017-04-25  8:40                                                 ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-25  8:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, he.chen, andrew.cooper3, dario.faggioli,
	ian.jackson, mengxu, xen-devel, chao.p.peng, roger.pau

On 17-04-25 02:24:40, Jan Beulich wrote:
> >>> On 25.04.17 at 09:15, <yi.y.sun@linux.intel.com> wrote:
> > Sorry, this may cause potential issue and is not a good example. But from SW
> > view, there is another case where the per-socket supporting is important in
> > real-time scenarios. You may have a real-time domain on one socket that requires
> > different masks (especially for code/data) to guarantee its performance vs.
> > other general-purpose domains run on a different socket. In that case it’s a
> > heterogeneous software usage model (rather than heterogeneous hardware). And,
> > we should not force same masks setting on different sockets in such case.
> 
> I don't follow: The COS IDs for the real-time and general purpose
> domains would be different, wouldn't they? Thus there would be
> different masks in use, as intended.
> 
Yes, you are right. But as above case, the real-time domain only runs on one
socket. If per-socket supporting is enabled, we can allocate this COS ID to
other domains on other sockets. This can help to improve the scarce cache
utilization.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH] dom_ids array implementation.
  2017-04-20  5:38   ` [PATCH] dom_ids array implementation Yi Sun
@ 2017-04-26 10:04     ` Jan Beulich
  2017-04-27  2:38       ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-26 10:04 UTC (permalink / raw)
  To: Yi Sun; +Cc: xen-devel

>>> On 20.04.17 at 07:38, <yi.y.sun@linux.intel.com> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -125,6 +125,8 @@ struct feat_node {
>      uint32_t cos_reg_val[MAX_COS_REG_CNT];
>  };
>  
> +#define PSR_DOM_IDS_NUM ((DOMID_IDLE + 1) / sizeof(uint32_t))

Instead of this, please use ...

> @@ -134,9 +136,11 @@ struct feat_node {
>   *             COS ID. Every entry of cos_ref corresponds to one COS ID.
>   */
>  struct psr_socket_info {
> -    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
>      spinlock_t ref_lock;
> +    spinlock_t dom_ids_lock;
> +    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
>      unsigned int cos_ref[MAX_COS_REG_CNT];
> +    uint32_t dom_ids[PSR_DOM_IDS_NUM];

... DECLARE_BITMAP() here.

Also please try to space apart the two locks, to avoid false cacheline
conflicts (e.g. the new lock may well go immediately before the array
it pairs with).

> @@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket)
>       */
>      for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
>      {
> -        if ( !info->features[i] )
> -            continue;
> -
>          xfree(info->features[i]);
>          info->features[i] = NULL;
>      }
> +
> +    spin_lock(&info->ref_lock);
> +    memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int));
> +    spin_unlock(&info->ref_lock);
> +
> +    spin_lock_irqsave(&info->dom_ids_lock, flag);
> +    memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t));

bitmap_clear()

I'm also not convinced you need to acquire either of the two locks
here - you're cleaning up the socket after all, so nothing can be
running on it anymore.

> @@ -682,9 +676,37 @@ void psr_ctxt_switch_to(struct domain *d)
>          psr_assoc_rmid(&reg, d->arch.psr_rmid);
>  
>      if ( psra->cos_mask )
> -        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
> -                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
> -                      0, psra->cos_mask);
> +    {
> +        unsigned int socket = cpu_to_socket(smp_processor_id());
> +        struct psr_socket_info *info = socket_info + socket;
> +
> +        if ( test_bit(d->domain_id, info->dom_ids) )

likely()

> +            goto set_assoc;

I'm not convinced "goto" is reasonable to use here - this is not an
error path. If you're afraid of the extra indentation level, make a
helper function.

> +        spin_lock(&info->dom_ids_lock);
> +
> +        int old_bit = test_and_set_bit(d->domain_id, info->dom_ids);

Please don't mix declarations and statements. Also bool please,
but then again the variable isn't really needed anyway.

> +        /*
> +         * If old_bit is 0, that means this is the first time the domain is
> +         * switched to this socket or domain's COS ID has not been set since
> +         * the socket is online. So, the domain's COS ID on this socket should
> +         * be default value, 0. If not, that means this socket has been offline
> +         * and the domain's COS ID has been set when the socket was online. So,
> +         * this COS ID is invalid and we have to restore it to 0.
> +         */
> +        if ( d->arch.psr_cos_ids &&
> +             old_bit == 0 &&
> +             d->arch.psr_cos_ids[socket] != 0 )

Why don't you replicate the other two conditions in the if() trying to
avoid taking the lock? (Especially if above you indeed intend to use
a helper function, abstracting the full condition into another one
would be very desirable.)

> +            d->arch.psr_cos_ids[socket] = 0;
> +
> +        spin_unlock(&info->dom_ids_lock);

And then, as a whole: As indicated before, ideally you'd keep this
out of the context switch path altogether. What are the alternatives?

> @@ -1310,7 +1283,10 @@ int psr_set_val(struct domain *d, unsigned int socket,
>       * which COS the domain is using on the socket. One domain can only use
>       * one COS ID at same time on each socket.
>       */
> +    spin_lock_irqsave(&info->dom_ids_lock, flag);
>      d->arch.psr_cos_ids[socket] = cos;
> +    test_and_set_bit(d->domain_id, info->dom_ids);

Why test_and_ when you don't use the result?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH] dom_ids array implementation.
  2017-04-26 10:04     ` Jan Beulich
@ 2017-04-27  2:38       ` Yi Sun
  2017-04-27  6:48         ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-27  2:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 17-04-26 04:04:15, Jan Beulich wrote:
> >>> On 20.04.17 at 07:38, <yi.y.sun@linux.intel.com> wrote:
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -125,6 +125,8 @@ struct feat_node {
> >      uint32_t cos_reg_val[MAX_COS_REG_CNT];
> >  };
> >  
> > +#define PSR_DOM_IDS_NUM ((DOMID_IDLE + 1) / sizeof(uint32_t))
> 
> Instead of this, please use ...
> 
> > @@ -134,9 +136,11 @@ struct feat_node {
> >   *             COS ID. Every entry of cos_ref corresponds to one COS ID.
> >   */
> >  struct psr_socket_info {
> > -    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
> >      spinlock_t ref_lock;
> > +    spinlock_t dom_ids_lock;
> > +    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
> >      unsigned int cos_ref[MAX_COS_REG_CNT];
> > +    uint32_t dom_ids[PSR_DOM_IDS_NUM];
> 
> ... DECLARE_BITMAP() here.
> 
> Also please try to space apart the two locks, to avoid false cacheline
> conflicts (e.g. the new lock may well go immediately before the array
> it pairs with).
> 
Got it, thanks a lot for the suggestions!

> > @@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket)
> >       */
> >      for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> >      {
> > -        if ( !info->features[i] )
> > -            continue;
> > -
> >          xfree(info->features[i]);
> >          info->features[i] = NULL;
> >      }
> > +
> > +    spin_lock(&info->ref_lock);
> > +    memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int));
> > +    spin_unlock(&info->ref_lock);
> > +
> > +    spin_lock_irqsave(&info->dom_ids_lock, flag);
> > +    memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t));
> 
> bitmap_clear()
> 
> I'm also not convinced you need to acquire either of the two locks
> here - you're cleaning up the socket after all, so nothing can be
> running on it anymore.
> 
Can domain destroy happens at the same time when a socket is offline?

> > @@ -682,9 +676,37 @@ void psr_ctxt_switch_to(struct domain *d)
> >          psr_assoc_rmid(&reg, d->arch.psr_rmid);
> >  
> >      if ( psra->cos_mask )
> > -        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
> > -                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
> > -                      0, psra->cos_mask);
> > +    {
> > +        unsigned int socket = cpu_to_socket(smp_processor_id());
> > +        struct psr_socket_info *info = socket_info + socket;
> > +
> > +        if ( test_bit(d->domain_id, info->dom_ids) )
> 
> likely()
> 
Ok, thanks!

> > +            goto set_assoc;
> 
> I'm not convinced "goto" is reasonable to use here - this is not an
> error path. If you're afraid of the extra indentation level, make a
> helper function.
> 
Then, it seems a helper function is needed.

> > +        spin_lock(&info->dom_ids_lock);
> > +
> > +        int old_bit = test_and_set_bit(d->domain_id, info->dom_ids);
> 
> Please don't mix declarations and statements. Also bool please,
> but then again the variable isn't really needed anyway.
> 
Got it. Thanks!

> > +        /*
> > +         * If old_bit is 0, that means this is the first time the domain is
> > +         * switched to this socket or domain's COS ID has not been set since
> > +         * the socket is online. So, the domain's COS ID on this socket should
> > +         * be default value, 0. If not, that means this socket has been offline
> > +         * and the domain's COS ID has been set when the socket was online. So,
> > +         * this COS ID is invalid and we have to restore it to 0.
> > +         */
> > +        if ( d->arch.psr_cos_ids &&
> > +             old_bit == 0 &&
> > +             d->arch.psr_cos_ids[socket] != 0 )
> 
> Why don't you replicate the other two conditions in the if() trying to
> avoid taking the lock? (Especially if above you indeed intend to use
> a helper function, abstracting the full condition into another one
> would be very desirable.)
> 
Ok, will move the two conditions to above 'if()', like below.

if ( likely(test_bit(d->domain_id, info->dom_ids)) ||
     !d->arch.psr_cos_ids ||
     !d->arch.psr_cos_ids[socket] )

Accordingly, the later codes should be:

spin_lock(&info->dom_ids_lock);
set_bit(d->domain_id, info->dom_ids);
d->arch.psr_cos_ids[socket] = 0;
spin_unlock(&info->dom_ids_lock);

> > +            d->arch.psr_cos_ids[socket] = 0;
> > +
> > +        spin_unlock(&info->dom_ids_lock);
> 
> And then, as a whole: As indicated before, ideally you'd keep this
> out of the context switch path altogether. What are the alternatives?
> 
To restore domains' "psr_cos_ids[socket]" to default when socket offline
happens, we have three time windows:
1. When socket is offline, in "free_socket_resources()";
2. When socket is online, in "psr_cpu_init()";
3. When context switch happens, in "psr_ctxt_switch_to()".

Option 1 and 2 have same effect and option 1 is more natural than 2. So, we can
do this restore action at "1" or "3".

I have two alternatives below. Please help to check which you think is better:
1. The first version of the patch iterates valid domain list to restore them one
by one. Per your comments, it may take much time. That is the reason I submitted
this patch to spread out the restore action of all domains. If you think
"psr_cos_ids[socket]" restore action happens in context switch path is not good,
can we use a tasklet in "free_socket_resources()" to iterate the domain list and
restore their "psr_cos_ids"?

2. Or, can we use a tasklet in "psr_ctxt_switch_to()" to do above work? The side
effect is that the domain's COS ID used in this switch is not right. The valid
COS ID may be set in next context switch.

> > @@ -1310,7 +1283,10 @@ int psr_set_val(struct domain *d, unsigned int socket,
> >       * which COS the domain is using on the socket. One domain can only use
> >       * one COS ID at same time on each socket.
> >       */
> > +    spin_lock_irqsave(&info->dom_ids_lock, flag);
> >      d->arch.psr_cos_ids[socket] = cos;
> > +    test_and_set_bit(d->domain_id, info->dom_ids);
> 
> Why test_and_ when you don't use the result?
> 
Sorry, set_bit() should be enough here.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH] dom_ids array implementation.
  2017-04-27  2:38       ` Yi Sun
@ 2017-04-27  6:48         ` Jan Beulich
  2017-04-27  9:30           ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-27  6:48 UTC (permalink / raw)
  To: Yi Sun; +Cc: xen-devel

>>> On 27.04.17 at 04:38, <yi.y.sun@linux.intel.com> wrote:
> On 17-04-26 04:04:15, Jan Beulich wrote:
>> >>> On 20.04.17 at 07:38, <yi.y.sun@linux.intel.com> wrote:
>> > @@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket)
>> >       */
>> >      for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
>> >      {
>> > -        if ( !info->features[i] )
>> > -            continue;
>> > -
>> >          xfree(info->features[i]);
>> >          info->features[i] = NULL;
>> >      }
>> > +
>> > +    spin_lock(&info->ref_lock);
>> > +    memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int));
>> > +    spin_unlock(&info->ref_lock);
>> > +
>> > +    spin_lock_irqsave(&info->dom_ids_lock, flag);
>> > +    memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t));
>> 
>> bitmap_clear()
>> 
>> I'm also not convinced you need to acquire either of the two locks
>> here - you're cleaning up the socket after all, so nothing can be
>> running on it anymore.
>> 
> Can domain destroy happens at the same time when a socket is offline?

Well, yes and no - it depends on what path exactly you sit here.
Large parts of CPU onlining/offlining happen in stop-machine
context, which would exclude domain destruction going on in
parallel.

>> > +        /*
>> > +         * If old_bit is 0, that means this is the first time the domain is
>> > +         * switched to this socket or domain's COS ID has not been set since
>> > +         * the socket is online. So, the domain's COS ID on this socket should
>> > +         * be default value, 0. If not, that means this socket has been offline
>> > +         * and the domain's COS ID has been set when the socket was online. So,
>> > +         * this COS ID is invalid and we have to restore it to 0.
>> > +         */
>> > +        if ( d->arch.psr_cos_ids &&
>> > +             old_bit == 0 &&
>> > +             d->arch.psr_cos_ids[socket] != 0 )
>> 
>> Why don't you replicate the other two conditions in the if() trying to
>> avoid taking the lock? (Especially if above you indeed intend to use
>> a helper function, abstracting the full condition into another one
>> would be very desirable.)
>> 
> Ok, will move the two conditions to above 'if()', like below.
> 
> if ( likely(test_bit(d->domain_id, info->dom_ids)) ||
>      !d->arch.psr_cos_ids ||
>      !d->arch.psr_cos_ids[socket] )
> 
> Accordingly, the later codes should be:
> 
> spin_lock(&info->dom_ids_lock);
> set_bit(d->domain_id, info->dom_ids);
> d->arch.psr_cos_ids[socket] = 0;
> spin_unlock(&info->dom_ids_lock);

Then you didn't fully understand: The test_and_ portion _cannot_
be moved out of the locked region, but a simple test_bit() can be
replicated prior to taking the lock.

>> > +            d->arch.psr_cos_ids[socket] = 0;
>> > +
>> > +        spin_unlock(&info->dom_ids_lock);
>> 
>> And then, as a whole: As indicated before, ideally you'd keep this
>> out of the context switch path altogether. What are the alternatives?
>> 
> To restore domains' "psr_cos_ids[socket]" to default when socket offline
> happens, we have three time windows:
> 1. When socket is offline, in "free_socket_resources()";
> 2. When socket is online, in "psr_cpu_init()";
> 3. When context switch happens, in "psr_ctxt_switch_to()".
> 
> Option 1 and 2 have same effect and option 1 is more natural than 2. So, we can
> do this restore action at "1" or "3".
> 
> I have two alternatives below. Please help to check which you think is better:
> 1. The first version of the patch iterates valid domain list to restore them one
> by one. Per your comments, it may take much time. That is the reason I submitted
> this patch to spread out the restore action of all domains. If you think
> "psr_cos_ids[socket]" restore action happens in context switch path is not good,
> can we use a tasklet in "free_socket_resources()" to iterate the domain list and
> restore their "psr_cos_ids"?

If that tasklet (a) doesn't again take overly long and (b) is
guaranteed to finish before the same socket may come back online
again, then yes. Otherwise both the iterate-over-all-domains and
the in-context-switch approaches have downsides, but the latter
would then seem preferable (because it only affects performance
without risking the system's health). The question is whether some
3rd method can't be found.

> 2. Or, can we use a tasklet in "psr_ctxt_switch_to()" to do above work? The side
> effect is that the domain's COS ID used in this switch is not right. The valid
> COS ID may be set in next context switch.

I think this would complicate things while at the same time making
them worse.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH] dom_ids array implementation.
  2017-04-27  6:48         ` Jan Beulich
@ 2017-04-27  9:30           ` Yi Sun
  2017-04-27  9:39             ` Jan Beulich
  0 siblings, 1 reply; 114+ messages in thread
From: Yi Sun @ 2017-04-27  9:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 17-04-27 00:48:43, Jan Beulich wrote:
> >>> On 27.04.17 at 04:38, <yi.y.sun@linux.intel.com> wrote:
> > On 17-04-26 04:04:15, Jan Beulich wrote:
> >> >>> On 20.04.17 at 07:38, <yi.y.sun@linux.intel.com> wrote:
> >> > @@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket)
> >> >       */
> >> >      for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> >> >      {
> >> > -        if ( !info->features[i] )
> >> > -            continue;
> >> > -
> >> >          xfree(info->features[i]);
> >> >          info->features[i] = NULL;
> >> >      }
> >> > +
> >> > +    spin_lock(&info->ref_lock);
> >> > +    memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int));
> >> > +    spin_unlock(&info->ref_lock);
> >> > +
> >> > +    spin_lock_irqsave(&info->dom_ids_lock, flag);
> >> > +    memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t));
> >> 
> >> bitmap_clear()
> >> 
> >> I'm also not convinced you need to acquire either of the two locks
> >> here - you're cleaning up the socket after all, so nothing can be
> >> running on it anymore.
> >> 
> > Can domain destroy happens at the same time when a socket is offline?
> 
> Well, yes and no - it depends on what path exactly you sit here.
> Large parts of CPU onlining/offlining happen in stop-machine
> context, which would exclude domain destruction going on in
> parallel.
> 
The 'free_socket_resources' may be called when 'CPU_UP_CANCELED' or 'CPUU_DEAD'
happens. For 'CPUU_DEAD', stop-machine is executed. But for 'CPU_UP_CANCELED',
I cannot see this. 'CPU_UP_CANCELED' happens when cpu up fails and
'free_socket_resources' is called only when the cpu is the last one on socket.
So for 'CPU_UP_CANCELED' case, 'free_socket_resources' should not be called.
So, I think you are right and we can remove the spin_lock protections here.

> >> > +        /*
> >> > +         * If old_bit is 0, that means this is the first time the domain is
> >> > +         * switched to this socket or domain's COS ID has not been set since
> >> > +         * the socket is online. So, the domain's COS ID on this socket should
> >> > +         * be default value, 0. If not, that means this socket has been offline
> >> > +         * and the domain's COS ID has been set when the socket was online. So,
> >> > +         * this COS ID is invalid and we have to restore it to 0.
> >> > +         */
> >> > +        if ( d->arch.psr_cos_ids &&
> >> > +             old_bit == 0 &&
> >> > +             d->arch.psr_cos_ids[socket] != 0 )
> >> 
> >> Why don't you replicate the other two conditions in the if() trying to
> >> avoid taking the lock? (Especially if above you indeed intend to use
> >> a helper function, abstracting the full condition into another one
> >> would be very desirable.)
> >> 
> > Ok, will move the two conditions to above 'if()', like below.
> > 
> > if ( likely(test_bit(d->domain_id, info->dom_ids)) ||
> >      !d->arch.psr_cos_ids ||
> >      !d->arch.psr_cos_ids[socket] )
> > 
> > Accordingly, the later codes should be:
> > 
> > spin_lock(&info->dom_ids_lock);
> > set_bit(d->domain_id, info->dom_ids);
> > d->arch.psr_cos_ids[socket] = 0;
> > spin_unlock(&info->dom_ids_lock);
> 
> Then you didn't fully understand: The test_and_ portion _cannot_
> be moved out of the locked region, but a simple test_bit() can be
> replicated prior to taking the lock.
> 
Oh, sorry, I should use test_and_ here to check the bit again which may be
changed before the lock. Thanks for pointing it out!

> >> > +            d->arch.psr_cos_ids[socket] = 0;
> >> > +
> >> > +        spin_unlock(&info->dom_ids_lock);
> >> 
> >> And then, as a whole: As indicated before, ideally you'd keep this
> >> out of the context switch path altogether. What are the alternatives?
> >> 
> > To restore domains' "psr_cos_ids[socket]" to default when socket offline
> > happens, we have three time windows:
> > 1. When socket is offline, in "free_socket_resources()";
> > 2. When socket is online, in "psr_cpu_init()";
> > 3. When context switch happens, in "psr_ctxt_switch_to()".
> > 
> > Option 1 and 2 have same effect and option 1 is more natural than 2. So, we can
> > do this restore action at "1" or "3".
> > 
> > I have two alternatives below. Please help to check which you think is better:
> > 1. The first version of the patch iterates valid domain list to restore them one
> > by one. Per your comments, it may take much time. That is the reason I submitted
> > this patch to spread out the restore action of all domains. If you think
> > "psr_cos_ids[socket]" restore action happens in context switch path is not good,
> > can we use a tasklet in "free_socket_resources()" to iterate the domain list and
> > restore their "psr_cos_ids"?
> 
> If that tasklet (a) doesn't again take overly long and (b) is
> guaranteed to finish before the same socket may come back online
> again, then yes. Otherwise both the iterate-over-all-domains and
> the in-context-switch approaches have downsides, but the latter
> would then seem preferable (because it only affects performance
> without risking the system's health). The question is whether some
> 3rd method can't be found.
> 
I have another solution now. We may move the psr_cos_ids[socket] restore action
into 'psr_get_val' and only set the bit of 'dom_ids[]' in 'psr_set_val'.
1. When socket is offline, the dom_ids[] is cleared.
2. When socket is online, we have four places to use psr_cos_ids[socket]:
   a. psr_ctxt_, we can use test_bit() atomically check if the bit is set. If
      not, that means the d->arch.psr_cos_ids[socket] is invalid at this time.
      Then, we use 0 to set ASSOC register. But we don't restore psr_cos_ids
      here and do not set dom_ids[]. So, we do not need the spin_lock.
   b. psr_get_val, we use test_bit() to check if the bit is 0 and the
      d->arch.psr_cos_ids[socket] is not 0. If yes, that means this domain's
      cos id has not been restored yet. So we restore it to 0.
   c. psr_set_val, we set the bit in dom_ids[] and set
      d->arch.psr_cos_ids[socket]. As, psr_set_val cannot happen when
      psr_get_val is called, so no protection is needed.
   d. psr_free_cos, clear the bit and free d->arch.psr_cos_ids. This place
      cannot happen at the same time that the above three functions called.
      So, no protection needed.

Per above analysis, we do not need lock protection. So, the CPU serialization
issue can be solved. How do you think?

> > 2. Or, can we use a tasklet in "psr_ctxt_switch_to()" to do above work? The side
> > effect is that the domain's COS ID used in this switch is not right. The valid
> > COS ID may be set in next context switch.
> 
> I think this would complicate things while at the same time making
> them worse.
> 
> Jan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH] dom_ids array implementation.
  2017-04-27  9:30           ` Yi Sun
@ 2017-04-27  9:39             ` Jan Beulich
  2017-04-27 12:03               ` Yi Sun
  0 siblings, 1 reply; 114+ messages in thread
From: Jan Beulich @ 2017-04-27  9:39 UTC (permalink / raw)
  To: Yi Sun; +Cc: xen-devel

>>> On 27.04.17 at 11:30, <yi.y.sun@linux.intel.com> wrote:
> I have another solution now. We may move the psr_cos_ids[socket] restore action
> into 'psr_get_val' and only set the bit of 'dom_ids[]' in 'psr_set_val'.
> 1. When socket is offline, the dom_ids[] is cleared.
> 2. When socket is online, we have four places to use psr_cos_ids[socket]:
>    a. psr_ctxt_, we can use test_bit() atomically check if the bit is set. If
>       not, that means the d->arch.psr_cos_ids[socket] is invalid at this time.
>       Then, we use 0 to set ASSOC register. But we don't restore psr_cos_ids
>       here and do not set dom_ids[]. So, we do not need the spin_lock.
>    b. psr_get_val, we use test_bit() to check if the bit is 0 and the
>       d->arch.psr_cos_ids[socket] is not 0. If yes, that means this domain's
>       cos id has not been restored yet. So we restore it to 0.
>    c. psr_set_val, we set the bit in dom_ids[] and set
>       d->arch.psr_cos_ids[socket]. As, psr_set_val cannot happen when
>       psr_get_val is called, so no protection is needed.
>    d. psr_free_cos, clear the bit and free d->arch.psr_cos_ids. This place
>       cannot happen at the same time that the above three functions called.
>       So, no protection needed.
> 
> Per above analysis, we do not need lock protection. So, the CPU serialization
> issue can be solved. How do you think?

Looks promising.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH] dom_ids array implementation.
  2017-04-27  9:39             ` Jan Beulich
@ 2017-04-27 12:03               ` Yi Sun
  0 siblings, 0 replies; 114+ messages in thread
From: Yi Sun @ 2017-04-27 12:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 17-04-27 03:39:02, Jan Beulich wrote:
> >>> On 27.04.17 at 11:30, <yi.y.sun@linux.intel.com> wrote:
> > I have another solution now. We may move the psr_cos_ids[socket] restore action
> > into 'psr_get_val' and only set the bit of 'dom_ids[]' in 'psr_set_val'.
> > 1. When socket is offline, the dom_ids[] is cleared.
> > 2. When socket is online, we have four places to use psr_cos_ids[socket]:
> >    a. psr_ctxt_, we can use test_bit() atomically check if the bit is set. If
> >       not, that means the d->arch.psr_cos_ids[socket] is invalid at this time.
> >       Then, we use 0 to set ASSOC register. But we don't restore psr_cos_ids
> >       here and do not set dom_ids[]. So, we do not need the spin_lock.
> >    b. psr_get_val, we use test_bit() to check if the bit is 0 and the
> >       d->arch.psr_cos_ids[socket] is not 0. If yes, that means this domain's
> >       cos id has not been restored yet. So we restore it to 0.
> >    c. psr_set_val, we set the bit in dom_ids[] and set
> >       d->arch.psr_cos_ids[socket]. As, psr_set_val cannot happen when
> >       psr_get_val is called, so no protection is needed.
> >    d. psr_free_cos, clear the bit and free d->arch.psr_cos_ids. This place
> >       cannot happen at the same time that the above three functions called.
> >       So, no protection needed.
> > 
> > Per above analysis, we do not need lock protection. So, the CPU serialization
> > issue can be solved. How do you think?
> 
> Looks promising.
> 
Thank you! Then, it seems we have addressed most issues. I will prepare v11 and
send it out in soon.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 114+ messages in thread

end of thread, other threads:[~2017-04-27 12:03 UTC | newest]

Thread overview: 114+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-01 13:53 [PATCH v10 00/25] Enable L2 Cache Allocation Technology & Refactor psr.c Yi Sun
2017-04-01 13:53 ` [PATCH v10 01/25] docs: create Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) feature document Yi Sun
2017-04-01 13:53 ` [PATCH v10 02/25] x86: refactor psr: remove L3 CAT/CDP codes Yi Sun
2017-04-01 13:53 ` [PATCH v10 03/25] x86: refactor psr: implement main data structures Yi Sun
2017-04-03 15:50   ` Jan Beulich
2017-04-05  3:12     ` Yi Sun
2017-04-05  8:20       ` Jan Beulich
2017-04-05  8:45         ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 04/25] x86: move cpuid_count_leaf from cpuid.c to processor.h Yi Sun
2017-04-01 13:53 ` [PATCH v10 05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow Yi Sun
2017-04-05 15:10   ` Jan Beulich
2017-04-06  5:49     ` Yi Sun
2017-04-06  8:32       ` Jan Beulich
2017-04-06  9:22         ` Yi Sun
2017-04-06  9:34           ` Jan Beulich
2017-04-06 10:02             ` Yi Sun
2017-04-06 14:02               ` Jan Beulich
2017-04-07  5:17                 ` Yi Sun
2017-04-07  8:48                   ` Jan Beulich
2017-04-07  9:08                     ` Yi Sun
2017-04-07  9:46                       ` Jan Beulich
2017-04-10  3:27                         ` Yi Sun
2017-04-10 12:43                           ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 06/25] x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows Yi Sun
2017-04-05 15:23   ` Jan Beulich
2017-04-06  6:01     ` Yi Sun
2017-04-06  8:34       ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 07/25] x86: refactor psr: L3 CAT: implement get hw info flow Yi Sun
2017-04-05 15:37   ` Jan Beulich
2017-04-06  6:05     ` Yi Sun
2017-04-06  8:36       ` Jan Beulich
2017-04-06 11:16         ` Yi Sun
2017-04-06 14:04           ` Jan Beulich
2017-04-07  5:39             ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 08/25] x86: refactor psr: L3 CAT: implement get value flow Yi Sun
2017-04-05 15:51   ` Jan Beulich
2017-04-06  6:10     ` Yi Sun
2017-04-06  8:40       ` Jan Beulich
2017-04-06 11:13         ` Yi Sun
2017-04-06 14:08           ` Jan Beulich
2017-04-07  5:40             ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework Yi Sun
2017-04-11 15:01   ` Jan Beulich
2017-04-12  5:53     ` Yi Sun
2017-04-12  9:09       ` Jan Beulich
2017-04-12 12:23         ` Yi Sun
2017-04-12 12:42           ` Jan Beulich
2017-04-13  8:11             ` Yi Sun
2017-04-13  9:41               ` Jan Beulich
2017-04-13 10:49                 ` Yi Sun
2017-04-13 10:58                   ` Jan Beulich
2017-04-13 11:11                     ` Yi Sun
2017-04-13 11:26                       ` Yi Sun
2017-04-13 11:31                       ` Jan Beulich
2017-04-13 11:44                         ` Yi Sun
2017-04-13 11:50                           ` Jan Beulich
2017-04-18 10:55                             ` Yi Sun
2017-04-18 11:46                               ` Jan Beulich
2017-04-19  8:22                                 ` Yi Sun
2017-04-19  9:00                                   ` Jan Beulich
2017-04-20  2:14                                     ` Yi Sun
2017-04-20  9:43                                       ` Jan Beulich
2017-04-20 13:02                                         ` Lars Kurth
2017-04-20 13:21                                           ` Jan Beulich
2017-04-20 16:52                                             ` Lars Kurth
2017-04-21  6:11                                               ` Jan Beulich
2017-04-21  1:13                                             ` Konrad Rzeszutek Wilk
2017-04-21  6:18                                       ` Jan Beulich
2017-04-24  6:40                                         ` Yi Sun
2017-04-24  6:55                                           ` Jan Beulich
2017-04-25  7:15                                             ` Yi Sun
2017-04-25  8:24                                               ` Jan Beulich
2017-04-25  8:40                                                 ` Yi Sun
2017-04-20  5:38   ` [PATCH] dom_ids array implementation Yi Sun
2017-04-26 10:04     ` Jan Beulich
2017-04-27  2:38       ` Yi Sun
2017-04-27  6:48         ` Jan Beulich
2017-04-27  9:30           ` Yi Sun
2017-04-27  9:39             ` Jan Beulich
2017-04-27 12:03               ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 10/25] x86: refactor psr: L3 CAT: set value: assemble features value array Yi Sun
2017-04-11 15:11   ` Jan Beulich
2017-04-12  5:55     ` Yi Sun
2017-04-12  9:13       ` Jan Beulich
2017-04-12 12:26         ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 11/25] x86: refactor psr: L3 CAT: set value: implement cos finding flow Yi Sun
2017-04-11 15:17   ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 12/25] x86: refactor psr: L3 CAT: set value: implement cos id picking flow Yi Sun
2017-04-11 15:20   ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 13/25] x86: refactor psr: L3 CAT: set value: implement write msr flow Yi Sun
2017-04-11 15:25   ` Jan Beulich
2017-04-12  6:04     ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 14/25] x86: refactor psr: CDP: implement CPU init and free flow Yi Sun
2017-04-01 13:53 ` [PATCH v10 15/25] x86: refactor psr: CDP: implement get hw info flow Yi Sun
2017-04-01 13:53 ` [PATCH v10 16/25] x86: refactor psr: CDP: implement get value flow Yi Sun
2017-04-11 15:39   ` Jan Beulich
2017-04-12  6:05     ` Yi Sun
2017-04-12  9:14       ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 17/25] x86: refactor psr: CDP: implement set value callback functions Yi Sun
2017-04-11 16:03   ` Jan Beulich
2017-04-12  6:14     ` Yi Sun
2017-04-01 13:53 ` [PATCH v10 18/25] x86: L2 CAT: implement CPU init and free flow Yi Sun
2017-04-12 15:18   ` Jan Beulich
2017-04-13  8:12     ` Yi Sun
2017-04-13  8:16       ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 19/25] x86: L2 CAT: implement get hw info flow Yi Sun
2017-04-01 13:53 ` [PATCH v10 20/25] x86: L2 CAT: implement get value flow Yi Sun
2017-04-01 13:53 ` [PATCH v10 21/25] x86: L2 CAT: implement set " Yi Sun
2017-04-12 15:23   ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 22/25] tools: L2 CAT: support get HW info for L2 CAT Yi Sun
2017-04-12 15:24   ` Jan Beulich
2017-04-01 13:53 ` [PATCH v10 23/25] tools: L2 CAT: support show cbm " Yi Sun
2017-04-01 13:53 ` [PATCH v10 24/25] tools: L2 CAT: support set " Yi Sun
2017-04-01 13:53 ` [PATCH v10 25/25] docs: add L2 CAT description in docs Yi Sun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.