All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yi Sun <yi.y.sun@linux.intel.com>
To: xen-devel@lists.xenproject.org
Cc: "Yi Sun" <yi.y.sun@linux.intel.com>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Ian Jackson" <ian.jackson@eu.citrix.com>,
	"Julien Grall" <julien.grall@arm.com>,
	"Jan Beulich" <jbeulich@suse.com>,
	"Chao Peng" <chao.p.peng@linux.intel.com>,
	"Wei Liu" <wei.liu2@citrix.com>,
	"Daniel De Graaf" <dgdegra@tycho.nsa.gov>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: [PATCH v6 01/16] docs: create Memory Bandwidth Allocation (MBA) feature document
Date: Sun,  8 Oct 2017 15:23:46 +0800	[thread overview]
Message-ID: <1507447441-5422-2-git-send-email-yi.y.sun@linux.intel.com> (raw)
In-Reply-To: <1507447441-5422-1-git-send-email-yi.y.sun@linux.intel.com>

This patch creates MBA feature document in doc/features/. It describes
key points to implement MBA which is described in details in Intel SDM
"Introduction to Memory Bandwidth Allocation".

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Daniel De Graaf <dgdegra@tycho.nsa.gov>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Chao Peng <chao.p.peng@linux.intel.com>
CC: Julien Grall <julien.grall@arm.com>

v6:
    - fix some words.
      (suggested by Roger Pau Monné)
v5:
    - correct some words.
      (suggested by Roger Pau Monné)
    - change 'xl psr-mba-set 1 0xa' to 'xl psr-mba-set 1 10'.
      (suggested by Roger Pau Monné)
v4:
    - add 'domain-name' as parameter of 'psr-mba-show/psr-mba-set'.
      (suggested by Roger Pau Monné)
    - fix some wordings.
      (suggested by Roger Pau Monné)
    - explain how user can know the MBA_MAX.
      (suggested by Roger Pau Monné)
    - move the description of 'Linear mode/Non-linear mode' into section
      of 'psr-mba-show'.
      (suggested by Roger Pau Monné)
    - change 'per-thread' to 'per-hyper-thread' to make it clearer.
      (suggested by Roger Pau Monné)
    - upgrade revision number.
v3:
    - remove 'closed-loop' related description.
      (suggested by Roger Pau Monné)
    - explain 'linear' and 'non-linear' before mentioning them.
      (suggested by Roger Pau Monné)
    - adjust desription of 'psr-mba-set'.
      (suggested by Roger Pau Monné)
    - explain 'MBA_MAX'.
      (suggested by Roger Pau Monné)
    - remove 'n<64'.
      (suggested by Roger Pau Monné)
    - fix some wordings.
      (suggested by Roger Pau Monné)
    - add context in 'Testing' part to make things more clear.
      (suggested by Roger Pau Monné)
v2:
    - declare 'HW' in Terminology.
      (suggested by Chao Peng)
    - replace 'COS ID of VCPU' to 'COS ID of domain'.
      (suggested by Chao Peng)
    - replace 'COS register' to 'Thrtl MSR'.
      (suggested by Chao Peng)
    - add description for 'psr-mba-show' to state that the decimal value is
      shown for linear mode but hexadecimal value is shown for non-linear mode.
      (suggested by Chao Peng)
    - remove content in 'Areas for improvement'.
      (suggested by Chao Peng)
    - use '<>' to specify mandatory argument to a command.
      (suggested by Wei Liu)
v1:
    - remove a special character to avoid the error when building pandoc.
---
 docs/features/intel_psr_mba.pandoc | 297 +++++++++++++++++++++++++++++++++++++
 1 file changed, 297 insertions(+)
 create mode 100644 docs/features/intel_psr_mba.pandoc

diff --git a/docs/features/intel_psr_mba.pandoc b/docs/features/intel_psr_mba.pandoc
new file mode 100644
index 0000000..86df661
--- /dev/null
+++ b/docs/features/intel_psr_mba.pandoc
@@ -0,0 +1,297 @@
+% Intel Memory Bandwidth Allocation (MBA) Feature
+% Revision 1.8
+
+\clearpage
+
+# Basics
+
+---------------- ----------------------------------------------------
+         Status: **Tech Preview**
+
+Architecture(s): Intel x86
+
+   Component(s): Hypervisor, toolstack
+
+       Hardware: MBA is supported on Skylake Server and beyond
+---------------- ----------------------------------------------------
+
+# Terminology
+
+* CAT         Cache Allocation Technology
+* CBM         Capacity BitMasks
+* CDP         Code and Data Prioritization
+* COS/CLOS    Class of Service
+* HW          Hardware
+* MBA         Memory Bandwidth Allocation
+* MSRs        Machine Specific Registers
+* PSR         Intel Platform Shared Resource
+* THRTL       Throttle value or delay value
+
+# Overview
+
+The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate
+control over memory bandwidth available per-core. This feature provides OS/
+hypervisor the ability to slow misbehaving apps/domains by using a credit-based
+throttling mechanism.
+
+# User details
+
+* Feature Enabling:
+
+  Add "psr=mba" to boot line parameter to enable MBA feature.
+
+* xl interfaces:
+
+  1. `psr-mba-show [domain-id|domain-name]`:
+
+     Show memory bandwidth throttling for domain. Under different modes, it
+     shows different type of data.
+
+     There are two modes:
+     Linear mode: the input precision is defined as 100-(MBA_MAX). For instance,
+     if the MBA_MAX value is 90, the input precision is 10%. Values not an even
+     multiple of the precision (e.g., 12%) will be rounded down (e.g., to 10%
+     delay applied) by HW automatically. The response of throttling value is
+     linear.
+
+     Non-linear mode: input delay values are powers-of-two from zero to the
+     MBA_MAX value from CPUID. In this case any values not a power of two will
+     be rounded down the next nearest power of two by HW automatically. The
+     response of throttling value is non-linear.
+
+     For linear mode, it shows the decimal value. For non-linear mode, it shows
+     hexadecimal value.
+
+  2. `psr-mba-set [OPTIONS] <domain-id|domain-name> <throttling>`:
+
+     Set memory bandwidth throttling for domain.
+
+     Options:
+     '-s': Specify the socket to process, otherwise all sockets are processed.
+
+     Throttling value set in register implies the approximate amount of delaying
+     the traffic between core and memory. Higher throttling value result in
+     lower bandwidth. The max throttling value (MBA_MAX) supported can be
+     obtained through CPUID inside hypervisor. Users can fetch the MBA_MAX value
+     using the `psr-hwinfo` xl command.
+
+# Technical details
+
+MBA is a member of Intel PSR features, it shares the base PSR infrastructure
+in Xen.
+
+## Hardware perspective
+
+  MBA defines a range of MSRs to support specifying a delay value (Thrtl) per
+  COS, with details below.
+
+  ```
+   +----------------------------+----------------+
+   | MSR (per socket)           |    Address     |
+   +----------------------------+----------------+
+   | IA32_L2_QOS_Ext_BW_Thrtl_0 |     0xD50      |
+   +----------------------------+----------------+
+   | ...                        |  ...           |
+   +----------------------------+----------------+
+   | IA32_L2_QOS_Ext_BW_Thrtl_n |     0xD50+n    |
+   +----------------------------+----------------+
+  ```
+
+  When context switch happens, the COS ID of domain is written to per-hyper-
+  thread MSR `IA32_PQR_ASSOC`, and then hardware enforces bandwidth allocation
+  according to the throttling value stored in the Thrtl MSR register.
+
+## The relationship between MBA and CAT/CDP
+
+  Generally speaking, MBA is completely independent of CAT/CDP, and any
+  combination may be applied at any time, e.g. enabling MBA with CAT
+  disabled.
+
+  But it needs to be noticed that MBA shares COS infrastructure with CAT,
+  although MBA is enumerated by different CPUID leaf from CAT (which
+  indicates that the max COS of MBA may be different from CAT). In some
+  cases, a domain is permitted to have a COS that is beyond one (or more)
+  of PSR features but within the others. For instance, let's assume the max
+  COS of MBA is 8 but the max COS of L3 CAT is 16, when a domain is assigned
+  9 as COS, the L3 CAT CBM associated to COS 9 would be enforced, but for MBA,
+  the HW works as default value is set since COS 9 is beyond the max COS (8)
+  of MBA.
+
+## Design Overview
+
+* Core COS/Thrtl association
+
+  When enforcing Memory Bandwidth Allocation, all cores of domains have
+  the same default Thrtl MSR (COS0) which stores the same Thrtl (0). The
+  default Thrtl MSR is used only in hypervisor and is transparent to tool stack
+  and user.
+
+  System administrators can change PSR allocation policy at runtime by
+  using the tool stack. Since MBA shares COS ID with CAT/CDP, a COS ID
+  corresponds to a 2-tuple, like [CBM, Thrtl] with only-CAT enabled, when CDP
+  is enabled, the COS ID corresponds to a 3-tuple, like [Code_CBM, Data_CBM,
+  Thrtl]. If neither CAT nor CDP is enabled, things are easier, since one COS
+  ID corresponds to one Thrtl.
+
+* VCPU schedule
+
+  This part reuses CAT COS infrastructure.
+
+* Multi-sockets
+
+  Different sockets may have different MBA capabilities (like max COS)
+  although it is consistent on the same socket. So the capability
+  of per-socket MBA is specified.
+
+  This part reuses CAT COS infrastructure.
+
+## Implementation Description
+
+* Hypervisor interfaces:
+
+  1. Boot line param: "psr=mba" to enable the feature.
+
+  2. SYSCTL:
+          - XEN_SYSCTL_PSR_MBA_get_info: Get system MBA information.
+
+  3. DOMCTL:
+          - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a domain.
+          - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a domain.
+
+* xl interfaces:
+
+  1. psr-mba-show [domain-id]
+          Show system/domain runtime MBA throttling value. For linear mode,
+          it shows the decimal value. For non-linear mode, it shows hexadecimal
+          value.
+          => XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL
+
+  2. psr-mba-set [OPTIONS] <domain-id> <throttling>
+          Set bandwidth throttling for a domain.
+          => XEN_DOMCTL_PSR_MBA_OP_SET_THRTL
+
+  3. psr-hwinfo
+          Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA.
+          => XEN_SYSCTL_PSR_MBA_get_info
+
+* Key data structure:
+
+  1. Feature HW info
+
+     ```
+     struct {
+         unsigned int thrtl_max;
+         bool linear;
+     } mba;
+
+     - Member `thrtl_max`
+
+       `thrtl_max` is the max throttling value to be set, i.e. MBA_MAX.
+
+     - Member `linear`
+
+       `linear` means the response of delay value is linear or not.
+
+     As mentioned above, MBA is a member of Intel PSR features, it shares the
+     base PSR infrastructure in Xen. For example, the 'cos_max' is a common HW
+     property for all features. So, for other data structure details, please
+     refer to 'intel_psr_cat_cdp.pandoc'.
+
+# Limitations
+
+MBA can only work on HW which supports it (check CPUID).
+
+# Testing
+
+We can execute these commands to verify MBA on different HWs supporting them.
+
+For example:
+  1. User can get the MBA hardware info through 'psr-hwinfo' command. From
+     result, user can know if this hardware works under linear mode or non-
+     linear mode, the max throttling value (MBA_MAX) and so on.
+
+    root@:~$ xl psr-hwinfo --mba
+    Memory Bandwidth Allocation (MBA):
+    Socket ID       : 0
+    Linear Mode     : Enabled
+    Maximum COS     : 7
+    Maximum Throttling Value: 90
+    Default Throttling Value: 0
+
+  2. Then, user can set a throttling value to a domain. For example, set '10',
+     i.e 10% delay.
+
+    root@:~$ xl psr-mba-set 1 10
+
+  3. User can check the current configuration of the domain through
+     'psr-mab-show'. For linear mode, the decimal value is shown.
+
+    root@:~$ xl psr-mba-show 1
+    Socket ID       : 0
+    Default THRTL   : 0
+       ID                     NAME            THRTL
+        1                 ubuntu14             10
+
+# Areas for improvement
+
+N/A
+
+# Known issues
+
+N/A
+
+# References
+
+"INTEL RESOURCE DIRECTOR TECHNOLOGY (INTEL RDT) ALLOCATION FEATURES" [Intel 64 and IA-32 Architectures Software Developer Manuals, vol3](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
+
+# History
+
+------------------------------------------------------------------------
+Date       Revision Version  Notes
+---------- -------- -------- -------------------------------------------
+2017-01-10 1.0      Xen 4.9  Design document written
+2017-07-10 1.1      Xen 4.10 Changes:
+                             1. Modify data structure according to latest
+                                codes;
+                             2. Add content for 'Areas for improvement';
+                             3. Other minor changes.
+2017-08-09 1.2      Xen 4.10 Changes:
+                             1. Remove a special character to avoid error when
+                                building pandoc.
+2017-08-15 1.3      Xen 4.10 Changes:
+                             1. Add terminology 'HW'.
+                             2. Change 'COS ID of VCPU' to 'COS ID of domain'.
+                             3. Change 'COS register' to 'Thrtl MSR'.
+                             4. Explain the value shown for 'psr-mba-show' under
+                                different modes.
+                             5. Remove content in 'Areas for improvement'.
+2017-08-16 1.4      Xen 4.10 Changes:
+                             1. Add '<>' for mandatory argument.
+2017-08-30 1.5      Xen 4.10 Changes:
+                             1. Modify words in 'Overview' to make it easier to
+                                understand.
+                             2. Explain 'linear/non-linear' modes before mention
+                                them.
+                             3. Explain throttling value more accurate.
+                             4. Explain 'MBA_MAX'.
+                             5. Correct some words in 'Design Overview'.
+                             6. Change 'mba_info' to 'mba' according to code
+                                changes. Also, modify contents of it.
+                             7. Add context in 'Testing' part to make things
+                                more clear.
+                             8. Remove 'n<64' to avoid out-of-sync.
+2017-09-21 1.6      Xen 4.10 Changes:
+                             1. Add 'domain-name' as parameter of 'psr-mba-show/
+                                psr-mba-set'.
+                             2. Fix some wordings.
+                             3. Explain how user can know the MBA_MAX.
+                             4. Move the description of 'Linear mode/Non-linear
+                                mode' into section of 'psr-mba-show'.
+                             5. Change 'per-thread' to 'per-hyper-thread'.
+2017-09-29 1.7      Xen 4.10 Changes:
+                             1. Correct some words.
+                             2. Change 'xl psr-mba-set 1 0xa' to
+                                'xl psr-mba-set 1 10'
+2017-10-08 1.8      Xen 4.10 Changes:
+                             1. Correct some words.
+---------- -------- -------- -------------------------------------------
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-10-08  7:45 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-08  7:23 [PATCH v6 00/16] Enable Memory Bandwidth Allocation in Xen Yi Sun
2017-10-08  7:23 ` Yi Sun [this message]
2017-10-08  7:23 ` [PATCH v6 02/16] Rename PSR sysctl/domctl interfaces and xsm policy to make them be general Yi Sun
2017-10-10 14:59   ` Jan Beulich
2017-10-08  7:23 ` [PATCH v6 03/16] x86: rename 'cbm_type' to 'psr_type' to make it general Yi Sun
2017-10-08  7:23 ` [PATCH v6 04/16] x86: a few optimizations to psr codes Yi Sun
2017-10-10 15:01   ` Jan Beulich
2017-10-08  7:23 ` [PATCH v6 05/16] x86: implement data structure and CPU init flow for MBA Yi Sun
2017-10-10 15:26   ` Jan Beulich
2017-10-08  7:23 ` [PATCH v6 06/16] x86: implement get hw info " Yi Sun
2017-10-11 13:12   ` Jan Beulich
2017-10-08  7:23 ` [PATCH v6 07/16] x86: implement get value interface " Yi Sun
2017-10-08  7:23 ` [PATCH v6 08/16] x86: implement set value flow " Yi Sun
2017-10-11 13:38   ` Jan Beulich
2017-10-12  4:33     ` Yi Sun
2017-10-12  9:43       ` Jan Beulich
2017-10-13  2:02         ` Yi Sun
2017-10-13  6:56           ` Jan Beulich
2017-10-08  7:23 ` [PATCH v6 09/16] tools: create general interfaces to support psr allocation features Yi Sun
2017-10-08  7:23 ` [PATCH v6 10/16] tools: implement the new libxc get hw info interface Yi Sun
2017-10-08  7:23 ` [PATCH v6 11/16] tools: implement the new libxl " Yi Sun
2017-10-08  7:23 ` [PATCH v6 12/16] tools: implement the new xl " Yi Sun
2017-10-08  7:23 ` [PATCH v6 13/16] tools: rename 'xc_psr_cat_type' to 'xc_psr_type' Yi Sun
2017-10-08  7:23 ` [PATCH v6 14/16] tools: implement new generic get value interface and MBA get value command Yi Sun
2017-10-08  7:24 ` [PATCH v6 15/16] tools: implement new generic set value interface and MBA set " Yi Sun
2017-10-08  7:24 ` [PATCH v6 16/16] docs: add MBA description in docs Yi Sun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1507447441-5422-2-git-send-email-yi.y.sun@linux.intel.com \
    --to=yi.y.sun@linux.intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien.grall@arm.com \
    --cc=konrad.wilk@oracle.com \
    --cc=roger.pau@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.