[RFC Design Doc] Intel L2 Cache Allocation Technology (L2 CAT) Feature enabling

* [RFC Design Doc] Intel L2 Cache Allocation Technology (L2 CAT) Feature enabling
@ 2016-05-12  9:40 He Chen
  2016-05-12 10:05 ` Jan Beulich
  2016-05-12 10:06 ` Andrew Cooper
  0 siblings, 2 replies; 12+ messages in thread
From: He Chen @ 2016-05-12  9:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Ian Jackson, Chao Peng, Jan Beulich, Andrew Cooper

% Intel L2 Cache Allocation Technology (L2 CAT) Feature
% Revision 1.0

\clearpage

Hi all,

We plan to bring new PQoS feature called Intel L2 Cache Allocation
Technology (L2 CAT) to Xen.

L2 CAT is supported on Atom codename Goldmont and beyond. “Big-core”
Xeon does not support L2 CAT in current generations.

This is the initial design of L2 CAT. It might be a little long and
detailed, hope it doesn't matter.

Comments and suggestions are welcome :-)

# Basics

---------------- ----------------------------------------------------
         Status: **Tech Preview**

Architecture(s): Intel x86

   Component(s): Hypervisor, toolstack

       Hardware: Atom codename Goldmont and beyond
---------------- ----------------------------------------------------

# Overview

L2 CAT allows an OS or Hypervisor/VMM to control allocation of a
CPU's shared L2 cache based on application priority or Class of Service
(COS). Each CLOS is configured using capacity bitmasks (CBM) which
represent cache capacity and indicate the degree of overlap and
isolation between classes. Once L2 CAT is configured, the processor
allows access to portions of L2 cache according to the established
class of service (COS).

# Technical information

L2 CAT is a member of Intel PQoS features and part of CAT, it shares
some base PSR infrastructure in Xen.

## Hardware perspective

L2 CAT defines a new range MSRs to assign different L2 cache access
patterns which are known as CBMs (Capacity BitMask), each CBM is
associated with a COS.

```

                        +----------------------------+----------------+
   IA32_PQR_ASSOC       | MSR (per socket)           |    Address     |
 +----+---+-------+     +----------------------------+----------------+
 |    |COS|       |     | IA32_L2_QOS_MASK_0         |     0xD10      |
 +----+---+-------+     +----------------------------+----------------+
        └-------------> | ...                        |  ...           |
                        +----------------------------+----------------+
                        | IA32_L2_QOS_MASK_n         | 0xD10+n (n<64) |
                        +----------------------------+----------------+
```

When context switch happens, the COS of VCPU is written to per-thread
MSR `IA32_PQR_ASSOC`, and then hardware enforces L2 cache allocation
according to the corresponding CBM.

## The relationship between L2 CAT and L3 CAT/CDP

L2 CAT is independent of L3 CAT/CDP, which means L2 CAT would be enabled
while L3 CAT/CDP is disabled, or L2 CAT and L3 CAT/CDP are all enabled.

L2 CAT uses a new range CBMs from 0xD10 ~ 0xD10+n (n<64), following by
the L3 CAT/CDP CBMs, and supports setting different L2 cache accessing
patterns from L3 cache.

N.B. L2 CAT and L3 CAT/CDP share the same COS field in the same
associate register `IA32_PQR_ASSOC`, that means one COS corresponds to a
pair of L2 CBM and L3 CBM.

In the initial implementation, L2 CAT is shown up on Atom codename
Goldmont firstly and there is no platform support both L2 & L3 CAT so
far.

## Design Overview

* Core COS/CBM association

  When enforcing L2 CAT, all cores of domains have the same default
  COS (COS0) which associated to the fully open CBM (all ones bitmask)
  to access all L2 cache. The default COS is used only in hypervisor
  and is transparent to tool stack and user.

  System administrator can change PQoS allocation policy at runtime by
  tool stack. Since L2 CAT share COS with L3 CAT/CDP, a COS corresponds
  to a 2-tuple, like [L2 CBM, L3 CBM] with only-CAT enabled, when CDP
  is enabled, one COS corresponds to a 3-tuple, like [L2 CBM,
  L3 Code_CBM, L3 Data_CBM]. If neither L3 CAT nor L3 CDP is enabled,
  things would be easier, one COS corresponds to one L2 CBM.

* VCPU schedule

  This part reuses L3 CAT COS infrastructure.

* Multi-sockets

  Different sockets may have different L2 CAT capability (e.g. max COS)
  although it is consistent on the same socket. So the capability of
  per-socket L2 CAT is specified.

## Implementation Description

* Hypervisor interfaces:

  1. Ext: Boot line parameter "psr=cat" now will enable L2 CAT and L3
          CAT if hardware supported.

  2. New: SYSCTL:
          - XEN_SYSCTL_PSR_CAT_get_l2_info: Get L2 CAT information.

  3. New: DOMCTL:
          - XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM: Get L2 CBM for a domain.
          - XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM: Set L2 CBM for a domain.

* xl interfaces:

  1. Ext: psr-cat-show: Show system/domain L2 CAT information.
          => XEN_SYSCTL_PSR_CAT_get_l2_info /
             XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM

  2. Ext: psr-mba-set -l2 domain-id cbm
          Set L2 cbm for a domain.
          => XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM

* Key data structure:

  1. Combined PSR bitmasks structure

     ```
     struct psr_mask {
         struct l3_cat {
             union {
                 uint64_t cbm;
                 struct {
                     uint64_t code;
                     uint64_t data;
                 };
             };
         };
         struct l2_cat {
             uint64_t cbm;
         };
         unsigned int ref;
     };
     ```

     As mentioned above, when more than one PQoS enforcement features
     are enabled, one COS would correspond several capacity bitmasks. So
     we combine these bitmasks to one structure, when we setting a new
     platform shared resource allocation pattern for a guest, it is easy
     to compare, modify or write the masks by this combined structure.

   2. Per-socket PSR allocation features information structure

      ```
      struct psr_cat_lvl_info {
          unsigned int cbm_len;
          unsigned int cos_max;
      };

      #define PSR_SOCKET_L3_CAT 1<<0
      #define PSR_SOCKET_L3_CDP 1<<1
      #define PSR_SOCKET_L2_CAT 1<<2

      struct psr_socket_alloc_info {
          unsigned long features;
          struct psr_cat_lvl_info cat_lvl[4];
          struct psr_mask *cos_to_mask;
          spinlock_t mask_lock;
      };
      ```

      We collect all PSR allocation features information of a socket in
      this `struct psr_socket_alloc_info`.

      - Member `features`

        `features` is a bitmap, to indicate which feature is enabled on
        current socket. We define `features` bitmap as:

        bit 0~1: L3 CAT status, [01] stands for L3 CAT only and [10]
                 stands for L3 CDP is enalbed.

        bit 2: L2 CAT status.

      - Member `cos_to_mask`

        `cos_to_mask` is a array of `psr_mask` structure which
        represents the capacity bitmasks of allocation features, and it
        is indexed by COS.

# User information

* Feature Enabling:

  Add "psr=cat" to boot line parameter to enable all supported level CAT
  features.

* xl interfaces:

  1. `psr-cat-show [domain-id]`:

     Show system/domain L2 & L3 CAT information.

  2. `psr-cat-cbm-set [OPTIONS] domain-id cbm`:

     New options `-l2` and `-l3` are added.
     `-l2`: Specify cbm for L2 cache.
     `-l3`: Specify cbm for L3 cache.

     If neither `-l2` nor `-l3` is given, LLC (Last Level Cache) is
     default behavior. Since there is no CDP support on L2 cache, so if
     `-l2`, `--code` or `--data` are both given, error message shows.

# References

[Intel® 64 and IA-32 Architectures Software Developer Manuals](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)

# History

------------------------------------------------------------------------
Date       Revision Version  Notes
---------- -------- -------- -------------------------------------------
2016-05-12 1.0      Xen 4.7  Initial design
---------- -------- -------- -------------------------------------------

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread